Configure Kubernetes Spec for Model

In this section, you will set Kubernetes configurations on a trained Machine Learning (ML) model you will serve with Kale and KFServing. You are going to configure the Kubernetes spec and metadata for the InferenceService and its underlying resources.

What You'll Need


  1. Create a new Notebook server using the default Kale Docker image. The image will have the following naming scheme:<IMAGE_TAG>


    The <IMAGE_TAG> varies based on the MiniKF or Arrikto EKF release.

  2. Create a new Jupyter Notebook (that is, an IPYNB file):

  3. Copy and paste the import statements in the first code cell, then run it:

    import json
    from sklearn.datasets import make_classification
    from sklearn.model_selection import train_test_split
    from sklearn.preprocessing import StandardScaler
    from sklearn.linear_model import LogisticRegression
    from kale.common.serveutils import serve

    This is how your notebook cell will look like:

  4. In a different code cell, prepare your dataset and train your model. Then, run it:

    # load the data
    x, y = make_classification(random_state=42)
    x_raw, x_test_raw, y, y_test = train_test_split(x, y, test_size=0.1)
    # process the data
    scaler = StandardScaler()
    x = scaler.fit_transform(x_raw)
    # train the model
    model = LogisticRegression(max_iter=1000), y)

    This is how your notebook cell will look like:

  5. Define the data processing function that you want to turn into a KFServing transformer component in a different code cell and run it:

    def process_raw(x):
        import numpy as np
        # convert your data to a NumPy array
        x = np.array(x)
        # expand your data's dimensions
        x = x[None,:]
        # scale your features
        x = scaler.transform(x)
        # turn your data back to a Python list and reduce the dimensions
        x = x.tolist()[0]
        return x

    This is how your notebook cell will look like:

  6. Specify your desired Kubernetes configuration for the deployment of your model in a different code cell and run it:

    deploy_config = {"limits": {"cpu": "1", "memory": "4Gi"},
                     "requests": {"cpu": "100m", "memory": "3Gi"},
                     "labels": {"my-kale-model": "logistic-regression"}}

    This is how your notebook cell will look like:



    To find out which are all the available configurations you can specify, take a look at the DeployConfig API.

  7. Call the serve function and pass the trained model, the preprocessing function, its dependencies, and the deployment configuration as arguments. Then, run the cell:

    kfserver = serve(model, preprocessing_fn=process_raw,
                     preprocessing_assets={'scaler': scaler},

    This is how your notebook cell will look like:



    The KFServing controller uses some default values for limits and requests that the admin sets, unless you explicitly specify them. At the same time, Kubernetes will not schedule pods when their requests exceed their limits.

    To ensure that the resulting resources will have valid specs, whenever you set either one of limits or requests make sure to specify some value for the other one as well, following the aforementioned restriction. If you provide a value just for requests which exceeds the default limits, Kubernetes will not schedule the resulting Pods.

    The deploy_config argument accepts either a dict, which Kale will use to initialize a DeployConfig object, or a DeployConfig directly.

  8. Invoke the server to get predictions in a different code cell and run it:

    data = json.dumps({"instances": x_test_raw.tolist()})
    predictions = kfserver.predict(data)

    This is how your notebook cell will look like:



You have successfully served and invoked a model with Kale, specifying custom Kubernetes configurations for the corresponding resources.

What's Next

Check out how you can invoke an already deployed model.