Configure Specs for Model

In this section, you will configure the underlying Kubernetes objects as well as the InferenceService Kale will create to serve a trained Machine Learning (ML) model. You will train a SKLearn model, configure the Kubernetes and InferenceService specs, serve the trained model, and get predictions.

What You’ll Need


  1. Create a new notebook server using the default Kale Docker image. The image will have the following naming scheme:<IMAGE_TAG>


    The <IMAGE_TAG> varies based on the MiniKF or Arrikto EKF release.

  2. Create a new Jupyter notebook (that is, an IPYNB file):

  3. Copy and paste the import statements in the first code cell, and run it:

    import json from sklearn.feature_extraction import text from sklearn.datasets import fetch_20newsgroups from sklearn.model_selection import train_test_split from sklearn.feature_extraction.text import TfidfVectorizer from kale.serve import Endpoint

    This is how your notebook cell will look like:

  4. In a different code cell, fetch the dataset and print the topic names. Copy and paste the following code, and run it:

    # download dataset newsgroups_dataset = fetch_20newsgroups(random_state=42) # dataset target groups class_names = newsgroups_dataset.target_names print(*class_names, sep = "\n")

    This is how your notebook cell will look like:

  5. Load the features and targets of the dataset, and split it into train and test subsets. In a new cell, copy and paste the following code, and run it:

    # create the dataset x = y = # split the dataset x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=.2, random_state=42)

    This is how your notebook cell will look like:

  6. Use the TF-IDF vectorizer to transform the raw training and test subsets into a form that you can use to train a machine learning model:

    # calculate TF-IDF vectors stop_words = text.ENGLISH_STOP_WORDS vectorizer = TfidfVectorizer(stop_words=stop_words) x_train_transformed = vectorizer.fit_transform(x_train) x_test_transformed = vectorizer.transform(x_test)

    This is how your notebook cell will look like:

  7. In the same notebook server, open a terminal, and create a new Python file. Name it

    $ touch
  8. Copy and paste the following code inside
    1# Copyright © 2022 Arrikto Inc. All Rights Reserved.
    3"""Kale SDK.
    5This script uses an ML pipeline to train and serve an SKLearn Model.
    8from import Signature
    9from kale.sdk import pipeline, step
    10from kale.common import mlmdutils, artifacts
    12from sklearn.feature_extraction import text
    13from sklearn.naive_bayes import MultinomialNB
    14from sklearn.datasets import fetch_20newsgroups
    15from sklearn.model_selection import train_test_split
    16from sklearn.feature_extraction.text import TfidfVectorizer
    20def load_split_dataset():
    21 """Fetch 20newgroup dataset."""
    22 # load the data
    23 newsgroups_dataset = fetch_20newsgroups(random_state=42)
    24 x =
    25 y =
    27 # split the dataset
    28 x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=.2,
    29 random_state=42)
    30 return x_train, y_train, x_test, y_test
    34def preprocess(x_train, x_test):
    35 """Preprocess the input data."""
    36 # get stopwords
    37 stop_words = text.ENGLISH_STOP_WORDS
    38 # TF-IDF vectors
    39 vectorizer = TfidfVectorizer(stop_words=stop_words)
    40 vectors_train = vectorizer.fit_transform(x_train)
    41 vectors_test = vectorizer.transform(x_test)
    42 return vectors_train, vectors_test
    46def train(x, y):
    47 """Train a MultinomialNB model."""
    48 classifier = MultinomialNB(alpha=.01)
    49 model =, y)
    50 return model
    54def register_model(model, x, y):
    55 mlmd = mlmdutils.get_mlmd_instance()
    57 signature = Signature(
    58 input_size=[1] + list(x[0].shape),
    59 output_size=[1] + list(y[0].shape),
    60 input_dtype=x.dtype,
    61 output_dtype=y.dtype)
    63 model_artifact = artifacts.SklearnModel(
    64 model=model,
    65 description="A simple MultinomialNB model",
    66 version="1.0.0",
    67 author="Kale",
    68 signature=signature,
    69 tags={"app": "sklearn-model"}).submit_artifact()
    71 mlmd.link_artifact_as_output(
    72 return
    75@pipeline(name="classification", experiment="sklearn-model")
    76def ml_pipeline():
    77 """Run the ML pipeline."""
    78 x_train, y_train, x_test, y_test = load_split_dataset()
    79 vectors_train, vectors_test = preprocess(x_train, x_test)
    80 model = train(vectors_train, y_train)
    81 register_model(model, vectors_train, y_train)
    84if __name__ == "__main__":
    85 ml_pipeline()
  9. Create a new step function which specifies your desired Kubernetes and InferenceService configurations, and serves the trained model:
    1# Copyright © 2022 Arrikto Inc. All Rights Reserved.
    3"""Kale SDK.
    5This script uses an ML pipeline to train and serve an SKLearn Model.
    8from import Signature
    9+from kale.serve import serve
    10from kale.sdk import pipeline, step
    11from kale.common import mlmdutils, artifacts
    13from sklearn.feature_extraction import text
    14from sklearn.naive_bayes import MultinomialNB
    15from sklearn.datasets import fetch_20newsgroups
    16from sklearn.model_selection import train_test_split
    17from sklearn.feature_extraction.text import TfidfVectorizer
    21def load_split_dataset():
    22 """Fetch 20newgroup dataset."""
    23 # load the data
    24 newsgroups_dataset = fetch_20newsgroups(random_state=42)
    25 x =
    26 y =
    28 # split the dataset
    29 x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=.2,
    30 random_state=42)
    31 return x_train, y_train, x_test, y_test
    35def preprocess(x_train, x_test):
    36 """Preprocess the input data."""
    37 # get stopwords
    38 stop_words = text.ENGLISH_STOP_WORDS
    39 # TF-IDF vectors
    40 vectorizer = TfidfVectorizer(stop_words=stop_words)
    41 vectors_train = vectorizer.fit_transform(x_train)
    42 vectors_test = vectorizer.transform(x_test)
    43 return vectors_train, vectors_test
    47def train(x, y):
    48 """Train a MultinomialNB model."""
    49 classifier = MultinomialNB(alpha=.01)
    50 model =, y)
    51 return model
    55def register_model(model, x, y):
    56 mlmd = mlmdutils.get_mlmd_instance()
    58 signature = Signature(
    59 input_size=[1] + list(x[0].shape),
    60 output_size=[1] + list(y[0].shape),
    61 input_dtype=x.dtype,
    62 output_dtype=y.dtype)
    64 model_artifact = artifacts.SklearnModel(
    65 model=model,
    66 description="A simple MultinomialNB model",
    67 version="1.0.0",
    68 author="Kale",
    69 signature=signature,
    70 tags={"app": "sklearn-model"}).submit_artifact()
    72 mlmd.link_artifact_as_output(
    73 return
    77+def serve_model(model_artifact_id):
    78+ serve_config = {"limits": {"cpu": 1, "memory": "4Gi"},
    79+ "requests": {"cpu": "100m", "memory": "3Gi"},
    80+ "labels": {"my-sklearn-model": "logistic-regression"},
    81+ "annotations": {"": "false"},
    82+ "protocol_version": "v1"}
    83+ serve(name="sklearn-model",
    84+ model_id=model_artifact_id,
    85+ serve_config=serve_config)
    88@pipeline(name="classification", experiment="sklearn-model")
    89def ml_pipeline():
    90 """Run the ML pipeline."""
    91 x_train, y_train, x_test, y_test = load_split_dataset()
    92 vectors_train, vectors_test = preprocess(x_train, x_test)
    93 model = train(vectors_train, y_train)
    94- register_model(model, vectors_train, y_train)
    95+ artifact_id = register_model(model, vectors_train, y_train)
    96+ serve_model(artifact_id)
    99if __name__ == "__main__":
    100 ml_pipeline()

    See also


    The KServe controller uses some default values for limits and requests that the admin sets, unless you explicitly specify them. At the same time, Kubernetes will not schedule Pods when their requests exceed their limits.

    To ensure that the resulting resources will have valid specs, whenever you set either one of limits or requests make sure to specify some value for the other one as well, following the aforementioned restriction. If you provide a value just for requests which exceeds the default limits, Kubernetes will not schedule the resulting Pods.

    The protocol_version attribute sets the protocol version for an InferenceService created for a SKLearn predictor, whereas the other attributes set configurations for the underlying Kubernetes objects.

    The serve_config argument accepts either a dict, which Kale will use to initialize a ServeConfig object, or a ServeConfig directly.

  10. Deploy and run your code as a KFP pipeline:

    $ python3 -m kale --kfp
  11. In the existing notebook, in a different code cell, initialize a Kale Endpoint object using the name of the InferenceService you created. Then, run the cell:

    endpoint = Endpoint(name="sklearn-model")


    When initializing an Endpoint, you can also pass the namespace of the InferenceService. If you do not provide one, Kale assumes the namespace of the notebook server.

    This is how your notebook cell will look like:

  12. Visualize a test sample and transform the data into JSON format. Copy and paste the following code in a new cell, and run it:

    # visualize the test sample you will use index_test = 2 print(x_test[index_test]) print("Topic:", class_names[y_test[index_test]])

    This is how your notebook cell will look like:

  13. Prepare the data payload for the prediction request. Copy and paste the following code in a new cell, and run it:

    # covert the test sample into json format data = {"instances": x_test_transformed[index_test].toarray().tolist()}

    This is how your notebook cell will look like:

  14. Invoke the server to get predictions. Copy and paste the following snippet in a different code cell, and run it:

    # get and print the prediction res = endpoint.predict(json.dumps(data)) print(f"The prediction is {class_names[res['predictions'][0]]}")

    This is how your notebook cell will look like:



You have successfully served and invoked a model with Kale, specifying custom Kubernetes and InferenceService configurations for the corresponding resources.

What’s Next

Check out how you can invoke an already deployed model.