Kale Serve API

Attention

This feature is a Tech Preview, so it is not fully supported by Arrikto and may not be functionally complete. While it is not intended for production use, we encourage you to try it out and provide us feedback.

Read our Tech Preview Policy for more information.

Also, check out the Kale Serve Guide to find out more on how to serve your models.

The Kale serve API is a Python API to serve ML (Machine Learning) models. You can create a V1beta1InferenceService in two different ways and serve your model seamlessly. These are:

  • by passing a model Kubeflow Artifact ID and, optionally, a transformer Kubeflow Artifact ID or a custom transformer image.
  • by passing a custom model image, and, optionally, a transformer Kubeflow Artifact ID or a custom transformer image.

Import

To use the Kale serve API, import it as follows:

from kale.serve import serve

Attributes

Name Type Default Description
name str None The name of the InferenceService
model_id int - The Kale Model Artifact ID
transformer_id int None The Kale Transformer Artifact ID
serve_config ServeConfig None The ServeConfig object
wait boolean True Wait for the InferenceService to become ready
propagate_configurations boolean False Propagate the PodDefault manifests applied in the Notebook Server to the InferenceService. The defaut PodDefault labels are Allow access to Rok and Allow access to Kubeflow Pipelines.

Initialization

You first need to store your ML model as a Kubeflow artifact. Then, you may use the serve API to serve the ML model using the Model Artifact ID. If needed, you may also use a transformer to preprocess the data the model receives, and postprocess the model’s predictions.

If you choose to create a transformer using a Transformer Artifact ID, initiallize the serve API like so:

serve_config = {"limits": {"memory": "4Gi"}, "annotations": {"sidecar.istio.io/inject": "false"}, "runtime_version": "1.0.1-xgboost"} serve(name="xgboost-model", model_id=19, transformer_id=3, serve_config=serve_config)

Otherwise, if you choose to create a transformer using a custom image, initiallize the serve API like so:

serve_config = {"limits": {"memory": "4Gi"}, "annotations": {"sidecar.istio.io/inject": "false"}, "runtime_version": "1.0.1-xgboost", "transformer": {"container": {"image": "image_str", "name": "container_name"}}} serve(name="xgboost-model", model_id=19, serve_config=serve_config)

You first need create a custom image to package your ML model and its dependencies. Then, you may pass the image to the ServeConfig object and use the serve API to serve the ML model. If needed, you may also use a transformer to preprocess the data the model receives, and postprocess the model’s predictions.

If you choose to create a transformer using a Transformer Artifact ID, initiallize the serve API like so:

serve_config = {"limits": {"memory": "4Gi"}, "annotations": {"sidecar.istio.io/inject": "false"}, "runtime_version": "1.0.1-xgboost", "predictor": {"container": {"image": "pred_image_str", "name": "container_name"}}} serve(name="xgboost-model", transformer_id=3, serve_config=serve_config)

Otherwise, if you choose to create a transformer using a custom image ID, initiallize the serve API like so:

serve_config = {"limits": {"memory": "4Gi"}, "annotations": {"sidecar.istio.io/inject": "false"}, "runtime_version": "1.0.1-xgboost", "transformer": {"container": {"image": "image_str", "name": "container_name"}}, "predictor": {"container": {"image": "pred_image_str", "name": "container_name"}}} serve(name="xgboost-model", serve_config=serve_config)

See also

You can see several use cases of the serve API in the following user guides: