Kale Model Artifact

kale.Model is an MLMD ArtifactType which allows the logging of arbitrary Machine Learning (ML) models in MLMD.

Import

The object lives in the kale.common.artifacts module. Import it as follows:

from kale.common.artifacts import Model

Attributes

Name Type Default Description
model Any - The model object
name str None The name of the model artifact
description str None A short description of the model artifact
version str None The version of the model artifact
author str None The author of the model artifact
signature kale.ml.Signature None The signature of the model artifact (i.e., the model’s input and output shapes)
tags Dict[str, str] None A dictionary of tags associated with the model artifact
model_type str None The type of the model artifact (e.g., TensorFlow, PyTorch, etc.)
artifact_uri str None A path to the serialized model

Initialization

You may initialize a Model artifact similarly to any other Python object:

from kale.ml import Signature signature = Signature( input_size=["batch_size"] + [64], output_size=["batch_size"] + [1], input_dtype="float64", output_dtype="float32") model = None model_artifact = Model( model, name="Regressor", description="A model for predicting housing prices", version="1.0.0", author="User", signature=signature, tags={"stage": "experimental"}, model_type="sklearn")

When you have initialized a Model artifact object, you can submit it to MLMD by calling the submit_artifact() method. For example:

model_artifact.submit_artifact()

Note

MLMD is a “metadata” store, this means that it does not store the (model) object itself. Every MLMD artifact you submit via Kale has an artifact_uri property that points to the location of the object.

When you call submit_artifact(), Kale marshals the object to the local file system and then uses Rok to take a snapshot. In this way, Rok acts as your artifact store, and you can use this snapshot reference later to restore the entire environment where you produced the artifact.

If you don’t want Kale to take a snapshot, or if you want to provide your own artifact location, you can use the artifact_uri field to specify the location of the serialized model. For example:

model_artifact = Model( model, name="Regressor", description="A model for predicting housing prices", version="1.0.0", author="User", signature=signature, tags={"stage": "experimental"}, model_type="sklearn", artifact_uri="gs://my-bucket/model.pkl")

In this case, Kale will not take a snapshot of the model, but it will still register the artifact in MLMD. KServe supports all major cloud storage services, like AWS S3, Azure Blob Storage, and Google Cloud Storage. You can also expose the location to your serialized model via HTTP or HTTPS. For more information on the supported platforms, see the KServe documentation.

ML Frameworks

We support specialized Model artifact implementations for these frameworks:

  • Scikit Learn (SklearnModel)
  • Keras TensorFlow (TFKerasModel)
  • PyTorch (PyTorchModel)
  • XGBoost (XGBoostModel)

You can initialize any of these model artifacts the same way you do with the base Model artifact. However, you do not need to specify the model_type. The field will be populated automatically, among other framework-specific properties of the model.

from kale.ml import Signature from kale.common.artifacts import SklearnModel from sklearn.linear_model import LinearRegression sklearn_model = LinearRegression() signature = Signature( input_size=["batch_size"] + [64], output_size=["batch_size"] + [1], input_dtype="float64", output_dtype="float32") model_artifact = SklearnModel( model=sklearn_model, description="An Sklearn linear regression model", version="1.0.0", author="Kale", signature=signature, tags={"stage": "draft"})
from kale.ml import Signature from kale.common.artifacts import XGBoostModel import xgboost as xgb from sklearn.metrics import mean_absolute_error xgb_model = xgb.XGBRegressor(tree_method="hist", eval_metric=mean_absolute_error) signature = Signature( input_size=["batch_size"] + [8], output_size=["batch_size"] + [1], input_dtype="float64", output_dtype="float32") model_artifact = XGBoostModel( model=xgb_model, description="An XGBoost regressor", version="1.0.0", author="Kale", signature=signature, tags={"stage": "draft"})
from kale.ml import Signature from kale.common.artifacts import PyTorchModel import torchvision.models as models torch_model = models.AlexNet() signature = Signature( input_size=[1, 3, 256, 256], output_size=[1, 1000], input_dtype="float32", output_dtype="float32") model_artifact = PyTorchModel( model=torch_model, description="A PyTorch AlexNet implementation", version="1.0.0", author="Kale", signature=signature, tags={"stage": "draft"})
from kale.ml import Signature from kale.common.artifacts import TFKerasModel import tensorflow as tf keras_model = tf.keras.applications.MobileNetV2( input_shape=(224, 224, 3)) signature = Signature( input_size=[1, 3, 224, 224], output_size=[1, 1000], input_dtype="float32", output_dtype="float32") model_artifact = TFKerasModel( model=keras_model, description="A Keras MobileNet implementation", version="1.0.0", author="Kale", signature=signature, tags={"stage": "draft"})