HP Tuning with the Kale SDK

This section will guide you through configuring and running a Katib experiment using the Kale SDK, to tune the hyperparameters (HP) of your Machine Learning (ML) model.

What You’ll Need

  • An EKF or MiniKF deployment with the default Kale Docker image.
  • An understanding of how the Kale SDK works.

Procedure

  1. Create a new Notebook server using the default Kale Docker image. The image will have the following naming scheme:

    gcr.io/arrikto/jupyter-kale-py38:<IMAGE_TAG>

    Note

    The <IMAGE_TAG> varies based on the MiniKF or EKF release.

  2. Connect to the server, open a terminal, and install scikit-learn:

    $ pip3 install --user scikit-learn==0.23.0
  3. Create a new python file and name it kale_katib.py:

    $ touch kale_katib.py
  4. Copy and paste the following code inside kale_katib.py:

    metrics.py
    1# Copyright © 2021 Arrikto Inc. All Rights Reserved.
    2
    3"""Kale SDK.
    4
    5This script trains an ML pipeline to solve a binary classification task.
    6"""
    7
    8from kale.sdk import has_metrics, pipeline, step
    9from kale.sdk.logging import log_metric
    10from sklearn.datasets import make_classification
    11from sklearn.linear_model import LogisticRegression
    12from sklearn.metrics import accuracy_score
    13from sklearn.model_selection import train_test_split
    14
    15
    16@step(name="data_loading")
    17def load(random_state):
    18 """Create a random dataset for binary classification."""
    19 rs = int(random_state)
    20 x, y = make_classification(random_state=rs)
    21 return x, y
    22
    23
    24@step(name="data_split")
    25def split(x, y):
    26 """Split the data into train and test sets."""
    27 x, x_test, y, y_test = train_test_split(x, y, test_size=0.1)
    28 return x, x_test, y, y_test
    29
    30
    31@step(name="model_training")
    32def train(x, x_test, y, training_iterations):
    33 """Train a Logistic Regression model."""
    34 iters = int(training_iterations)
    35 model = LogisticRegression(max_iter=iters)
    36 model.fit(x, y)
    37 return model
    38
    39
    40@has_metrics
    41@step(name="model_evaluation")
    42def evaluate(model, x_test, y_test):
    43 """Evaluate the model on the test dataset."""
    44 y_pred = model.predict(x_test)
    45 accuracy = accuracy_score(y_test, y_pred)
    46 log_metric(name="accuracy", value=accuracy)
    47
    48
    49@pipeline(name="binary-classification", experiment="kale-tutorial")
    50def ml_pipeline(rs=42, iters=100):
    51 """Run the ML pipeline."""
    52 x, y = load(rs)
    53 x, x_test, y, y_test = split(x, y)
    54 model = train(x, x_test, y, iters)
    55 evaluate(model, x_test, y_test)
    56
    57
    58if __name__ == "__main__":
    59 ml_pipeline(rs=42, iters=100)

    In this code sample, you start with a standard Python script that trains a Logistic Regression model. Moreover, you have decorated the functions using the Kale SDK. To read more about how to create this file, head to the corresponding KFP metrics user guide.

  5. Define the Katib experiment configuration using the Katib SDK. The following snippet summarizes the changes in code:

    configuration.py
    1# Copyright © 2021 Arrikto Inc. All Rights Reserved.
    2
    3"""Kale SDK.
    4-10
    4
    5This script trains an ML pipeline to solve a binary classification task.
    6"""
    7
    8from kale.sdk import has_metrics, pipeline, step
    9from kale.sdk.logging import log_metric
    10from sklearn.datasets import make_classification
    11from sklearn.linear_model import LogisticRegression
    12from sklearn.metrics import accuracy_score
    13from sklearn.model_selection import train_test_split
    14+from kubeflow import katib
    15+
    16+
    17+# The Katib experiment definition.
    18+katib_experiment = katib.V1beta1ExperimentSpec(
    19+ max_trial_count=3,
    20+ parallel_trial_count=1,
    21+ max_failed_trial_count=0,
    22+ algorithm=katib.V1beta1AlgorithmSpec(
    23+ algorithm_name="grid"
    24+ ),
    25+ objective=katib.V1beta1ObjectiveSpec(
    26+ type="maximize",
    27+ objective_metric_name="accuracy"
    28+ ),
    29+ parameters=[katib.V1beta1ParameterSpec(
    30+ name="c",
    31+ parameter_type="double",
    32+ feasible_space=katib.V1beta1FeasibleSpace(
    33+ min="0.1",
    34+ max="1.0",
    35+ step="0.3"),
    36+ ), katib.V1beta1ParameterSpec(
    37+ name="penalty",
    38+ parameter_type="categorical",
    39+ feasible_space=katib.V1beta1FeasibleSpace(
    40+ list=["l2", "none"])
    41+ )],
    42+)
    43
    44
    45@step(name="data_loading")
    46-85
    46def load(random_state):
    47 """Create a random dataset for binary classification."""
    48 rs = int(random_state)
    49 x, y = make_classification(random_state=rs)
    50 return x, y
    51
    52
    53@step(name="data_split")
    54def split(x, y):
    55 """Split the data into train and test sets."""
    56 x, x_test, y, y_test = train_test_split(x, y, test_size=0.1)
    57 return x, x_test, y, y_test
    58
    59
    60@step(name="model_training")
    61def train(x, x_test, y, training_iterations):
    62 """Train a Logistic Regression model."""
    63 iters = int(training_iterations)
    64 model = LogisticRegression(max_iter=iters)
    65 model.fit(x, y)
    66 return model
    67
    68
    69@has_metrics
    70@step(name="model_evaluation")
    71def evaluate(model, x_test, y_test):
    72 """Evaluate the model on the test dataset."""
    73 y_pred = model.predict(x_test)
    74 accuracy = accuracy_score(y_test, y_pred)
    75 log_metric(name="accuracy", value=accuracy)
    76
    77
    78@pipeline(name="binary-classification", experiment="kale-tutorial")
    79def ml_pipeline(rs=42, iters=100):
    80 """Run the ML pipeline."""
    81 x, y = load(rs)
    82 x, x_test, y, y_test = split(x, y)
    83 model = train(x, x_test, y, iters)
    84 evaluate(model, x_test, y_test)
    85
    86
    87if __name__ == "__main__":
    88 ml_pipeline(rs=42, iters=100)

    A Katib experiment configuration has four main sections:

    • Trials: The maximum trials that Katib will start, how many of them will run in parallel, and after how many failed Trials the experiment will terminate.
    • Algorithm: The algorithm that you want to use for HP tuning (e.g., grid), specified as a katib.V1beta1AlgorithmSpec spec object.
    • Metric: The objective name that you use as an end goal (e.g., accuracy) specified as a katib.V1beta1ObjectiveSpec spec object. Note that the name of the objective should match one of the metrics you log with the log_metrics API. See the guide on how to produce KFP metrics for more details.
    • Hyperparameters: The names of the HPs you want to optimize as a list of katib.V1beta1ParameterSpec spec objects. For each HP, you should specify the feasible space inside a katib.V1beta1FeasibleSpace spec object. This can take the form of a numerical range (e.g., see the HP named c in the above code snippet) or a list of possible values (e.g., see the HP named penalty in the above code snippet).

    Note

    The Jupyter Kale images in the EKF or MiniKF deployments already come with the Katib SDK installed.

  6. Pass the Katib configuration as an argument to the pipeline decorated function:

    experiment.py
    1# Copyright © 2021 Arrikto Inc. All Rights Reserved.
    2
    3"""Kale SDK.
    4-74
    4
    5This script trains an ML pipeline to solve a binary classification task.
    6"""
    7
    8from kale.sdk import has_metrics, pipeline, step
    9from kale.sdk.logging import log_metric
    10from sklearn.datasets import make_classification
    11from sklearn.linear_model import LogisticRegression
    12from sklearn.metrics import accuracy_score
    13from sklearn.model_selection import train_test_split
    14from kubeflow import katib
    15
    16
    17# The Katib experiment definition.
    18katib_experiment = katib.V1beta1ExperimentSpec(
    19 max_trial_count=3,
    20 parallel_trial_count=1,
    21 max_failed_trial_count=0,
    22 algorithm=katib.V1beta1AlgorithmSpec(
    23 algorithm_name="grid"
    24 ),
    25 objective=katib.V1beta1ObjectiveSpec(
    26 type="maximize",
    27 objective_metric_name="accuracy"
    28 ),
    29 parameters=[katib.V1beta1ParameterSpec(
    30 name="c",
    31 parameter_type="double",
    32 feasible_space=katib.V1beta1FeasibleSpace(
    33 min="0.1",
    34 max="1.0",
    35 step="0.3"),
    36 ), katib.V1beta1ParameterSpec(
    37 name="penalty",
    38 parameter_type="categorical",
    39 feasible_space=katib.V1beta1FeasibleSpace(
    40 list=["l2", "none"])
    41 )],
    42)
    43
    44
    45@step(name="data_loading")
    46def load(random_state):
    47 """Create a random dataset for binary classification."""
    48 rs = int(random_state)
    49 x, y = make_classification(random_state=rs)
    50 return x, y
    51
    52
    53@step(name="data_split")
    54def split(x, y):
    55 """Split the data into train and test sets."""
    56 x, x_test, y, y_test = train_test_split(x, y, test_size=0.1)
    57 return x, x_test, y, y_test
    58
    59
    60@step(name="model_training")
    61def train(x, x_test, y, training_iterations):
    62 """Train a Logistic Regression model."""
    63 iters = int(training_iterations)
    64 model = LogisticRegression(max_iter=iters)
    65 model.fit(x, y)
    66 return model
    67
    68
    69@has_metrics
    70@step(name="model_evaluation")
    71def evaluate(model, x_test, y_test):
    72 """Evaluate the model on the test dataset."""
    73 y_pred = model.predict(x_test)
    74 accuracy = accuracy_score(y_test, y_pred)
    75 log_metric(name="accuracy", value=accuracy)
    76
    77
    78-@pipeline(name="binary-classification", experiment="kale-tutorial")
    79+@pipeline(name="binary-classification", experiment="kale-tutorial",
    80+ katib_experiment=katib_experiment)
    81def ml_pipeline(rs=42, iters=100):
    82 """Run the ML pipeline."""
    83 x, y = load(rs)
    84-87
    84 x, x_test, y, y_test = split(x, y)
    85 model = train(x, x_test, y, iters)
    86 evaluate(model, x_test, y_test)
    87
    88
    89if __name__ == "__main__":
    90 ml_pipeline(rs=42, iters=100)
  7. Create a parameterized pipeline, which receives the HPs that you want to tune as inputs. To read more about creating parameterized pipelines with the Kale SDK head to relevant Kale SDK guide:

    katib.py
    1# Copyright © 2021 Arrikto Inc. All Rights Reserved.
    2
    3"""Kale SDK.
    4-57
    4
    5This script trains an ML pipeline to solve a binary classification task.
    6"""
    7
    8from kale.sdk import has_metrics, pipeline, step
    9from kale.sdk.logging import log_metric
    10from sklearn.datasets import make_classification
    11from sklearn.linear_model import LogisticRegression
    12from sklearn.metrics import accuracy_score
    13from sklearn.model_selection import train_test_split
    14from kubeflow import katib
    15
    16
    17# The Katib experiment definition.
    18katib_experiment = katib.V1beta1ExperimentSpec(
    19 max_trial_count=3,
    20 parallel_trial_count=1,
    21 max_failed_trial_count=0,
    22 algorithm=katib.V1beta1AlgorithmSpec(
    23 algorithm_name="grid"
    24 ),
    25 objective=katib.V1beta1ObjectiveSpec(
    26 type="maximize",
    27 objective_metric_name="accuracy"
    28 ),
    29 parameters=[katib.V1beta1ParameterSpec(
    30 name="c",
    31 parameter_type="double",
    32 feasible_space=katib.V1beta1FeasibleSpace(
    33 min="0.1",
    34 max="1.0",
    35 step="0.3"),
    36 ), katib.V1beta1ParameterSpec(
    37 name="penalty",
    38 parameter_type="categorical",
    39 feasible_space=katib.V1beta1FeasibleSpace(
    40 list=["l2", "none"])
    41 )],
    42)
    43
    44
    45@step(name="data_loading")
    46def load(random_state):
    47 """Create a random dataset for binary classification."""
    48 rs = int(random_state)
    49 x, y = make_classification(random_state=rs)
    50 return x, y
    51
    52
    53@step(name="data_split")
    54def split(x, y):
    55 """Split the data into train and test sets."""
    56 x, x_test, y, y_test = train_test_split(x, y, test_size=0.1)
    57 return x, x_test, y, y_test
    58
    59
    60@step(name="model_training")
    61-def train(x, x_test, y, training_iterations):
    62+def train(x, x_test, y, training_iterations, c, penalty):
    63 """Train a Logistic Regression model."""
    64 iters = int(training_iterations)
    65- model = LogisticRegression(max_iter=iters)
    66+ c = float(c)
    67+ model = LogisticRegression(max_iter=iters, C=c, penalty=penalty)
    68 model.fit(x, y)
    69 return model
    70
    71-79
    71
    72@has_metrics
    73@step(name="model_evaluation")
    74def evaluate(model, x_test, y_test):
    75 """Evaluate the model on the test dataset."""
    76 y_pred = model.predict(x_test)
    77 accuracy = accuracy_score(y_test, y_pred)
    78 log_metric(name="accuracy", value=accuracy)
    79
    80
    81@pipeline(name="binary-classification", experiment="kale-tutorial",
    82 katib_experiment=katib_experiment)
    83-def ml_pipeline(rs=42, iters=100):
    84+def ml_pipeline(rs=42, c=1.0, penalty="l2", iters=100):
    85 """Run the ML pipeline."""
    86 x, y = load(rs)
    87 x, x_test, y, y_test = split(x, y)
    88- model = train(x, x_test, y, iters)
    89+ model = train(x, x_test, y, iters, c, penalty)
    90 evaluate(model, x_test, y_test)
    91
    92
    93if __name__ == "__main__":
    94- ml_pipeline(rs=42, iters=100)
    95+ ml_pipeline(rs=42, c=1.0, penalty="l2", iters=100)
  8. Run the script locally to test whether your code runs successfully using Kale’s marshalling mechanism.

    $ python3 -m kale kale_katib.py
  9. (Optional) Produce a workflow YAML file that you can inspect:

    $ python3 -m kale kale_katib.py --compile

    After the successful execution of this command, look for the workflow YAML file inside a .kale directory inside your working directory. This is a file that you could upload and submit to Kubeflow manually through its User Interface (KFP UI).

  10. Deploy and run your code as a KFP pipeline:

    $ python3 -m kale kale_katib.py --kfp

    Note

    To see the complete list of arguments and their respective usage, run python3 -m kale --help.

Summary

You have successfully created a Katib experiment using the Kale SDK.

What’s Next

Check out the rest of the Kale user guides.