HP Tuning with the Kale SDK¶

This section will guide you through configuring and running a Katib experiment using the Kale SDK, to tune the hyperparameters (HP) of your Machine Learning (ML) model.

Overview

What You’ll Need
Procedure
Summary
What’s Next

What You’ll Need ¶

An EKF or MiniKF deployment with the default Kale Docker image.
An understanding of how the Kale SDK works.

Procedure ¶

Create a new Notebook server using the default Kale Docker image. The image will have the following naming scheme:

gcr.io/arrikto/jupyter-kale-py38:<IMAGE_TAG>

Note

The <IMAGE_TAG> varies based on the MiniKF or EKF release.
Connect to the server, open a terminal, and install scikit-learn:

$ pip3 install --user scikit-learn==0.23.0
Create a new python file and name it kale_katib.py:

$ touch kale_katib.py

Copy and paste the following code inside kale_katib.py:

metrics.py

1# Copyright © 2021 Arrikto Inc.  All Rights Reserved.
2
3"""Kale SDK.
4
5This script trains an ML pipeline to solve a binary classification task.
6"""
7
8from kale.sdk import has_metrics, pipeline, step
9from kale.sdk.logging import log_metric
10from sklearn.datasets import make_classification
11from sklearn.linear_model import LogisticRegression
12from sklearn.metrics import accuracy_score
13from sklearn.model_selection import train_test_split
14
15
16@step(name="data_loading")
17def load(random_state):
18    """Create a random dataset for binary classification."""
19    rs = int(random_state)
20    x, y = make_classification(random_state=rs)
21    return x, y
22
23
24@step(name="data_split")
25def split(x, y):
26    """Split the data into train and test sets."""
27    x, x_test, y, y_test = train_test_split(x, y, test_size=0.1)
28    return x, x_test, y, y_test
29
30
31@step(name="model_training")
32def train(x, x_test, y, training_iterations):
33    """Train a Logistic Regression model."""
34    iters = int(training_iterations)
35    model = LogisticRegression(max_iter=iters)
36    model.fit(x, y)
37    return model
38
39
40@has_metrics
41@step(name="model_evaluation")
42def evaluate(model, x_test, y_test):
43    """Evaluate the model on the test dataset."""
44    y_pred = model.predict(x_test)
45    accuracy = accuracy_score(y_test, y_pred)
46    log_metric(name="accuracy", value=accuracy)
47
48
49@pipeline(name="binary-classification", experiment="kale-tutorial")
50def ml_pipeline(rs=42, iters=100):
51    """Run the ML pipeline."""
52    x, y = load(rs)
53    x, x_test, y, y_test = split(x, y)
54    model = train(x, x_test, y, iters)
55    evaluate(model, x_test, y_test)
56
57
58if __name__ == "__main__":
59    ml_pipeline(rs=42, iters=100)

In this code sample, you start with a standard Python script that trains a Logistic Regression model. Moreover, you have decorated the functions using the Kale SDK. To read more about how to create this file, head to the corresponding KFP metrics user guide.

Define the Katib experiment configuration using the Katib SDK. The following snippet summarizes the changes in code:

configuration.py

1# Copyright © 2021 Arrikto Inc.  All Rights Reserved.
2
3"""Kale SDK.
4-10
4
5This script trains an ML pipeline to solve a binary classification task.
6"""
7
8from kale.sdk import has_metrics, pipeline, step
9from kale.sdk.logging import log_metric
10from sklearn.datasets import make_classification
11from sklearn.linear_model import LogisticRegression
12from sklearn.metrics import accuracy_score
13from sklearn.model_selection import train_test_split
14+from kubeflow import katib
15+
16+
17+# The Katib experiment definition.
18+katib_experiment = katib.V1beta1ExperimentSpec(
19+    max_trial_count=3,
20+    parallel_trial_count=1,
21+    max_failed_trial_count=0,
22+    algorithm=katib.V1beta1AlgorithmSpec(
23+        algorithm_name="grid"
24+    ),
25+    objective=katib.V1beta1ObjectiveSpec(
26+        type="maximize",
27+        objective_metric_name="accuracy"
28+    ),
29+    parameters=[katib.V1beta1ParameterSpec(
30+        name="c",
31+        parameter_type="double",
32+        feasible_space=katib.V1beta1FeasibleSpace(
33+            min="0.1",
34+            max="1.0",
35+            step="0.3"),
36+    ), katib.V1beta1ParameterSpec(
37+        name="penalty",
38+        parameter_type="categorical",
39+        feasible_space=katib.V1beta1FeasibleSpace(
40+            list=["l2", "none"])
41+    )],
42+)
43
44
45@step(name="data_loading")
46-85
46def load(random_state):
47    """Create a random dataset for binary classification."""
48    rs = int(random_state)
49    x, y = make_classification(random_state=rs)
50    return x, y
51
52
53@step(name="data_split")
54def split(x, y):
55    """Split the data into train and test sets."""
56    x, x_test, y, y_test = train_test_split(x, y, test_size=0.1)
57    return x, x_test, y, y_test
58
59
60@step(name="model_training")
61def train(x, x_test, y, training_iterations):
62    """Train a Logistic Regression model."""
63    iters = int(training_iterations)
64    model = LogisticRegression(max_iter=iters)
65    model.fit(x, y)
66    return model
67
68
69@has_metrics
70@step(name="model_evaluation")
71def evaluate(model, x_test, y_test):
72    """Evaluate the model on the test dataset."""
73    y_pred = model.predict(x_test)
74    accuracy = accuracy_score(y_test, y_pred)
75    log_metric(name="accuracy", value=accuracy)
76
77
78@pipeline(name="binary-classification", experiment="kale-tutorial")
79def ml_pipeline(rs=42, iters=100):
80    """Run the ML pipeline."""
81    x, y = load(rs)
82    x, x_test, y, y_test = split(x, y)
83    model = train(x, x_test, y, iters)
84    evaluate(model, x_test, y_test)
85
86
87if __name__ == "__main__":
88    ml_pipeline(rs=42, iters=100)

A Katib experiment configuration has four main sections:

Trials: The maximum trials that Katib will start, how many of them will run in parallel, and after how many failed Trials the experiment will terminate.
Algorithm: The algorithm that you want to use for HP tuning (e.g., grid), specified as a katib.V1beta1AlgorithmSpec spec object.
Metric: The objective name that you use as an end goal (e.g., accuracy) specified as a katib.V1beta1ObjectiveSpec spec object. Note that the name of the objective should match one of the metrics you log with the log_metrics API. See the guide on how to produce KFP metrics for more details.
Hyperparameters: The names of the HPs you want to optimize as a list of katib.V1beta1ParameterSpec spec objects. For each HP, you should specify the feasible space inside a katib.V1beta1FeasibleSpace spec object. This can take the form of a numerical range (e.g., see the HP named c in the above code snippet) or a list of possible values (e.g., see the HP named penalty in the above code snippet).

Note

The Jupyter Kale images in the EKF or MiniKF deployments already come with the Katib SDK installed.

Pass the Katib configuration as an argument to the pipeline decorated function:

experiment.py

1# Copyright © 2021 Arrikto Inc.  All Rights Reserved.
2
3"""Kale SDK.
4-74
4
5This script trains an ML pipeline to solve a binary classification task.
6"""
7
8from kale.sdk import has_metrics, pipeline, step
9from kale.sdk.logging import log_metric
10from sklearn.datasets import make_classification
11from sklearn.linear_model import LogisticRegression
12from sklearn.metrics import accuracy_score
13from sklearn.model_selection import train_test_split
14from kubeflow import katib
15
16
17# The Katib experiment definition.
18katib_experiment = katib.V1beta1ExperimentSpec(
19    max_trial_count=3,
20    parallel_trial_count=1,
21    max_failed_trial_count=0,
22    algorithm=katib.V1beta1AlgorithmSpec(
23        algorithm_name="grid"
24    ),
25    objective=katib.V1beta1ObjectiveSpec(
26        type="maximize",
27        objective_metric_name="accuracy"
28    ),
29    parameters=[katib.V1beta1ParameterSpec(
30        name="c",
31        parameter_type="double",
32        feasible_space=katib.V1beta1FeasibleSpace(
33            min="0.1",
34            max="1.0",
35            step="0.3"),
36    ), katib.V1beta1ParameterSpec(
37        name="penalty",
38        parameter_type="categorical",
39        feasible_space=katib.V1beta1FeasibleSpace(
40            list=["l2", "none"])
41    )],
42)
43
44
45@step(name="data_loading")
46def load(random_state):
47    """Create a random dataset for binary classification."""
48    rs = int(random_state)
49    x, y = make_classification(random_state=rs)
50    return x, y
51
52
53@step(name="data_split")
54def split(x, y):
55    """Split the data into train and test sets."""
56    x, x_test, y, y_test = train_test_split(x, y, test_size=0.1)
57    return x, x_test, y, y_test
58
59
60@step(name="model_training")
61def train(x, x_test, y, training_iterations):
62    """Train a Logistic Regression model."""
63    iters = int(training_iterations)
64    model = LogisticRegression(max_iter=iters)
65    model.fit(x, y)
66    return model
67
68
69@has_metrics
70@step(name="model_evaluation")
71def evaluate(model, x_test, y_test):
72    """Evaluate the model on the test dataset."""
73    y_pred = model.predict(x_test)
74    accuracy = accuracy_score(y_test, y_pred)
75    log_metric(name="accuracy", value=accuracy)
76
77
78-@pipeline(name="binary-classification", experiment="kale-tutorial")
79+@pipeline(name="binary-classification", experiment="kale-tutorial",
80+          katib_experiment=katib_experiment)
81def ml_pipeline(rs=42, iters=100):
82    """Run the ML pipeline."""
83    x, y = load(rs)
84-87
84    x, x_test, y, y_test = split(x, y)
85    model = train(x, x_test, y, iters)
86    evaluate(model, x_test, y_test)
87
88
89if __name__ == "__main__":
90    ml_pipeline(rs=42, iters=100)

Create a parameterized pipeline, which receives the HPs that you want to tune as inputs. To read more about creating parameterized pipelines with the Kale SDK head to relevant Kale SDK guide:

katib.py

1# Copyright © 2021 Arrikto Inc.  All Rights Reserved.
2
3"""Kale SDK.
4-57
4
5This script trains an ML pipeline to solve a binary classification task.
6"""
7
8from kale.sdk import has_metrics, pipeline, step
9from kale.sdk.logging import log_metric
10from sklearn.datasets import make_classification
11from sklearn.linear_model import LogisticRegression
12from sklearn.metrics import accuracy_score
13from sklearn.model_selection import train_test_split
14from kubeflow import katib
15
16
17# The Katib experiment definition.
18katib_experiment = katib.V1beta1ExperimentSpec(
19    max_trial_count=3,
20    parallel_trial_count=1,
21    max_failed_trial_count=0,
22    algorithm=katib.V1beta1AlgorithmSpec(
23        algorithm_name="grid"
24    ),
25    objective=katib.V1beta1ObjectiveSpec(
26        type="maximize",
27        objective_metric_name="accuracy"
28    ),
29    parameters=[katib.V1beta1ParameterSpec(
30        name="c",
31        parameter_type="double",
32        feasible_space=katib.V1beta1FeasibleSpace(
33            min="0.1",
34            max="1.0",
35            step="0.3"),
36    ), katib.V1beta1ParameterSpec(
37        name="penalty",
38        parameter_type="categorical",
39        feasible_space=katib.V1beta1FeasibleSpace(
40            list=["l2", "none"])
41    )],
42)
43
44
45@step(name="data_loading")
46def load(random_state):
47    """Create a random dataset for binary classification."""
48    rs = int(random_state)
49    x, y = make_classification(random_state=rs)
50    return x, y
51
52
53@step(name="data_split")
54def split(x, y):
55    """Split the data into train and test sets."""
56    x, x_test, y, y_test = train_test_split(x, y, test_size=0.1)
57    return x, x_test, y, y_test
58
59
60@step(name="model_training")
61-def train(x, x_test, y, training_iterations):
62+def train(x, x_test, y, training_iterations, c, penalty):
63    """Train a Logistic Regression model."""
64    iters = int(training_iterations)
65-    model = LogisticRegression(max_iter=iters)
66+    c = float(c)
67+    model = LogisticRegression(max_iter=iters, C=c, penalty=penalty)
68    model.fit(x, y)
69    return model
70
71-79
71
72@has_metrics
73@step(name="model_evaluation")
74def evaluate(model, x_test, y_test):
75    """Evaluate the model on the test dataset."""
76    y_pred = model.predict(x_test)
77    accuracy = accuracy_score(y_test, y_pred)
78    log_metric(name="accuracy", value=accuracy)
79
80
81@pipeline(name="binary-classification", experiment="kale-tutorial",
82          katib_experiment=katib_experiment)
83-def ml_pipeline(rs=42, iters=100):
84+def ml_pipeline(rs=42, c=1.0, penalty="l2", iters=100):
85    """Run the ML pipeline."""
86    x, y = load(rs)
87    x, x_test, y, y_test = split(x, y)
88-    model = train(x, x_test, y, iters)
89+    model = train(x, x_test, y, iters, c, penalty)
90    evaluate(model, x_test, y_test)
91
92
93if __name__ == "__main__":
94-    ml_pipeline(rs=42, iters=100)
95+    ml_pipeline(rs=42, c=1.0, penalty="l2", iters=100)

Run the script locally to test whether your code runs successfully using Kale’s marshalling mechanism.

$ python3 -m kale kale_katib.py
(Optional) Produce a workflow YAML file that you can inspect:

$ python3 -m kale kale_katib.py --compile

After the successful execution of this command, look for the workflow YAML file inside a .kale directory inside your working directory. This is a file that you could upload and submit to Kubeflow manually through its User Interface (KFP UI).
Deploy and run your code as a KFP pipeline:

$ python3 -m kale kale_katib.py --kfp

Note

To see the complete list of arguments and their respective usage, run python3 -m kale --help.