Upgrade Kubeflow¶
This section describes how to upgrade Kubeflow.
Fast Forward
If you have not deployed Kubeflow in your cluster, expand this box to fast-forward.
- Proceed to the What’s Next section.
What You’ll Need¶
- An upgraded management environment.
- An existing Kubeflow deployment.
- Your clone of the Arrikto GitOps repository.
- Arrikto manifests for EKF version 2.0.1.
Procedure¶
Go to your GitOps repository inside your
rok-tools
management environment:root@rok-tools:~# cd ~/ops/deploymentsUpgrade your Spark Operator installation first:
root@rok-tools:~# rok-deploy \ > --apply kubeflow/manifests/contrib/spark/spark-operator/overlays/deploy \ > --force --force-kinds DeploymentUpgrade your Kubeflow installation:
root@rok-tools:~/ops/deployments# rok-deploy --apply install/kubeflowRemove the deprecated resources left by the previous version of Kubeflow:
root@rok-tools:~/ops/deployments# rok-kf-prune --app kubeflowRemove the deprecated resources left by the previous version of Knative:
root@rok-tools:~/ops/deployments# rok-kf-prune --app knativeMigrate from KFServing to KServe.
Migrate any existing KFServing InferenceServices to KServe:
root@rok-tools:~/ops/deployments# rok-kserve-migrateImportant
During the migration, KServe will use the default ServingRuntime of the specific framework as the backend of your new InferenceService. As such, your InferenceService may not become ready if:
- You were using a custom image to serve your model, for example, using
a specific
runtimeVersion
in the InferenceService spec or changing theinferenceservice-config
ConfigMap. - Your model is not compatible with the new default ServingRuntime, for example, there is a mismatch between the version of the library used to build your model and the version of the library the default ServingRuntime is using.
In such cases, you need to:
- Create a new ServingRuntime with your custom image or an image compatible with the model you want to serve.
- Patch your InferenceService to use this ServingRuntime.
Note that until the new Revision becomes ready, the old one will remain up-and-running.
See also
- You were using a custom image to serve your model, for example, using
a specific
Verify that there are no KFServing inference services present:
root@rok-tools:~/ops/deployments# kubectl get inferenceservices.serving.kubeflow.org -A No resources foundDelete the deprecated KFServing resources:
root@rok-tools:~/ops/deployments# rok-deploy --delete kubeflow/manifests/apps/kfserving/upstream/overlays/deployOptional
KServe 0.8 supports path-based serving. If you have already exposed serving and you want to switch from host-based serving follow the corresponding Operations guide.
Verify¶
Verify that the Dex Pod is up and running. Check the Pod status and verify that field STATUS is Running and field READY is 2/2:
root@rok-tools:~# kubectl -n auth get pods NAME READY STATUS RESTARTS AGE dex-0 2/2 Running 3 1mVerify that the Pods in the
cert-manager
namespace are up and running. Check the Pod status and verify that field STATUS is Running and field READY is 1/1 for all Pods:root@rok-tools:~# kubectl -n cert-manager get pods NAME READY STATUS RESTARTS AGE cert-manager-6d86476c77-qwgnj 1/1 Running 0 1m cert-manager-cainjector-5b9cd446fd-kl9gg 1/1 Running 0 1m cert-manager-webhook-64d967c45-jmxcz 1/1 Running 0 1mVerify that the Pods in the
istio-system
namespace are up and running. Check the Pod status and verify that field STATUS is Running and field READY is 1/1 for all Pods:root@rok-tools:~# kubectl -n istio-system get pods NAME READY STATUS RESTARTS AGE authservice-0 1/1 Running 0 1m istio-ingressgateway-57f58bf544-x45kw 1/1 Running 0 1m istiod-68f6c899f5-wzjfm 1/1 Running 0 1mVerify that the Pods in the
knative-monitoring
namespace are up and running. Check the Pod status and verify that field STATUS is Running and field READY is N/N for all Pods:root@rok-tools:~# kubectl -n knative-monitoring get pods NAME READY STATUS RESTARTS AGE grafana-6695587d6f-ktf86 1/1 Running 0 1m kube-state-metrics-79ddb7fc64-w7s5m 1/1 Running 0 1m node-exporter-xlj2v 2/2 Running 0 1m node-exporter-zfjh5 2/2 Running 0 1m prometheus-system-0 1/1 Running 0 1m prometheus-system-1 1/1 Running 0 1mVerify that the Pods in the
knative-serving
namespace are up and running. Check the Pod status and verify that field STATUS is Running and field READY is 2/2 for all Pods:root@rok-tools:~# kubectl -n knative-serving get pods NAME READY STATUS RESTARTS AGE activator-5d6754bc67-qb2ct 2/2 Running 0 1m autoscaler-6dd6dbbb84-zgwkf 2/2 Running 0 1m controller-687f6c6995-27fkw 2/2 Running 0 1m istio-webhook-8d4f5fbfb-tg6h4 2/2 Running 0 1m networking-istio-785675596f-nnqbr 2/2 Running 0 1m webhook-6d776d968c-gmnbz 2/2 Running 0 1mVerify that the Pods in the
kubeflow
namespace are up and running. Check the Pod status and verify that field STATUS is Running and field READY is N/N for all Pods:root@rok-tools:~# kubectl -n kubeflow get pods NAME READY STATUS RESTARTS AGE admission-webhook-deployment-5d4cf6bbdb-jszsw 2/2 Running 0 1m cache-server-68ffc8d4ff-ltl8q 2/2 Running 0 1m centraldashboard-fd8774874-56587 2/2 Running 0 1m jupyter-web-app-deployment-7987d45c7d-5gwss 2/2 Running 0 1m katib-controller-54f895f874-g29bx 2/2 Running 2 1m katib-db-manager-6f5d8f5945-wmmnb 2/2 Running 1 1m katib-mysql-857bfdb7f9-w5zj8 2/2 Running 0 1m katib-ui-696fc69ddc-jkk2x 2/2 Running 2 1m kfp-cache-d96f57c8b-5cjht 3/3 Running 4 1m kfserving-controller-manager-0 3/3 Running 1 1m kfserving-models-web-app-77cc4c8dd6-86v92 2/2 Running 0 1m kubeflow-reception-9c67996fc-46djf 2/2 Running 1 1m metadata-db-d48d67699-89fg9 2/2 Running 0 1m metadata-envoy-deployment-775b466c45-4gbkx 1/1 Running 0 1m metadata-grpc-deployment-5c975cb96d-tq5vr 2/2 Running 4 1m minio-7c9b6578cd-7f2tb 2/2 Running 0 1m ml-pipeline-7867b5b879-dgmnj 2/2 Running 0 1m ml-pipeline-persistenceagent-8495768cbb-vpfjt 2/2 Running 0 1m ml-pipeline-scheduledworkflow-7f58d84f9f-4pf7d 2/2 Running 0 1m ml-pipeline-ui-678cb55d6f-z9spc 2/2 Running 0 1m ml-pipeline-viewer-crd-57768dc6c6-wtxjm 2/2 Running 1 1m ml-pipeline-visualizationserver-68498d6df6-ms74w 2/2 Running 0 1m mysql-55d57856d7-bzvgd 2/2 Running 0 1m notebook-controller-deployment-6cf9974cd9-2p9mj 2/2 Running 1 1m profiles-deployment-64cf74dfd4-b6dx2 3/3 Running 1 1m pvcviewer-controller-controller-manager-6dd55d9dfd-m5j8s 3/3 Running 1 1m spark-operatorsparkoperator-5775c699bb-4xgn2 2/2 Running 0 1m tensorboard-controller-controller-manager-7f766c8676-8g6fq 3/3 Running 2 1m tensorboards-web-app-deployment-6b4dfd598c-r9xgk 1/1 Running 0 1m training-operator-747f797684-f6jhd 2/2 Running 0 1m volumes-web-app-deployment-7b58b4c478-btfmw 2/2 Running 0 1m workflow-controller-76579565dd-8f6vw 2/2 Running 1 1mVerify that the Pods in the
kserve
namespace are up and running. Check the Pod status and verify that field STATUS is Running and field READY is N/N for all Pods:root@rok-tools:~# kubectl -n kserve get pods NAME READY STATUS RESTARTS AGE kserve-controller-manager-0 3/3 Running 1 1m
Summary¶
You have successfully upgraded Kubeflow.