Install Kubeflow¶
This section will guide you through installing Kubeflow alongside Rok, using the
rok-deploy
tool.
Fast Forward
If you have already deployed Kubeflow, expand this box to fast-forward.
- Proceed to the Verify section.
Choose one of the following options to install Kubeflow:
Overview
What You’ll Need¶
- A configured management environment.
- Your clone of the Arrikto GitOps repository.
- An existing Kubernetes cluster.
- A configured Rok user.
Option 1: Install Kubeflow Automatically (preferred)¶
Choose one of the following options, based on your cloud provider.
Install Kubeflow by following the on-screen instructions on the
rok-deploy
user interface.
If rok-deploy
is not already running, start it with:

Proceed to the Summary section.
Option 2: Install Kubeflow Manually¶
If you want to install Kubeflow manually, follow the instructions below.
Procedure¶
Go to your GitOps repository, inside your
rok-tools
management environment:root@rok-tools:~# cd ~/ops/deploymentsDeploy Kubeflow:
root@rok-tools:~/ops/deployments# rok-deploy --apply install/kubeflowTroubleshooting
Cannot create resources in user namespaces
If you have previously uninstalled Kubeflow and are re-applying your existing manifests to reinstall it, it is possible that the namespace resources cannot be applied because the user namespaces do not yet exist. In this case, follow the next steps to apply the profiles before installing Kubeflow so that you can create the namespaces during the deployment.
Go to your GitOps repository, inside your
rok-tools
management environment:root@rok-tools:~# cd ~/ops/deploymentsApply the Profile CRD:
root@rok-tools:~/ops/deployments# rok-deploy --apply \ > kubeflow/manifests/apps/profiles/upstream/crd/Create all user profiles:
root@rok-tools:~/ops/deployments# find kubeflow/manifests/common/namespace-resources/profiles/*.yaml \ > | xargs -n1 kubectl apply -fDeploy Kubeflow:
root@rok-tools:~/ops/deployments# rok-deploy --apply install/kubeflow
Air Gapped
Follow the Use Mirrored Kale Python Image guide to configure Jupyter Web App and enable the PodDefault for the mirrored Kale Python image.
Configure the Argo workflow executor, if necessary. Choose one of the following options, based on your cloud provider:
Skip this step. The executor that EKF configures by default is compatible with AWS.
Follow the Configure Argo Workflow Executor guide to set the executor to PNS.
Then, come back to this guide and follow the rest of the procedure.
Skip this step. The executor that EKF configures by default is compatible with Google Cloud.
Verify¶
Verify that the Dex pod is up-and-running. Check the pod status and verify field STATUS is Running and field READY is 2/2:
root@rok-tools:~# kubectl -n auth get pods NAME READY STATUS RESTARTS AGE dex-57c98bb9bb-l466d 2/2 Running 3 17mVerify that the pods in the cert-manager namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is 1/1:
root@rok-tools:~# kubectl -n cert-manager get pods NAME READY STATUS RESTARTS AGE cert-manager-6d86476c77-qwgnj 1/1 Running 0 16m cert-manager-cainjector-5b9cd446fd-kl9gg 1/1 Running 0 16m cert-manager-webhook-64d967c45-jmxcz 1/1 Running 0 16mVerify that the pods in the istio-system namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is 1/1:
root@rok-tools:~# kubectl -n istio-system get pods NAME READY STATUS RESTARTS AGE authservice-0 1/1 Running 0 17m cluster-local-gateway-b76ff5885-2rjg5 1/1 Running 0 2m23s istio-ingressgateway-57f58bf544-x45kw 1/1 Running 0 19m istiod-68f6c899f5-wzjfm 1/1 Running 0 19mVerify that the pods in the knative-monitoring namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is N/N:
root@rok-tools:~# kubectl -n knative-monitoring get pods NAME READY STATUS RESTARTS AGE grafana-6695587d6f-ktf86 1/1 Running 0 2m41s kube-state-metrics-79ddb7fc64-w7s5m 1/1 Running 0 2m38s node-exporter-xlj2v 2/2 Running 0 2m3s node-exporter-zfjh5 2/2 Running 0 2m3s prometheus-system-0 1/1 Running 0 2m3s prometheus-system-1 1/1 Running 0 2m3sVerify that the pods in the knative-serving namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is 2/2:
root@rok-tools:~# kubectl -n knative-serving get pods NAME READY STATUS RESTARTS AGE activator-5d6754bc67-qb2ct 2/2 Running 0 2m47s autoscaler-6dd6dbbb84-zgwkf 2/2 Running 0 2m46s controller-687f6c6995-27fkw 2/2 Running 0 2m42s istio-webhook-8d4f5fbfb-tg6h4 2/2 Running 0 2m40s networking-istio-785675596f-nnqbr 2/2 Running 0 2m43s webhook-6d776d968c-gmnbz 2/2 Running 0 2m43sVerify that the pods in the kubeflow namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is N/N:
root@rok-tools:~# kubectl -n kubeflow get pods NAME READY STATUS RESTARTS AGE admission-webhook-deployment-5d4cf6bbdb-jszsw 2/2 Running 0 16m centraldashboard-fd8774874-56587 2/2 Running 0 2m42s jupyter-web-app-deployment-7987d45c7d-5gwss 2/2 Running 0 2m42s katib-controller-54f895f874-g29bx 2/2 Running 2 2m41s katib-db-manager-6f5d8f5945-wmmnb 2/2 Running 1 2m48s katib-mysql-857bfdb7f9-w5zj8 2/2 Running 0 2m39s katib-ui-696fc69ddc-jkk2x 2/2 Running 2 2m38s kfp-cache-d96f57c8b-5cjht 3/3 Running 4 2m46s kfserving-controller-manager-0 3/3 Running 1 2m20s kubeflow-reception-9c67996fc-46djf 2/2 Running 1 15m metadata-db-d48d67699-89fg9 2/2 Running 0 2m44s metadata-envoy-deployment-775b466c45-4gbkx 1/1 Running 0 2m38s metadata-grpc-deployment-5c975cb96d-tq5vr 2/2 Running 4 2m37s minio-7c9b6578cd-7f2tb 2/2 Running 0 2m35s ml-pipeline-7867b5b879-dgmnj 2/2 Running 0 2m41s ml-pipeline-persistenceagent-8495768cbb-vpfjt 2/2 Running 0 2m33s ml-pipeline-scheduledworkflow-7f58d84f9f-4pf7d 2/2 Running 0 2m37s ml-pipeline-ui-678cb55d6f-z9spc 2/2 Running 0 2m32s ml-pipeline-viewer-crd-57768dc6c6-wtxjm 2/2 Running 1 2m30s ml-pipeline-visualizationserver-68498d6df6-ms74w 2/2 Running 0 2m28s models-web-app-748f8776df-zrc66 2/2 Running 0 2m34s mpi-operator-f658c675b-6jrln 1/1 Running 0 2m34s mxnet-operator-6594fb56b-q68pp 1/1 Running 0 2m25s mysql-55d57856d7-bzvgd 2/2 Running 0 2m25s notebook-controller-deployment-6cf9974cd9-2p9mj 2/2 Running 1 2m25s profiles-deployment-64cf74dfd4-b6dx2 3/3 Running 1 15m pvcviewer-controller-controller-manager-6dd55d9dfd-m5j8s 3/3 Running 1 2m23s pytorch-operator-74788b9d8c-prdsb 2/2 Running 0 2m29s spark-operatorsparkoperator-5775c699bb-4xgn2 2/2 Running 0 2m27s tensorboard-controller-controller-manager-7f766c8676-8g6fq 3/3 Running 2 2m22s tensorboards-web-app-deployment-6b4dfd598c-r9xgk 1/1 Running 0 2m25s tf-job-operator-d8b96567b-qj48v 2/2 Running 1 2m22s volumes-web-app-deployment-7b58b4c478-btfmw 2/2 Running 0 2m24s workflow-controller-76579565dd-8f6vw 2/2 Running 1 2m22s xgboost-operator-deployment-7dcff8bf85-t9hvr 2/2 Running 1 2m22s
What’s Next¶
The next step is to integrate Rok with the Kubeflow dashboard.