Install Kubeflow¶
This section will guide you through installing Kubeflow alongside Rok, using the
rok-deploy
tool.
Fast Forward
If you have already deployed Kubeflow, expand this box to fast-forward.
Go to your GitOps repository, inside your
rok-tools
management environment:root@rok-tools:~# cd ~/ops/deploymentsSave your state:
root@rok-tools:~/ops/deployments# rok-j2 deploy/env.kubeflow-deploy.j2 \ > -o deploy/env.kubeflow-deployCommit your changes:
root@rok-tools:~/ops/deployments# git commit -am "Install Kubeflow"Proceed to the Verify section.
Choose one of the following options to install Kubeflow:
Overview
What You’ll Need¶
- A configured management environment.
- Your clone of the Arrikto GitOps repository.
- An existing Kubernetes cluster.
- A configured Rok user.
Option 1: Install Kubeflow Automatically (preferred)¶
Choose one of the following options, based on your cloud provider.
Install Kubeflow by following the on-screen instructions on the
rok-deploy
user interface.
If rok-deploy
is not already running, start it with:

Proceed to the Summary section.
Option 2: Install Kubeflow Manually¶
If you want to install Kubeflow manually, follow the instructions below.
Procedure¶
Go to your GitOps repository, inside your
rok-tools
management environment:root@rok-tools:~# cd ~/ops/deploymentsDeploy Kubeflow:
root@rok-tools:~/ops/deployments# rok-deploy --apply install/kubeflowTroubleshooting
Cannot create resources in user namespaces
If you have previously uninstalled Kubeflow and are re-applying your existing manifests to reinstall it, it is possible that the namespace resources cannot be applied because the user namespaces do not yet exist. In this case, follow the next steps to apply the profiles before installing Kubeflow so that you can create the namespaces during the deployment.
Go to your GitOps repository, inside your
rok-tools
management environment:root@rok-tools:~# cd ~/ops/deploymentsApply the Profile CRD:
root@rok-tools:~/ops/deployments# rok-deploy --apply \ > kubeflow/manifests/apps/profiles/upstream/crd/Create all user profiles:
root@rok-tools:~/ops/deployments# find kubeflow/manifests/common/namespace-resources/profiles/*.yaml \ > | xargs -n1 kubectl apply -fDeploy Kubeflow:
root@rok-tools:~/ops/deployments# rok-deploy --apply install/kubeflow
Save your state:
root@rok-tools:~/ops/deployments# rok-j2 deploy/env.kubeflow-deploy.j2 \ > -o deploy/env.kubeflow-deployCommit your changes:
root@rok-tools:~/ops/deployments# git commit -am "Install Kubeflow"Mark your progress:
root@rok-tools:~/ops/deployments# export DATE=$(date -u "+%Y-%m-%dT%H.%M.%SZ")root@rok-tools:~/ops/deployments# git tag \ > -a deploy/${DATE?}/release-2.0/kubeflow-deploy \ > -m "Install Kubeflow"Air Gapped
Follow the Use Mirrored Kale Python Image guide to configure Jupyter Web App and enable the PodDefault for the mirrored Kale Python image.
Configure the Argo workflow executor, if necessary. Choose one of the following options, based on your cloud provider:
Skip this step. The executor that EKF configures by default is compatible with AWS.
Follow the Configure Argo Workflow Executor guide to set the executor to PNS.
Then, come back to this guide and follow the rest of the procedure.
Skip this step. The executor that EKF configures by default is compatible with Google Cloud.
Verify¶
Verify that the Dex pod is up-and-running. Check the pod status and verify field STATUS is Running and field READY is 2/2:
root@rok-tools:~# kubectl -n auth get pods NAME READY STATUS RESTARTS AGE dex-0 2/2 Running 3 17mVerify that the pods in the cert-manager namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is 1/1:
root@rok-tools:~# kubectl -n cert-manager get pods NAME READY STATUS RESTARTS AGE cert-manager-6d86476c77-qwgnj 1/1 Running 0 16m cert-manager-cainjector-5b9cd446fd-kl9gg 1/1 Running 0 16m cert-manager-webhook-64d967c45-jmxcz 1/1 Running 0 16mVerify that the pods in the istio-system namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is 1/1:
root@rok-tools:~# kubectl -n istio-system get pods NAME READY STATUS RESTARTS AGE authservice-0 1/1 Running 0 17m istio-ingressgateway-57f58bf544-x45kw 1/1 Running 0 19m istiod-68f6c899f5-wzjfm 1/1 Running 0 19mVerify that the pods in the knative-monitoring namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is N/N:
root@rok-tools:~# kubectl -n knative-monitoring get pods NAME READY STATUS RESTARTS AGE grafana-6695587d6f-ktf86 1/1 Running 0 2m41s kube-state-metrics-79ddb7fc64-w7s5m 1/1 Running 0 2m38s node-exporter-xlj2v 2/2 Running 0 2m3s node-exporter-zfjh5 2/2 Running 0 2m3s prometheus-system-0 1/1 Running 0 2m3s prometheus-system-1 1/1 Running 0 2m3sVerify that the pods in the knative-serving namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is 2/2:
root@rok-tools:~# kubectl -n knative-serving get pods NAME READY STATUS RESTARTS AGE activator-56776cbc47-2l6vl 2/2 Running 0 24h autoscaler-fc7884b85-ncv8w 2/2 Running 0 24h controller-6fb67d5db5-zdwzn 2/2 Running 0 24h domain-mapping-5d56bfc7d-dqjq4 2/2 Running 0 39m domainmapping-webhook-75d7c89dcb-6dtpm 2/2 Running 0 24h knative-serving-cluster-ingressgateway-7d45565bd6-z7prd 1/1 Running 0 24h knative-serving-ingressgateway-764d54dbc7-hkcjc 1/1 Running 0 24h net-istio-controller-6c6b88bc9-cfn8c 2/2 Running 0 39m net-istio-webhook-5b57488bb6-n7pw8 2/2 Running 0 39m webhook-74bbb5c8d5-577gx 2/2 Running 0 39mVerify that the pods in the kubeflow namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is N/N:
root@rok-tools:~# kubectl -n kubeflow get pods NAME READY STATUS RESTARTS AGE admission-webhook-deployment-5d4cf6bbdb-jszsw 2/2 Running 0 16m cache-server-68ffc8d4ff-ltl8q 2/2 Running 0 1m centraldashboard-fd8774874-56587 2/2 Running 0 2m42s jupyter-web-app-deployment-7987d45c7d-5gwss 2/2 Running 0 2m42s katib-controller-54f895f874-g29bx 2/2 Running 2 2m41s katib-db-manager-6f5d8f5945-wmmnb 2/2 Running 1 2m48s katib-mysql-857bfdb7f9-w5zj8 2/2 Running 0 2m39s katib-ui-696fc69ddc-jkk2x 2/2 Running 2 2m38s kfp-cache-d96f57c8b-5cjht 3/3 Running 4 2m46s kfserving-controller-manager-0 3/3 Running 1 2m20s kfserving-models-web-app-77cc4c8dd6-86v92 2/2 Running 0 1m kubeflow-reception-9c67996fc-46djf 2/2 Running 1 15m metadata-db-d48d67699-89fg9 2/2 Running 0 2m44s metadata-envoy-deployment-775b466c45-4gbkx 1/1 Running 0 2m38s metadata-grpc-deployment-5c975cb96d-tq5vr 2/2 Running 4 2m37s minio-7c9b6578cd-7f2tb 2/2 Running 0 2m35s ml-pipeline-7867b5b879-dgmnj 2/2 Running 0 2m41s ml-pipeline-persistenceagent-8495768cbb-vpfjt 2/2 Running 0 2m33s ml-pipeline-scheduledworkflow-7f58d84f9f-4pf7d 2/2 Running 0 2m37s ml-pipeline-ui-678cb55d6f-z9spc 2/2 Running 0 2m32s ml-pipeline-viewer-crd-57768dc6c6-wtxjm 2/2 Running 1 2m30s ml-pipeline-visualizationserver-68498d6df6-ms74w 2/2 Running 0 2m28s mysql-55d57856d7-bzvgd 2/2 Running 0 2m25s notebook-controller-deployment-6cf9974cd9-2p9mj 2/2 Running 1 2m25s profiles-deployment-64cf74dfd4-b6dx2 3/3 Running 1 15m pvcviewer-controller-controller-manager-6dd55d9dfd-m5j8s 3/3 Running 1 2m23s spark-operatorsparkoperator-5775c699bb-4xgn2 2/2 Running 0 2m27s tensorboard-controller-controller-manager-7f766c8676-8g6fq 3/3 Running 2 2m22s tensorboards-web-app-deployment-6b4dfd598c-r9xgk 1/1 Running 0 2m25s training-operator-747f797684-f6jhd 2/2 Running 0 1m volumes-web-app-deployment-7b58b4c478-btfmw 2/2 Running 0 2m24s workflow-controller-76579565dd-8f6vw 2/2 Running 1 2m22sVerify that the pods in the kyverno namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is 1/1:
root@rok-tools:~# kubectl -n kyverno get pods NAME READY STATUS RESTARTS AGE kyverno-dd4fcd768-x955r 1/1 Running 0 2m47s
What’s Next¶
The next step is to integrate Rok with the Kubeflow dashboard.