Install Kubeflow

This section will guide you through installing Kubeflow 1.5 alongside Rok, using the rok-deploy tool.

Choose one of the following options to install Kubeflow:

What You’ll Need

Option 1: Install Kubeflow Automatically (preferred)

Choose one of the following options, based on your platform.

Install Kubeflow by following the on-screen instructions on the rok-deploy user interface.

If rok-deploy is not already running, start it with:

root@rok-tools:~# rok-deploy --run-from kubeflow-deploy
../../_images/kubeflow-deploy.png

Proceed to the Summary section.

Rok does not currently support automatic deployment on Azure. Please follow the instructions in the Option 2: Install Kubeflow Manually section to deploy Rok manually.
Rok does not currently support automatic deployment on Google Cloud. Please follow the instructions in the Option 2: Install Kubeflow Manually section to deploy Rok manually.
Rok does not currently support automatic deployment on premises. Please follow the instructions in the Option 2: Install Kubeflow Manually section to deploy Rok manually.

Option 2: Install Kubeflow Manually

If you want to install Kubeflow manually, follow the instructions below.

Procedure

  1. Go to your GitOps repository, inside your rok-tools management environment:

    root@rok-tools:~# cd ~/ops/deployments
  2. Deploy Kubeflow:

    root@rok-tools:~/ops/deployments# rok-deploy --apply install/kubeflow

    Troubleshooting

    Cannot create resources in user namespaces

    If you have previously uninstalled Kubeflow and are re-applying your existing manifests to reinstall it, it is possible that the namespace resources cannot be applied because the user namespaces do not yet exist. In this case, follow the next steps to apply the profiles before installing Kubeflow so that you can create the namespaces during the deployment.

    1. Go to your GitOps repository, inside your rok-tools management environment:

      root@rok-tools:~# cd ~/ops/deployments
    2. Apply the Profile CRD:

      root@rok-tools:~/ops/deployments# rok-deploy --apply \ > kubeflow/manifests/apps/profiles/upstream/crd/
    3. Create all user profiles:

      root@rok-tools:~/ops/deployments# find kubeflow/manifests/common/namespace-resources/profiles/*.yaml \ > | xargs -n1 kubectl apply -f
    4. Deploy Kubeflow:

      root@rok-tools:~/ops/deployments# rok-deploy --apply install/kubeflow
  3. Save your state:

    root@rok-tools:~/ops/deployments# rok-j2 deploy/env.kubeflow-deploy.j2 \ > -o deploy/env.kubeflow-deploy
  4. Commit your changes:

    root@rok-tools:~/ops/deployments# git commit -am "Install Kubeflow"
  5. Mark your progress:

    root@rok-tools:~/ops/deployments# export DATE=$(date -u "+%Y-%m-%dT%H.%M.%SZ")
    root@rok-tools:~/ops/deployments# git tag \ > -a deploy/${DATE?}/develop/kubeflow-deploy \ > -m "Install Kubeflow"
  6. Configure the Argo workflow executor, if necessary. Choose one of the following options, based on your platform:

    Skip this step. The executor that EKF configures by default is compatible with AWS.

    Follow the Configure Argo Workflow Executor guide to set the executor to PNS.

    Then, come back to this guide and follow the rest of the procedure.

    Skip this step. The executor that EKF configures by default is compatible with Google Cloud.

    Skip this step. The executor that EKF configures by default is compatible with on-premises deployments.

Verify

  1. Verify that the Dex pod is up-and-running. Check the pod status and verify field STATUS is Running and field READY is 2/2:

    root@rok-tools:~# kubectl -n auth get pods NAME READY STATUS RESTARTS AGE dex-0 2/2 Running 3 17m
  2. Verify that the pods in the cert-manager namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is 1/1:

    root@rok-tools:~# kubectl -n cert-manager get pods NAME READY STATUS RESTARTS AGE cert-manager-6d86476c77-qwgnj 1/1 Running 0 16m cert-manager-cainjector-5b9cd446fd-kl9gg 1/1 Running 0 16m cert-manager-webhook-64d967c45-jmxcz 1/1 Running 0 16m
  3. Verify that the pods in the istio-system namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is 1/1:

    root@rok-tools:~# kubectl -n istio-system get pods NAME READY STATUS RESTARTS AGE authservice-0 1/1 Running 0 17m istio-ingressgateway-57f58bf544-x45kw 1/1 Running 0 19m istiod-68f6c899f5-wzjfm 1/1 Running 0 19m
  4. Verify that the pods in the knative-monitoring namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is N/N:

    root@rok-tools:~# kubectl -n knative-monitoring get pods NAME READY STATUS RESTARTS AGE grafana-6695587d6f-ktf86 1/1 Running 0 2m41s kube-state-metrics-79ddb7fc64-w7s5m 1/1 Running 0 2m38s node-exporter-xlj2v 2/2 Running 0 2m3s node-exporter-zfjh5 2/2 Running 0 2m3s prometheus-system-0 1/1 Running 0 2m3s prometheus-system-1 1/1 Running 0 2m3s
  5. Verify that the pods in the knative-serving namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is 2/2:

    root@rok-tools:~# kubectl -n knative-serving get pods NAME READY STATUS RESTARTS AGE activator-56776cbc47-2l6vl 2/2 Running 0 24h autoscaler-fc7884b85-ncv8w 2/2 Running 0 24h controller-6fb67d5db5-zdwzn 2/2 Running 0 24h domain-mapping-5d56bfc7d-dqjq4 2/2 Running 0 39m domainmapping-webhook-75d7c89dcb-6dtpm 2/2 Running 0 24h knative-serving-cluster-ingressgateway-7d45565bd6-z7prd 1/1 Running 0 24h knative-serving-ingressgateway-764d54dbc7-hkcjc 1/1 Running 0 24h net-istio-controller-6c6b88bc9-cfn8c 2/2 Running 0 39m net-istio-webhook-5b57488bb6-n7pw8 2/2 Running 0 39m webhook-74bbb5c8d5-577gx 2/2 Running 0 39m
  6. Verify that the pods in the kubeflow namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is N/N:

    root@rok-tools:~# kubectl -n kubeflow get pods NAME READY STATUS RESTARTS AGE admission-webhook-deployment-5d4cf6bbdb-jszsw 2/2 Running 0 16m cache-server-68ffc8d4ff-ltl8q 2/2 Running 0 1m centraldashboard-fd8774874-56587 2/2 Running 0 2m42s jupyter-web-app-deployment-7987d45c7d-5gwss 2/2 Running 0 2m42s katib-controller-54f895f874-g29bx 2/2 Running 2 2m41s katib-db-manager-6f5d8f5945-wmmnb 2/2 Running 1 2m48s katib-mysql-857bfdb7f9-w5zj8 2/2 Running 0 2m39s katib-ui-696fc69ddc-jkk2x 2/2 Running 2 2m38s kfp-cache-d96f57c8b-5cjht 3/3 Running 4 2m46s kfserving-controller-manager-0 3/3 Running 1 2m20s kfserving-models-web-app-77cc4c8dd6-86v92 2/2 Running 0 1m kubeflow-reception-9c67996fc-46djf 2/2 Running 1 15m metadata-db-d48d67699-89fg9 2/2 Running 0 2m44s metadata-envoy-deployment-775b466c45-4gbkx 1/1 Running 0 2m38s metadata-grpc-deployment-5c975cb96d-tq5vr 2/2 Running 4 2m37s minio-7c9b6578cd-7f2tb 2/2 Running 0 2m35s ml-pipeline-7867b5b879-dgmnj 2/2 Running 0 2m41s ml-pipeline-persistenceagent-8495768cbb-vpfjt 2/2 Running 0 2m33s ml-pipeline-scheduledworkflow-7f58d84f9f-4pf7d 2/2 Running 0 2m37s ml-pipeline-ui-678cb55d6f-z9spc 2/2 Running 0 2m32s ml-pipeline-viewer-crd-57768dc6c6-wtxjm 2/2 Running 1 2m30s ml-pipeline-visualizationserver-68498d6df6-ms74w 2/2 Running 0 2m28s mysql-55d57856d7-bzvgd 2/2 Running 0 2m25s notebook-controller-deployment-6cf9974cd9-2p9mj 2/2 Running 1 2m25s profiles-deployment-64cf74dfd4-b6dx2 3/3 Running 1 15m pvcviewer-controller-controller-manager-6dd55d9dfd-m5j8s 3/3 Running 1 2m23s spark-operatorsparkoperator-5775c699bb-4xgn2 2/2 Running 0 2m27s tensorboard-controller-controller-manager-7f766c8676-8g6fq 3/3 Running 2 2m22s tensorboards-web-app-deployment-6b4dfd598c-r9xgk 1/1 Running 0 2m25s training-operator-747f797684-f6jhd 2/2 Running 0 1m volumes-web-app-deployment-7b58b4c478-btfmw 2/2 Running 0 2m24s workflow-controller-76579565dd-8f6vw 2/2 Running 1 2m22s
  7. Verify that the pods in the kyverno namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is 1/1:

    root@rok-tools:~# kubectl -n kyverno get pods NAME READY STATUS RESTARTS AGE kyverno-dd4fcd768-x955r 1/1 Running 0 2m47s

Summary

You have successfully installed Kubeflow.

What’s Next

The next step is to integrate Rok with the Kubeflow dashboard.