Kubeflow

This guide describes how to deploy Kubeflow alongside Rok, using installation manifests provided by Arrikto. For example:

$ git clone https://github.com/arrikto/deployments
$ cd deployments

CertManager

Cert-manager is a native Kubernetes certificate management controller. It can help with issuing certificates from a variety of sources, a simple signing key pair, or self signed. Kubeflow needs a self-signed issuer, so we are going to install and configure cert-manager as such.

  1. Apply the cert-manager base manifests:

    $ kubectl apply -k rok/cert-manager/cert-manager-kube-system-resources/base
    $ kubectl apply -k rok/cert-manager/cert-manager/base --validate=false
    

    Note

    We first apply the base kustomization and then the self-signed overlay because kubectl fails due to an upstream issue with version 1.18 (see https://github.com/kubernetes/kubectl/issues/845)

  2. Apply the cert-manager overlayed manifests:

    $ kubectl apply -k rok/cert-manager/cert-manager/overlays/self-signed --validate=false
    

    Note

    We skip validation since, otherwise, kubectl fails with:

    error validating data: ValidationError(CustomResourceDefinition.spec.validation.openAPIV3Schema.properties.spec.properties.solver.properties.dns01.properties.webhook.properties.config): unknown field "x-kubernetes-preserve-unknown-fields" in io.k8s.apiextensions-apiserver.pkg.apis.apiextensions.v1beta1.JSONSchemaProps
    

    Warning

    This command depends on the previous ones, so make sure to wait about a minute before executing it. In case this command fails with:

    apiservice.apiregistration.k8s.io/v1beta1.webhook.cert-manager.io unchanged
    Error from server (InternalError): error when creating "rok/cert-manager/cert-manager/overlays/self-signed":
    Internal error occurred: failed calling webhook "webhook.cert-manager.io": the server is currently unable to handle the request
    

    simply re-execute it.

  3. Verify that the cert-manager Pods are created in the cert-manager namespace:

    $ kubectl get pods -n cert-manager
    NAME                                       READY   STATUS    RESTARTS   AGE
    cert-manager-6fc9549998-pl95m              1/1     Running   0          2m12s
    cert-manager-cainjector-84747978b7-glwsw   1/1     Running   0          2m13s
    cert-manager-webhook-849c8bdf78-xf8cv      1/1     Running   1          2m12s
    

EKF (Enterprise Kubeflow)

Configure Authentication

EKF authenticates users using OIDC. We use Dex as our default OIDC Provider and AuthService as our OIDC Client (authenticating proxy). If you have another OIDC Provider (e.g., GitLab) then you can skip installing Dex. In this section we describe how to setup authentication for EKF, using Dex and AuthService.

Specifically:

  • Change password of default user
  • Change credentials of OIDC client

By default, Dex is installed with a single static user. To change the default user’s password or create new users, one has to modify Dex’s ConfigMap. To change the password of the default user:

  1. Pick a password for the default user, with handle user, and hash it using bcrypt:

    $ python3 -c 'from passlib.hash import bcrypt; import getpass; print(bcrypt.using(rounds=12, ident="2y").hash(getpass.getpass()))'
    
  2. Edit kubeflow/kfctl_config.yaml and fill the relevant field with the hash of the password you chose:

    ...
    - kustomizeConfig:
        parameters:
        - name: static_password_hash
          value: <enter your generated hash here>
      name: dex
    
  3. Generate OIDC Client credentials for the AuthService. These credentials are used by the AuthService to authenticate to Dex. The credentials must be filled in both Dex and AuthService kustomizations:

    $ export OIDC_CLIENT_ID="authservice"
    $ export OIDC_CLIENT_SECRET="$(openssl rand -base64 32)"
    $ envsubst < kubeflow/manifests/dex-auth/dex-crds/overlays/ekf/secret_params.env.in > kubeflow/manifests/dex-auth/dex-crds/overlays/ekf/secret_params.env
    $ envsubst < kubeflow/manifests/istio/oidc-authservice/overlays/ekf/secret_params.env.in > kubeflow/manifests/istio/oidc-authservice/overlays/ekf/secret_params.env
    
  4. Commit changes locally:

    $ git commit -am "kubeflow: Configure authentication"
    

Deploy Kubeflow

Kubeflow is deployed in a generate/commit/apply manner as well.

More specifically:

  1. Generate: kfctl takes KfDef and a manifest and generates kustomizations once
  2. Commit: The end-user can modify kustomizations and commit everything
  3. Apply: Apply everything to Kubernetes

Follow the steps below to deploy Kubeflow:

  1. If you have already deployed Rok, delete Dex and AuthService, as we’re going to install them as part of Kubeflow:

    $ kubectl delete -k rok/rok-external-services/dex/overlays/deploy
    $ kubectl delete -k rok/rok-external-services/authservice/overlays/deploy
    
  2. Navigate to the kubeflow directory of the local GitOps repository, where both kfctl config and manifests reside:

    $ cd kubeflow
    
  3. Auto-generate Kubeflow-related kustomizations:

    $ kfctl build -V -f kfctl_config.yaml
    
  4. Commit the generated stock kustomizations:

    $ git add kustomize
    $ git commit -m "Add initial Kubeflow kustomizations"
    
  5. Deploy Kubeflow with:

    $ kfctl apply -V -f kfctl_config.yaml
    
  6. This slightly modifies kfctl_config. Commit this as well:

    $ git commit -am "Add kfctl configuration changes after apply"
    

Integrate Rok with the Kubeflow Dashboard

To integrate Rok with the Kubeflow dashboard, so that you can visit it from the “Snapshot Store” tab in the Kubeflow UI, you need to:

  1. Go to the deployment repository:

    $ cd ~/ops/deployments
    
  2. Edit rok/rok-cluster/overlays/deploy/patches/configvars.yaml and add the gw.ui.kubeflow_dashboard_enabled: true config variable, like so:

    ...
    configVars:
      daemons.s3d.use_iam_role: true
      gw.ui.kubeflow_dashboard_enabled: true # <-- Copy this line.
    
  3. Commit the new option:

    $ git add rok/rok-cluster/overlays/deploy
    $ git commit -m "Enable Kubeflow dashboard integration"
    
  4. Re-apply the Rok cluster overlay:

    $ kubectl apply -k rok/rok-cluster/overlays/deploy
    

Set up namespaces

Each namespace where users run requires some resources that enable access to Rok and Kubeflow Pipelines. In this section we describe how to set up your namespaces to achieve this.

In kubeflow/manifests/ you will find a directory called namespace-resources/. This contains a base/ kustomization, along with a templated-kustomization.yaml.

Important

You need to follow these instructions, and create a new overlay, for every namespace you wish to set up.

  1. Save the name of the namespace in an environment variable. For example:

    $ export NAMESPACE=kubeflow-user
    
  2. Make sure you have logged in as a user who can access $NAMESPACE at least once. This ensures the namespace exists and has the default namespace-specific RoleBindings for Kubeflow.

    Note

    If you are using Dex for authentication see Tweak Dex section on hot to add and new user.

  3. Switch to kubeflow/manifests/ directory:

    $ cd kubeflow/manifests
    
  4. Create the new overlay:

    $ mkdir -p namespace-resources/overlays/$NAMESPACE
    $ envsubst < namespace-resources/template-kustomization.yaml > namespace-resources/overlays/$NAMESPACE/kustomization.yaml
    
  5. Commit the changes:

    $ git add namespace-resources/overlays/$NAMESPACE
    $ git commit -m "Set up namespace '$NAMESPACE' for access to Rok and KFP"
    
  6. Apply the kustomization:

    $ rok-deploy --apply namespace-resources/overlays/$NAMESPACE
    

Enable namespace sharing

This section describes how to share a namespace with another user. It will make use of namespace-permissions base kustomization, and will create an overlay using existing templates.

Important

The namespace should already exist, i.e., the owner has logged in at least once, and configured as described in the Set up namespaces section.

Important

You need to follow the instructions below, i.e., generate, commit and apply an overlay, for every namespace and every user you wish to set up.

  1. Set the namespace to be shared, the desired role that the new user will have inside the namespace, and the user ID of the new user. For example:

    $ export NAMESPACE=kubeflow-user-example-com
    $ export USER=user1@example.com
    $ export ROLE=edit
    

    Note

    NAMESPACE is created automatically by the profile controller after sanitizing the user ID obtained by authservice.

    Note

    USER depends on the way authservice is configured, i.e., via USERID_CLAIM, and can be a username, or email.

    Note

    ROLE can be one of view/edit/admin.

  2. Set the name prefix for the K8s resources that will be generated:

    $ export NAME=${USER//[^a-zA-Z0-9\-]/-}-$ROLE
    $ export OVERLAY=$NAMESPACE-$NAME
    

    Note

    We need a unique, DNS-1123 compatible name (prefix) for all K8s resources that will be created. Here we replace all non-valid chars of the USER (that usually is an email) with a dash.

  3. Switch to kubeflow/manifests/ directory:

    $ cd kubeflow/manifests
    
  4. Create the new overlay:

    $ mkdir -p namespace-permissions/overlays/$OVERLAY
    $ envsubst < namespace-permissions/template-kustomization.yaml > namespace-permissions/overlays/$OVERLAY/kustomization.yaml
    $ envsubst < namespace-permissions/template-params.env > namespace-permissions/overlays/$OVERLAY/params.env
    
  5. Commit the changes:

    $ git add namespace-permissions/overlays/$OVERLAY
    $ git commit -m "Assign '$ROLE' access on namespace '$NAMESPACE' to user '$USER'"
    
  6. Apply the kustomization:

    $ rok-deploy --apply namespace-permissions/overlays/$OVERLAY
    

Access Private Registries

To be able to pull images from a private registry, first create a Secret following the Official Kubernetes Guide.

To enable Jupyter Web App to create Notebooks using images from a private registry, create a PodDefault using the previously created Secret as imagePullSecret like:

apiVersion: kubeflow.org/v1alpha1
kind: PodDefault
metadata:
  name: access-prv-registry
spec:
  desc: Allow access to private registry
  selector:
    matchLabels:
      registry-pull-secret: "true"
  imagePullSecrets:
  - name: <secret_name>

Notice the selector.matchLabels field, this PodDefault will be applied to every Pod that contains the label registry-pull-secret: "true" in its spec. Jupyter Web App will now show this new PodDefault in the “Configurations” section. In case you want to make the PodDefault selected by default, edit kubeflow/kustomize/jupyter-web-app/overlays/ekf/patches/config-map.yaml and append the above label, e.g., registry-pull-secret to the existing spawnerFormDefaults.configurations.value list.

Then commit and apply the changes:

$ git commit -am "Allow pulling private images when creating Notebooks"
$ kfctl apply -V -f kfctl_config.yaml

Finally restart the JWA pod so that it “sees” the change in the jupyter-web-app-config ConfigMap:

$ kubectl delete pods -n kubeflow -l app.kubernetes.io/name=jupyter-web-app

Make sure you also commit both Secret and PodDefault in the GitOps repo.

Verify

To verify that Kubeflow was deployed correctly, simply visit https://demo.example.com/, login as user with the password you generated before, and you’ll be redirected to the Kubeflow’s dashboard.

Congratulations, you have just deployed Kubeflow! To make sure that all Kubeflow services are running as expected, we advise you to run an e2e data science workflow with Kale. Kale is the workflow tool that allows to orchestrate Kubeflow pipelines, starting from a Jupyter notebook.

Follow chapters 4 and 5 of the Titanic Codelab to run a Jupyter notebook as a Kubeflow pipeline in a reproducible environment, spawn a new notebook from a step’s snapshot and debug the state of the pipeline.

Note

Some notable differences from what is described in the Codelab:

  1. When creating a new Notebook Server, use the provided gcr.io/arrikto/jupyter-kale image, instead of gcr.io/arrikto-public/tensorflow-1.14.0-notebook-cpu:kubecon-workshop.

  2. This base image does not come with any data science library installed. Just installing seaborn, as stated in the Codelab, will not be enough. Make sure to run the following commands:

    $ cd /home/jovyan/data/examples/titanic-ml-dataset
    $ pip3 install -r requirements.txt --user
    
  3. In chapter 5, section Reproduce prior state, you are asked to look at the Artifacts tab in the Pipelines dashboard. This tab has been renamed to Visualizations.

  4. In chapter 5, when spawning a new Notebook, starting from a snapshot’s URL, you will notice that the Autofill button is missing. All the form fields are now automatically filled as soon as you paste the Rok URL.

Important

Due to a known issue, the snapshot procedure could be ignored the first time you convert a Notebook. To mitigate this, make sure to disable and then re-enable the Use this notebook’s volumes toggle, before clicking the Compile and Run button.

Tweak Kubeflow

In this section, we document common tweaks needed in a Kubeflow deployment.

Extend the list of default images of Jupyter Web App

  1. Go to the deployment repository:

    $ cd ~/ops/deployments
    
  2. Edit the Jupyter Web App config kubeflow/manifests/jupyter/jupyter-web-app/overlays/ekf/patches/config-map.yaml and add your custom image to the Jupyter Web App configuration:

    data:
      spawner_ui_config.yaml: |
        spawnerFormDefaults:
          image:
            # The container Image for the user's Jupyter Notebook
            # If readonly, this value must be a member of the list below
            value: <your_custom_image>
            # The list of available standard container Images
            options:
              - <your_custom_image>
              - gcr.io/arrikto/jupyter-kale:v0.5.0-13-gf5308d4
    
  3. Commit the new configuration:

    $ git add kubeflow/manifests/jupyter/jupyter-web-app/overlays/ekf/patches/config-map.yaml
    $ git commit -m "kubeflow: Update Jupyter Web App image list"
    
  4. Follow the Upgrade guide to re-apply Kubeflow.

Tweak Dex

In this section we describe how to tweak the default Dex deployment (see also the corresponding section in official Kubeflow docs). Specifically:

  • change password of default user
  • change frontend theme and issuer
  • change credentials of OIDC client

By default, Dex is installed with a single static user. To change its password or create new users, one has to modify the ConfigMap patch.

Note

Here we avoid modifying params.env and use variable substitution inside the ConfigMap, since this limits us to a single user.

First pick a password and hash it, and a UUID:

$ python3 -c 'from passlib.hash import bcrypt; import getpass; print(bcrypt.using(rounds=12, ident="2y").hash(getpass.getpass()))'
$ cat /proc/sys/kernel/random/uuid

Edit kustomize/dex/overlays/ekf/patches/config-map.yaml and:

apiVersion: v1 kind: ConfigMap
metadata:
  name: dex
data:
   ...
  staticPasswords:
  - email: user
    hash: <enter the generated hash here>
    username: user
    userID: <enter the random UUID here>

Change the default frontend theme and issuer. Edit the configmap again and:

apiVersion: v1
kind: ConfigMap
metadata:
  name: dex
data:
  config.yaml: |
    frontend:
      dir: /arrikto_web
      issuer: Kubeflow
      theme: ekf

Finally, to change the OIDC client credentials, edit kustomize/dex/overlays/ekf/secret_params.env and:

OIDC_CLIENT_ID=kubeflow-oidc-authservice
OIDC_CLIENT_SECRET=pUBnBOY80SnXgjibTYM9ZWNzY2xreNGQok

Important

If one changes the default values, one should update the corresponding secret of oidc-authservice component, i.e., match with kustomize/oidc-authservice/overlays/ekf/secret_params.env

Commit and apply changes:

$ git commit -am "dex: Change default user, OIDC client creds and theme"
$ kfctl apply -V -f kfctl_config.yaml

For changes to take effect we have to restart the pods manually:

$ kubectl delete pods -n auth -l app=dex
$ kubectl delete pods -n istio-system -l app=authservice

Upgrade

Important

The upgrade procedure for Kubeflow is not in a fetch/rebase/apply manner since one has to re-generate kustomizations using kfctl. This is a current limitation of how Kubeflow manifests are organized and how the kfctl tool, that will be replaced in upcoming releases, manages them.

First make sure you follow the Upgrade manifests guide. This will bring latest Kubeflow manifests. As described above, we have used kfctl to first generate kustomizations, and then to apply them. The final kustomizations are generated locally and tracked by the user and not included in Arrikto provided manifests. Thus the user must re-generate them to bring “upstream” changes, e.g., new images.

To upgrade Kubeflow follow the steps below:

  1. Switch to the kubeflow directory:

    $ cd kubeflow
    
  2. Delete generated kustomizations and stale cache folder:

    $ git rm -r kustomize/
    $ rm -r .cache/
    
  3. Re-generate kustomization based on new manifests and KfDef:

    $ kfctl build -V -f kfctl_config.yaml
    
  4. Add new kustomizations and commit them:

    $ git add kustomize/
    $ git commit -m "Upgrade Kubeflow kustomizations"
    
  5. At this point you have to recreate any changes you possibly made during the initial deployment of Kubeflow, that updated the content of:

    • the kustomize/ directory that we purged above.
    • the manifests/ directory that might have been overridden during rebase (see Upgrade manifests).

    For example, you might need to cherry-pick the commit that configured Kubeflow’s OIDC authentication or a custom commit that extended the list of available images in Kubeflow’s Jupyter Web App under jupyter/jupyter-web-app/overlays/ekf/patches/config-map.yaml.

    These tweaks should be separate commits in your Git history. To ensure that your local GitOps deployment repository contains these changes:

    1. Go through your log, and look for changes you need to re-apply after the upgrade:

      $ git log
      
    2. Inspect the changes, one-by-one. E.g., for a single change / a single commit:

      $ git show <commit_id>
      
    3. Re-apply the change by cherry-picking it as a new commit on top of your current HEAD, and confirm the log looks good:

      $ git cherry-pick <commit_id>
      $ git log
      
  6. Re-apply configuration:

    $ kfctl apply -V -f kfctl_config.yaml
    
  7. Re-apply your kustomizations for all namespaces you have set up:

    $ rok-deploy --apply manifests/namespace-resources/overlays/*
    

    Important

    If you have not performed any namespace setup yet, please follow the instructions in the Set up namespaces section.

  8. Re-apply your kustomizations for all namespace sharing you have set up:

    $ rok-deploy --apply manifests/namespace-permissions/overlays/*
    

    Note

    If you have not set up any namespace sharing and would like to, please follow the instructions in the Enable namespace sharing section.