Rok v1.0 (unreleased)

This guide assumes that you have already cloned Arrikto’s deployment repository that contains needed Kubernetes manifests, i.e., https://github.com/arrikto/deployments.

Change your current directory to your local clone of Arrikto’s GitOps deployment repository. For example:

$ cd ~/ops/deployments

Note

This guide uses the deploy overlay in the commands to be executed since this is the overlay that users are supposed to tailor based on their needs and preferences.

Notable changes v0.15 -> v1.0

  • Until v0.15 Rok integrates with Istio v1.3.1. In v1.0 Rok has been upgraded to work with Istio v1.5.7.
  • Rok tools now persists the whole of /root and not only /root/ops by default.

Upgrade your management environment

This version of Rok changes the way data is persisted in your management environment: from now on, rok-tools persists the whole of /root, i.e, not only the local GitOps repository but also user settings and credentials, e.g., under ~/.aws and ~/.ssh.

Important

The location where the volume of rok-tools is mounted has changed from /root/ops to /root.

Important

The steps below will instruct you how to mirror the local GitOps repo in a private remote, so that you can later clone it into a new, empty volume that the upgraded instance of rok-tools will use to persist data.

  1. First, add an extra remote of your choice so you can push local changes there:

    root@rok-tools-0:/# cd ~/ops/deployments
    root@rok-tools-0:~/ops/deployments# git remote add private <repo-url>
    
  2. Then, push your local GitOps repository to the remote you just added to safely keep your current changes. For example:

    root@rok-tools-0:~/ops/deployments# git push private develop:develop
    
  3. (Docker only) If you are running rok-tools as a local Docker container you need to clear the host directory that was previously used by Docker as the volume of rok-tools. For example:

    $ rm -rf rok-tools-data/*
    
  4. Follow the steps described in the Upgrade your management environment section

  5. Exec into the upgraded rok-tools container:

    For Kubernetes:

    $ kubectl exec -ti statefulset/rok-tools /bin/bash
    

    For Docker:

    $ docker exec -ti <ROK_TOOLS_CONTAINER_ID> /bin/bash
    
  6. Since any changes under /root are lost, you have to reconfigure your management environment based on the individual subsections of the Configure your management environment section, as needed. From now on, changes in files under /root will now be persisted in the external volume.

  7. Restore your GitOps repository locally to make all your latest changes available in the new rok-tools instance:

    root@rok-tools-0:/# mkdir ~/ops && cd ~/ops
    root@rok-tools-0:~/ops# git clone --branch develop --origin private <repo-url>  deployments
    
  8. Add the Arrikto provided remote:

    root@rok-tools-0:/# cd ~/ops/deployments
    root@rok-tools-0:~/ops/deployments# git remote add origin git@github.com:arrikto/deployments.git
    
  9. Update local repo:

    root@rok-tools-0:~/ops/deployments# git fetch --all -p
    
  10. Ensure your local branch tracks the Arrikto-managed one:

    root@rok-tools-0:~/ops/deployments# git branch -u origin/develop
    
  11. (Optional) Remove the private remote:

    root@rok-tools-0:~/ops/deployments# git remote remove private
    
  12. Reconfigure kubectl so that you can connect to your Kubernetes cluster. Depending on your environment and setup you might need to follow the Access EKS Cluster section or copy in your kubeconfig file.

  13. (Kubernetes only) If you are running rok-tools as a Kubernetes StatefulSet you can optionally delete the old PVC, i.e., rok-ops-rok-tools-0, since the PVC has been renamed to data-rok-tools-0 and the old one is no longer needed.

Upgrade Manifests

Upgrade the local deployments repo and rebase your work over the release-1.0 branch to track the 1.0 version of Rok.

  1. Fetch latest upstream changes:

    $ git fetch --all -p
    
  2. Retrieve the current upstream branch on which your current work is based:

    $ git rev-parse --abbrev-ref --symbolic-full-name @{u}
    origin/develop
    

    Important

    The next command assumes you were working off the origin/develop upstream branch until now. Otherwise you must use the correct branch, based on the output of this step.

  3. Rebase over release-1.0 and favor local changes upon conflicts:

    $ git rebase -Xtheirs --onto origin/release-1.0 origin/develop
    

    Important

    The above may cause conflicts, e.g., when a file was modified locally but removed from upstream, for example:

    CONFLICT (modify/delete): kubeflow/kfctl_config.yaml deleted in origin/develop and modified in HEAD~61. Version HEAD~61 of kubeflow/kfctl_config.yaml left in tree.
    

    We suggest here to go ahead and delete those files, e.g.:

    $ git status --porcelain | awk '{if ($1=="DU") print $2}' | xargs git rm
    

    And proceed with the rebase:

    $ git rebase --continue
    
  4. Configure the local branch to track release-1.0:

    $ git branch --set-upstream-to=origin/release-1.0
    

Upgrade Istio

Note

The upgrade procedure described below deviates from the generate/commit/apply model since requires some manual deletions.

Rok v1.0 uses a newer version of Istio (v1.5.7), containing many bug fixes and improvements. In order to upgrade to version 1.5.7 from version 1.3.1:

  1. Delete the previous Istio control plane installation:

    $ kubectl delete -k rok/rok-external-services/istio/istio-1-3-1/istio-install-1-3-1/overlays/deploy
    
  2. Apply the new Istio control plane:

    $ kubectl apply -k rok/rok-external-services/istio/istio-1-5-7/istio-crds-1-5-7/overlays/deploy
    $ kubectl apply -k rok/rok-external-services/istio/istio-1-5-7/istio-namespace-1-5-7/overlays/deploy
    $ kubectl apply -k rok/rok-external-services/istio/istio-1-5-7/istio-install-1-5-7/overlays/deploy
    
  3. Check the Envoy Proxy sidecars that exist in the cluster:

    $ istioctl proxy-status
    
    NAME                                                   CDS                            LDS        EDS        RDS        PILOT                      VERSION
    activator-cfc66dc7-tzdzx.knative-serving               SYNCED                         SYNCED     SYNCED     SYNCED     istiod-7c855cc66-fdqcg     1.3.1
    autoscaler-6cc8bc459b-jghcg.knative-serving            SYNCED                         SYNCED     SYNCED     SYNCED     istiod-7c855cc66-fdqcg     1.3.1
    cluster-local-gateway-9d544d7db-c2xbq.istio-system     STALE (Never Acknowledged)     SYNCED     SYNCED     SYNCED     istiod-7c855cc66-fdqcg     1.3.1
    istio-ingressgateway-74649669b7-x77tt.istio-system     SYNCED                         SYNCED     SYNCED     SYNCED     istiod-7c855cc66-fdqcg     1.5.7
    prometheus-6c846d79b9-r8v4b.istio-system               SYNCED                         SYNCED     SYNCED     SYNCED     istiod-7c855cc66-fdqcg     1.5.7
    

    Note

    The above list might include Kubeflow related components that will be upgraded later on.

  4. This release of Rok comes with an upgraded version of Istio to fix an issue with the handling of X-Forwarded-* headers by intermediate proxies, i.e., Rok URLs showed up as http URLs instead of https ones. If you have HTTP proxies in front of your Istio IngressGateway installation (e.g., ALB, NGINX, etc.), make sure to configure X-Forwarded-* settings.

  5. Also have Ingress NGINX compute full X-Forwarded-For header by patching its configuration:

    $ kubectl apply -f rok/nginx-ingress-controller/patch-configmap-l7.yaml
    
  6. Delete Pod Disruption Budgets for Istio resources:

    $ kubectl -n istio-system delete pdb istio-ingressgateway istiod cluster-local-gateway
    

Upgrade Rok components

Upgrade Rok components using latest kustomizations as described in the Upgrade components section.

Upgrade Kubeflow

If you have integrated your deployment with Kubeflow then you have to perform some manual actions before proceeding to the normal upgrade. Specifically:

  1. upgrade Kubeflow to use new Istio
  2. delete some deprecated resources

Upgrade Kubeflow to use new Istio

  1. Using new Istio requires you to manually delete some resources that cannot be updated. Specifically:

    $ kubectl delete poddisruptionbudgets -n istio-system cluster-local-gateway
    

    and:

    $ kubectl delete deployment -n istio-system cluster-local-gateway
    
  2. Also, delete KFServing Pods, so that Istio sidecars get upgraded:

    $ kubectl delete pods -n knative-serving -l app=activator
    $ kubectl delete pods -n knative-serving -l app=autoscaler
    

Upgrade to use new Manifests

In Kubeflow 1.0, the upgrade procedure for Kubeflow was not in a fetch/rebase/apply manner since one had to re-generate kustomizations using kfctl. This was a limitation of using kfctl to produce Kubeflow 1.0 manifests.

For EKF with Kubeflow 1.1, we use kustomize directly instead of kfctl. Using kustomize allows us to streamline upgrades using a fetch/rebase/apply workflow.

To migrate your manifests to the new Kubeflow 1.1 structure, follow the steps below:

  1. Follow the Upgrade manifests guide to get the latest manifests.

  2. At this point you have to recreate any changes you possibly made during the initial deployment of Kubeflow, which updated the content of:

    • the kubeflow/kustomize directory.
    • the kubeflow/manifests directory, which might have been overridden during rebase (see Upgrade manifests).

    These changes should be moved to the new manifests structure where:

    • the kustomize folder does not exist anymore.
    • all custom changes should end up in the deploy overlay and not in ekf one of the corresponding application.

    You are essentially configuring Kubeflow with the new style manifests from scratch. This is a one-time task. So you need to go through the following sections and ensure the configuration matches your needs:

    The above steps will modify the deploy overlays of the new manifests. In case you want to compare the new changes with the ones previously made, here is a mapping between the two trees:

    Kubeflow 1.0 vs Kubeflow 1.1
    Kubeflow 1.0 (old, kustomize/, ekf overlays) Kubeflow 1.1 (new, manifests/, deploy overlays)
    kubeflow/kustomize/dex/overlays/ekf/secret_params.env kubeflow/manifests/dex-auth/dex-crds/overlays/deploy/secret_params.env
    kubeflow/kustomize/dex/overlays/ekf/patches/config-map.yaml kubeflow/manifests/dex-auth/dex-crds/overlays/deploy/patches/config-map.yaml
    kubeflow/kustomize/oidc-authservice/overlays/ekf/params.env kubeflow/manifests/istio/oidc-authservice/overlays/deploy/params.env
    kubeflow/kustomize/oidc-authservice/overlays/ekf/secret_params.env kubeflow/manifests/istio/oidc-authservice/overlays/deploy/secret_params.env
    kubeflow/kustomize/jupyter-web-app/overlays/ekf/patches/config-map.yaml kubeflow/manifests/jupyter/jupyter-web-app/overlays/deploy/patches/config-map.yaml
    kubeflow/manifests/dex-auth/dex-crds/overlays/ekf/secret_params.env kubeflow/manifests/dex-auth/dex-crds/overlays/deploy/secret_params.env
    kubeflow/manifests/istio/oidc-authservice/overlays/ekf/secret_params.env kubeflow/manifests/istio/oidc-authservice/overlays/deploy/secret_params.env
    kubeflow/manifests/jupyter/jupyter-web-app/overlays/ekf/patches/config-map.yaml kubeflow/manifests/jupyter/jupyter-web-app/overlays/deploy/patches/config-map.yaml
  3. Undo custom changes in ekf overlays:

    $ git checkout origin/develop --\
    >     kubeflow/manifests/dex-auth/dex-crds/overlays/ekf/secret_params.env \
    >     kubeflow/manifests/istio/oidc-authservice/overlays/ekf/secret_params.env \
    >     kubeflow/manifests/jupyter/jupyter-web-app/overlays/ekf/patches/config-map.yaml
    
  4. Commit changes:

    $ git add kubeflow
    $ git commit -m "kubeflow: Refactor to 1.1 manifests"
    
  5. Delete deprecated files related to kfctl and commit changes:

    $ git rm -r kubeflow/kustomize
    $ rm -rf kubeflow/.cache
    $ git commit -m "kubeflow: Delete deprecated kustomizations"
    
  6. Apply:

    $ rok-deploy --apply install/kubeflow --force
    

    Warning

    We have to force update because there are changes to immutable fields of resources and thus we should delete and recreate them. Note that this is done only for specific kind of resources that are safe to be deleted. These kinds are specified by --force-kinds and are by default Deployments and StatefulSets.

Delete deprecated resources

Previous versions were creating some resources which are not needed any more. Follow this section to clean them up.

  1. Gather all user namespaces:

    $ NAMESPACES=$(kubectl get profiles -o jsonpath="{.items[*].metadata.name}")
    
  2. Select the namespace-scoped and cluster-scoped resources to delete:

    $ NAMESPACE_RESOURCES="rolebinding.rbac.authorization.k8s.io/sa-pipeline-runner-edit rolebinding.rbac.authorization.k8s.io/sa-default-editor-edit rolebinding.rbac.authorization.k8s.io/pipeline-runner-admin"
    $ CLUSTER_RESOURCES="clusterrole.rbac.authorization.k8s.io/cr-empty"
    
  3. Delete them:

    $ for ns in $NAMESPACES; do kubectl delete -n $ns $NAMESPACE_RESOURCES; done
    $ kubectl delete $CLUSTER_RESOURCES