Rok v0.15

This guide assumes that you have already cloned Arrikto’s deployment repository that contains needed Kubernetes manifests, i.e., https://github.com/arrikto/deployments.

Change your current directory to your local clone of Arrikto’s GitOps deployment repository. For example:

$ cd ~/ops/deployments

Note

This guide uses the deploy overlay in the commands to be executed since this is the overlay that users are supposed to tailor based on their needs and preferences.

Upgrade on Kubernetes

Rok v0.15 is the first version to support software upgrades that are orchestrated by the Operator. In this guide you can find the steps needed to seamlessly upgrade a v0.14 Rok cluster to v0.15. Each step roughly corresponds to one of the components that need to be upgraded:

Component v0.14 v0.15
RokCluster CR  
RokCluster CRD  
Rok Operator  
Rok Disk Manager  
Rok kmod  

Notable changes v0.14 -> v0.15

  • The Operator manifests have been updated in v0.15 and, some of them, cannot be transitioned by simply updating v0.14 resources. As such, we need to manually delete those and re-create them by applying the v0.15 manifests. Else, Kubernetes reports:

    The StatefulSet "rok-operator" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden
    
  • Τhe RokCluster and RokRegistryCluster CRDs have also been updated. By deploying the v0.15 Operator manifests you also update these cluster CRDs. Note that the API versions of the CRDs have not been updated, since we do not yet support CRD versioning. Most importantly a new, required field has been added to the RokCluster CRD to determine a Redis endpoint, needed by Rok’s composer.

  • The StorageClass and VolumeSnapshotClass resources are no longer managed by the Operator, but created by kubectl apply -k. Also, the StorageClass parameters have been updated so we need to manually delete it and re-create it by applying the v0.15 manifest. Else, Kubernetes reports:

    The StorageClass "rok" is invalid: parameters: Forbidden: updates to parameters are forbidden
    
  • Rok cluster secondary resources (e.g., Rok DaemonSet, Rok CSI StatefulSet) have also been updated, so when the v0.15 Operator gets deployed cluster reconciliation will fail with 422 Unprocessable Entity errors. As a fallback, Operator will delete old resources so that it can apply the new desired state. During this time, and until the cluster CR also gets updated, Pods might fail and the cluster might be in half baked state.

  • Rok cluster manifests now include the admin ClusterRole, aimed to to provide administrative access to all Rok resources.

  • Rok on Kubernetes now integrates with Dex as OIDC Provider for static users.

  • Rok on Kubernetes now runs behind the AuthService authentication proxy.

  • Rok on Kubernetes now ships with Istio (1.3.1) to manage network traffic and expose Rok to the outer world via an Ingress Gateway. In next releases Rok will also use Istio as its service mesh.

  • The ConfigMap that holds cluster member join metadata has been renamed from $CLUSTERNAME-metadata to $CLUSTERNAME-join-metadata. The old ConfigMap can be safely deleted for housekeeping, after the upgrade completes.

  • Operator has been granted elevated privileges to update the status subresource of both the RokCluster and RokRegistryCluster CRs

  • Configuration variables on the Rok and Rok Registry CRs have been converted from string to object i.e., users can now specify them as key value pairs.

    Important

    If you have applied patches to the cluster CR that still refer to the spec.configVars field as string, you will need to update them appropriately

    For example, the following patch

    apiVersion: crd.arrikto.com/v1alpha1
    kind: RokCluster
    metadata:
      name: rok
    spec:
      configVars: cluster.behind_proxy=True; cluster.check_host_header=False;
    

    should be converted to:

    apiVersion: crd.arrikto.com/v1alpha1
    kind: RokCluster
    metadata:
      name: rok
    spec:
      configVars:
        cluster.behind_proxy: True
        cluster.check_host_header: False
    

Upgrade v0.14 -> v0.15

This is a major software version upgrade.

We assume that you are already running a v0.14 Rok cluster on Kubernetes and that you also have access to the v0.15 kustomization tree you are upgrading to.

1. Increase observability (optional)

To gain insight into the status of the cluster upgrade execute the following commands in a separate window:

  • For live cluster status:

    $ watch kubectl get rokcluster -n rok
    
  • For live cluster events:

    $ watch 'kubectl describe rokcluster -n rok rok | tail -n 20'
    

2. Inspect current version (optional)

  1. Get into the Rok Pod:

    $ kubectl exec -ti -n rok daemonset/rok -- /bin/bash
    
  2. Verify the config version of your cluster:

    root@rok-fkmvz|rok-minikube|rok.rok.svc.cluster.local:/# rok-config version
    Current version [Cluster-wide]: v001400_0004
    Desired version [This appliance]: v001400_0004
    

3. Upgrade Rok Disk Manager

  1. Apply the v0.15 Rok Disk Manager manifests:

    $ kubectl apply -k rok/rok-disk-manager/overlays/deploy
    
Component v0.14 v0.15
RokCluster CR  
RokCluster CRD  
Rok Operator  
Rok Disk Manager  
Rok kmod  

4. Upgrade Rok kmod

  1. Apply the v0.15 Rok kmod manifests:

    $ kubectl apply -k rok/rok-kmod/overlays/deploy
    
Component v0.14 v0.15
RokCluster CR  
RokCluster CRD  
Rok Operator  
Rok Disk Manager  
Rok kmod  

5. Upgrade Rok Operator

  1. Delete the Operator StatefulSet manually:

    $ kubectl delete sts -n rok-system rok-operator
    

    Note

    Due to bad handling of the SIGTERM signal in v0.14, the Operator Pod might take up to 30-45 seconds to delete. This issue has been resolved in v0.15.

  2. Apply the v0.15 Operator manifests:

    $ kubectl apply -k rok/rok-operator/overlays/deploy
    

Note

The above command updates the RokCluster CRD

After the manifests have been applied, ensure Rok Operator has become ready before continuing to upgrade the Rok cluster by running the following command:

$ watch kubectl get pods -n rok-system -l app=rok-operator
Component v0.14 v0.15
RokCluster CR  
RokCluster CRD  
Rok Operator  
Rok Disk Manager  
Rok kmod  

Warning

Until the cluster CR is also updated, the Operator will attempt to reconcile the v0.14 cluster CR based on the v0.15 CRD. This will lead into API errors which can be viewed in the Operator logs.

For example, the Operator won’t be able to find the newly added, required spec.redis.endpoint key in the cluster CR spec and will fail to sync resources. Plus, the Operator won’t be able to patch the existing RokCluster to set its status.

Important

This is a known limitation that we plan to overcome with proper CRD versioning in the near future

6. Upgrade Rok cluster

  1. Delete the existing Rok StorageClass manually:

    $ kubectl delete storageclass rok
    
  2. Rok v0.15 uses Redis as an in-memory data store. If you don’t already run Redis on your Kubernetes cluster, you can spin one with:

    $ kubectl apply -k rok/rok-external-services/redis/overlays/deploy
    
  3. Rok v0.15 uses Istio (1.3.1) to expose itself to the outer world. If you don’t already have Istio (1.3.1) running on your Kubernetes cluster, you can install it with:

    $ kubectl apply -k rok/rok-external-services/istio/istio-1-3-1/istio-crds-1-3-1/overlays/deploy
    $ kubectl apply -k rok/rok-external-services/istio/istio-1-3-1/istio-install-1-3-1/overlays/deploy
    $ kubectl apply -f rok/rok-istio-ingress/rok-istio-1-3-1.yaml
    
  4. Rok v0.15 integrates with Dex as OIDC Provider. You can install it with:

    $ kubectl apply -k rok/rok-external-services/dex/overlays/deploy
    
  5. Rok v0.15 uses AuthService as its authentication proxy. You can install it with:

    $ kubectl apply -k rok/rok-external-services/authservice/overlays/deploy
    
  6. Make sure that the desired version is specified both for Rok and Rok CSI in the Rok cluster’s kustomize patches i.e.,:

    ...
    spec:
      ...
      images:
       rok: gcr.io/arrikto/roke:v0.15
       rokCSI: gcr.io/arrikto/rok-csi:v0.15
    
  7. Make sure that the desired Redis endpoint is specified in the spec of the RokCluster CR i.e.,:

    ...
    spec:
      ...
      redis:
        endpoint: redis://rok-redis.rok.svc.cluster.local:6379
    
  8. Apply the v0.15 Rok cluster manifests:

    $ kubectl apply -k rok/rok-cluster/overlays/deploy
    
Component v0.14 v0.15
RokCluster CR  
RokCluster CRD  
Rok Operator  
Rok Disk Manager  
Rok kmod  

The v0.15 Operator will now attempt to reconcile the v0.15 cluster CR based on the v0.15 CRD. The cluster upgrade Job will upgrade the cluster config on etcd. There are no GW nor Fort migrations between v0.14 and v0.15.

#. During the upgrade, you should be able to view log messages similar to the following:

[INFO] [rokcluster=rok, ns=rok] Current version: `v0.14', Desired version: `v0.15'
[INFO] [rokcluster=rok, ns=rok] Cluster needs to be upgraded from `v0.14' to `v0.15'
[INFO] [rokcluster=rok, ns=rok] Upgrading cluster to version: `v0.15'
[WARNING] [rokcluster=rok, ns=rok] Scaling cluster components to zero, one-by-one. Downtime is expected during the upgrade.

7. Verify successful upgrade

  1. Check the status of the cluster upgrade Job:

    $ kubectl get job -n rok rok-upgrade-XYZ
    
  2. Ensure that Rok is up and running after the upgrade Job finishes:

    $ kubectl get rokcluster -n rok rok
    NAME   VERSION  HEALTH   TOTAL MEMBERS   READY MEMBERS   PHASE     AGE
    rok    v0.15    OK       1               1               Running   1h18m
    
  3. Get into the Rok Pod:

    $ kubectl exec -ti -n rok daemonset/rok -- /bin/bash
    
  4. Verify the config version of your cluster:

    root@rok-fkmvz|rok-minikube|rok.rok.svc.cluster.local:/# rok-config version
    Current version [Cluster-wide]: v001500_0010
    Desired version [This appliance]: v001500_0010