Protect Rok System Pods¶

This guide describes the necessary steps to patch an existing Rok cluster on Kubernetes, in order to protect essential Rok System Pods from being terminated in case of a memory pressure scenario.

Overview

What You’ll Need
Procedure
Verify
Summary
What’s Next

What You’ll Need ¶

A configured management environment.
An existing Kubernetes cluster.
A working Rok deployment.

Procedure ¶

Get the version of your Rok Operator:

root@rok-tools:~# kubectl get -n rok-system sts rok-operator --no-headers \ > -o custom-columns=:.spec.template.spec.containers[0].image gcr.io/arrikto-deploy/rok-operator:release-1.1-l0-release-1.1

If the image tag of your Rok Operator is release-1.1-l0-release-1.1 or newer, you may proceed to the Verify section.
Watch the rok-csi-controller logs and ensure that no pipelines or snapshot policies are running, namely nothing will be logged for 30 secs:

root@rok-tools:~# kubectl -n rok logs -l app=rok-csi-controller -c csi-controller -f --tail=100
Scale down the rok-operator StatefulSet:

root@rok-tools:~# kubectl -n rok-system scale sts rok-operator --replicas=0 statefulset.apps/rok-operator scaled
Ensure rok-operator has scaled down to zero:

root@rok-tools:~# kubectl -n rok-system get sts rok-operator NAME READY AGE rok-operator 0/0 2h
Scale down the rok-csi-controller StatefulSet:

root@rok-tools:~# kubectl -n rok scale sts rok-csi-controller --replicas=0 statefulset.apps/rok-csi-controller scaled
Ensure rok-csi-controller has scaled down to zero:

root@rok-tools:~# kubectl get -n rok sts rok-csi-controller NAME READY AGE rok-csi-controller 0/0 2h
Watch the rok-csi-node logs and ensure that all pending operations have finished, namely nothing will be logged for 30 secs:

root@rok-tools:~# kubectl -n rok logs -l app=rok-csi-node -c csi-node -f --tail=100
Delete the rok-csi-node DaemonSet:

root@rok-tools:~# kubectl -n rok delete ds rok-csi-node daemonset.apps "rok-csi-node" deleted
Specify the image for the new rok-operator, which will assign the system-node-critical Priority Class to all Rok and Rok CSI resources:

root@rok-tools:~# export ROK_OPERATOR_IMAGE=gcr.io/arrikto-deploy/rok-operator:release-1.1-l0-release-1.1
Patch rok-operator to pull the new image:

root@rok-tools:~# kubectl -n rok-system patch sts rok-operator \ > --patch "{\"spec\": {\"template\": {\"spec\": {\"containers\": [{\"name\": \"rok-operator\", \"image\": \"${ROK_OPERATOR_IMAGE}\"}]}}}}" statefulset.apps/rok-operator patched
Scale back up rok-operator to its initial size to recreate the Rok and Rok CSI resources:

root@rok-tools:~# kubectl -n rok-system scale sts rok-operator --replicas=1 statefulset.apps/rok-operator scaled

Verify ¶

Ensure that the Rok cluster is up and running:

root@rok-tools:~# watch kubectl get rokcluster -n rok NAME VERSION HEALTH TOTAL MEMBERS READY MEMBERS PHASE AGE rok release-1.1-l0-release-1.1-rc5 OK 3 3 Running 2h
Ensure that rok, rok-csi-node, rok-csi-guard now have the system-node-critical Priority Class:

root@rok-tools:~# kubectl get -n rok daemonset rok --no-headers \ > -o custom-columns=:.spec.template.spec.priorityClassName system-node-critical

root@rok-tools:~# kubectl get -n rok sts rok-csi-node --no-headers \ > -o custom-columns=:.spec.template.spec.priorityClassName system-node-critical

root@rok-tools:~# kubectl get -n rok deploy rok-csi-guard --no-headers \ > -o custom-columns=:.spec.template.spec.priorityClassName system-node-critical
Ensure that rok-csi-controller now has the system-cluster-critical Priority Class:

root@rok-tools:~# kubectl get -n rok sts rok-csi-controller --no-headers \ > -o custom-columns=:.spec.template.spec.priorityClassName system-cluster-critical

Summary ¶

You have successfully patched all Rok System Pods with the highest pre-defined Kubernetes Priority Classes and have protected them against evictions and terminations under memory pressure scenarios.

What’s Next ¶

The next step is to protect the Rok External Services Pods.

Protect Rok External Services Pods

Protect Rok System Pods¶

What You’ll Need¶

Procedure¶

Verify¶

Summary¶

What’s Next¶

What You’ll Need ¶

Procedure ¶

Verify ¶

Summary ¶

What’s Next ¶