Scale Down Rok etcd

This guide will walk you through reducing the size of your Rok etcd cluster by removing one of its members. For removing more members, simply follow the guide again.

Important

Scaling Rok etcd down to less than three members will result in Rok etcd not being able to serve requests if it loses one of its members. Therefore, we recommend that you always use at least three Rok etcd cluster members to prevent cluster unavailability.

See also

Official guide on etcd clustering.

Check Your Environment

  1. Retrieve the endpoints of all etcd cluster members:

    root@rok-tools:~/ops/deployments# export ETCD_ENDPOINTS=$(kubectl \ > exec -ti -n rok sts/rok-etcd -- etcdctl member list -w json \ > | jq -r '.members[].clientURLs[]' | paste -sd, -)
  2. Ensure that the etcd cluster is currently healthy. Inspect the etcd endpoints and verify that the HEALTH field is true for all endpoints:

    root@rok-tools:~/# kubectl exec -ti -n rok sts/rok-etcd -c etcd -- \ > etcdctl --endpoints ${ETCD_ENDPOINTS?} endpoint health -w table +--------------------------------------+--------+------------+-------+ | ENDPOINT | HEALTH | TOOK | ERROR | +--------------------------------------+--------+------------+-------+ | rok-etcd-0.rok-etcd-cluster.rok:2379 | true | 9.302141ms | | | rok-etcd-1.rok-etcd-cluster.rok:2379 | true | 9.325642ms | | | rok-etcd-2.rok-etcd-cluster.rok:2379 | true | 9.317423ms | | +--------------------------------------+--------+------------+-------+

Procedure

  1. Go to your GitOps repository, inside your rok-tools management environment:

    root@rok-tools:~# cd ~/ops/deployments
  2. Retrieve the current size of the etcd cluster:

    root@rok-tools:~/ops/deployments# export ETCD_CLUSTER_SIZE=$(kubectl get sts \ > -n rok rok-etcd -o jsonpath="{.spec.replicas}") \ > && echo ${ETCD_CLUSTER_SIZE?} 3
  3. Decrease the Rok etcd cluster size by one:

    root@rok-tools:~/ops/deployments# let ETCD_CLUSTER_SIZE--
  4. Set the name of the last etcd cluster member:

    root@rok-tools:~/ops/deployments# export \ > NAME=rok-etcd-${ETCD_CLUSTER_SIZE?}.rok-etcd-cluster.rok
  5. Retrieve the ID of the last etcd cluster member:

    root@rok-tools:~/ops/deployments# export ID=$(kubectl exec -ti -n rok sts/rok-etcd -c etcd -- \ > etcdctl member list -w json --hex \ > | jq -r '.members[] | select(.name == "'${NAME?}'") | .ID') \ > && echo ${ID?} 39212b442e2e4e54
  6. Remove the member from the etcd cluster:

    root@rok-tools:~/ops/deployments# kubectl exec -ti -n rok sts/rok-etcd -c etcd -- \ > etcdctl member remove ${ID?} Member 39212b442e2e4e54 removed from cluster 844c2991de84c0b
  7. Render the patch for the etcd cluster size:

    root@rok-tools:~/ops/deployments# j2 \ > rok/rok-external-services/etcd/overlays/deploy/patches/cluster-size.yaml.j2 \ > -o rok/rok-external-services/etcd/overlays/deploy/patches/cluster-size.yaml
  8. Set the etcd cluster state:

    root@rok-tools:~/ops/deployments# export ETCD_CLUSTER_STATE=existing
  9. Render the patch for the etcd cluster state:

    root@rok-tools:~/ops/deployments# j2 \ > rok/rok-external-services/etcd/overlays/deploy/patches/cluster-state.yaml.j2 \ > -o rok/rok-external-services/etcd/overlays/deploy/patches/cluster-state.yaml
  10. Edit rok/rok-external-services/etcd/overlays/deploy/kustomization.yaml and ensure that both cluster-size and cluster-state patches are enabled:

    patches: - path: patches/cluster-size.yaml target: kind: StatefulSet name: etcd - path: patches/cluster-state.yaml
  11. Commit your changes:

    root@rok-tools:~/ops/deployments# git commit -am "Scale Rok etcd to ${ETCD_CLUSTER_SIZE?} members"
  12. Apply the kustomization:

    root@rok-tools:~/ops/deployments# rok-deploy --apply rok/rok-external-services/etcd/overlays/deploy
  13. Set the PVC name of the removed member:

    root@rok-tools:~/ops/deployments# export PVC=data-rok-etcd-${ETCD_CLUSTER_SIZE?}
  14. Delete the PVC of the removed member:

    root@rok-tools:~/ops/deployments# kubectl delete pvc -n rok ${PVC?}

Verify

  1. Ensure that all etcd Pods are ready. Verify that field READY is 2/2 and field STATUS is Running for all Pods:

    root@rok-tools:~/ops/deployments# watch kubectl get pods -n rok -l app=etcd Every 2.0s: kubectl get pods -n rok -l app=etcd rok-tools: Mon Aug 8 12:36:35 2022 NAME READY STATUS RESTARTS AGE rok-etcd-0 2/2 Running 0 2d22h rok-etcd-1 2/2 Running 0 2d22h rok-etcd-2 2/2 Running 0 2d22h
  2. Ensure that the PVC of the removed member has been recreated. Verify that the field STATUS is Bound for all PVCs:

    root@rok-tools:~/ops/deployments# kubectl get pvc -n rok -l app=etcd NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE data-rok-etcd-0 Bound pvc-f1a26d8b-a172-4bea-8700-c10c125f64e4 80Gi RWO gp2 42d data-rok-etcd-1 Bound pvc-4afa5bf2-cf8b-40c7-ad5b-8df05192af7a 80Gi RWO gp2 11d data-rok-etcd-2 Bound pvc-0edb4f81-a7f5-4dd0-8fca-d7efb10f4ab3 80Gi RWO gp2 11d
  3. Retrieve the endpoints of all etcd cluster members:

    root@rok-tools:~/ops/deployments# export ETCD_ENDPOINTS=$(kubectl \ > exec -ti -n rok sts/rok-etcd -- etcdctl member list -w json \ > | jq -r '.members[].clientURLs[]' | paste -sd, -)
  4. Ensure that the etcd cluster is currently healthy. Inspect the etcd endpoints and verify that the HEALTH field is true for all endpoints:

    root@rok-tools:~/# kubectl exec -ti -n rok sts/rok-etcd -c etcd -- \ > etcdctl --endpoints ${ETCD_ENDPOINTS?} endpoint health -w table +--------------------------------------+--------+------------+-------+ | ENDPOINT | HEALTH | TOOK | ERROR | +--------------------------------------+--------+------------+-------+ | rok-etcd-0.rok-etcd-cluster.rok:2379 | true | 9.302141ms | | | rok-etcd-1.rok-etcd-cluster.rok:2379 | true | 9.325642ms | | | rok-etcd-2.rok-etcd-cluster.rok:2379 | true | 9.317423ms | | +--------------------------------------+--------+------------+-------+
  5. Ensure that the Rok etcd cluster has the expected member count. Verify that the output of the following command is for example 3:

    root@rok-tools:~# kubectl exec -ti -n rok sts/rok-etcd -c etcd -- \ > etcdctl member list | wc -l 3
  6. List the members of the etcd cluster. Verify that field STATUS is started and field IS LEARNER is false for all members:

    root@rok-tools:~/ops/deployments# kubectl exec -ti -n rok sts/rok-etcd -c etcd -- \ > etcdctl member list -w table +------------------+---------+---------------------------------+---------------------------------------------+---------------------------------------------+------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | +------------------+---------+---------------------------------+---------------------------------------------+---------------------------------------------+------------+ | 28e727eb210f314d | started | rok-etcd-2.rok-etcd-cluster.rok | http://rok-etcd-2.rok-etcd-cluster.rok:2380 | http://rok-etcd-2.rok-etcd-cluster.rok:2379 | false | | b2ff88bb2eae13b7 | started | rok-etcd-0.rok-etcd-cluster.rok | http://rok-etcd-0.rok-etcd-cluster.rok:2380 | http://rok-etcd-0.rok-etcd-cluster.rok:2379 | false | | f823900dacf44825 | started | rok-etcd-1.rok-etcd-cluster.rok | http://rok-etcd-1.rok-etcd-cluster.rok:2380 | http://rok-etcd-1.rok-etcd-cluster.rok:2379 | false | +------------------+---------+---------------------------------+---------------------------------------------+---------------------------------------------+------------+

Summary

You have successfully removed a member to the Rok etcd cluster.

What’s Next

Check out the rest of the maintenance operations you can perform on your Rok etcd cluster.