Scale In EKS Cluster¶
EKF supports automatic scaling operations on the Kubernetes cluster using a modified version of the Cluster Autoscaler that supports Rok volumes.
This guide will walk you through manually scaling in your EKS cluster, by selecting and removing nodes one-by-one.
See also
- Scale In Kubernetes Cluster using
rok-k8s-drain
to forcefully scale your EKS cluster to a desired size.
Overview
What You’ll need¶
A configured management environment.
An existing EKS cluster.
One or more managed or self-managed node groups.
Optional
A working Cluster Autoscaler.
Procedure¶
Go to your GitOps repository, inside your
rok-tools
management environment:root@rok-tools:~# cd ~/ops/deploymentsList the Kubernetes nodes of your cluster:
root@rok-tools:~# kubectl get nodes NAME STATUS ROLES AGE VERSION ip-192-168-147-191.eu-central-1.compute.internal Ready <none> 18d v1.21.5-eks-bc4871b ip-192-168-168-207.eu-central-1.compute.internal Ready <none> 18d v1.21.5-eks-bc4871bSpecify the node you want to remove:
root@rok-tools:~# export NODE=<NODE>Replace
<NODE>
with the node name. For example:root@rok-tools:~# export NODE=ip-192-168-168-207.eu-central-1.compute.internalNote
Normally, the Cluster Autoscaler finds a scale-in candidate automatically. In order to find a good candidate manually, you have to
- Pick an underutilized node.
- Ensure that you don’t try to scale in past the ASG’s
minSize
. - Ensure that existing EBS volumes are reachable from other nodes in the cluster.
Start a drain operation for the selected node:
root@rok-tools:~# kubectl drain --ignore-daemonsets --delete-local-data ${NODE?} ... node/ip-192-168-168-207.eu-central-1.compute.internal evictedNote
This may take a while, since Rok is unpinning all volumes on this node, and as such,
rok-csi-guard
pods are expected to be evicted last.Warning
Do not delete
rok-csi-guard
pods manually, since this might cause data loss.Troubleshooting
The command does not complete.
Most likely the unpinning of a Rok PVC fails. Inspect the logs of Rok CSI controller to debug further.
Once the drain operation completes, remove the node.
Fast Forward
Skip this step if you have a Cluster Autoscaler instance running in your cluster, since it will see the drained node, will consider it as unneeded, and after a period of time (based on
scale-down-unneeded-time
option) it will automatically terminate the EC2 instance and decrement the desired size of the Auto Scaling group.Find the EC2 instance of the drained node:
root@rok-tools:~# export INSTANCE=$(kubectl get nodes ${NODE?} \ > -o jsonpath={.spec.providerID} \ > | sed 's|aws:///.*/||')Terminate the instance and decrement the desired capacity of its Auto Scaling group:
root@rok-tools:~# aws autoscaling terminate-instance-in-auto-scaling-group \ > --instance-id ${INSTANCE?} \ > --should-decrement-desired-capacity
Verify¶
Ensure that the selected node has been removed from your Kubernetes cluster:
root@rok-tools:~# kubectl get nodes ${NODE?} Error from server (NotFound): nodes "ip-192-168-168-207.eu-central-1.compute.internal" not foundEnsure that the underlying instance has been deleted:
root@rok-tools:~# aws ec2 describe-instances --instance-id ${INSTANCE?} An error occurred (InvalidInstanceID.NotFound) when calling the DescribeInstances operation: The instance ID 'i-0f992f0b02d777901' does not exist
What’s Next¶
Check out the rest of the EKS maintenance operations that you can perform on your cluster.