Create Node Pool

In this section you will add a node pool to your GKE cluster. This will host all Arrikto EKF related workloads.

Note

When creating the GKE cluster, a default node pool is created as well. If you want to use the default node pool, you may proceed to the What's Next section.

What You'll Need

Procedure

To create a node pool:

  1. Specify the name of the node pool:

    root@rok-tools:~# export NODE_POOL_NAME=workers
    
  2. Specify the Kubernetes version of the node pool:

    root@rok-tools:~# export NODE_VERSION=1.19.16-gke.1500
    
  3. Specify the machine type:

    root@rok-tools:~# export MACHINE_TYPE=n1-standard-8
    
  4. Specify the number of nodes to create:

    root@rok-tools:~# export NUM_NODES=3
    
  5. Specify the number of local NVMe SSDs to add:

    root@rok-tools:~# export NUM_SSD=3
    

    Note

    Each local NVMe SSD is 375 GB in size. You can attach a maximum of 24 local SSD partitions for 9 TB per instance.

  6. Create the node pool:

    root@rok-tools:~# gcloud alpha container node-pools create ${NODE_POOL_NAME?} \
    >   --account ${CLUSTER_ADMIN_ACCOUNT?} \
    >   --cluster ${GKE_CLUSTER?} \
    >   --node-version ${NODE_VERSION?} \
    >   --machine-type ${MACHINE_TYPE?} \
    >   --image-type UBUNTU \
    >   --disk-type pd-ssd \
    >   --disk-size 200 \
    >   --local-ssd-volumes count=${NUM_SSD?},type=nvme,format=block \
    >   --metadata disable-legacy-endpoints=true \
    >   --workload-metadata=GKE_METADATA \
    >   --scopes gke-default \
    >   --num-nodes ${NUM_NODES?} \
    >   --no-enable-autoupgrade \
    >   --max-surge-upgrade 1 \
    >   --max-unavailable-upgrade 0 \
    >   --no-enable-autoupgrade \
    >   --no-enable-autorepair
    

Verify

  1. Verify that the node pool exists and its status is RUNNING:

    root@rok-tools:~# gcloud container node-pools describe ${NODE_POOL_NAME?} \
    >    --cluster=${GKE_CLUSTER?}
    ...
    name: workers
    ...
    status: RUNNING
    
  2. Verify that the nodes show up in the Kubernetes cluster:

    root@rok-tools:~# kubectl get nodes
    NAME                                         STATUS   ROLES   AGE     VERSION
    ...
    gke-arrikto-cluster-workers-1108b534-3rgs    Ready    <none>   26m   v1.19.16-gke.1500
    gke-arrikto-cluster-workers-1108b534-jztj    Ready    <none>   26m   v1.19.16-gke.1500
    gke-arrikto-cluster-workers-1108b534-wm2j    Ready    <none>   26m   v1.19.16-gke.1500
    
  3. Verify that all instances of your node pool have the necessary storage attached:

    1. Find the Instance group that corresponds to the workers node pool:

      root@rok-tools:~# export INSTANCE_GROUP=$(gcloud container node-pools describe ${NODE_POOL_NAME?} \
      >     --cluster=${GKE_CLUSTER?} \
      >     --format="value(instanceGroupUrls)")
      
    2. Find the Template of the Instance group:

      root@rok-tools:~# export TEMPLATE=$(gcloud compute instance-groups managed describe ${INSTANCE_GROUP?} \
      >     --format="value(instanceTemplate)")
      
    3. Inspect the Template and ensure that kube-env metadata key has the expected NODE_LOCAL_SSDS_EXT:

      root@rok-tools:~# gcloud compute instance-templates describe ${TEMPLATE?} --format json | \
      >     jq -r '.properties.metadata.items[] | select(.key == "kube-env") | .value' | \
      >        grep NODE_LOCAL_SSDS
      NODE_LOCAL_SSDS_EXT: 3,nvme,block
      
    4. Inspect the Template and ensure that it has NVMe local SSDs attached. The command below will list all disks of type SCRATCH and show their interface. It should be NVME:

      root@rok-tools:~# gcloud compute instance-templates describe ${TEMPLATE?} --format json | \
      >     jq -r '.properties.disks[] | select(.type == "SCRATCH") | .index, .deviceName, .interface' | paste - - -
      1 local-ssd-0 NVME
      2 local-ssd-1 NVME
      3 local-ssd-2 NVME
      
    5. Ensure that all instances inside the instance group run with the desired template:

      root@rok-tools:~# gcloud compute instance-groups managed describe ${INSTANCE_GROUP?} \
      >    --format="value(status.versionTarget.isReached)"
      True
      

Summary

You have successfully created a node pool.

What's Next

Check out the rest of the maintenance operations that you can perform on your cluster.