Create GKE Cluster

This section will guide you through creating a GKE cluster using the Google Cloud SDK. After completing this guide you will have a GKE cluster with:

  • Kubernetes 1.19.
  • Worker nodes with local NVMe SSDs.

Procedure

To create the GKE cluster follow the steps below:

  1. Switch to your management environment and specify the cluster name:

    root@rok-tools:~# export CLUSTERNAME=arrikto-cluster
    
  2. Specify the Kubernetes version:

    root@rok-tools:~# export CLUSTER_VERSION=1.19.12-gke.2100
    
  3. Specify the name of the default node pool:

    root@rok-tools:~# export NODE_POOL_NAME=default-workers
    
  4. Specify the machine type:

    root@rok-tools:~# export MACHINE_TYPE=n1-standard-8
    
  5. Specify the number of nodes to create:

    root@rok-tools:~# export NUM_NODES=3
    
  6. Specify the number of local NVMe SSDs to add:

    root@rok-tools:~# export NUM_SSD=3
    

    Note

    Rok will automatically find and use all local SSDs, which are expected to be unformatted. Each local NVMe SSD is 375 GB in size. You can attach a maximum of 24 local SSD partitions for 9 TB per instance.

  7. Create the cluster:

    root@rok-tools:~# gcloud alpha container clusters create ${CLUSTERNAME?} \
    >   --account ${CLUSTER_ADMIN_ACCOUNT?} \
    >   --cluster-version ${CLUSTER_VERSION?} \
    >   --release-channel None \
    >   --no-enable-basic-auth \
    >   --node-pool-name ${NODE_POOL_NAME?} \
    >   --machine-type ${MACHINE_TYPE?} \
    >   --image-type UBUNTU \
    >   --disk-type pd-ssd \
    >   --disk-size 200 \
    >   --local-ssd-volumes count=${NUM_SSD?},type=nvme,format=block \
    >   --metadata disable-legacy-endpoints=True \
    >   --workload-pool=${PROJECT_ID?}.svc.id.goog \
    >   --scopes gke-default \
    >   --num-nodes ${NUM_NODES?} \
    >   --enable-stackdriver-kubernetes \
    >   --enable-ip-alias \
    >   --default-max-pods-per-node 110 \
    >   --no-enable-master-authorized-networks \
    >   --no-enable-intra-node-visibility \
    >   --addons HorizontalPodAutoscaling,HttpLoadBalancing,GcePersistentDiskCsiDriver \
    >   --max-surge-upgrade 1 \
    >   --max-unavailable-upgrade 0 \
    >   --no-enable-autoupgrade \
    >   --no-enable-autorepair \
    >   --enable-shielded-nodes
    
    Troubleshooting
    The command fails with ‘Insufficient regional quota to satisfy request: resource “SSD_TOTAL_GB”’

    Ensure that your region has enough quotas for local SSD. To inspect the usage / limits run:

    root@rok-tools:~# gcloud compute regions describe ${REGION?} --format json | \
    >     jq -r '.quotas[] | select(.metric=="SSD_TOTAL_GB") | "\(.usage)/\(.limit)"'
    

    Either delete some resources or choose a different region/zone.

    Note

    This will create a zonal cluster with 3 nodes in the cluster’s primary zone. It will use the default network and subnet in the zone.

Verify

Switch back to your management environment and

  1. Ensure that the GKE cluster exists and its status is RUNNING:

    root@rok-tools:~# gcloud container clusters describe ${CLUSTERNAME?}
    ...
    name: arrikto-cluster
    ...
    status: RUNNING
    
  2. Get the list of the node pools:

    root@rok-tools:~# gcloud container node-pools list --cluster=${CLUSTERNAME?}
    NAME             MACHINE_TYPE   DISK_SIZE_GB  NODE_VERSION
    default-workers  n1-standard-8  200           1.19.12-gke.2100
    
  3. Ensure the default node pool exists and its status is RUNNING:

    root@rok-tools:~# gcloud container node-pools describe ${NODE_POOL_NAME?} \
    >   --cluster=${CLUSTERNAME?}
    ...
    name: default-workers
    ...
    status: RUNNING
    
  4. Verify that all instances of your node pool have the necessary storage attached:

    1. Find the Instance group that corresponds to the workers node pool:

      root@rok-tools:~# export INSTANCE_GROUP=$(gcloud container node-pools describe ${NODE_POOL_NAME?} \
      >     --cluster=${CLUSTERNAME?} \
      >     --format="value(instanceGroupUrls)")
      
    2. Find the Template of the Instance group:

      root@rok-tools:~# export TEMPLATE=$(gcloud compute instance-groups managed describe ${INSTANCE_GROUP?} \
      >     --format="value(instanceTemplate)")
      
    3. Inspect the Template and ensure that kube-env metadata key has the expected NODE_LOCAL_SSDS_EXT:

      root@rok-tools:~# gcloud compute instance-templates describe ${TEMPLATE?} --format json | \
      >     jq -r '.properties.metadata.items[] | select(.key == "kube-env") | .value' | \
      >        grep NODE_LOCAL_SSDS
      NODE_LOCAL_SSDS_EXT: 3,nvme,block
      
    4. Inspect the Template and ensure that it has NVMe local SSDs attached. The command below will list all disks of type SCRATCH and show their interface. It should be NVME:

      root@rok-tools:~# gcloud compute instance-templates describe ${TEMPLATE?} --format json | \
      >     jq -r '.properties.disks[] | select(.type == "SCRATCH") | .index, .deviceName, .interface' | paste - - -
      1 local-ssd-0 NVME
      2 local-ssd-1 NVME
      3 local-ssd-2 NVME
      
    5. Ensure that all instances inside the instance group run with the desired template:

      root@rok-tools:~# gcloud compute instance-groups managed describe ${INSTANCE_GROUP?} \
      >    --format="value(status.versionTarget.isReached)"
      True
      

Summary

You have successfully created your GKE cluster.

What’s Next

The next step is to get access to your GKE cluster.