Provision EKS cluster

This section will guide you through the steps needed to create a fresh, 3-node Kubernetes EKS cluster to deploy Rok.

Important

This guide contains instructions to deploy an EKS cluster that runs Kubernetes 1.16.X. Rok already runs and will soon transition to Kubernetes 1.17.X in upcoming releases. For more information see the official EKS docs on Kubernetes versioning.

To provision an EKS cluster we need to create some IAM roles, as shown below. These IAM Roles are needed to allow Amazon EKS and the Kubernetes control plane to manage AWS resources on your behalf.

Note

If you already have an EKS Kubernetes cluster targeted for installing Rok you can skip this section.

Configure IAM

EKS Service Role

Simply follow official EKS docs on how to create service IAM role.

  1. Download the JSON Policy document directly inside rok-tools with:

    $ wget <download_root>/eks-assume-role-policy-document.json
    

    Optionally, you can download eks-assume-role-policy-document.json locally in your machine to view it.

  2. Create an IAM role for the EKS cluster:

    $ aws iam create-role \
    >     --role-name eksServiceRole \
    >     --assume-role-policy-document file://eks-assume-role-policy-document.json
    
  3. Attach policies to the IAM role you previously created:

    $ aws iam attach-role-policy \
    >     --role-name eksServiceRole \
    >     --policy-arn arn:aws:iam::aws:policy/AmazonEKSClusterPolicy
    $ aws iam attach-role-policy \
    >     --role-name eksServiceRole \
    >     --policy-arn arn:aws:iam::aws:policy/AmazonEKSServicePolicy
    
  4. Verify:

    $ aws iam get-role --role-name eksServiceRole
    $ aws iam list-attached-role-policies --role-name eksServiceRole
    

EKS Worker Node Role

Simply follow official EKS docs on how to create worker node IAM role.

  1. Download the JSON Policy document directly inside rok-tools with:

    $ wget <download_root>/ec2-assume-role-policy-document.json
    

    Optionally, you can download ec2-assume-role-policy-document.json locally in your machine to view it.

  2. Create an IAM role for the worker nodes:

    $ aws iam create-role \
    >     --role-name eksWorkerNodeRole \
    >     --assume-role-policy-document file://ec2-assume-role-policy-document.json
    

    Here is the above policy document:

  3. Attach policies to the IAM role you previously created:

    $ aws iam attach-role-policy \
    >     --role-name eksWorkerNodeRole \
    >     --policy-arn arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
    $ aws iam attach-role-policy \
    >     --role-name eksWorkerNodeRole \
    >     --policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
    $ aws iam attach-role-policy \
    >     --role-name eksWorkerNodeRole \
    >     --policy-arn arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
    $ aws iam attach-role-policy \
    >     --role-name eksWorkerNodeRole \
    >     --policy-arn arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy
    
  4. Verify:

    $ aws iam get-role --role-name eksWorkerNodeRole
    $ aws iam list-attached-role-policies --role-name eksWorkerNodeRole
    

Select VPC

To create an EKS cluster you need a VPC that satisfies specific requirements, e.g., it should have subnets in at least two availability zones, any public subnets must be configured to auto-assign public IP addresses. It is recommended to use a VPC with public and private subnets so that Kubernetes can create public load balancers in the public subnets that load balance traffic to pods running on nodes that are in private subnets. To create a new VPC tailored to EKS requirements, you can follow the official getting started guide. Alternatively, a public-only approach is to use the default VPC in your region:

$ export VPCID=$(aws ec2 describe-vpcs --filters Name=isDefault,Values=true | jq -r '.Vpcs[0].VpcId')

Warning

If you have specific network requirements, e.g., deploy on a specific VPC and you already know the VPC ID, you can specify it explicitly with:

$ export VPCID=vpc-12345

Select subnets

On AWS, a VPC can contain more than one subnets, each of which is tied to an availability zone. To see a mapping between subnet IDs and AZ in your selected VPC run the following command:

$ aws ec2 describe-subnets \
>     --filters Name=vpc-id,Values=${VPCID?} | \
>     jq -r '.Subnets[] | .SubnetId, .AvailabilityZone' | paste - -

To use all subnets of the previously selected VPC:

$ export SUBNETIDS=$(aws ec2 describe-subnets --filters Name=vpc-id,Values=${VPCID?} | jq -r '.Subnets[].SubnetId' | xargs)

Note

If you have specific network requirements, e.g., deploy on a subset of the available subnets, and you already know the VPC Subnet IDs, you can specify them explicitly with:

$ export SUBNETIDS="subnet-1 subnet-2"

Create Security Group

As part of the cluster creation we need to create an EC2 security group. A Security Group acts as a virtual firewall that controls the traffic for one or more instances.

Note

In case you opted for creating a new VPC using Amazon’s official instructions in the Select VPC section, you have already created the required SecurityGroups, so you can skip this section.

  1. Choose the security group name:

    $ export SECURITYGROUP=demo-eks-clusters
    
  2. Set a trusted CIDR. For example:

    $ export CIDR=1.2.3.4/32
    
  3. Create a security group in the default VPC:

    $ aws ec2 create-security-group \
    >     --description "Demo EKS clusters" \
    >     --group-name ${SECURITYGROUP?} \
    >     --vpc-id ${VPCID?}
    
  4. Obtain the security group ID:

    $ export SECURITYGROUPID=$(aws ec2 describe-security-groups --filters Name=vpc-id,Values=${VPCID?} Name=group-name,Values=${SECURITYGROUP?} | jq -r '.SecurityGroups[].GroupId')
    
  5. Only allow traffic to your EKS cluster from a specific IP or CIDR:

    $ aws ec2 authorize-security-group-ingress \
    >     --group-id ${SECURITYGROUPID?} \
    >     --protocol tcp \
    >     --port 0-65535 \
    >     --cidr ${CIDR?}
    $ aws ec2 authorize-security-group-ingress \
    >     --group-id ${SECURITYGROUPID?} \
    >     --protocol icmp \
    >     --port -1
    
  6. Verify:

    $ aws ec2 describe-security-groups --group-ids ${SECURITYGROUPID?}
    

Create EKS Cluster

This section we will create an EKS cluster running Kubernetes 1.16.X that will allow public access to the Kubernetes cluster behind firewall.

  1. First choose the cluster name and the trusted CIDRs:

    $ export CIDRS=$CIDR
    $ export CLUSTERNAME=$AWS_ACCOUNT-$AWS_IAM_USER-cluster
    
  2. Obtain the AWS account ID:

    $ export ACCOUNT_ID=$(aws sts get-caller-identity | jq -r '.Account')
    
  3. Decide on the VPC configuration used by the cluster control plane, i.e, the Kubernetes master nodes.

  4. Specify the subnets to use to host resources for your cluster. For the EKS control plane you must specify at least two subnets in different Availability Zones. It is advised to select all available subnets in the VPC, including the private ones (if any), so that you can deploy worker nodes on private subnets and use the internal Kubernetes endpoint. See Select subnets above for how to obtain the desired subnet IDs and make sure you set the SUBNETIDS environment variable accordingly.

  5. Specify the security groups to use (up to five):

    To use the security group created in the previous section:

    $ export SECURITYGROUPIDS=$(aws ec2 describe-security-groups --filters Name=vpc-id,Values=${VPCID?} Name=group-name,Values=${SECURITYGROUP?},default | jq -r '.SecurityGroups[].GroupId' | xargs)
    

    Warning

    If you have specific network requirements, e.g., use pre-existing security groups, and you already know the Security Group IDs, you can specify them explicitly with:

    $ export SECURITYGROUPIDS="sg-1 sg-2"
    
  6. Create an EKS cluster:

    $ aws eks create-cluster \
    >      --name ${CLUSTERNAME?} \
    >      --role-arn arn:aws:iam::${ACCOUNT_ID?}:role/eksServiceRole \
    >      --resources-vpc-config subnetIds=${SUBNETIDS// /,},securityGroupIds=${SECURITYGROUPIDS// /,},endpointPublicAccess=true,endpointPrivateAccess=true,publicAccessCidrs=${CIDRS// /,} \
    >      --tags owner=${AWS_ACCOUNT?}/${AWS_IAM_USER?} \
    >      --kubernetes-version 1.16
    
  7. Verify that the EKS cluster exists:

    $ aws eks describe-cluster --name ${CLUSTERNAME?}
    

Enable IAM roles for Kubernetes Service Accounts

Note

You must wait for your cluster to become ACTIVE, before you can create an OIDC provider for it.

Create an OIDC provider and associate it with the K8s cluster to enable IAM roles for service accounts:

$ eksctl utils associate-iam-oidc-provider --cluster $CLUSTERNAME --approve

To verify:

$ export OIDC_PROVIDER=$(aws eks describe-cluster --name $CLUSTERNAME --query "cluster.identity.oidc.issuer" --output text | sed -e "s/^https:\/\///")
$ aws iam get-open-id-connect-provider \
>     --open-id-connect-provider-arn arn:aws:iam::$ACCOUNT_ID:oidc-provider/$OIDC_PROVIDER

Access EKS Cluster

To access your newly created EKS cluster you need to update your kubeconfig. For more information, see the official EKS kubeconfig docs:

$ aws eks update-kubeconfig --name $CLUSTERNAME

Inspect the generated config using kubectl:

$ kubectl config current-context
$ kubectl config view --minify=true

Note

Set the --minify flag to output info only for the current context.

Since the EKS cluster is behind firewall make sure you have access to the underlying ELB of the Kubernetes endpoint:

$ kubectl config view -o json --raw --minify=true | jq -r '.clusters[0].cluster.server'

Create EKS Node Group

EKS supports both managed and self-managed node groups. You can use either of them based on your needs. For example, if you want to use a specific AMI then you should use self-managed node groups. For easier management via the Console prefer using managed node groups.

Both node group types use AutoScalingGroups (ASG) underneath, which, on their turn, use a Launch Template that specifies the Kubernetes node instance configuration.

In this section we will see:

Local Storage requirements

Rok can run on any instance type, as long as there are disks available for it to use.

  • For instance types that have instance store volumes (local NVMe storage) attached, Rok will automatically find and use all of them.
  • For instance types without instance store volumes (EBS-only), you will need one or more extra EBS volumes of the exact same size. Rok will use all extra EBS volumes with devices names /dev/sd[f-p] (see also recommended device names for EBS volumes).

Important

Rok expects to use all the available extra volumes mentioned above exclusively.

Note

Since the size of EBS volumes affects their performance, it’s advised to use at least 500GiB per disk.

Note

You should not add extra EBS volumes to instances with instance store volumes. The default behavior for Rok is to use all available volumes as a RAID0 array, and this will lead to unpredictable performance.

Managed node groups

Amazon EKS managed node groups automate the provisioning and lifecycle management of nodes (Amazon EC2 instances) for Amazon EKS Kubernetes clusters.

  1. Specify the subnets to use. For the Auto Scaling Group, in contrast to the control plane, one can use a single AZ. The rule of thumb here is that the ASG minSize should be equal or greater than the number of the Availability Zones it spans. Still, since we will make use of EBS volumes, it is highly recommended for the ASG to span a single Availability zone. See Select subnets above for how to obtain the desired subnet IDs and make sure you set the SUBNETIDS environment variable accordingly.

  2. Choose an instance type of your preference. It is recommended to use an instance type that has instance store volumes (local NVMe storage) attached. For example, use m5d.4xlarge, which has 16CPU, 64G RAM, and 2x300 NVMe SSD:

    $ export INSTANCE_TYPE=m5d.4xlarge
    

    If you prefer to use an EBS-only instance type, e.g., m5.large, create the node group as mentioned below, and then, make sure to Add extra EBS volumes so that Rok can use them.

  3. Create the managed node group:

    $ aws eks create-nodegroup \
    >     --cluster-name ${CLUSTERNAME?} \
    >     --nodegroup-name general-workers \
    >     --disk-size 200 \
    >     --scaling-config minSize=1,maxSize=3,desiredSize=2 \
    >     --subnets ${SUBNETIDS?} \
    >     --instance-types ${INSTANCE_TYPE?} \
    >     --ami-type AL2_x86_64 \
    >     --node-role arn:aws:iam::${ACCOUNT_ID?}:role/eksWorkerNodeRole \
    >     --labels role=general-worker \
    >     --tags owner=${AWS_ACCOUNT?}/${AWS_IAM_USER?},kubernetes.io/cluster/${CLUSTERNAME?}=owned \
    >     --release-version 1.16.15-20201211 \
    >     --kubernetes-version 1.16
    

    Note

    Creating an EKS managed nodegroup with zero minimum size is currently not supported.

    Note

    You can create a Node Group for your cluster once its status is ACTIVE.

    Important

    The above command might fail with the following error message:

    “An error occurred (InvalidParameterException) when calling the CreateNodegroup operation: Requested Nodegroup release version X is invalid.”

    This means that Amazon has released a new AMI for EKS, most likely with an upgraded amazonlinux kernel. In this case, please coordinate with Arrikto’s Tech Team to ensure that container images that support the latest AMI and kernel are available.

Self-managed node groups

AWS CLI does not support creating self-managed node groups. To create one, follow the official instructions that use a CloudFormation template. Make sure to:

  1. Pick a name for the stack, e.g., <CLUSTERNAME>-workers

  2. Specify the name of the EKS cluster we created previously.

  3. For ClusterControlPlaneSecurityGroup, select the security group you created during the Create Security Group section. In case you created a dedicated VPC, choose the SecurityGroups value from the AWS CloudFormation output:

    ../../../_images/cf-eks-vpc-output.png

    This is the security group to allow communication between your worker nodes and the Kubernetes control plane.

  4. Pick a name for the node group, e.g., general-workers. The created instances will be named after <CLUSTERNAME>-<NODEGROUPNAME>-Node.

  5. For the image use the latest one for 1.16 using the corresponding NodeImageIdSSMParam.

  6. Set a large enough NodeVolumeSize, e.g., 200 (GB) since this will hold Docker images and pod’s ephemeral storage.

  7. Select your desired instance type. If it is EBS-only see how to Add extra EBS volumes after node group creation.

  8. Select the VPC, and the subnets to spawn the workers in. Since we will use EBS volumes, it is highly recommended that the ASG spans a single Availability zone. See Select subnets above for how to obtain the desired subnet IDs and make sure you choose them from the given drop down list.

  9. In Configure stack options, specify the Tags that Cluster Autoscaler requires so that it can discover the instances of the ASG automatically:

    Key Value
    k8s.io/cluster-autoscaler/enabled true
    k8s.io/cluster-autoscaler/<CLUSTERNAME> owned

Create the stack and wait for CloudFormation to:

  • Create an IAM role that worker nodes will consume.
  • Create an AutoScalingGroup with a new Launch Template.
  • Create a security group that the worker nodes will use.
  • Modify given cluster security group to allow communication between control plane and worker nodes.

After the stack has finished creating, continue with the enable nodes to join your cluster section to complete the setup of the node group.

Add extra EBS volumes

In case you have used an EBS-only instance type for the node group, you will have to attach an extra EBS volume for Rok to use as local storage. To do that, you have to edit the Launch template that the underlying ASG is using. This is similar to what we do for EKS Upgrade. Specifically:

  1. Go to https://console.aws.amazon.com/ec2autoscaling/.

  2. Find the ASG associated with our node group.

  3. Edit its Launch template.

    ../../../_images/asg-edit-lt.png
  4. Create a new launch template version.

    ../../../_images/lt-new-version.png
  5. In the Storage (volumes) section click on Add new volume.

    ../../../_images/lt-add-new-volume.png
  6. Specify the specs of the extra EBS volume:

    • Size: 500 GiB
    • Device name: /dev/sdf
    • Volume type: gp2
    ../../../_images/lt-extra-ebs-volume.png
  7. Create the new template version.

    ../../../_images/lt-new-version-success.png
  8. Go back to the Launch template section of the ASG, refresh the drop down menu with the versions and select the newly created one.

    ../../../_images/asg-update-lt-version.png

From now on, all newly created instances will have an extra EBS disk. To replace the existing instances start an instance refresh operation by specifying zero for Minimum healthy percentage so that all instances are replaced at the same time.

Verify

  1. Verify that EC2 instances have been created:

    $ aws ec2 describe-instances \
    >     --filters Name=tag-key,Values=kubernetes.io/cluster/$CLUSTERNAME
    
  2. Verify that Kubernetes nodes have appeared:

    $ kubectl get nodes
    NAME                                         STATUS   ROLES    AGE    VERSION
    ip-172-31-0-86.us-west-2.compute.internal    Ready    <none>   8m2s   v1.16.13-eks-2ba88
    ip-172-31-24-96.us-west-2.compute.internal   Ready    <none>   8m4s   v1.16.13-eks-2ba88
    

(Optional) Share EKS cluster

In case you wish to allow other users to access your EKS cluster, you need to:

  1. Edit the kube-system/aws-auth ConfigMap and add an entry for each user you wish to grant access:

    mapUsers: |
       - userarn: arn:aws:iam::<account_id>:user/<username>
         username: <username>
         groups:
           - system:masters
    
  2. Make sure additional users have sufficient permissions on EKS resources (see https://docs.aws.amazon.com/eks/latest/userguide/security_iam_id-based-policy-examples.html).

    Note

    For example, create a new group with the corresponding policy, e.g., AmazonEKSAdminPolicy and add the user to this group (see https://github.com/kubernetes-sigs/aws-iam-authenticator/issues/174#issuecomment-476442197)

  3. Have the user follow the Configure AWS guide so that they can access AWS resources with aws.

  4. Have the user follow the Access EKS Cluster guide so that they can access Kubernetes with kubectl.

    Important

    In case the Kubernetes API server is firewalled the user needs to make sure they are connecting from a trusted source, e.g., via a trusted VPN.