Configure Time Window for Exclusive GPU Access

This guide will instruct you on how to configure the exclusive access time window (TQ) for the Kiwi Scheduler.

The exclusive access mechanism, implemented by the Kiwi Scheduler, exists in order to prevent thrashing scenarios when the sum of the applications’ working set sizes (GPU memory) exceeds the physical GPU memory capacity. Each application that needs to do GPU work gets exclusive access to the GPU for TQ seconds at a time. If more than one applications’ GPU bursts overlap, Kiwi assigns exclusive access to the GPU in a round-robin manner.

Note

If you use a big TQ, for example 100 seconds, you sacrifice interactivity in favor of maximum throughput. This is because each time another application gets exclusive access to the GPU, Kiwi must fetch its data to the GPU and evict the previous application’s data.

Important

By default, the Kiwi Scheduler’s time quantum is 30 seconds. We do not recommend setting this variable to a value of 5 seconds or less, as the application will spend most of its exclusive access window waiting for its data to become resident on the GPU.

Important

Changes to the configuration of a Kiwi Scheduler instance only affect that particular instance and are not persisted across Pod restarts.

In order to make the changes persistent and also have them apply to all Kiwi Scheduler instances you must redeploy Kiwi with the desirable configuration.

What You’ll Need

Procedure

  1. Find the node for which you wish to change the time window.

    1. Specify the name of the application Pod you are interested in changing the time window for:

      root@rok-tools:~# KIWI_POD_NAME=<POD_NAME>

      Replace <POD_NAME> with the name of the Pod you want to configure, for example:

      root@rok-tools:~# KIWI_POD_NAME=kiwi-pod
    2. Specify the namespace of the application Pod:

      root@rok-tools:~# KIWI_POD_NAMESPACE=<POD_NAMESPACE>

      Replace <POD_NAMESPACE> with the namespace of the Pod you want to configure, for example:

      root@rok-tools:~# KIWI_POD_NAMESPACE=kiwi-pod-namespace
    3. Find the node where the application Pod is running on:

      root@rok-tools:~# KIWI_NODENAME=$(kubectl get pod ${KIWI_POD_NAME?} \ > -n ${KIWI_POD_NAMESPACE?} -o json \ > | jq -r '.spec.nodeName') > && echo ${KIWI_NODENAME?} ip-192-168-109-143.eu-central-1.compute.internal
  2. Get the Kiwi Scheduler’s Pod name for the specified node:

    root@rok-tools:~# KIWI_SCHEDULER_POD_NAME=$(kubectl get pod \ > -n kiwi-system -l name=kiwi-scheduler -o json \ > | jq -r '.items[] | select(.spec.nodeName == "'$KIWI_NODENAME'") | .metadata.name') > && echo ${KIWI_SCHEDULER_POD_NAME?} kiwi-scheduler-4pk55
  3. Change the Kiwi Scheduler’s TQ:

    root@rok-tools:~# kubectl exec -it ${KIWI_SCHEDULER_POD_NAME?} \ > -n kiwi-system -- kiwictl -T <NEW_TQ>

    Replace <NEW_TQ> with the desired time quantum for the Kiwi Scheduler, for example:

    root@rok-tools:~# kubectl exec -it ${KIWI_SCHEDULER_POD_NAME?} \ > -n kiwi-system -- kiwictl -T 20 [INFO] Successfully set the scheduler TQ to 20 seconds.

Verify

  1. Inspect the Kiwi Scheduler logs and verify that TQ is equal with the one you previously set:

    root@rok-tools:~# kubectl logs ${KIWI_SCHEDULER_POD_NAME?} -n kiwi-system [INFO] New TQ = 20

Summary

You have successfully configured the access time window for the Kiwi Scheduler.

What’s Next

Check out the rest of the Kiwi operations you can perform.