This section contains various troubleshooting instructions regarding Rok deployment or cleanup.
Rok cleanup is stuck¶
In this section we describe the reason why Rok cleanup might be stuck and the way to fix it.
RokCluster Custom Resource (CR) is protected by the
Normally, upon CR deletion, Rok Operator would remove this finalizer allowing the resource to be actually deleted.
However, if Rok Operator is not running when the
RokCluster is marked for
rokclustercleanup.arrikto.com finalizer will remain on the CR
and its deletion will block indefinitely. In addition, any attempt to delete the
rok namespace (where the
RokCluster CR lives by default) will not
succeed, since Kubernetes needs to ensure that all resources that exist in a
namespace are deleted before deleting the namespace itself.
In order to unblock from this situation you can manually remove the
rokclustercleanup.arrikto.com finalizer from the
RokCluster CR using
Kubeflow cleanup is stuck¶
In this section we describe the reason why Kubeflow might be stuck and the way to fix it.
It may occur that user namespaces,
kubeflow-XXX, are stuck in a
Terminating phase. If that is the case, you should list all resources in
the namespace to see what does not get deleted:
If you find Katib Trial Custom Resources, it is because they are protected by
Due to a race with respect to resource deletions, trials cannot fulfill their finalizer and, thus, are never deleted.
To unblock this, you should patch every Trial and delete its finalizers: