PyTorch Integration

The Kubeflow Training Operator allows users to easily distribute the training process of PyTorch models. Users can create and submit PyTorchJob Custom Resources (CRs), and manage PyTorch jobs like other built-in resources in Kubernetes.

Kale provides a simple way to translate your Python code into a PyTorchJob CR and a client which facilitates the monitoring and management of the running jobs.

Important

This API is in beta and is subject to change.

What You’ll Need