DistributedConfig¶
DistributedConfig
is a Kale object, which holds information relevant to the
configuration of distributed training experiments.
Overview
Import¶
The object lives in the kale.distributed
module. Import it as follows:
Attributes¶
Name | Type | Default | Description |
---|---|---|---|
env |
List[V1EnvVar] |
[] |
Extends the env field of a container |
env_from |
List[V1EnvFromSource] |
[] |
Extends the envFrom field of the container |
requests |
Dict |
{} |
Sets resources.requests for the container |
limits |
Dict |
{} |
Sets resources.limits for the container |
annotations |
Dict |
{} |
Sets annotations for the Pod |
labels |
Dict |
{} |
Sets labels for the Pod |
node_selector |
Dict |
{} |
Sets the node_selector for the Pod |
affinity |
V1Affinity |
None |
Sets the affinity of the Pod |
tolerations |
List[V1Tolerations] |
[] |
Sets tolerations for the Pod |
run_policy |
Dict | V1RunPolicy |
None |
Encapsulates various runtime policies of the distributed training job |
Note
In the table above, we also mention objects that are part of the Kubernetes Python client library, as well as the Kubeflow Training Operator Python client library. For details on the structure of the Kubernetes objects please refer to the Official Python client library for Kubernetes. For details on the structure of the Kubeflow Training Operator objects please refer to the Official Python client library for the Kubeflow Training Operator.
Important
The container-level options that you set in the configuration object, such
as env
, labels
, annotations
, limits
, requests
, etc.,
are propagated to every container that is part of the distributed training
process. Thus, the master and every worker pod will have the same
container-level options.
Initialization¶
You may initialize a DistributedConfig
object similarly to any other Python
object:
However, you can also initialize a field that expects Kubernetes objects by passing a dictionary, which Kale will then deserialize into the corresponding Kubernetes object. For example: