Topology Aware Scheduling
Topology Aware Scheduling (TAS) ensures that all pods in a gang are placed within the same network topology domain (e.g., the same rack, block, or data center zone). This is critical for distributed training workloads where inter-pod communication latency directly impacts performance.
Prerequisites
In addition to the general WAS prerequisites, you must:
Enable the
TopologyAwareWorkloadSchedulingfeature gate. This feature gate is included in the kind cluster configuration and is required for topology-aware placement.Label your nodes with topology keys. The scheduler uses node labels to determine topology domains. Apply labels that represent your topology hierarchy — for example, rack, block, or zone:
kubectl label node <node-name> topology.example.com/rack=rack-1Every node that should participate in topology-aware placement must have the topology label defined in the PodGroup’s
schedulingConstraints.topologyfield.
How It Works
The PodGroup’s schedulingConstraints.topology field tells the scheduler which topology domain to consider when placing pods. The scheduler finds a single topology domain that can accommodate the entire gang and co-locates all pods within it.
For example, setting topology.example.com/rack as the topology key ensures all pods in the gang land on nodes within the same rack, minimizing network hops between them.
Step 1: Create the Workload
The Workload references the JobSet and defines a pod group template with a gang scheduling policy:
apiVersion: scheduling.k8s.io/v1alpha2
kind: Workload
metadata:
name: js-abc
spec:
controllerRef:
apiGroup: jobset.x-k8s.io
kind: JobSet
name: js
podGroupTemplates:
- name: workers
schedulingPolicy:
gang:
minCount: 4
The controllerRef links the Workload to the JobSet. The podGroupTemplates entry defines a gang scheduling policy requiring all 4 pods to be schedulable.
Step 2: Create the PodGroup
The PodGroup references the pod group template and adds topology constraints:
apiVersion: scheduling.k8s.io/v1alpha2
kind: PodGroup
metadata:
name: js-abc-workers-def
namespace: default
spec:
podGroupTemplateRef:
workload:
workloadName: js-abc
podGroupTemplateName: workers
schedulingPolicy:
gang:
minCount: 4
schedulingConstraints:
topology:
- key: topology.example.com/rack
The key addition compared to basic gang scheduling is schedulingConstraints.topology:
key: topology.example.com/rack: The scheduler will find a single rack (i.e., a set of nodes sharing the sametopology.example.com/racklabel value) that can fit all 4 pods and schedule them there.
Step 3: Create the JobSet
The JobSet pods reference the PodGroup through schedulingGroup.podGroupName:
apiVersion: jobset.x-k8s.io/v1alpha2
kind: JobSet
metadata:
name: js
spec:
failurePolicy:
maxRestarts: 10
replicatedJobs:
- name: rj
replicas: 2
template:
spec:
completions: 2
parallelism: 2
backoffLimit: 0
template:
spec:
terminationGracePeriodSeconds: 0
schedulingGroup:
podGroupName: js-abc-workers-def
containers:
- name: worker
image: busybox
command: ["sleep", "infinity"]
resources:
requests:
cpu: "500m"
This JobSet creates 4 pods total (2 replicas × 2 completions). All 4 pods will be gang-scheduled onto nodes within the same topology.example.com/rack domain.
Verify Topology Placement
After applying all three resources, verify that the pods were co-located within the same topology domain:
kubectl get pods -l jobset.sigs.k8s.io/jobset-name=js -o wide
Check that all pods landed on nodes sharing the same topology label value:
kubectl get nodes -L topology.example.com/rack
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.