DLC and Watts per Rack

Per-rack power is the binding constraint on modern GPU clusters. Compute scales faster than power and cooling. The decisions you make at facility build-out about cooling, distribution, and rack density set the ceiling on what your fleet can train for the next decade.

The four rack power tiers

The historical baseline for an enterprise datacenter rack was 5 to 10 kW. A single DGX with 8 GPUs lands roughly at this number. Two such racks fit comfortably in any tier-3 facility built in the last 20 years.

The air-cooled ceiling sits near 30 kW. This is not a vendor preference; it is what the physics of air at room temperature can carry away through a 42U envelope at any plausible CFM. Past 30 kW you need new air paths (rear-door heat exchangers, contained hot aisles), at which point you might as well admit you are halfway to liquid.

Direct liquid cooling removes the constraint. An HGX H100 DLC rack lands near 60 kW: the same 8-GPU node design as DGX, but with cold plates instead of fans, packed tighter. NVL72 is the next jump: 72 GPUs (36 GB200 superchips, 2 GPUs each) in a single rack at roughly 120 kW. Same floor tile as a single DGX, 9× the GPUs and 24× the watts.

What 120 kW per rack unlocks

The reason NVIDIA reference-designed NVL72 to land at this density is not "more power for the sake of it." It is the NVLink domain size. A 72-GPU NVLink domain inside one rack means an all-reduce that would have crossed an InfiniBand fabric in an HGX-class deployment now stays inside the rack, on switched copper, at NVLink speed (1.8 TB/s per GPU as of GB200).

Concretely:

Tensor-parallel groups can stretch to 72-way without leaving the rack.
The first stages of pipeline-parallel training can fit on a single rack.
Collective bandwidth at 72-way is ~10× what it is when the same 72 GPUs span 9 air-cooled racks connected by InfiniBand.

The compute is the same; the coupling is fundamentally different. That coupling is what changes the achievable model size and the achievable training throughput.

The 10-year decision

Power and cooling are facility-level decisions. You can swap a generation of GPUs in months. You cannot swap a CDU plant, a power feed, a chilled-water tower, or a row of in-row coolers without taking the facility offline for weeks.

This means decisions made when a hall is built largely fix what fleets you can run inside it for a decade or more:

A facility designed at 10 kW per rack and 5 MW total can host a few thousand H100s in HGX nodes. It cannot host NVL72 without retrofit.
A facility designed at 60 kW per rack with DLC headroom can host both HGX and NVL72 today, and has at least a generation of headroom for what comes next.
A facility designed at 120+ kW per rack with N+1 CDU and 30 to 50 MW total is purpose-built for the current generation, and the operator is betting that the next two generations stay inside this density envelope.

The mistake to avoid is sizing for the current GPU and shipping a building that cannot accept the next one. Several large operators are publicly retrofitting 2022-era halls precisely because they sized for 10 kW per rack on the assumption that air cooling would hold.

What this means in practice

If you are an ML lab choosing capacity, ask the operator what their per-rack density ceiling is and what their cooling architecture looks like before you ask about price. The cheaper rack is often the one that cannot grow with you.
If you are an operator, the rack density you commit to today is a 10-year contract with your buildings. Plan accordingly.
If you are sizing facility power against a workload, remember that the PDU and breaker side is a parallel constraint. Cooling can carry 120 kW; the distribution side has to deliver it without tripping under transient load.

The 30 kW ceiling defined the previous decade of cluster design. The 120 kW ceiling defines the next one. The transition is where the upgrade pain lands.

The four rack power tiers

What 120 kW per rack unlocks

The 10-year decision

What this means in practice

See also