Skip to main content
GPU Glossary/Monitoring Metrics
Monitoring Metrics

Power Capping

Limiting GPU power draw below TDP to control thermals and rack density.

What it is

Power capping sets a GPU's enforced power limit below its default TDP to reduce power consumption and thermal output, trading peak performance for improved power efficiency and thermal headroom. Caps are configured via nvidia-smi -pl or the DCGM API, take effect immediately, and are tracked via DCGM_FI_DEV_ENFORCED_POWER_LIMIT alongside actual draw in DCGM_FI_DEV_POWER_USAGE. The performance impact is non-linear -- reducing an H100 from 700W to 600W (14% power cut) typically yields only a 5-8% throughput decrease for memory-bound LLM inference.

Why it matters

Power capping is a critical lever in dense GPU deployments where rack power limits constrain how many GPUs can run at full TDP simultaneously. A 1000-GPU cluster running at 600W instead of 700W saves 100 kW and equivalent cooling load, enabling additional rack density within the same facility power budget. GPUs where actual draw consistently equals the enforced limit confirm the cap is actively constraining performance -- any further reduction has a disproportionately larger throughput impact.

How to monitor

Compare DCGM_FI_DEV_POWER_USAGE against DCGM_FI_DEV_ENFORCED_POWER_LIMIT to detect whether a cap is binding. Correlate DCGM_FI_DEV_CLOCK_THROTTLE_REASONS bit 5 with the power limit to confirm power-driven throttling versus thermal-driven. Factryze uses power capping as an automated remediation action during thermal emergencies, dynamically lowering limits on overheating GPUs and restoring full power once thermal conditions stabilize.

Power Capping - Performance vs Power TradeoffPower Capping - Performance vs Power Tradeoff
Pinch to zoom, drag to pan, double-tap to toggle
Power Capping - Performance vs Power TradeoffPower Capping - Performance vs Power Tradeoff
DCGM Metric Field
DCGM_FI_DEV_POWER_USAGE / DCGM_FI_DEV_ENFORCED_POWER_LIMIT

Monitor this automatically

Factryze correlates GPU signals in real time: errors, clocks, and fabric health.

Get Started Free