Skip to main content
GPU Glossary/Monitoring Metrics
Monitoring Metrics

PCIe Bandwidth

Measured GPU-to-host data transfer rate over PCI Express in GB/s.

What it is

PCIe bandwidth is the measured data transfer rate between the GPU and host system over PCI Express, tracked for transmit and receive separately via DCGM_FI_DEV_PCIE_TX_THROUGHPUT and DCGM_FI_DEV_PCIE_RX_THROUGHPUT (in KB/s). Theoretical maximums are 32 GB/s per direction for PCIe Gen4 x16 and 64 GB/s for Gen5 x16, with real-world throughput at approximately 25 GB/s and 50 GB/s respectively. PCIe bandwidth is critical for data loading pipelines, frequent host-to-GPU transfers, and checkpoint writes.

Why it matters

A sudden drop to exactly 50% of expected PCIe throughput is the signature of link width degradation from x16 to x8 -- a hardware fault from marginal connections, damaged traces, or riser card issues that generates no Xid error and no default DCGM alert. A data-loading pipeline that drops from 24 GB/s to 12 GB/s will starve the GPU, causing utilization to fall from 95% to 60% with no error message in any log. This is one of the most common silent hardware failures in dense GPU deployments.

How to monitor

Track DCGM_FI_DEV_PCIE_TX_THROUGHPUT and DCGM_FI_DEV_PCIE_RX_THROUGHPUT and compare against expected throughput for the negotiated link generation and width. Confirm negotiated width via nvidia-smi --query-gpu=pcie.link.width.current --format=csv. Factryze monitors PCIe throughput patterns and flags bandwidth degradation consistent with link width downtraining before it causes GPU starvation.

PCIe Bandwidth - Generation Comparison and Lane DegradationPCIe Bandwidth - Generation Comparison and Lane Degradation
Pinch to zoom, drag to pan, double-tap to toggle
PCIe Bandwidth - Generation Comparison and Lane DegradationPCIe Bandwidth - Generation Comparison and Lane Degradation
DCGM Metric Field
DCGM_FI_DEV_PCIE_TX_THROUGHPUT / DCGM_FI_DEV_PCIE_RX_THROUGHPUT

Monitor this automatically

Factryze correlates GPU signals in real time: errors, clocks, and fabric health.

Get Started Free