Skip to main content
GPU Glossary/Monitoring Metrics
Monitoring Metrics

Memory Clock

GPU HBM/GDDR memory frequency in MHz that determines memory bandwidth.

What it is

The memory clock is the operating frequency of the GPU's memory subsystem (HBM2e, HBM3, or GDDR6X) measured in MHz, reported via DCGM_FI_DEV_MEM_CLOCK, and directly determines the available memory bandwidth feeding the streaming multiprocessors. On H100 SXM, HBM3 runs at 1593 MHz for 3.35 TB/s peak; on A100 SXM, HBM2e runs at 1215 MHz for 2.0 TB/s. Unlike SM clocks, data center GPU memory clocks are typically locked at rated speed during active operation.

Why it matters

Any reduction in memory clock frequency is a significant anomaly since it should not vary under normal conditions -- even a 50-100 MHz drop means a proportional bandwidth reduction. A 10% memory clock reduction translates directly to 10% less bandwidth, devastating throughput for memory-bandwidth-bound workloads like LLM inference and attention computation. An A100 dropping from 1215 MHz to 1100 MHz during inference will see approximately 10% latency increase per layer, compounding across the entire model.

How to monitor

Track DCGM_FI_DEV_MEM_CLOCK and alert on any deviation from the GPU's rated frequency. Correlate memory clock drops with DCGM_FI_DEV_MEMORY_TEMP and DCGM_FI_DEV_ECC_SBE_VOL_TOTAL to distinguish thermal protection response from failing HBM hardware. Factryze monitors memory clock continuously and correlates anomalies with HBM temperature and ECC error rates to classify the root cause.

Memory Clock - HBM Frequency and BandwidthMemory Clock - HBM Frequency and Bandwidth
Pinch to zoom, drag to pan, double-tap to toggle
Memory Clock - HBM Frequency and BandwidthMemory Clock - HBM Frequency and Bandwidth
DCGM Metric Field
DCGM_FI_DEV_MEM_CLOCK

Monitor this automatically

Factryze correlates GPU signals in real time: errors, clocks, and fabric health.

Get Started Free