kvserver: expose a per-store replica cpu histogram metric #138672

angles-n-daemons · 2025-01-08T20:22:48Z

In the interest of better troubleshooting hotspots, we're aiming to expose more observability around range usage and load.

To be considered complete, for each store, when updateReplicationGauges is called, update this new metric to a histogram of seen cpu values.

Epic: CRDB-43150
Jira issue: CRDB-46300

The text was updated successfully, but these errors were encountered:

One of our goals with adding more hotspot telemetry is to better understand what's happening to a cluster when it has a hotspot. Today this is possible real time, but information is limited when trying to understand hotspots from the past. We currently have a log for the hot ranges in a cluster, which can be enabled to periodically report the hot ranges, but to limit the output it runs infrequently, and therefore is likely to miss temporal, or short lived hotspots. In the replica deciders, there already exists some functionality for determining when a specific replica is the target of an unbalanced portion of the system's load. What this change does is allow for other parts of the system (namely the hot range logger) to subscribe to when that tipping point is reached. The following change will link the hot range logger to this new notification system, so that temporal hotspots can be better examined. Fixes: cockroachdb#138672 Epic: CRDB-43150 Release note: none

angles-n-daemons added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Jan 8, 2025

angles-n-daemons changed the title ~~kvserver: expose a per-store maximum replica cpu metric~~ kvserver: expose a per-store replica cpu histogram metric Jan 9, 2025

angles-n-daemons added the T-observability label Jan 9, 2025

angles-n-daemons mentioned this issue Jan 9, 2025

kvserver: expose a per-store replica throughput histograms #138756

Open

exalate-issue-sync bot assigned angles-n-daemons Jan 14, 2025

exalate-issue-sync bot closed this as completed Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvserver: expose a per-store replica cpu histogram metric #138672

kvserver: expose a per-store replica cpu histogram metric #138672

angles-n-daemons commented Jan 8, 2025 •

edited

Loading

kvserver: expose a per-store replica cpu histogram metric #138672

kvserver: expose a per-store replica cpu histogram metric #138672

Comments

angles-n-daemons commented Jan 8, 2025 • edited Loading

angles-n-daemons commented Jan 8, 2025 •

edited

Loading