Dashboards That Drive Action

Create dashboards that drive decisions: golden paths, drill-downs, and runbook links for fast response.

On this page

Dashboards That Operators Use

Start with a "service overview": traffic, errors, latency, saturation.
Provide drill-down links: logs query, traces filter, runbook.
Show deploy markers and config changes.

Core Panels

RPS and error rate (stacked by status class)
Latency percentiles (p50/p95/p99)
Dependency latency/errors
Resource saturation (CPU/memory/disk/net)

Checklist

Can I answer "is the service healthy" in 10 seconds?
Can I find "what changed" quickly?
Can I pivot to logs/traces in one click?

Failure Modes

Pretty dashboards with no decisions: no thresholds, no runbooks.
Too many panels: slow load and cognitive overload.

← Sampling and High Cardinality Problems

Golden Signals and RED/USE methods →