Walk Through a Realistic Config¶
This page shows a compact production-style config.yaml. Use Configuration reference for field-level defaults and validation.
Example¶
cluster: prod-us-east-1
metricsAddr: ":9090"
filters:
watchedNamespaces: "^(prod|staging)-.*"
ignoredNamespaces: "kube-,system-debug"
watchedPodNamePrefixes: ""
ignoredPodNamePrefixes: "debug-,test-"
behavior:
muteSeconds: 600
ignoreRestartCount: 30
ignoreRestartsWithExitCodeZero: false
resolveTTLSeconds: 600
startupGraceSeconds: 30
pvcPendingSeconds: 300
disableLogCollection: false
disableAnnotationSilences: false
channels:
critical: alerts-critical
warning: alerts-warning
info: alerts-info
routing:
- match: {severity: critical}
sinks: [slack, pagerduty]
- match: {severity: warning, namespace: prod-.*}
sinks: [slack]
- match: {severity: info}
sinks: [slack]
- match: {kind: Pod, reason: ImagePullBackOff, namespace: staging-.*}
sinks: [slack]
severityOverrides:
- match: {kind: Pod, reason: ImagePullBackOff, namespace: dev-.*}
severity: info
sinkRates:
pagerduty:
perSecond: 10
burst: 20
discord:
perSecond: 2
burst: 5
grouping:
enabled: true
windowSeconds: 30
by: [kind, namespace, reason, severity]
escalations:
- match: {severity: critical}
afterMinutes: 15
sinks: [pagerduty]
receiver:
enabled: true
allowAnonymous: false
inhibitions:
- source: {kind: Node, reason: NodeNotReady}
target: {kind: Pod}
equal: [node]
duration: 10m
silences:
- matchers: {namespace: kube-system}
until: "2026-06-30T00:00:00Z"
persistence:
enabled: true
configMapName: alertkube-state
What Each Section Does¶
| Section | Purpose |
|---|---|
cluster, metricsAddr |
Name alerts and expose /metrics, /healthz, /readyz, /api/alerts, /api/v1/alerts. |
filters |
Limit watched namespaces and pod name prefixes. Namespace filters apply to all watchers; pod filters apply to Pods. |
behavior |
Dedupe, resolve timing, restart handling, startup grace, PVC pending threshold, log enrichment, annotation silences. |
channels |
Slack channel names by severity. Modern Slack apps need bot-token mode for this to work. |
routing |
First-match rules mapping alerts to sinks. namespace and reason are anchored regexes; most other fields are exact. |
severityOverrides |
Remap default watcher severity before dedupe and routing. |
sinkRates |
Per-sink token-bucket limits; defaults are conservative. |
grouping |
Storm folding. First alert dispatches immediately; later same-group alerts summarize. |
escalations |
Re-dispatch still-unresolved alerts to extra sinks after a delay. |
receiver |
Accept Alertmanager webhooks on POST /api/v1/alerts. |
inhibitions |
Suppress target alerts while a matching source alert is active. |
silences |
Suppress matching alerts until an RFC3339 timestamp. |
persistence |
Snapshot active alerts and mute history to a ConfigMap. |
Important invariants:
muteSecondsandresolveTTLSecondsmust be greater than 300 seconds.- PagerDuty and Opsgenie receive every individual alert and resolve; they never receive grouped summaries.
- Resolves bypass silences and inhibitions so incidents can close.
- Config-file silences are operator-controlled. Annotation silences can be disabled with
behavior.disableAnnotationSilences: true.
Test the Config¶
- Validate YAML syntax.
- Run Helm with
--dry-run=client. - Apply, then trigger a known test alert.
-
Check suppression counters:
Common Patterns¶
Large Cluster¶
behavior:
muteSeconds: 900 # longer mute window
resolveTTLSeconds: 900
grouping:
enabled: true
windowSeconds: 60 # wider window
by: [kind, namespace, reason] # fold across namespaces
inhibitions:
- source: {kind: Node} # suppress pods when nodes fail
target: {kind: Pod}
equal: [node]
Small Cluster¶
behavior:
muteSeconds: 360
resolveTTLSeconds: 360
grouping:
enabled: false # each alert is meaningful
Strict Environment¶
behavior:
disableAnnotationSilences: true # only config-file silences apply
disableLogCollection: true
Multi-Sink Routing¶
routing:
- match: {severity: critical}
sinks: [slack, pagerduty, opsgenie] # reach everyone
- match: {severity: warning}
sinks: [slack, opsgenie] # ops see warnings
- match: {severity: info}
sinks: [slack] # only chat
See Also¶
- Configuration schema reference - all keys, types, defaults, and validation rules.
- Configure alert sinks - set up each sink (Slack, PagerDuty, etc.).
- Configure Alertmanager webhook receiver - receiver and API token setup.
- Tune the mute window and grouping - deep dive on dedup and storm folding.
- Suppress dependent alerts with inhibitions - inhibition patterns and examples.
- Silence alerts for a time window - time-bounded suppression.