Commit Graph

3 Commits

Author SHA1 Message Date
chemavx 4facdd8515 fix(monitoring): correct alert rule pipeline to A→B(reduce)→C(threshold)
Grafana threshold expression requires a scalar input, not a raw time
series. Added explicit reduce step (type: reduce, reducer: last) as
refId B between the Prometheus query (A) and the threshold check (C).

All 4 rules updated: CrashLoopBackOff, Disco >80%, RAM >85%, Pod Failed.
condition field changed from B → C on each rule.
2026-04-26 15:46:39 +00:00
chemavx bb64cc9e62 fix(monitoring): hardcode chatid as string in Telegram contact point
Grafana env var substitution of a numeric TELEGRAM_CHAT_ID caused
json unmarshal error (number into string field). chatid is not sensitive
so hardcode it directly; only bottoken uses ${TELEGRAM_BOT_TOKEN}.
2026-04-26 15:40:21 +00:00
chemavx 94c059ccb9 feat(monitoring): Grafana alerting → Telegram for homelab
- Secret grafana-telegram: bot token + chat ID (env var injection)
- ConfigMap grafana-alerting: provisioning files for contact point,
  notification policy, and 4 alert rules
  * Pod CrashLoopBackOff (for: 1m, noData: OK)
  * Disk > 80% on non-tmpfs filesystems (for: 5m)
  * RAM > 85% (for: 5m)
  * Pod Failed/Unknown (for: 3m, noData: OK)
- Deployment: TELEGRAM_* env vars from secret + alerting volume mount

Token interpolated via ${TELEGRAM_BOT_TOKEN} in provisioning YAML.
2026-04-26 15:25:07 +00:00