Grafana threshold expression requires a scalar input, not a raw time
series. Added explicit reduce step (type: reduce, reducer: last) as
refId B between the Prometheus query (A) and the threshold check (C).
All 4 rules updated: CrashLoopBackOff, Disco >80%, RAM >85%, Pod Failed.
condition field changed from B → C on each rule.
Grafana env var substitution of a numeric TELEGRAM_CHAT_ID caused
json unmarshal error (number into string field). chatid is not sensitive
so hardcode it directly; only bottoken uses ${TELEGRAM_BOT_TOKEN}.
- Delete 26 secret manifests containing REDACTED placeholder values
(15 cert-manager TLS + 11 app secrets across 8 namespaces)
- REDACTED is valid base64 that decodes to non-UTF-8 bytes — ArgoCD
applying these manifests corrupts live secrets in the cluster
- Add .githooks/pre-commit that rejects any .yaml with REDACTED
- Add README.md documenting secret management policy and manual
creation commands for each service
- n8n secret manifests already fixed in previous commits (618b1e8, db04fd2)