# Cardinal Rule Violations Audit Report — clavis-telemetry **Auditor:** Yurii (Code & Principle Review) **Domain:** clavis-telemetry **Domain Owner:** Hans (operations, monitoring, NOC) **Date:** 2026-04-08 **Handbook Version:** 1.0 --- ## Audit Summary 4 Cardinal Rule violations identified in `clavis/clavis-telemetry/`: | Issue | File | Violation | Severity | Assignee | |-------|------|-----------|----------|----------| | #001 | main.go | Missing unique error codes | Medium | @hans | | #002 | main.go | Unhandled database errors | **High** | @hans | | #003 | kuma.go | Silent failure in Kuma push | **High** | @hans | | #004 | main.go | Unchecked flush error | Low | @hans | --- ## Cardinal Rules Status | Rule | Status | Notes | |------|--------|-------| | #1 — Security failures are LOUD | ⚠️ Partial | Error handling present but lacks unique codes | | #2 — Server never holds L2/L3 | ✅ Pass | Telemetry doesn't handle key material | | #3 — Browser is trust anchor | N/A | Not applicable to telemetry service | | #4 — Threat defenses | ✅ Pass | mTLS properly implemented | | #5 — WL3 GDPR-out-of-scope | ✅ Pass | No PII in telemetry schema | | #6 — Admin over Tailscale | ✅ Pass | Separate from public interface | | #7 — Webhooks signed | N/A | No webhook handlers in telemetry | --- ## Part 1 Violations (Culture/Foundation) ### Mandatory Error Handling with Unique Codes — VIOLATED **Rule:** > Mandatory error handling with unique codes: > - Every `if` needs an `else`. The `if` exists because the condition IS possible > - Use unique error codes: `ToLog("ERR-12345: L3 unavailable in decrypt")` **Violations Found:** 1. **Generic error messages** (main.go:41, 47, 228, etc.) - `log.Fatalf("Failed to open operations.db: %v", err)` - Missing `ERR-TELEMETRY-XXX` prefix 2. **Silent failures** (kuma.go:56-58) - Kuma push errors completely ignored - Comment rationalizes the silence 3. **Unchecked errors** (main.go:328, 335, 338, 354) - Database `QueryRow` and `Exec` errors not handled - Silent data corruption risk --- ## Recommended Fix Priority ### Immediate (Before Next Deploy) 1. **Issue #002** — Unhandled database errors in `updateSpan()` - Risk: Silent outage tracking failures - Impact: Operations blind spot during incidents 2. **Issue #003** — Silent Kuma push failure - Risk: Monitoring blind spot - Impact: "Down" status with no logs explaining why ### This Sprint 3. **Issue #001** — Missing unique error codes - Add `ERR-TELEMETRY-XXX` codes to all error paths - Update test cases to verify codes appear 4. **Issue #004** — Unchecked tarpit flush error - Minor fix, principle consistency --- ## Files Created - `issues/001-missing-error-codes.md` — Assignee: Hans - `issues/002-unhandled-db-errors.md` — Assignee: Hans - `issues/003-silent-kuma-failure.md` — Assignee: Hans - `issues/004-tarpit-flush-error.md` — Assignee: Hans --- ## Domain Compliance Per CLAVITOR-AGENT-HANDBOOK.md Section V: > ### `clavis-telemetry` — operator telemetry > **Owns:** > - Heartbeat metrics from POPs to central > - Per-POP system metrics (CPU, memory, disk, vault count) > > **Must never:** > - Send any vault content. ✅ Telemetry is operational, not data. > - Send raw IP addresses of users. ✅ Aggregate counts only. > - Run in community edition by default. ✅ `//go:build commercial` tag present. **Domain compliance: PASS** — The telemetry service properly adheres to its domain constraints. The violations are in error handling hygiene, not in security model violations. --- ## Next Steps 1. Hans addresses issues #002 and #003 (High priority) 2. Hans addresses issues #001 and #004 (Medium/Low priority) 3. Yurii reviews PRs before merge 4. Add daily review checklist to telemetry directory per Section III --- *Foundation First. No mediocrity. Ever.*