clavitor/issues/AUDIT-REPORT-clavis-telemet...

3.8 KiB

Cardinal Rule Violations Audit Report — clavis-telemetry

Auditor: Yurii (Code & Principle Review)
Domain: clavis-telemetry
Domain Owner: Hans (operations, monitoring, NOC)
Date: 2026-04-08
Handbook Version: 1.0


Audit Summary

4 Cardinal Rule violations identified in clavis/clavis-telemetry/:

Issue File Violation Severity Assignee
#001 main.go Missing unique error codes Medium @hans
#002 main.go Unhandled database errors High @hans
#003 kuma.go Silent failure in Kuma push High @hans
#004 main.go Unchecked flush error Low @hans

Cardinal Rules Status

Rule Status Notes
#1 — Security failures are LOUD ⚠️ Partial Error handling present but lacks unique codes
#2 — Server never holds L2/L3 Pass Telemetry doesn't handle key material
#3 — Browser is trust anchor N/A Not applicable to telemetry service
#4 — Threat defenses Pass mTLS properly implemented
#5 — WL3 GDPR-out-of-scope Pass No PII in telemetry schema
#6 — Admin over Tailscale Pass Separate from public interface
#7 — Webhooks signed N/A No webhook handlers in telemetry

Part 1 Violations (Culture/Foundation)

Mandatory Error Handling with Unique Codes — VIOLATED

Rule:

Mandatory error handling with unique codes:

  • Every if needs an else. The if exists because the condition IS possible
  • Use unique error codes: ToLog("ERR-12345: L3 unavailable in decrypt")

Violations Found:

  1. Generic error messages (main.go:41, 47, 228, etc.)

    • log.Fatalf("Failed to open operations.db: %v", err)
    • Missing ERR-TELEMETRY-XXX prefix
  2. Silent failures (kuma.go:56-58)

    • Kuma push errors completely ignored
    • Comment rationalizes the silence
  3. Unchecked errors (main.go:328, 335, 338, 354)

    • Database QueryRow and Exec errors not handled
    • Silent data corruption risk

Immediate (Before Next Deploy)

  1. Issue #002 — Unhandled database errors in updateSpan()

    • Risk: Silent outage tracking failures
    • Impact: Operations blind spot during incidents
  2. Issue #003 — Silent Kuma push failure

    • Risk: Monitoring blind spot
    • Impact: "Down" status with no logs explaining why

This Sprint

  1. Issue #001 — Missing unique error codes

    • Add ERR-TELEMETRY-XXX codes to all error paths
    • Update test cases to verify codes appear
  2. Issue #004 — Unchecked tarpit flush error

    • Minor fix, principle consistency

Files Created

  • issues/001-missing-error-codes.md — Assignee: Hans
  • issues/002-unhandled-db-errors.md — Assignee: Hans
  • issues/003-silent-kuma-failure.md — Assignee: Hans
  • issues/004-tarpit-flush-error.md — Assignee: Hans

Domain Compliance

Per CLAVITOR-AGENT-HANDBOOK.md Section V:

clavis-telemetry — operator telemetry

Owns:

  • Heartbeat metrics from POPs to central
  • Per-POP system metrics (CPU, memory, disk, vault count)

Must never:

  • Send any vault content. Telemetry is operational, not data.
  • Send raw IP addresses of users. Aggregate counts only.
  • Run in community edition by default. //go:build commercial tag present.

Domain compliance: PASS — The telemetry service properly adheres to its domain constraints. The violations are in error handling hygiene, not in security model violations.


Next Steps

  1. Hans addresses issues #002 and #003 (High priority)
  2. Hans addresses issues #001 and #004 (Medium/Low priority)
  3. Yurii reviews PRs before merge
  4. Add daily review checklist to telemetry directory per Section III

Foundation First. No mediocrity. Ever.