3.2 KiB
3.2 KiB
Issue: Missing unique error codes in clavis-telemetry
Domain: clavis-telemetry
Assignee: @hans
Labels: violation, cardinal-rule-part-1, error-handling
Priority: Medium
Date: 2026-04-08
Violation
Cardinal Rule Violated: Part 1 — "Mandatory error handling with unique codes"
Per CLAVITOR-AGENT-HANDBOOK.md Part 1:
Mandatory error handling with unique codes:
- Every
ifneeds anelse. Theifexists because the condition IS possible- Use unique error codes:
ToLog("ERR-12345: L3 unavailable in decrypt")- When your "impossible" case triggers in production, you need to know exactly which assumption failed and where.
Error messages that actually help:
Every error message shown to a user must be:
- Uniquely recognizable — include an error code:
ERR-12345: ...- Actionable — the user must know what to do next
- Routed to the actor who can resolve it
Location
File: clavis/clavis-telemetry/main.go
Lines with violations:
| Line | Current Code | Violation |
|---|---|---|
| 41 | log.Fatalf("Failed to open operations.db: %v", err) |
No unique error code |
| 47 | log.Fatalf("Failed to load CA chain for mTLS: %v", err) |
No unique error code |
| 228 | log.Printf("Invalid certificate from %s: %v", popID, err) |
No unique error code |
| 337 | log.Printf("SPAN EXTEND node=%s gap=%ds...") |
No unique error code |
| 342-351 | log.Printf("OUTAGE SPAN node=%s...") |
No unique error code |
| 367-370 | log.Printf("OUTAGE SPAN... alerting disabled") |
No unique error code |
| 383 | log.Printf("OUTAGE SPAN ntfy error creating request: %v", err) |
No unique error code |
| 395 | log.Printf("OUTAGE SPAN ntfy error sending alert: %v", err) |
No unique error code |
| 398 | log.Printf("OUTAGE SPAN ntfy alert sent for node=%s", nodeID) |
No unique error code |
File: clavis/clavis-telemetry/kuma.go
| Line | Current Code | Violation |
|---|---|---|
| 56-58 | Silent fail on Kuma push error | Missing error handling entirely |
Required Fix
- Assign unique error codes for each error path (e.g.,
ERR-TELEMETRY-001throughERR-TELEMETRY-020) - Format:
ERR-TELEMETRY-XXX: <actionable message> - Include error codes in:
- Fatal logs (database/CA loading failures)
- Certificate validation failures
- External alerting failures (ntfy)
- Kuma push failures (currently silent)
Example Fix
// Before:
log.Fatalf("Failed to open operations.db: %v", err)
// After:
log.Fatalf("ERR-TELEMETRY-001: Failed to open operations.db at %s - %v. Check permissions and disk space.", dbPath, err)
Verification Checklist
- All
log.Fatalfcalls includeERR-TELEMETRY-XXXcodes - All
log.Printferror logs includeERR-TELEMETRY-XXXcodes - Kuma push errors are no longer silent (line 56-58 kuma.go)
- Certificate validation failures include error codes
- External alert failures (ntfy) include error codes
- Test cases verify error codes appear in output
Reporter: Yurii (Code & Principle Review)
Reference: CLAVITOR-AGENT-HANDBOOK.md Part 1, "Mandatory error handling with unique codes"