CRITICAL: clavis-telemetry silent failure in Kuma push #3

Closed
opened 2026-04-09 04:46:09 +00:00 by johan · 0 comments
Owner

Violation

Per CLAVITOR-AGENT-HANDBOOK.md Part 1:

Every if needs an else.
Quick fixes are not fixes. A "temporary" hack that ships is permanent.

Location

File: clavis/clavis-telemetry/kuma.go
Lines 53-59:

// POST to Kuma
payload := `{"status":"...` 
resp, err := http.Post(kumaURL, "application/json", strings.NewReader(payload))
if err != nil {
    // Silent fail - Kuma will detect silence as down
    return
}
resp.Body.Close()

The Violation

  1. Silent failure: The error is caught and completely ignored with only a comment
  2. No error code: No ERR-XXXXX code for operational forensics
  3. No logging: The failure is invisible in logs
  4. Comment is misleading: "Kuma will detect silence as down" — but operators won't know WHY

Why This Matters

When Kuma shows "down", operators need to know if it's because:

  • The telemetry service is actually down (DB failure)
  • The telemetry service can't reach Kuma (network issue)
  • Kuma itself is having issues

Silent failures create blind spots in operational monitoring.

Required Fix

  1. Log Kuma push failures with unique error code
  2. Include the error details in the log
  3. Handle non-OK HTTP responses
  4. Handle response body close errors

Assignment

  • Domain: clavis-telemetry
  • Domain Owner: Hans (per Section I agent mapping)
  • Priority: Critical
  • Review by: Yurii (after fix)
## Violation Per CLAVITOR-AGENT-HANDBOOK.md Part 1: > Every `if` needs an `else`. > Quick fixes are not fixes. A "temporary" hack that ships is permanent. ## Location File: `clavis/clavis-telemetry/kuma.go` Lines 53-59: ```go // POST to Kuma payload := `{"status":"...` resp, err := http.Post(kumaURL, "application/json", strings.NewReader(payload)) if err != nil { // Silent fail - Kuma will detect silence as down return } resp.Body.Close() ``` ## The Violation 1. **Silent failure:** The error is caught and completely ignored with only a comment 2. **No error code:** No `ERR-XXXXX` code for operational forensics 3. **No logging:** The failure is invisible in logs 4. **Comment is misleading:** "Kuma will detect silence as down" — but operators won't know WHY ## Why This Matters When Kuma shows "down", operators need to know if it's because: - The telemetry service is actually down (DB failure) - The telemetry service can't reach Kuma (network issue) - Kuma itself is having issues Silent failures create blind spots in operational monitoring. ## Required Fix 1. Log Kuma push failures with unique error code 2. Include the error details in the log 3. Handle non-OK HTTP responses 4. Handle response body close errors ## Assignment - Domain: clavis-telemetry - Domain Owner: Hans (per Section I agent mapping) - Priority: **Critical** - Review by: Yurii (after fix)
hans was assigned by johan 2026-04-09 05:41:54 +00:00
johan closed this issue 2026-04-09 06:34:03 +00:00
Sign in to join this conversation.
No description provided.