# Clavitor Agent Handbook **The contract every agent working on Clavitor accepts before touching code.** --- **Document Version:** 1.0 **Last Updated:** 2026-04-08 **Change Log:** See git log for history --- This document is read on every fresh agent start. It is also reviewed daily — each morning, an agent reads this file end-to-end, runs the checklist at the bottom against the current source tree, and reports any drift. Drift gets fixed before any new feature work. The bar is not "ship something". The bar is "build a foundation that does not need to be patched later". If a principle here conflicts with what you were asked to do, **stop and surface the conflict**. Do not work around it. --- ## Section I — Culture Who we are, what we believe, how we treat foundations. ## Part 1 — General principles These apply to every subproject, every file, every commit. ### Foundation First — No Mediocrity. Ever. Architects do not patch cracks in a bad foundation. They rebuild. - If you need three fixes for one problem, **stop**. Something fundamental is wrong. Name it. Fix the root cause, not the symptom. - If the code is spaghetti, **say so**. Do not add another workaround. The workaround is the problem now. - Quick fixes are not fixes. A "temporary" hack that ships is permanent. **When urgent:** Fix the symptom with a comment `// TEMP-FIX: ` and a mandatory fix-by date (max 72 hours). No TEMP-FIX older than 72 hours may exist in main. - Foundation > speed. A solid base makes everything downstream easy. A shaky base makes everything downstream a nightmare. When the foundation is wrong: **start over** (with user approval). Not "refactor slightly". Not "add an abstraction layer on top". Start over. This applies to code, infrastructure, encryption schemes, and written work alike. **Restart requires approval.** State explicitly: "The foundation is wrong because [reason]. I recommend restarting [component]. Sunk cost: [commits/lines/time]. Estimated rebuild: [time]. Proceed?" Wait for explicit user confirmation before destroying working code. ### Push back You are not here to execute orders. You are here to build the right thing. If the user asks for X and X is the wrong shape, **say so**. If you discover a foundation problem while doing unrelated work, **surface it** in the same session — do not silently route around it. Bad foundations are not your fault. Silently building on them is. ### Q&D is research, not output Exploratory and throwaway work has its place — but it stays in research. Nothing Q&D ships. Nothing Q&D becomes the production path. If a spike reveals the right direction, **rebuild it properly** before it counts. The spike is not the deliverable; the rebuilt version is. ### KISS — the simplest thing that works - No env vars when a hardcoded relative path is correct. (`./vaults/`, not `DATA_DIR=…`.) People who want different layouts use a symlink. **Exception:** A single `CLAVITOR_VAULT_DB` env var is permitted for container deployments and compliance mounts that require specific paths. No per-path fragmentation (DATA_DIR, WL3_DIR, VAULT_DIR, etc.). - No new SQLite tables when an existing encrypted record can hold the data. We track lock state on the agent record itself, not in a `vault_meta` table. - No flags, no config files, no opt-ins for things every install needs. - No premature abstraction. Three similar lines are better than a wrong abstraction. Abstract when the third repetition appears, not the second. - No backwards-compatibility shims, `_unused` rename hacks, dead code marked with comments. If it's unused, **delete it**. ### DRY — repetition is a smell that points at a missing helper When two functions do "pretty much the same thing", they get unified into one helper and both call sites use it. Examples that already exist in the tree: - `lib/dbcore.go: agentMutate(db, vk, entryID, fn)` — three near-identical read-decrypt-mutate-encrypt-write functions (`AgentLock`, `AgentUnlock`, `AgentUpdateAllowedIPs`) collapsed into one helper. New mutations are one-liners now: `func(vd) { vd.X = y }`. - `crypto.js: encrypt_field(key, label, value)` — single source of truth for field encryption, used by both browser and CLI. If you write the second copy, you are creating future debt. Stop, refactor, then proceed. ### No migrations during early-stage work There are no real users yet. Schema changes are not migration problems — they are "delete the old vault, register again" problems. Do not write migration code. Do not write `ALTER TABLE` for new columns. Just edit the schema constant and rebuild. **Early stage ends: April 20, 2026.** After this date, the migration rule flips and production deployments require proper schema migrations. ### Don't add what's not needed (and delete what isn't) - **No backwards-compatibility shims:** - No `--no-verify`, `--skip-sign`, `--insecure`, `--legacy` flags unless the user explicitly asks. - No `_oldName` field rename hacks "for transition". - No re-exports of types that moved. - No `// removed` comments in place of deletion. - No commented-out code "in case we need it again". - If you delete it, delete it. `git log` is the rollback path. - **Delete dead code (with approval):** - No orphaned files (not imported, not referenced, not executed). - No unreachable functions (private, no callers in the codebase). - No unused exports (public but no external importers). - No dead CSS classes (not referenced in HTML/JS). - No commented-out blocks >5 lines (anything that large should be deleted). - **Permission required:** File deletion requires explicit user approval. State: "3 files/directories found unused. Recommend deletion: 1. path/to/file1.html (orphaned, no references) 2. path/to/file2.go (unreachable function) 3. path/to/dir/ (empty) Proceed with deletion?" - **Verification:** Part 4 Section G checks for unused files/functions. - **Mandatory error handling with unique codes:** - Every `if` needs an `else`. The `if` exists because the condition IS possible — "impossible" just means "I can't imagine it happening." The else handles the case where your imagination was wrong. - Use unique error codes: `ToLog("ERR-12345: L3 unavailable in decrypt")` - When your "impossible" case triggers in production, you need to know exactly which assumption failed and where. ### Error messages that actually help Every error message shown to a user must be: 1. **Uniquely recognizable** — include an error code: `ERR-12345: ...` 2. **Actionable** — the user must know what to do next (retry, check input, wait, contact support) 3. **Routed to the actor who can resolve it** — don't tell end-users things only operators can fix **Bad:** "Cannot perform action" — useless, no code, no next step, no routing. **Bad:** "Server down" to an end-user — they can't fix it. This creates helplessness and support tickets. **Good (operator alert + user message):** ``` [Alert to ops/monitoring]: ERR-50001: POP-ZRH vault DB unreachable — 3 failures in 60s [User sees]: "Service temporarily unavailable. Our team has been alerted and is working on it. Reference: ERR-50001. No need to open a ticket." ``` The user knows: (a) what happened, (b) it's being handled, (c) they don't need to act. This prevents unnecessary support load while keeping the user informed. - No validation at trusted internal boundaries. Validate at the system perimeter (HTTP requests, file uploads, parsed user input) and nowhere else. - No feature flags for hypothetical futures. Build what's asked, no more. - Docstrings where understanding requires tracing >2 files or non-local reasoning. If you had to grep across modules to figure out what a function does, the next reader will too — document it. Obvious single-file logic needs no comment. ### Audit everything that matters Every state change, every rejection, every strike, every lock, every admin action, every webhook receipt: **audit-logged**. The audit log is the forensic tool when something goes wrong. If it's not in the audit log, it didn't happen and you can't reconstruct it. Audit format already in use: `lib.AuditLog(db, &lib.AuditEvent{ Action: ..., Actor: ..., Title: ..., IPAddr: ..., EntryID: ... })`. ### Never delegate understanding Don't write "based on the research, implement it" or "based on the findings, fix the bug". Those phrases push synthesis onto someone else (or onto a future agent). Write code that proves you understood: include file paths, line numbers, and what specifically to change. --- ### How to add a new principle When a new lesson surfaces (today's "rate limit is for the agent, not the owner" was one), add it here in the same session: 1. Pick the right section: General (Part 1), Cardinal (Part 2), or subproject-specific (Part 3). 2. State the principle in one sentence at the top of the entry. 3. Explain WHY in 1–3 sentences. The reasoning matters more than the rule — future reviewers need it to apply the principle correctly to new situations. 4. If the principle is auditable, add a grep check to Part 4. 5. Reference the incident or insight that surfaced the principle (commit hash, issue number, or session date). Principles are additive. Don't delete old ones to make room — they exist because something went wrong once. If a principle is genuinely obsolete: **Deprecation format:** ```markdown ### **DEPRECATED** — Original Principle Name **Deprecated:** 2026-04-08 **Reason:** Early stage ended — migration code now required **Authorized by:** Johan **Replacement:** See "Schema migrations" in Part 8 (when written) [original principle text preserved below...] ``` - Deprecated principles remain in place for 90 days, then move to Appendix A - A deprecated principle's grep checks in Part 4 are commented out with reason - Deprecation requires explicit user approval (state: "Deprecating principle X because Y. Authorized by you?") ### When you have to violate a principle You can't always honor every principle. Sometimes the principles themselves conflict. When that happens: 1. **Surface the conflict to the user.** Don't choose silently. 2. **Pick the principle with the higher consequence.** Security vetos beat DRY. Foundation rules beat KISS. The Cardinal Rules in Part 2 beat everything else. 3. **Document the violation in the code itself**, with a comment explaining what principle was broken, why, and what would have to be true for the violation to be reversed. 4. **Add a TODO to the daily checklist** so the violation is reviewed tomorrow and either fixed or formally accepted. Example of an acceptable, documented violation: the IP whitelist first-contact race in `api/middleware.go` — there's a 30-line comment explaining that the race is theoretically possible, why it cannot be reproduced in practice, and why fixing it would require complexity disproportionate to the threat. The violation is acknowledged, scoped, and dated. ### Escalation and post-hoc discovery **User disagreement:** If the user insists on a lower-consequence principle over a higher one (e.g., speed over security), escalate to project owner (Johan) before proceeding. Document the override in the PR description. **Post-hoc discovery:** If you find a committed violation during daily review: 1. Assess if it's actively dangerous (security veto) — if yes, escalate immediately 2. If not dangerous, add to TODO and address in current sprint 3. Never leave a violation undocumented for >24 hours **Equal consequence conflicts:** When two principles of equal tier conflict, prefer the one that preserves user agency and data integrity. **Severe security violations:** For critical security breaches (harvester detected, token compromise, unauthorized data exfiltration): - Immediate escalation to Johan - Consider permanent ban of the agent and/or **all agents running on that model** - Quarantine affected vault(s) until forensic review complete - Legal/compliance review if customer data was accessed --- --- ## Section II — Security The threat model and cardinal rules. Non-negotiable. The threat model for Clavitor is **a malicious skill that subverts an agent** into harvesting credentials. Not random brute-forcers. Not script kiddies. The attacker is already inside the trust envelope — they're an authenticated agent with a real CVT token. Every defense in the system is built around slowing, detecting, and stopping that attacker. ### Cardinal rule #1 — Security failures are LOUD If decryption fails, exposure fails, or a signature mismatches: **expose the failure**. Never fall back to plaintext. Never silently continue. Never return a partial result that omits the failed item. **WRONG (fireable offense)**: ```javascript try { decrypted = await decrypt(ciphertext); } catch (e) { decrypted = plaintext; // NEVER } ``` **WRONG (also fireable)**: ```go func verifyWebhookSignature(r *http.Request) bool { return true // For now, accept all } ``` **CORRECT**: ```javascript try { decrypted = await decrypt(ciphertext); } catch (e) { decrypted = '[decryption failed]'; // user sees the failure } ``` ```go func verifyWebhookSignature(r *http.Request, body []byte) bool { secret := os.Getenv("PADDLE_WEBHOOK_SECRET") if secret == "" { log.Printf("PADDLE_WEBHOOK_SECRET not set; refusing webhook") return false // refuse, log, never permit } // ... real HMAC verification ... } ``` Failures must be **noisy, visible, and blocking** — never silent, hidden, or permissive. This applies to: - Encryption / decryption errors - Signature verification failures - Authentication failures - Tampering detection - Any security-critical operation ### Cardinal rule #2 — The server is a dumb store The vault server **never holds full key material**. The server's universe of secrets is limited to: | Tier | What | Server may hold | |------|------|-----------------| | L0 | 4 bytes — vault file routing | Yes (sliced from `L1[:4]`) | | L1 | 8 bytes — vault content encryption | Yes (in bearer token) | | L2 | 16 bytes — agent decryption key | **NEVER** | | L3 | 32 bytes — hardware-only key | **NEVER** | | P1 | 8 bytes — public lookup token | Yes (browser computes & sends) | | Master | 32 bytes — full PRF output | **NEVER** | The server **derives nothing**. It receives what the browser sends, validates shapes, and persists. P1 is computed by the browser via HKDF and sent on the wire — the server does not know how to derive it because it does not have the master key. There is no `lib.DeriveP1` function; if you find one, delete it. The server is allowed to hold L1 because L1 is the bearer token used for per-entry encryption keys (`DeriveEntryKey`). L1 grants the ability to read vault content, but **not** the L2/L3 fields: L2 is the agent's own key (never leaves the C CLI / browser trust anchor), and L3 is a random 32-byte value the server has never seen in plaintext — it's only present in the WL3 file, wrapped with the user's PRF. ### Cardinal rule #3 — The browser is the trust anchor Every line of code that handles full key material lives in the browser (or the C CLI, which is its own trust anchor — see Part 3). The browser: - Generates a random 32-byte L3 at vault creation. L3 is pure entropy — not derived from anything. The PRF's job is to *wrap* L3, not to derive it. - Wraps L3 with the raw PRF output → opaque blob the server stores in WL3 - Computes P1 = HKDF(master, "clavitor-p1-v1", 8 bytes) - Computes L1 = master[0:8] - Sends `{l1, p1, wrapped_l3, credential_id, public_key}` to the server - Holds L2/L3 in memory only during an active session, never persists them The browser is the only place that ever sees: - The full master key - L2 plaintext - L3 plaintext If a server-side function takes any of these as a parameter, that function is a **veto violation**. Delete it. **Exceptionally rare terms:** The strings `master_key`, `L3` (or `l3_plaintext`), and `P3` should appear almost nowhere in the codebase. If you find yourself using them, stop. These are browser-only concepts. Their presence outside `auth.js`, `crypto.js`, or the C CLI is a sign of architectural drift. ### Cardinal rule #3.5 — Cryptographic hygiene **FIPS 140-3 compliance:** All cryptographic operations must be FIPS 140-3 validated or in the process of validation. This means: - Use standard library crypto (Go: `crypto/` packages, C: OpenSSL FIPS module) - No custom cryptographic primitives - No home-grown key derivation or random number generation **Avoid CGO where possible:** CGO (Go's C interoperability) creates build complexity, deployment friction, and audit surface. Prefer: - Pure Go implementations for new code - CGO only when FIPS mandates it (e.g., specific HSM integrations) - The C CLI (`clavis-cli`) is the only CGO-heavy component by design **Compiler optimizations:** Standard `memset` or `clear()` may be optimized away. Use `runtime.memclrNoHeapPointers` (Go) or `memset_explicit` (C11) to ensure key material is actually cleared from memory. ### Cardinal rule #4 — Three threats, no others The owner is not a threat — they paid for the vault, they hold the keys, defending against them is self-DOS. The three threats below are the **primary** universe Clavitor defends against at the application layer. Infrastructure, supply chain, and social engineering are handled through operational security practices outside this codebase. #### Threat A — Malicious skill harvesting through an authenticated agent A subverted skill makes a real agent enumerate credentials. Attacker is already inside the trust envelope: valid CVT token, IP whitelist passes, normal-looking client. Defenses: - Per-agent unique-entries quota (default 3/min, 10/hour). Repeats of the same credential are free; only distinct credentials count. - Two-strike lockdown: second hour-limit hit within 2h → agent locked, owner PRF unlock required. State persists in the encrypted agent record; restart does not reset. - **Slow-harvest detection:** An agent accessing >12 distinct credentials in any 24-hour period triggers an immediate audit alert (regardless of rate limits). This is a dark-red flag — normal agents need at most a dozen credentials. - L3 fields require keys the server has never seen (Threat B's defenses apply here too). - Owner-only endpoints exempt from global rate limit (agents blocked at the handler entry — rate-limiting that path defends against nothing). When reasoning about a rate limit, ask: *agent or owner?* If owner, it's wrong. #### Threat B — Vault file falls into the wrong hands Stolen disk, stolen backup, hostile sysadmin, court order on a host. The attacker has every byte the server has. **Result must be: useless ciphertext.** This works because **keys are never co-located with the data they encrypt**: - L1 lives nowhere on disk. Derived from the user's WebAuthn PRF; the PRF requires the physical authenticator. No keyfile, no recovery copy. - L2/L3 plaintext are absent from the server's universe at compile time (see the hard veto list and Part 4 → Section A). - WL3 holds `wrapped_l3` = AES-256-GCM(L3, raw PRF). Without the authenticator, this is uniformly random ciphertext. - 32-byte AES-256-GCM keys are not brute-forceable. ~10⁵⁷ years at 10¹² guesses/sec classically. Grover's quantum algorithm provides a square-root speedup, reducing effective security to that of a 128-bit classical cipher — still computationally infeasible (~10²⁹ years). - Backups use `VACUUM INTO` — same encrypted blobs, same property. If you ever put L1, L2, L3, master_key, the raw PRF, or any decrypted field value into a log line, backup, metric, error message, audit row, or debug dump: you have broken Threat B. Stop. #### Threat C — Agent token falls into the wrong hands The CVT **client credential** (type 0x01: L2(16) + agent_id(16) + POP(4), encrypted with L0) is exfiltrated. The agent stores this locally; at request time it derives L1 = L2[:8] and mints a wire token to authenticate. A stolen credential gives the attacker **the agent's full decryption capability**, not just authentication: L2 decrypts L2-flagged fields the agent fetches. This is a complete agent impersonation. They are the agent. Defenses: - **Per-agent IP whitelist.** First contact fills the whitelist with the agent's source IP; every subsequent request must come from a listed IP, CIDR, or FQDN. A stolen credential used from anywhere else is refused at `L1Middleware` before any handler runs. - Threat A's defenses still apply: unique-entries quota throttles harvest attempts, two-strike lockdown stops sustained patterns. Refused-IP attempts are audit-logged for forensic review. - Owner revokes the agent at any time (admin-token gated, PRF tap). **Gap — Sub-vector C2 (credential theft at creation):** When an agent is enrolled, the browser generates L2 and creates the type-0x01 credential. This token exists transiently in browser memory before being delivered to the agent. A malicious browser extension or XSS could exfiltrate it during this window. **Status:** Acknowledged. The token exists in memory while the user is logged in (with L3 in sessionStorage), so side-channel delivery (QR, file+scp) doesn't eliminate the risk — it just moves it. Full mitigation requires a hardware attestation flow not implemented in current stage. **Deferred until post-April 20** as current IP whitelist + rate limits provide sufficient baseline defense. - L3 fields remain unreachable — L2 does not unlock L3. A perfect attacker (same IP, same access cadence, same scope of reads) is indistinguishable from the legitimate agent until the owner reviews the audit log. Whitelist + rate limit + audit are layered on purpose — no single defense carries the full load. ### Cardinal rule #5 — WL3 is GDPR-out-of-scope by design The WL3 corpus (wrapped L3 + lookup metadata) contains **only cryptographic material**. No customer ID, no name, no email, no slot ID. The link from a credential to a human lives only in the central admin DB. Delete the central record → the WL3 file becomes a mathematically-orphan blob with no possible link to a person. Functionally anonymous. This is not a GDPR risk we tolerate. It is a property we get for free because we **designed the data out** of the WL3 file. If you ever add a personal-data field to the WL3 schema, you have just made the entire WL3 corpus subject to GDPR deletion requests, retention rules, and audit obligations across every POP that mirrors it. **Do not.** ### Cardinal rule #6 — Admin/sync traffic is not public Anything sensitive between central and POPs runs over **Tailscale**, not public 443. The admin endpoints listen only on the Tailscale interface. Public clavitor.ai serves users; the ops control plane is invisible to the internet. App-level HMAC on top of Tailscale is defense-in-depth — if a tailnet node is ever compromised, the per-POP shared secret is the second lock. Both layers required, neither optional. ### Cardinal rule #7 — Webhooks are signed or refused Every inbound webhook (Paddle, future integrations) verifies a cryptographic signature against a stored secret. If the secret is missing, refuse. If the signature is missing, refuse. If the signature mismatches, refuse. If the timestamp is outside ±5 minutes of now (anti-replay), refuse. If the body has been touched between read and verify, refuse. There is no debug mode that disables this. If you need to develop locally against a webhook, set `PADDLE_WEBHOOK_SECRET=test-secret` and craft a real signature with the test secret. The verification path is always on. --- --- ## Section III — Workflow How we work together day-to-day. ### Daily review checklist **Run this every morning. Any check that fails is a foundation alert — fix before any new feature work.** **Methodology evolution:** These checks were historically grep-driven for reproducibility. With LLM analysis available, the daily reviewer must run a **full LLM audit of all changed source files** to catch semantic violations, not just pattern matches. Grep catches the letter; LLM catches the intent. **Procedure:** 1. Run the automated tests (F1) 2. Run LLM analysis on all files changed since last review 3. Address any findings before new feature work ### Where to find your tasks **Gitea Issues (https://git.clavitor.ai):** - Issues assigned to you: `tea issues list --repo johan/clavitor --assignees ` - Or: https://git.clavitor.ai/johan/clavitor/issues?q=assignee%3A **Priority order:** 1. CRITICAL issues (fix immediately) 2. HIGH priority issues (this sprint) 3. MEDIUM/LOW (backlog) ### The review workflow **Engineers (Sarah, Charles, Hans, etc.):** 1. Pick up assigned issue 2. Create branch: `git checkout -b /fix-###` 3. Implement fix per issue spec 4. Create PR referencing issue: `Fixes ####` in commit 5. **Wait for reviewer approval** — never merge your own PR **Reviewers (Yurii, Victoria, Arthur):** - Review PRs for principle compliance - Comment: "Approved" or "Request changes" - Engineers address feedback, push updates **You (Johan):** - Merge approved PRs - Override only in emergencies (document in PR) Use this prompt with your LLM code review tool: ``` Review the following files for violations of CLAVITOR-PRINCIPLES.md: - Cardinal Rule #1: Security failures must be LOUD (no silent fallbacks) - Cardinal Rule #2: Server never holds L2/L3/master_key - Cardinal Rule #3: Browser is the trust anchor - Cardinal Rule #4: Defenses against Threats A, B, C - Part 1: KISS, DRY, no migrations, no backwards-compat shims Focus on: 1. Security: Any key material (L1, L2, L3, master_key, PRF) flowing to logs, errors, or responses 2. Silent fallbacks: catch blocks that don't expose failures to users 3. DRY violations: Similar logic that should be unified 4. Trust boundaries: Server-side code that derives or handles full keys Report any violations with file paths and line numbers. ``` ### Section A — Server hard vetos **A1. The server never receives `master_key` over the wire.** ```bash #!/bin/bash # Verify server never sees full master key matches=$(grep -r "master_key\|MasterKey\|masterKey" clavis-vault/api/ clavis-vault/lib/ --include="*.go" 2>/dev/null | grep -v "_test.go" | wc -l) if [ "$matches" -gt 0 ]; then echo "❌ FAIL: master_key handling found on server ($matches matches)" grep -r "master_key\|MasterKey\|masterKey" clavis-vault/api/ clavis-vault/lib/ --include="*.go" | grep -v "_test.go" exit 1 else echo "✅ PASS: No master_key handling on server" fi ``` **A2. `lib.DeriveP1` does not exist on the server.** ```bash #!/bin/bash # Check for forbidden P1 derivation on server if grep -rn "DeriveP1\|derive_p1\|deriveP1" clavis-vault/lib/ clavis-vault/api/ 2>/dev/null; then echo "❌ FAIL: DeriveP1 found on server — P1 must be browser-only" exit 1 else echo "✅ PASS: No DeriveP1 on server" fi ``` **A3. L2-handling CVT functions do not exist on the server.** ```bash #!/bin/bash # Check for forbidden L2 credential handling if grep -rn "MintCredential\|ParseCredential\|CredentialToWire" \ clavis-vault/api/ clavis-vault/lib/ 2>/dev/null; then echo "❌ FAIL: L2 credential functions found on server" exit 1 else echo "✅ PASS: No L2 credential functions on server" fi ``` ``` Expected: zero matches. The Go CVT module only handles type-0x00 wire tokens (L1, agent_id). The L2-bearing type-0x01 is C-only in `clavis-cli/src/cvt.c`. **A4. `AgentData` has no L2 field.** ``` grep -n "L2 *\[\]byte" clavis-vault/lib/types.go ``` Expected: zero matches. (Field-level `L2 bool` flag on `VaultField` is fine — that's metadata, not key material.) **A5. WL3 JSON contains no PII.** ``` grep -n "customer_id\|name\|email\|slot_id" clavis-vault/lib/wl3.go ``` Expected: zero matches in field definitions of `WL3Entry`. Adding any of these makes the entire WL3 corpus GDPR-scoped. **A6. Keys are never logged or audited.** ``` grep -rn "log\.\|fmt\.Print\|AuditEvent" clavis-vault/api/ clavis-vault/lib/ \ | grep -iE "l1Key|master|wrappedL3|wrapped_l3|prf|p1Bytes" ``` Expected: zero matches. Threat B violation if any key material reaches a log, metric, error message, or audit row — even in hex, even in debug builds. **A7. Per-agent IP whitelist enforcement is on.** ``` grep -n "AgentIPAllowed\|allowed_ips\|AllowedIPs" clavis-vault/api/middleware.go ``` Expected: matches in `L1Middleware` showing first-contact fill + enforcement. This is Threat C's primary defense. If the check is missing or commented out, a stolen agent token works from anywhere on the internet. ### Section B — Silent fallback **B1. No silent decryption fallback in browser code.** ``` grep -rn "catch.*plaintext\|catch.*ciphertext\|catch.*=.*null" \ clavis-vault/cmd/clavitor/web/ clavis-chrome/ clavis-firefox/ ``` Expected: any match must be reviewed. The acceptable pattern is `catch (e) { result = '[decryption failed]'; }` — visible to the user. **B2. No `return true` in security check stubs.** ``` grep -rn "return true.*//.*TODO\|return true.*//.*for now\|return true.*//.*accept" \ clavitor.ai/admin/ clavis-vault/ ``` Expected: zero matches. If you see one, you've found a SECURITY.md violation. Implement it or delete the call site. **B3. No commented-out security checks.** ``` grep -rn "// *if !verify\|// *if !auth\|// *return.*Forbidden" \ clavis-vault/ clavitor.ai/admin/ ``` Expected: zero matches. **B4. No silent fallbacks in catch blocks (manual LLM review required).** Grep catches patterns, not intent. All catch blocks in decryption/security paths must be manually reviewed to ensure they don't silently swallow errors. The LLM must verify: no `catch (e) { return data; }`, no `catch { decrypted = ""; }`, no `finally` blocks that restore old values without logging. ### Section C — Test fixture security **C0. Test fixture key material is obviously fake.** Test data containing key material (L2, L3, master_key, PRF output): - Must be instantly recognizable as test data (not production keys) - **Pattern:** 32 identical bytes: `bytes.Repeat([]byte{0xAB}, 32)` or `{0x00}`, `{0xFF}`, `{0xDE}, {0xAD}` - **Never** use random-looking values that could be mistaken for real keys - **Never** derive test keys from production material - **Never** commit binary test fixtures (always text/SQL that shows the pattern) Example: ```go // GOOD - obviously fake testL3 := bytes.Repeat([]byte{0xAB}, 32) testMaster := bytes.Repeat([]byte{0x00}, 32) // BAD - looks like a real key testL3 := []byte{0x4a, 0x9f, 0x2c, ...} // 32 random bytes ``` ### Section D — Foundation drift **C1. No `_old/` directory.** ``` find . -type d -name "_old" ``` Expected: zero results. Dead code is deleted, not parked. **C2. No env vars for fundamental paths.** ``` grep -rn "os.Getenv(\"DATA_DIR\")\|os.Getenv(\"WL3_DIR\")" clavis-vault/lib/ ``` Expected: zero matches. These are hardcoded in `lib/config.go` to `vaults` and `WL3` respectively. Symlink if you want different paths. **C3. No migration code.** ``` grep -rn "ALTER TABLE\|migration\|Migration" clavis-vault/lib/dbcore.go \ | grep -v "// " ``` Expected: matches only inside `MigrateDB` for the existing `alternate_for` and `verified_at` columns (legacy from earlier work). No NEW `ALTER TABLE` should appear during early-stage development. **C4. Hardcoded build constants in HTML.** ``` grep -rn "var BUILD = '" clavis-vault/cmd/clavitor/web/ ``` Expected: zero matches. Build version must come from `/api/version` at runtime, not from a hardcoded constant that goes stale. ### Section E — DRY violations **E1. Two functions doing the same thing.** This one is judgment-call, not greppable, but the daily reviewer should look for repeated patterns of: - read row from DB → unpack → mutate one field → repack → write - HTTP handler boilerplate (admin check, decode body, validate, audit, respond) - Crypto primitive wrappers that differ only by info string If you see the third copy of any pattern, **stop and refactor** before adding the fourth. The canonical example to model after: `lib/dbcore.go:agentMutate` — replaced three near-identical functions with a higher-order helper. Net negative line count, easier to add new mutations, single point of correctness. **E2. Two crypto implementations of the same primitive.** Browser code uses `clavis-vault/cmd/clavitor/web/crypto.js`. CLI uses `clavis-cli/crypto/crypto.js`. These should be **byte-identical**. If they diverge, the encrypted output diverges, and a credential created in one client becomes unreadable in the other. ``` diff clavis-vault/cmd/clavitor/web/crypto.js clavis-cli/crypto/crypto.js ``` Expected: zero diff (or, if intentionally different for a target-specific adapter, the difference is documented in `clavis-crypto/README.md`). ### Section F — Threat model alignment **E1. Per-agent rate limits have safe defaults.** ``` grep -n "RateLimit *: *3\|RateLimit *= *3\|RateLimitMinute.*3" \ clavis-vault/api/handlers.go grep -n "RateLimitHour *: *10\|RateLimitHour *= *10" \ clavis-vault/api/handlers.go ``` Expected: matches in `HandleCreateAgent`. If the defaults drift (e.g. someone changes them to 0 = unlimited), the harvester defense is silently disabled for new agents. **E2. `agentReadEntry` is called from every credential-read handler.** For each handler that returns decrypted credential data, the call is mandatory. The current set: - `GetEntry` - `MatchURL` - `HandleListAlternates` If you add a new handler that returns decrypted credentials and forget to call `agentReadEntry`, you have a harvester hole. Verify by greping: ``` grep -n "AgentCanAccess\|agentReadEntry" clavis-vault/api/handlers.go ``` Every credential-read handler must call both `AgentCanAccess` (before accessing data) and `agentReadEntry` (after decryption, before returning). Verify both functions appear in the handler code path. **E3. `/api/entries/batch` is exempt from global rate limit.** ``` grep -A2 "/api/entries/batch" clavis-vault/api/middleware.go ``` Expected: a `next.ServeHTTP(w, r); return` block with the explanatory comment intact. If the comment is missing, the next reviewer will reintroduce the rate limit and break import. ### Section F — Test posture **F1. All tests pass.** ``` cd clavis-vault && go test ./lib/... ./api/... cd clavitor.ai/admin && go test . ``` Expected: all green. **Failing tests are foundation alerts**, not "tomorrow" problems. **F2. JS syntax check passes.** ``` cd clavis-vault && make build # validates JS as part of build ``` Expected: no `JS syntax error` in output. JS files are embedded in the binary; syntax errors break the running app silently in some browsers. **F3. New code has tests.** Not greppable. Judgment call. The rule: if you added a new function with non-trivial logic, there should be a test for it in the same commit. The tests model what the function is supposed to do. If you can't write a test because "it's hard to test", the function is probably too coupled — refactor. ### Section G — Dead code elimination **G1. No empty directories.** ```bash find . -type d -empty | grep -v ".git" | grep -v "vendor" ``` Expected: zero results. Empty directories are deleted. **G2. No orphaned HTML/JS/CSS files (with exceptions).** ```bash # Find HTML files not referenced in Go code # Exceptions: design-system/, marketing/, test-*.html (dev/test files) for f in $(find . -name "*.html" | grep -v vendor | grep -v "design-system" | grep -v "marketing" | grep -v "test-"); do name=$(basename "$f") if ! grep -r "$name" --include="*.go" . 2>/dev/null | grep -q .; then echo "Orphaned: $f" fi done ``` Review any matches — may be standalone pages or may be dead. **Exceptions (intentionally not checked):** - `design-system/` — standalone design documentation, not embedded - `marketing/` — marketing assets, separate from application - `test-*.html` — development/test files (manual review for cleanup) **G3. No unreachable Go functions (requires go-dead-code or similar tool).** Manual check: review private functions (lowercase) with no callers in the same package. Public functions with no importers are also suspicious — verify they are API endpoints or intentionally exported. **G4. No commented-out blocks >5 lines.** ```bash grep -rn "^\s*//\s*\w" --include="*.go" --include="*.js" . | \ awk -F: '{print $1}' | sort | uniq -c | sort -rn | head -20 ``` Review files with many commented lines — likely dead code. --- ### Git workflow ### Commit philosophy **Commit early, commit often.** A commit is a save point, not a publish event. - One logical change per commit — don't mix bug fix + refactoring + new feature - If you can't explain the commit in one sentence, it's too big — split it - Work-in-progress commits are fine locally; clean up before sharing ### Commit message format ``` : : (optional but encouraged)