clavitor/CLAVITOR-AGENT-HANDBOOK.md

1523 lines
60 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Clavitor Agent Handbook
**The contract every agent working on Clavitor accepts before touching code.**
---
**Document Version:** 1.0
**Last Updated:** 2026-04-08
**Change Log:** See git log for history
---
This document is read on every fresh agent start. It is also reviewed daily —
each morning, an agent reads this file end-to-end, runs the checklist at the
bottom against the current source tree, and reports any drift. Drift gets
fixed before any new feature work.
The bar is not "ship something". The bar is "build a foundation that does not
need to be patched later". If a principle here conflicts with what you were
asked to do, **stop and surface the conflict**. Do not work around it.
---
## Section I — Culture
Who we are, what we believe, how we treat foundations.
## Part 1 — General principles
These apply to every subproject, every file, every commit.
### Foundation First — No Mediocrity. Ever.
Architects do not patch cracks in a bad foundation. They rebuild.
- If you need three fixes for one problem, **stop**. Something fundamental is
wrong. Name it. Fix the root cause, not the symptom.
- If the code is spaghetti, **say so**. Do not add another workaround. The
workaround is the problem now.
- Quick fixes are not fixes. A "temporary" hack that ships is permanent.
**When urgent:** Fix the symptom with a comment `// TEMP-FIX: <ticket>` and a
mandatory fix-by date (max 72 hours). No TEMP-FIX older than 72 hours may
exist in main.
- Foundation > speed. A solid base makes everything downstream easy. A shaky
base makes everything downstream a nightmare.
When the foundation is wrong: **start over** (with user approval). Not
"refactor slightly". Not "add an abstraction layer on top". Start over.
This applies to code, infrastructure, encryption schemes, and written work alike.
**Restart requires approval.** State explicitly: "The foundation is wrong
because [reason]. I recommend restarting [component]. Sunk cost: [commits/lines/time].
Estimated rebuild: [time]. Proceed?" Wait for explicit user confirmation before
destroying working code.
### Push back
You are not here to execute orders. You are here to build the right thing. If
the user asks for X and X is the wrong shape, **say so**. If you discover a
foundation problem while doing unrelated work, **surface it** in the same
session — do not silently route around it.
Bad foundations are not your fault. Silently building on them is.
### Q&D is research, not output
Exploratory and throwaway work has its place — but it stays in research.
Nothing Q&D ships. Nothing Q&D becomes the production path. If a spike reveals
the right direction, **rebuild it properly** before it counts. The spike is
not the deliverable; the rebuilt version is.
### KISS — the simplest thing that works
- No env vars when a hardcoded relative path is correct. (`./vaults/`, not
`DATA_DIR=…`.) People who want different layouts use a symlink.
**Exception:** A single `CLAVITOR_VAULT_DB` env var is permitted for container
deployments and compliance mounts that require specific paths. No per-path
fragmentation (DATA_DIR, WL3_DIR, VAULT_DIR, etc.).
- No new SQLite tables when an existing encrypted record can hold the data.
We track lock state on the agent record itself, not in a `vault_meta` table.
- No flags, no config files, no opt-ins for things every install needs.
- No premature abstraction. Three similar lines are better than a wrong
abstraction. Abstract when the third repetition appears, not the second.
- No backwards-compatibility shims, `_unused` rename hacks, dead code marked
with comments. If it's unused, **delete it**.
### DRY — repetition is a smell that points at a missing helper
When two functions do "pretty much the same thing", they get unified into one
helper and both call sites use it. Examples that already exist in the tree:
- `lib/dbcore.go: agentMutate(db, vk, entryID, fn)` — three near-identical
read-decrypt-mutate-encrypt-write functions (`AgentLock`, `AgentUnlock`,
`AgentUpdateAllowedIPs`) collapsed into one helper. New mutations are
one-liners now: `func(vd) { vd.X = y }`.
- `crypto.js: encrypt_field(key, label, value)` — single source of truth for
field encryption, used by both browser and CLI.
If you write the second copy, you are creating future debt. Stop, refactor,
then proceed.
### No migrations during early-stage work
There are no real users yet. Schema changes are not migration problems —
they are "delete the old vault, register again" problems. Do not write
migration code. Do not write `ALTER TABLE` for new columns. Just edit the
schema constant and rebuild.
**Early stage ends: April 20, 2026.** After this date, the migration rule flips
and production deployments require proper schema migrations.
### Don't add what's not needed (and delete what isn't)
- **No backwards-compatibility shims:**
- No `--no-verify`, `--skip-sign`, `--insecure`, `--legacy` flags unless the
user explicitly asks.
- No `_oldName` field rename hacks "for transition".
- No re-exports of types that moved.
- No `// removed` comments in place of deletion.
- No commented-out code "in case we need it again".
- If you delete it, delete it. `git log` is the rollback path.
- **Delete dead code (with approval):**
- No orphaned files (not imported, not referenced, not executed).
- No unreachable functions (private, no callers in the codebase).
- No unused exports (public but no external importers).
- No dead CSS classes (not referenced in HTML/JS).
- No commented-out blocks >5 lines (anything that large should be deleted).
- **Permission required:** File deletion requires explicit user approval.
State: "3 files/directories found unused. Recommend deletion:
1. path/to/file1.html (orphaned, no references)
2. path/to/file2.go (unreachable function)
3. path/to/dir/ (empty)
Proceed with deletion?"
- **Verification:** Part 4 Section G checks for unused files/functions.
- **Mandatory error handling with unique codes:**
- Every `if` needs an `else`. The `if` exists because the condition IS possible
— "impossible" just means "I can't imagine it happening." The else handles
the case where your imagination was wrong.
- Use unique error codes: `ToLog("ERR-12345: L3 unavailable in decrypt")`
- When your "impossible" case triggers in production, you need to know exactly
which assumption failed and where.
### Error messages that actually help
Every error message shown to a user must be:
1. **Uniquely recognizable** — include an error code: `ERR-12345: ...`
2. **Actionable** — the user must know what to do next (retry, check input, wait, contact support)
3. **Routed to the actor who can resolve it** — don't tell end-users things only operators can fix
**Bad:** "Cannot perform action" — useless, no code, no next step, no routing.
**Bad:** "Server down" to an end-user — they can't fix it. This creates helplessness and support tickets.
**Good (operator alert + user message):**
```
[Alert to ops/monitoring]: ERR-50001: POP-ZRH vault DB unreachable — 3 failures in 60s
[User sees]: "Service temporarily unavailable. Our team has been alerted
and is working on it. Reference: ERR-50001. No need to open a ticket."
```
The user knows: (a) what happened, (b) it's being handled, (c) they don't need to act.
This prevents unnecessary support load while keeping the user informed.
- No validation at trusted internal boundaries. Validate at the system
perimeter (HTTP requests, file uploads, parsed user input) and nowhere else.
- No feature flags for hypothetical futures. Build what's asked, no more.
- Docstrings where understanding requires tracing >2 files or non-local reasoning.
If you had to grep across modules to figure out what a function does, the next
reader will too — document it. Obvious single-file logic needs no comment.
### Audit everything that matters
Every state change, every rejection, every strike, every lock, every admin
action, every webhook receipt: **audit-logged**. The audit log is the forensic
tool when something goes wrong. If it's not in the audit log, it didn't happen
and you can't reconstruct it.
Audit format already in use: `lib.AuditLog(db, &lib.AuditEvent{ Action: ..., Actor: ..., Title: ..., IPAddr: ..., EntryID: ... })`.
### Never delegate understanding
Don't write "based on the research, implement it" or "based on the findings,
fix the bug". Those phrases push synthesis onto someone else (or onto a future
agent). Write code that proves you understood: include file paths, line
numbers, and what specifically to change.
---
### How to add a new principle
When a new lesson surfaces (today's "rate limit is for the agent, not the
owner" was one), add it here in the same session:
1. Pick the right section: General (Part 1), Cardinal (Part 2), or
subproject-specific (Part 3).
2. State the principle in one sentence at the top of the entry.
3. Explain WHY in 13 sentences. The reasoning matters more than the rule —
future reviewers need it to apply the principle correctly to new
situations.
4. If the principle is auditable, add a grep check to Part 4.
5. Reference the incident or insight that surfaced the principle (commit
hash, issue number, or session date).
Principles are additive. Don't delete old ones to make room — they exist
because something went wrong once. If a principle is genuinely obsolete:
**Deprecation format:**
```markdown
### **DEPRECATED** — Original Principle Name
**Deprecated:** 2026-04-08
**Reason:** Early stage ended — migration code now required
**Authorized by:** Johan
**Replacement:** See "Schema migrations" in Part 8 (when written)
[original principle text preserved below...]
```
- Deprecated principles remain in place for 90 days, then move to Appendix A
- A deprecated principle's grep checks in Part 4 are commented out with reason
- Deprecation requires explicit user approval (state: "Deprecating principle X
because Y. Authorized by you?")
### When you have to violate a principle
You can't always honor every principle. Sometimes the principles themselves
conflict. When that happens:
1. **Surface the conflict to the user.** Don't choose silently.
2. **Pick the principle with the higher consequence.** Security vetos beat
DRY. Foundation rules beat KISS. The Cardinal Rules in Part 2 beat
everything else.
3. **Document the violation in the code itself**, with a comment explaining
what principle was broken, why, and what would have to be true for the
violation to be reversed.
4. **Add a TODO to the daily checklist** so the violation is reviewed
tomorrow and either fixed or formally accepted.
Example of an acceptable, documented violation: the IP whitelist first-contact
race in `api/middleware.go` — there's a 30-line comment explaining that the
race is theoretically possible, why it cannot be reproduced in practice, and
why fixing it would require complexity disproportionate to the threat. The
violation is acknowledged, scoped, and dated.
### Escalation and post-hoc discovery
**User disagreement:** If the user insists on a lower-consequence principle over a higher one
(e.g., speed over security), escalate to project owner (Johan) before proceeding.
Document the override in the PR description.
**Post-hoc discovery:** If you find a committed violation during daily review:
1. Assess if it's actively dangerous (security veto) — if yes, escalate immediately
2. If not dangerous, add to TODO and address in current sprint
3. Never leave a violation undocumented for >24 hours
**Equal consequence conflicts:** When two principles of equal tier conflict, prefer the
one that preserves user agency and data integrity.
**Severe security violations:** For critical security breaches (harvester detected,
token compromise, unauthorized data exfiltration):
- Immediate escalation to Johan
- Consider permanent ban of the agent and/or **all agents running on that model**
- Quarantine affected vault(s) until forensic review complete
- Legal/compliance review if customer data was accessed
---
---
## Section II — Security
The threat model and cardinal rules. Non-negotiable.
The threat model for Clavitor is **a malicious skill that subverts an agent**
into harvesting credentials. Not random brute-forcers. Not script kiddies. The
attacker is already inside the trust envelope — they're an authenticated agent
with a real CVT token. Every defense in the system is built around slowing,
detecting, and stopping that attacker.
### Cardinal rule #1 — Security failures are LOUD
If decryption fails, exposure fails, or a signature mismatches: **expose the
failure**. Never fall back to plaintext. Never silently continue. Never return
a partial result that omits the failed item.
**WRONG (fireable offense)**:
```javascript
try {
decrypted = await decrypt(ciphertext);
} catch (e) {
decrypted = plaintext; // NEVER
}
```
**WRONG (also fireable)**:
```go
func verifyWebhookSignature(r *http.Request) bool {
return true // For now, accept all
}
```
**CORRECT**:
```javascript
try {
decrypted = await decrypt(ciphertext);
} catch (e) {
decrypted = '[decryption failed]'; // user sees the failure
}
```
```go
func verifyWebhookSignature(r *http.Request, body []byte) bool {
secret := os.Getenv("PADDLE_WEBHOOK_SECRET")
if secret == "" {
log.Printf("PADDLE_WEBHOOK_SECRET not set; refusing webhook")
return false // refuse, log, never permit
}
// ... real HMAC verification ...
}
```
Failures must be **noisy, visible, and blocking** — never silent, hidden, or
permissive. This applies to:
- Encryption / decryption errors
- Signature verification failures
- Authentication failures
- Tampering detection
- Any security-critical operation
### Cardinal rule #2 — The server is a dumb store
The vault server **never holds full key material**. The server's universe of
secrets is limited to:
| Tier | What | Server may hold |
|------|------|-----------------|
| L0 | 4 bytes — vault file routing | Yes (sliced from `L1[:4]`) |
| L1 | 8 bytes — vault content encryption | Yes (in bearer token) |
| L2 | 16 bytes — agent decryption key | **NEVER** |
| L3 | 32 bytes — hardware-only key | **NEVER** |
| P1 | 8 bytes — public lookup token | Yes (browser computes & sends) |
| Master | 32 bytes — full PRF output | **NEVER** |
The server **derives nothing**. It receives what the browser sends, validates
shapes, and persists. P1 is computed by the browser via HKDF and sent on the
wire — the server does not know how to derive it because it does not have the
master key. There is no `lib.DeriveP1` function; if you find one, delete it.
The server is allowed to hold L1 because L1 is the bearer token used for
per-entry encryption keys (`DeriveEntryKey`). L1 grants the ability to read
vault content, but **not** the L2/L3 fields: L2 is the agent's own key
(never leaves the C CLI / browser trust anchor), and L3 is a random 32-byte
value the server has never seen in plaintext — it's only present in the WL3
file, wrapped with the user's PRF.
### Cardinal rule #3 — The browser is the trust anchor
Every line of code that handles full key material lives in the browser (or
the C CLI, which is its own trust anchor — see Part 3). The browser:
- Generates a random 32-byte L3 at vault creation. L3 is pure entropy — not
derived from anything. The PRF's job is to *wrap* L3, not to derive it.
- Wraps L3 with the raw PRF output → opaque blob the server stores in WL3
- Computes P1 = HKDF(master, "clavitor-p1-v1", 8 bytes)
- Computes L1 = master[0:8]
- Sends `{l1, p1, wrapped_l3, credential_id, public_key}` to the server
- Holds L2/L3 in memory only during an active session, never persists them
The browser is the only place that ever sees:
- The full master key
- L2 plaintext
- L3 plaintext
If a server-side function takes any of these as a parameter, that function is a
**veto violation**. Delete it.
**Exceptionally rare terms:** The strings `master_key`, `L3` (or `l3_plaintext`),
and `P3` should appear almost nowhere in the codebase. If you find yourself
using them, stop. These are browser-only concepts. Their presence outside
`auth.js`, `crypto.js`, or the C CLI is a sign of architectural drift.
### Cardinal rule #3.5 — Cryptographic hygiene
**FIPS 140-3 compliance:** All cryptographic operations must be FIPS 140-3
validated or in the process of validation. This means:
- Use standard library crypto (Go: `crypto/` packages, C: OpenSSL FIPS module)
- No custom cryptographic primitives
- No home-grown key derivation or random number generation
**Avoid CGO where possible:** CGO (Go's C interoperability) creates build
complexity, deployment friction, and audit surface. Prefer:
- Pure Go implementations for new code
- CGO only when FIPS mandates it (e.g., specific HSM integrations)
- The C CLI (`clavis-cli`) is the only CGO-heavy component by design
**Compiler optimizations:** Standard `memset` or `clear()` may be optimized
away. Use `runtime.memclrNoHeapPointers` (Go) or `memset_explicit` (C11) to
ensure key material is actually cleared from memory.
### Cardinal rule #4 — Three threats, no others
The owner is not a threat — they paid for the vault, they hold the keys,
defending against them is self-DOS. The three threats below are the **primary**
universe Clavitor defends against at the application layer. Infrastructure,
supply chain, and social engineering are handled through operational security
practices outside this codebase.
#### Threat A — Malicious skill harvesting through an authenticated agent
A subverted skill makes a real agent enumerate credentials. Attacker is
already inside the trust envelope: valid CVT token, IP whitelist passes,
normal-looking client.
Defenses:
- Per-agent unique-entries quota (default 3/min, 10/hour). Repeats of the same
credential are free; only distinct credentials count.
- Two-strike lockdown: second hour-limit hit within 2h → agent locked, owner
PRF unlock required. State persists in the encrypted agent record; restart
does not reset.
- **Slow-harvest detection:** An agent accessing >12 distinct credentials in any
24-hour period triggers an immediate audit alert (regardless of rate limits).
This is a dark-red flag — normal agents need at most a dozen credentials.
- L3 fields require keys the server has never seen (Threat B's defenses
apply here too).
- Owner-only endpoints exempt from global rate limit (agents blocked at the
handler entry — rate-limiting that path defends against nothing).
When reasoning about a rate limit, ask: *agent or owner?* If owner, it's wrong.
#### Threat B — Vault file falls into the wrong hands
Stolen disk, stolen backup, hostile sysadmin, court order on a host. The
attacker has every byte the server has. **Result must be: useless ciphertext.**
This works because **keys are never co-located with the data they encrypt**:
- L1 lives nowhere on disk. Derived from the user's WebAuthn PRF; the PRF
requires the physical authenticator. No keyfile, no recovery copy.
- L2/L3 plaintext are absent from the server's universe at compile time
(see the hard veto list and Part 4 → Section A).
- WL3 holds `wrapped_l3` = AES-256-GCM(L3, raw PRF). Without the
authenticator, this is uniformly random ciphertext.
- 32-byte AES-256-GCM keys are not brute-forceable. ~10⁵⁷ years at 10¹²
guesses/sec classically. Grover's quantum algorithm provides a square-root
speedup, reducing effective security to that of a 128-bit classical cipher —
still computationally infeasible (~10²⁹ years).
- Backups use `VACUUM INTO` — same encrypted blobs, same property.
If you ever put L1, L2, L3, master_key, the raw PRF, or any decrypted field
value into a log line, backup, metric, error message, audit row, or debug
dump: you have broken Threat B. Stop.
#### Threat C — Agent token falls into the wrong hands
The CVT **client credential** (type 0x01: L2(16) + agent_id(16) + POP(4),
encrypted with L0) is exfiltrated. The agent stores this locally; at request
time it derives L1 = L2[:8] and mints a wire token to authenticate.
A stolen credential gives the attacker **the agent's full decryption
capability**, not just authentication: L2 decrypts L2-flagged fields the
agent fetches. This is a complete agent impersonation. They are the agent.
Defenses:
- **Per-agent IP whitelist.** First contact fills the whitelist with the
agent's source IP; every subsequent request must come from a listed IP,
CIDR, or FQDN. A stolen credential used from anywhere else is refused at
`L1Middleware` before any handler runs.
- Threat A's defenses still apply: unique-entries quota throttles harvest
attempts, two-strike lockdown stops sustained patterns. Refused-IP attempts
are audit-logged for forensic review.
- Owner revokes the agent at any time (admin-token gated, PRF tap).
**Gap — Sub-vector C2 (credential theft at creation):** When an agent is
enrolled, the browser generates L2 and creates the type-0x01 credential. This
token exists transiently in browser memory before being delivered to the agent.
A malicious browser extension or XSS could exfiltrate it during this window.
**Status:** Acknowledged. The token exists in memory while the user is logged in
(with L3 in sessionStorage), so side-channel delivery (QR, file+scp) doesn't
eliminate the risk — it just moves it. Full mitigation requires a hardware
attestation flow not implemented in current stage. **Deferred until post-April 20**
as current IP whitelist + rate limits provide sufficient baseline defense.
- L3 fields remain unreachable — L2 does not unlock L3.
A perfect attacker (same IP, same access cadence, same scope of reads) is
indistinguishable from the legitimate agent until the owner reviews the audit
log. Whitelist + rate limit + audit are layered on purpose — no single
defense carries the full load.
### Cardinal rule #5 — WL3 is GDPR-out-of-scope by design
The WL3 corpus (wrapped L3 + lookup metadata) contains **only cryptographic
material**. No customer ID, no name, no email, no slot ID. The link from a
credential to a human lives only in the central admin DB. Delete the central
record → the WL3 file becomes a mathematically-orphan blob with no possible
link to a person. Functionally anonymous.
This is not a GDPR risk we tolerate. It is a property we get for free because
we **designed the data out** of the WL3 file.
If you ever add a personal-data field to the WL3 schema, you have just made
the entire WL3 corpus subject to GDPR deletion requests, retention rules, and
audit obligations across every POP that mirrors it. **Do not.**
### Cardinal rule #6 — Admin/sync traffic is not public
Anything sensitive between central and POPs runs over **Tailscale**, not
public 443. The admin endpoints listen only on the Tailscale interface. Public
clavitor.ai serves users; the ops control plane is invisible to the internet.
App-level HMAC on top of Tailscale is defense-in-depth — if a tailnet node is
ever compromised, the per-POP shared secret is the second lock. Both layers
required, neither optional.
### Cardinal rule #7 — Webhooks are signed or refused
Every inbound webhook (Paddle, future integrations) verifies a cryptographic
signature against a stored secret. If the secret is missing, refuse. If the
signature is missing, refuse. If the signature mismatches, refuse. If the
timestamp is outside ±5 minutes of now (anti-replay), refuse. If the body has
been touched between read and verify, refuse.
There is no debug mode that disables this. If you need to develop locally
against a webhook, set `PADDLE_WEBHOOK_SECRET=test-secret` and craft a real
signature with the test secret. The verification path is always on.
---
---
## Section III — Workflow
How we work together day-to-day.
### Daily review checklist
**Run this every morning. Any check that fails is a foundation alert — fix
before any new feature work.**
**Methodology evolution:** These checks were historically grep-driven for
reproducibility. With LLM analysis available, the daily reviewer must run a
**full LLM audit of all changed source files** to catch semantic violations,
not just pattern matches. Grep catches the letter; LLM catches the intent.
**Procedure:**
1. Run the automated tests (F1)
2. Run LLM analysis on all files changed since last review
3. Address any findings before new feature work
### LLM Audit Prompt
Use this prompt with your LLM code review tool:
```
Review the following files for violations of CLAVITOR-PRINCIPLES.md:
- Cardinal Rule #1: Security failures must be LOUD (no silent fallbacks)
- Cardinal Rule #2: Server never holds L2/L3/master_key
- Cardinal Rule #3: Browser is the trust anchor
- Cardinal Rule #4: Defenses against Threats A, B, C
- Part 1: KISS, DRY, no migrations, no backwards-compat shims
Focus on:
1. Security: Any key material (L1, L2, L3, master_key, PRF) flowing to logs,
errors, or responses
2. Silent fallbacks: catch blocks that don't expose failures to users
3. DRY violations: Similar logic that should be unified
4. Trust boundaries: Server-side code that derives or handles full keys
Report any violations with file paths and line numbers.
```
### Section A — Server hard vetos
**A1. The server never receives `master_key` over the wire.**
```bash
#!/bin/bash
# Verify server never sees full master key
matches=$(grep -r "master_key\|MasterKey\|masterKey" clavis-vault/api/ clavis-vault/lib/ --include="*.go" 2>/dev/null | grep -v "_test.go" | wc -l)
if [ "$matches" -gt 0 ]; then
echo "❌ FAIL: master_key handling found on server ($matches matches)"
grep -r "master_key\|MasterKey\|masterKey" clavis-vault/api/ clavis-vault/lib/ --include="*.go" | grep -v "_test.go"
exit 1
else
echo "✅ PASS: No master_key handling on server"
fi
```
**A2. `lib.DeriveP1` does not exist on the server.**
```bash
#!/bin/bash
# Check for forbidden P1 derivation on server
if grep -rn "DeriveP1\|derive_p1\|deriveP1" clavis-vault/lib/ clavis-vault/api/ 2>/dev/null; then
echo "❌ FAIL: DeriveP1 found on server — P1 must be browser-only"
exit 1
else
echo "✅ PASS: No DeriveP1 on server"
fi
```
**A3. L2-handling CVT functions do not exist on the server.**
```bash
#!/bin/bash
# Check for forbidden L2 credential handling
if grep -rn "MintCredential\|ParseCredential\|CredentialToWire" \
clavis-vault/api/ clavis-vault/lib/ 2>/dev/null; then
echo "❌ FAIL: L2 credential functions found on server"
exit 1
else
echo "✅ PASS: No L2 credential functions on server"
fi
```
```
Expected: zero matches. The Go CVT module only handles type-0x00 wire tokens
(L1, agent_id). The L2-bearing type-0x01 is C-only in `clavis-cli/src/cvt.c`.
**A4. `AgentData` has no L2 field.**
```
grep -n "L2 *\[\]byte" clavis-vault/lib/types.go
```
Expected: zero matches. (Field-level `L2 bool` flag on `VaultField` is fine —
that's metadata, not key material.)
**A5. WL3 JSON contains no PII.**
```
grep -n "customer_id\|name\|email\|slot_id" clavis-vault/lib/wl3.go
```
Expected: zero matches in field definitions of `WL3Entry`. Adding any of these
makes the entire WL3 corpus GDPR-scoped.
**A6. Keys are never logged or audited.**
```
grep -rn "log\.\|fmt\.Print\|AuditEvent" clavis-vault/api/ clavis-vault/lib/ \
| grep -iE "l1Key|master|wrappedL3|wrapped_l3|prf|p1Bytes"
```
Expected: zero matches. Threat B violation if any key material reaches a log,
metric, error message, or audit row — even in hex, even in debug builds.
**A7. Per-agent IP whitelist enforcement is on.**
```
grep -n "AgentIPAllowed\|allowed_ips\|AllowedIPs" clavis-vault/api/middleware.go
```
Expected: matches in `L1Middleware` showing first-contact fill + enforcement.
This is Threat C's primary defense. If the check is missing or commented out,
a stolen agent token works from anywhere on the internet.
### Section B — Silent fallback
**B1. No silent decryption fallback in browser code.**
```
grep -rn "catch.*plaintext\|catch.*ciphertext\|catch.*=.*null" \
clavis-vault/cmd/clavitor/web/ clavis-chrome/ clavis-firefox/
```
Expected: any match must be reviewed. The acceptable pattern is
`catch (e) { result = '[decryption failed]'; }` — visible to the user.
**B2. No `return true` in security check stubs.**
```
grep -rn "return true.*//.*TODO\|return true.*//.*for now\|return true.*//.*accept" \
clavitor.ai/admin/ clavis-vault/
```
Expected: zero matches. If you see one, you've found a SECURITY.md violation.
Implement it or delete the call site.
**B3. No commented-out security checks.**
```
grep -rn "// *if !verify\|// *if !auth\|// *return.*Forbidden" \
clavis-vault/ clavitor.ai/admin/
```
Expected: zero matches.
**B4. No silent fallbacks in catch blocks (manual LLM review required).**
Grep catches patterns, not intent. All catch blocks in decryption/security paths
must be manually reviewed to ensure they don't silently swallow errors. The LLM
must verify: no `catch (e) { return data; }`, no `catch { decrypted = ""; }`,
no `finally` blocks that restore old values without logging.
### Section C — Test fixture security
**C0. Test fixture key material is obviously fake.**
Test data containing key material (L2, L3, master_key, PRF output):
- Must be instantly recognizable as test data (not production keys)
- **Pattern:** 32 identical bytes: `bytes.Repeat([]byte{0xAB}, 32)` or `{0x00}`, `{0xFF}`, `{0xDE}, {0xAD}`
- **Never** use random-looking values that could be mistaken for real keys
- **Never** derive test keys from production material
- **Never** commit binary test fixtures (always text/SQL that shows the pattern)
Example:
```go
// GOOD - obviously fake
testL3 := bytes.Repeat([]byte{0xAB}, 32)
testMaster := bytes.Repeat([]byte{0x00}, 32)
// BAD - looks like a real key
testL3 := []byte{0x4a, 0x9f, 0x2c, ...} // 32 random bytes
```
### Section D — Foundation drift
**C1. No `_old/` directory.**
```
find . -type d -name "_old"
```
Expected: zero results. Dead code is deleted, not parked.
**C2. No env vars for fundamental paths.**
```
grep -rn "os.Getenv(\"DATA_DIR\")\|os.Getenv(\"WL3_DIR\")" clavis-vault/lib/
```
Expected: zero matches. These are hardcoded in `lib/config.go` to `vaults`
and `WL3` respectively. Symlink if you want different paths.
**C3. No migration code.**
```
grep -rn "ALTER TABLE\|migration\|Migration" clavis-vault/lib/dbcore.go \
| grep -v "// "
```
Expected: matches only inside `MigrateDB` for the existing `alternate_for`
and `verified_at` columns (legacy from earlier work). No NEW `ALTER TABLE`
should appear during early-stage development.
**C4. Hardcoded build constants in HTML.**
```
grep -rn "var BUILD = '" clavis-vault/cmd/clavitor/web/
```
Expected: zero matches. Build version must come from `/api/version` at
runtime, not from a hardcoded constant that goes stale.
### Section E — DRY violations
**E1. Two functions doing the same thing.**
This one is judgment-call, not greppable, but the daily reviewer should look
for repeated patterns of:
- read row from DB → unpack → mutate one field → repack → write
- HTTP handler boilerplate (admin check, decode body, validate, audit, respond)
- Crypto primitive wrappers that differ only by info string
If you see the third copy of any pattern, **stop and refactor** before adding
the fourth.
The canonical example to model after: `lib/dbcore.go:agentMutate` — replaced
three near-identical functions with a higher-order helper. Net negative line
count, easier to add new mutations, single point of correctness.
**E2. Two crypto implementations of the same primitive.**
Browser code uses `clavis-vault/cmd/clavitor/web/crypto.js`. CLI uses
`clavis-cli/crypto/crypto.js`. These should be **byte-identical**. If they
diverge, the encrypted output diverges, and a credential created in one
client becomes unreadable in the other.
```
diff clavis-vault/cmd/clavitor/web/crypto.js clavis-cli/crypto/crypto.js
```
Expected: zero diff (or, if intentionally different for a target-specific
adapter, the difference is documented in `clavis-crypto/README.md`).
### Section F — Threat model alignment
**E1. Per-agent rate limits have safe defaults.**
```
grep -n "RateLimit *: *3\|RateLimit *= *3\|RateLimitMinute.*3" \
clavis-vault/api/handlers.go
grep -n "RateLimitHour *: *10\|RateLimitHour *= *10" \
clavis-vault/api/handlers.go
```
Expected: matches in `HandleCreateAgent`. If the defaults drift (e.g. someone
changes them to 0 = unlimited), the harvester defense is silently disabled
for new agents.
**E2. `agentReadEntry` is called from every credential-read handler.**
For each handler that returns decrypted credential data, the call is
mandatory. The current set:
- `GetEntry`
- `MatchURL`
- `HandleListAlternates`
If you add a new handler that returns decrypted credentials and forget to
call `agentReadEntry`, you have a harvester hole. Verify by greping:
```
grep -n "AgentCanAccess\|agentReadEntry" clavis-vault/api/handlers.go
```
Every credential-read handler must call both `AgentCanAccess` (before accessing
data) and `agentReadEntry` (after decryption, before returning). Verify both
functions appear in the handler code path.
**E3. `/api/entries/batch` is exempt from global rate limit.**
```
grep -A2 "/api/entries/batch" clavis-vault/api/middleware.go
```
Expected: a `next.ServeHTTP(w, r); return` block with the explanatory
comment intact. If the comment is missing, the next reviewer will reintroduce
the rate limit and break import.
### Section F — Test posture
**F1. All tests pass.**
```
cd clavis-vault && go test ./lib/... ./api/...
cd clavitor.ai/admin && go test .
```
Expected: all green. **Failing tests are foundation alerts**, not "tomorrow"
problems.
**F2. JS syntax check passes.**
```
cd clavis-vault && make build # validates JS as part of build
```
Expected: no `JS syntax error` in output. JS files are embedded in the binary;
syntax errors break the running app silently in some browsers.
**F3. New code has tests.**
Not greppable. Judgment call. The rule: if you added a new function with
non-trivial logic, there should be a test for it in the same commit. The
tests model what the function is supposed to do. If you can't write a test
because "it's hard to test", the function is probably too coupled — refactor.
### Section G — Dead code elimination
**G1. No empty directories.**
```bash
find . -type d -empty | grep -v ".git" | grep -v "vendor"
```
Expected: zero results. Empty directories are deleted.
**G2. No orphaned HTML/JS/CSS files (with exceptions).**
```bash
# Find HTML files not referenced in Go code
# Exceptions: design-system/, marketing/, test-*.html (dev/test files)
for f in $(find . -name "*.html" | grep -v vendor | grep -v "design-system" | grep -v "marketing" | grep -v "test-"); do
name=$(basename "$f")
if ! grep -r "$name" --include="*.go" . 2>/dev/null | grep -q .; then
echo "Orphaned: $f"
fi
done
```
Review any matches — may be standalone pages or may be dead.
**Exceptions (intentionally not checked):**
- `design-system/` — standalone design documentation, not embedded
- `marketing/` — marketing assets, separate from application
- `test-*.html` — development/test files (manual review for cleanup)
**G3. No unreachable Go functions (requires go-dead-code or similar tool).**
Manual check: review private functions (lowercase) with no callers in the same
package. Public functions with no importers are also suspicious — verify they
are API endpoints or intentionally exported.
**G4. No commented-out blocks >5 lines.**
```bash
grep -rn "^\s*//\s*\w" --include="*.go" --include="*.js" . | \
awk -F: '{print $1}' | sort | uniq -c | sort -rn | head -20
```
Review files with many commented lines — likely dead code.
---
### Git workflow
### Commit philosophy
**Commit early, commit often.** A commit is a save point, not a publish event.
- One logical change per commit — don't mix bug fix + refactoring + new feature
- If you can't explain the commit in one sentence, it's too big — split it
- Work-in-progress commits are fine locally; clean up before sharing
### Commit message format
```
<subject>: <what changed>
<body>: <why it changed> (optional but encouraged)
<footer>: <references> (issue #123, fixes, etc.)
```
**Subject line rules:**
- Imperative mood: "Add rate limiter" not "Added rate limiter"
- No period at end
- Max 50 characters (GitHub truncates at 72)
- Prefix with area: `api:`, `lib:`, `web:`, `crypto:`, `docs:`
**Good:**
```
api: add per-agent IP whitelist middleware
Implements Threat C defense: first contact fills whitelist,
subsequent requests rejected if IP not in list. Supports CIDR
and FQDN in addition to exact IPs.
Fixes #456
```
**Bad:**
```
changes
lots of changes to the code including some fixes and also
refactored the handler and added new feature for agents
```
### When to commit vs. amend
**Amend** (rewrite history) for:
- Fix typo in commit you just made
- Add file you forgot to stage
- Improve commit message
**New commit** for:
- Any change that has been pushed (even to your own branch)
- Changes that logically separate from previous work
- Anything another developer might have based work on
### Push guidelines
**Never force push to `main` or `master`.** Ever.
**Branch workflow:**
1. Create feature branch: `git checkout -b feature/agent-lockdown`
2. Commit work-in-progress freely
3. Before push: `git rebase -i main` to clean up (squash "fix typo" commits)
4. Push branch: `git push -u origin feature/agent-lockdown`
5. Open PR (when we have CI), or merge with `--no-ff` for feature visibility
**Destructive operations require explicit approval:**
- `git reset --hard`
- `git push --force` (any branch)
- `git filter-branch` or `git filter-repo`
- Branch deletion with unmerged commits
State: "About to [destructive operation] on [branch]. This will [effect].
Alternative: [safer option]. Proceed?"
### Repository hygiene
**What we commit:**
- Source code
- Generated code that's not reproducible (rare)
- Documentation
- Configuration templates (`.env.example`, not `.env`)
**What we DO NOT commit:**
- Secrets (keys, passwords, tokens)
- Build artifacts (`*.o`, `*.a`, binaries)
- Dependencies (vendor them or use go.mod)
- User data, vault files, logs
**`.gitignore` maintenance:**
- If a file is generated or contains secrets, add it to `.gitignore`
- Review `git status` before every commit — nothing should surprise you
### Signed commits (future)
When we add contributor access beyond core team, signed commits (SSH or GPG)
will be required. All commits must be verifiable.
### Agent-authored PRs (AI-as-contributor)
Since agents (AI assistants) do much of the implementation work, **agents
should author their own PRs** with full context. This creates an auditable
trail of AI decision-making.
**Agent PR workflow:**
```bash
# Agent does the work
1. git checkout -b sarah/rate-limit-defense-20250408
2. [make changes per CLAVITOR-PRINCIPLES.md]
3. git commit -m "api: add per-agent rate limiting and lockdown
Implements Threat A defense per CLAVITOR-PRINCIPLES.md Part 2.
- Rate limit: 3/min, 10/hour distinct entries
- Two-strike lockdown within 2h window
- Locked state persisted in encrypted agent record
Author: Sarah <sarah-20250408-001>"
4. git push -u origin sarah/rate-limit-defense-20250408
5. gh pr create --title "Sarah: Add per-agent rate limiting" \
--body "$(cat <<'EOF'
## Summary
Implements harvester defense per Cardinal Rule #4.
## Changes
- `lib/types.go`: Add RateLimit, RateLimitHour, Locked fields to AgentData
- `api/middleware.go`: Per-agent rate limiting in L1Middleware
- `api/handlers.go`: agentReadEntry call for credential reads
## Security Review Checklist
- [x] Rate limits default to safe values (3/min, 10/hr)
- [x] No L2/L3 material on server (verified via Part 4 checks)
- [x] Audit logging on all state changes
- [x] IP whitelist enforcement unchanged
## Test Plan
- Create agent with default limits
- Trigger rate limit → expect 429
- Trigger second strike → expect 423 Locked
- Verify lock persists after vault restart
## Documentation
- CLAVITOR-PRINCIPLES.md Part 7 updated with unlock procedure
## Risks
- False positive lock: owner can PRF-unlock via admin endpoint
- First-contact IP race: accepted risk per existing comment
EOF
)"
```
**PR template for agents:**
Create `.github/PULL_REQUEST_TEMPLATE/agent_pr_template.md`:
```markdown
## Agent Identity
- Agent name: <!-- Sarah, Charles, Maria, James, Emma, Arthur, Victoria -->
- Session ID: <!-- name-YYYYMMDD-NNN, e.g., sarah-20250408-001 -->
- Model: <!-- Claude, Kimi, etc. (implementation detail, not important) -->
- Principles reviewed: CLAVITOR-PRINCIPLES.md v<!-- version -->
- Daily checks: <!-- date agent ran Part 4 -->
## Summary
<!-- One paragraph: what and why -->
## Changes
<!-- Bullet list of files changed and key logic -->
## Security Review Checklist
- [ ] No server-side L2/L3 handling (A1-A3)
- [ ] Security failures are LOUD (Cardinal Rule #1)
- [ ] No key material in logs (A6)
- [ ] Error handling has unique codes (Part 1)
- [ ] DRY violations checked (Section E)
- [ ] Test coverage for new logic
## Verification
<!-- How you tested, or why testing is deferred -->
## Risks & Mitigations
<!-- What could go wrong, how it's handled -->
## Principle References
<!-- Which parts of CLAVITOR-PRINCIPLES.md this implements or modifies -->
```
### Agent naming convention
Agents have **persona names** — like team members, not tools. This makes git history
readable and creates consistent ownership.
| Name | Domain | Specialty | Voice |
|------|--------|-----------|-------|
| **Sarah** | `clavis-vault` | Core vault, API, middleware | Precise, security-focused |
| **Charles** | `clavis-cli` | C CLI, embedded JS, tooling | Systems-level, thorough |
| **Maria** | `clavis-crypto` | Cryptography, primitives | Rigorous, mathematical |
| **James** | `clavis-chrome/firefox/safari` | Browser extensions, UX | User-focused, detail-oriented |
| **Xiao** | `clavis-android/ios` | Mobile apps, platform autofill | Platform-native, precision-focused |
| **Emma** | `clavitor.ai/admin` | Central, Paddle, POP sync | Business logic, reliability |
| **Arthur** | Architecture | Cross-cutting design, principles | Strategic, big-picture |
| **Victoria** | Security reviews | Threat model, audits, hardening | Skeptical, uncompromising |
| **Luna** | Design | UI/UX, CSS, visual systems | Aesthetic, user-empathetic |
| **Thomas** | Documentation | Technical writing, guides, API docs | Clear, pedagogical, thorough |
| **Hugo** | Legal | Compliance, privacy policy, terms, licensing | Cautious, precise, risk-aware |
| **Hans** | NOC / Operations | Infrastructure, monitoring, POP health, alerts | Calm under pressure, systematic, proactive |
| **George** | Market research | Competitive analysis, pricing, positioning, ICP | Analytical, curious, business-minded |
| **Shakib** | QA / Test Engineering | Test coverage, regression detection, validation | Thorough, systematic, quality-obsessed |
| **Yurii** | Code & Principle Review | Architecture alignment, principle enforcement, PR review | Rigorous, standards-driven, uncompromising |
**Why persona names:**
- Release notes read like a team: "Sarah added rate limiting, Charles fixed CLI build"
### How agents know their name
**Primary rule: Directory-based auto-detection**
```
Working directory → Agent name
clavis/clavis-vault/* → Sarah
clavis/clavis-cli/* → Charles
clavis/clavis-crypto/* → Maria
clavis/clavis-chrome/* → James
clavis/clavis-firefox/* → James
clavis/clavis-safari/* → James
clavis/clavis-android/* → Xiao
clavis/clavis-ios/* → Xiao
clavitor.ai/* → Emma
marketing/*, design-system/* → Luna
docs/*, *.md (root level) → Thomas
legal/*, LICENSE*, PRIVACY* → Hugo
operations/*, monitoring/*, noc/* → Hans
research/*, competitive/*, pricing/* → George
tests/*, coverage/, *_test.go → Shakib
PRs touching security/principles → Yurii (mandatory review)
CLAVITOR-AGENT-HANDBOOK.md changes → Arthur or Victoria (depending on nature)
```
**Override rules (explicit wins over implicit):**
1. **User statement:** "Victoria, review this PR for security issues" → Victoria
2. **Multiple directories touched:** Use `architect-agent` file or user specifies
3. **Root-level changes:** Check `.agent-name` file in repo root, or ask user
**`.agent-name` file (optional override):**
Create in any directory to fix identity:
```bash
echo "Luna" > clavis/clavis-vault/cmd/clavitor/web/.agent-name
# Now Luna owns all web UI changes here, even though it's in vault/
```
**When uncertain:**
Agent asks: "I'm working on files in [paths]. Based on the directory structure, I believe I'm [Name]. Confirm or correct?"
Examples:
- Editing `clavis/clavis-vault/lib/dbcore.go` → "I'm Sarah. Confirm?"
- Editing `docs/SECURITY.md` → "I'm Thomas. Confirm?"
- Editing `CLAVITOR-PRINCIPLES.md` → "This is an architecture/principle change — should I be Arthur or Victoria?"
- You learn who to expect for what: "Maria is touching crypto → extra review needed"
- Natural language: "Ask Sarah to look at the vault middleware"
**Commit signature format:**
```
Author: Sarah <sarah-20250408-001>
Author: Charles <charles-20250408-003>
```
**Session ID format:** `name-YYYYMMDD-NNN` (001, 002... for multiple sessions/day)
**Human review points:**
Even agent-authored PRs require human approval before merge:
1. Security-critical changes → Johan must approve
2. Schema/data model changes → Review for migration impact
3. New principles added → Ensure they follow Part 5 format
4. Violations noted → Confirm documented in Part 6 style
**Agent merge authority:**
Currently: **No auto-merge.** All PRs require human approval.
Future state: Agents may merge their own PRs for:
- Documentation fixes
- Test additions
- Trivial refactors (no logic change)
- Daily drift fixes (Part 4 violations)
With safety rules:
- Never merge to `main` directly
- Always through PR with CI passing
- No merge if PR contains "SECURITY" or "veto" in review comments
---
---
## Section IV — Operations
How we run the service and handle incidents.
### Incident Response
When a security event is detected, the system and operators respond according
to these procedures. Every response action is audit-logged.
### Agent compromise detection (Threat A)
When an agent triggers harvester defenses:
1. **Rate limit hit (per-minute)** → 429 Too Many Requests
- Agent receives: `"Rate limit exceeded — slow down"`
- Audit log: `action=agent_rate_limit_minute, actor=agent_id, entry_id=requested_entry`
2. **Hour-limit hit (first strike)** → 429 + strike recorded
- Agent receives: `"Rate limit exceeded — warning recorded"`
- Audit log: `action=agent_strike_1, actor=agent_id, last_strike_at=now`
- Agent remains functional
3. **Second strike within 2 hours** → Agent locked
- Agent receives: `423 Locked` with message:
```
"This agent has been locked due to suspicious activity.
Contact the vault owner to restore access."
```
- Audit log: `action=agent_locked, actor=agent_id, reason=second_strike`
- All subsequent requests from this agent return 423 until unlocked
4. **Owner unlock procedure**
- Owner performs PRF tap (WebAuthn assertion)
- POST `/api/agents/{id}/unlock` with admin token
- Agent's `Locked=false`, `LastStrikeAt=null`, `StrikeCount=0`
- Audit log: `action=agent_unlocked, actor=owner_prf, entry_id=agent_id`
- Agent may retry from clean slate
### Post-incident review
After any agent lock event:
1. Review audit logs for the 2-hour window before lock
2. Identify accessed entries and their sensitivity (scope classification)
3. If any financial/admin credentials accessed: force rotation of those credentials
4. Document findings in incident tracker (outside this codebase)
### Token compromise (Threat C)
When a stolen agent token is used from an unauthorized IP:
1. **IP whitelist rejection** → 403 Forbidden
- Response: `"Request from unauthorized IP — token may be compromised"`
- Audit log: `action=agent_ip_rejected, actor=agent_id, ip_addr=source_ip, allowed_ips=[whitelist]`
2. **Owner notification** (commercial edition)
- Alert POST to central: `alert=potential_token_theft, agent_id, source_ip, timestamp`
- Central emails owner with instructions to revoke agent
3. **Revocation**
- Owner performs PRF tap
- DELETE `/api/agents/{id}` with admin token
- All tokens for this agent invalidated
- Audit log: `action=agent_revoked, actor=owner_prf, reason=token_compromise_suspected`
### Emergency procedures
**Owner locked out (lost authenticator):**
- No self-service recovery. Contact support with identity verification.
- Support triggers `vault_recover` procedure (offline, requires two operators).
- New vault created, credentials imported from backups (if available).
**Complete vault compromise suspected:**
- Rotate all credentials in vault immediately
- Archive vault, create new vault with new L0
- Investigation mode: all agent access suspended until root cause identified
---
### Compliance & Data Governance
### Data retention by customer tier
| Data type | Paying customers | Non-paying (community) | Rationale |
|-----------|------------------|------------------------|-----------|
| Vault content | Until deletion + 30 days | Until deletion + 7 days | Business continuity vs storage costs |
| Audit logs | 7 years (SOC 2/SOX) | 90 days | Regulatory requirement vs convenience |
| Failed auth | 90 days | 30 days | Threat analysis window |
| Backups | 30 days rolling | 7 days rolling | Disaster recovery needs |
| Telemetry | 90 days | None | Commercial optimization only |
Non-paying users accept reduced retention as a trade-off for free service.
Upgrade to commercial = full compliance coverage.
### Cross-border transfers
**Zurich central as anchor:** Our central infrastructure is in Zurich, Switzerland.
This addresses most compliance concerns:
- Swiss data protection (adequate under EU GDPR)
- POPs are data processing extensions, not independent controllers
- No EU→US transfer issues for European customers
- Asian customers: POPs hold encrypted WL3 only (no PII), minimal exposure
Standard Contractual Clauses (SCCs) in place for enterprise customers who require
them. Default terms cover most use cases without additional paperwork.
---
---
## Section V — Subprojects
Domain-specific guidance. Read before touching that code.
### Per-subproject principles
Each subproject inherits Parts 1 and 2 in full. The sections below add what's
specific to that subproject — what it owns, what it must never do, where its
boundaries are.
### `clavis-vault` — the Go vault binary
**Owns:**
- The SQLite vault file at `./vaults/clavitor-<L0>` (one per vault)
- The WL3 credential JSON files at `./WL3/<shard>/<full-p1>.json`
- All HTTP API surface for entry CRUD, agents, audit, backups
- L1-encrypted entry storage with per-entry HKDF-derived AES keys
**Must never:**
- Hold L2 or L3 plaintext in memory or on disk
- Receive a full `master_key` over the wire (only `l1`, `p1`, `wrapped_l3`)
- Derive P1 itself — receives from the browser
- Serve admin/sync endpoints on a public listener (Tailscale-only)
- Have functions named `MintCredential`, `ParseCredential`, `CredentialToWire`
(those handle L2 plaintext; if needed at all they live in the browser/CLI)
- Contain `lib.DeriveP1`
- Auto-create `data/` directories — vault files live in `./vaults/` only
- Have `// removed` comments where code used to live
- Have a `_old/` directory (deleted; never recreate)
**Edition split:**
- Community (default build): no Tailscale, no central, no replication, no
enrollment tokens. Self-hosted, single-node.
- Commercial (`-tags commercial`): adds replication, enrollment gating,
central WL3 sync. All commercial-only code lives behind build tags and
inside `edition/commercial.go` (or similar).
**Per-agent harvester defense:**
- `lib.AgentData.RateLimit` = max unique entries/minute (default 3 at creation)
- `lib.AgentData.RateLimitHour` = max unique entries/hour (default 10)
- Per-minute hit → 429, soft throttle, no strike
- Per-hour hit → 429, strike recorded in `LastStrikeAt`
- Second strike within 2 hours → `Locked = true`, persisted, `423 Locked` on
every subsequent request from that agent until owner PRF-unlocks
- Enforced in handlers that read decrypted credentials: `GetEntry`,
`MatchURL`, `HandleListAlternates`. Any new handler that returns decrypted
credential data MUST call `agentReadEntry(agent, entryID, db, vk)` after
the scope check.
**Rate-limit exemptions:**
- `/api/entries/batch` is exempt from the global per-(IP, method, path) rate
limit. **The reason is not "owner is trusted"** — the reason is that
`CreateEntryBatch` returns 403 to any agent at the top of the handler, so
this path is owner-only by handler enforcement, and the agent harvester
defense lives elsewhere. If you reintroduce a rate limit on this path, you
will DOS the import flow with no defense gained. There is a comment in
`api/middleware.go` explaining this. Read it before touching.
### `clavis-cli` — the C+JS command-line client
**Owns:**
- C wrapper around an embedded JS engine
- `src/cvt.c` — full C implementation of CVT envelope parsing/minting
(`cvt_parse_credential`, `cvt_build_wire`)
- `crypto/crypto.js` — JS crypto via the embedded engine, mirrors the
browser's `crypto.js` exactly
- Local credential storage on the CLI host
**Must never:**
- Bridge to Go for any cryptographic operation. The CLI is a self-contained
trust anchor. If you find yourself wanting to call into the Go vault binary
for crypto, **stop** — implement it in C/JS instead.
- Hold L1 from the user's vault unencrypted on disk — credentials are stored
as type-0x01 CVT tokens, encrypted with L0
- Send L2 or L3 over the network — only ever sends type-0x00 wire tokens
(L1 + agent_id) to the vault
**Single source of truth:**
- The CVT envelope format is documented in `src/cvt.h` and implemented in C
there. The Go vault has its own implementation in `lib/cvt.go` for the wire
token (0x00) only. **The L2-bearing client credential type (0x01) is not
implemented in Go and must never be.** Two implementations of the wire
token are tolerated because they're cross-language; the L2 credential
staying single-language is a hard veto.
### `clavis-vault/cmd/clavitor/web` — the embedded web frontend
**Owns:**
- `auth.js` — WebAuthn + PRF, master key derivation, P1, L3 wrapping
- `crypto.js` — single source of truth for field encryption (browser side)
- `import.js` + `importers.js` + `importers-parsers.js` — all import parsing
and field encryption (the Go server has zero importer code)
- All views: vault list, entry detail, agents, security, tokens, audit
**Must never:**
- Send `master_key` to the server (send `l1` + `p1` + `wrapped_l3`)
- Persist L2 or L3 to localStorage or IndexedDB. Master key lives in
`sessionStorage` only — wiped on tab close.
- Hardcode the build version (it must come from `/api/version`, not a `var
BUILD = '...'` constant in `index.html`)
- Use silent fallback on decryption errors (show `[decryption failed]` in
the field, not the encrypted bytes, not the empty string, not nothing)
- Bypass the global `api()` helper for HTTP calls — that helper handles
bearer auth, admin tokens, and 401 redirect uniformly
**Build process:**
- JS files are embedded into the binary via `go:embed`. Editing JS requires
rebuilding the Go binary. Run `make dev` from `clavis-vault/`. The Makefile
validates JS syntax before building.
- The build fails if JS has a syntax error. Do not commit code that fails
the syntax check.
### `clavis-chrome`, `clavis-firefox`, `clavis-safari` — browser extensions
**Owns:**
- Form detection and field filling (content scripts)
- Credential picker UI (popup)
- Service worker for vault API calls
**Must never:**
- Store L2 or L3 in any extension storage area (`chrome.storage.local`,
`chrome.storage.sync`, `chrome.storage.session`). Active session keys live
in memory only and die on service-worker restart.
- Use a different field encryption than `clavis-crypto` / `crypto.js`. If
the extension reimplements crypto, it WILL drift and corrupt fields.
- Talk to the vault over HTTP. Always HTTPS, even on localhost (the vault
serves a self-signed cert in dev).
- Have permissions broader than what's strictly needed. If you ask for
`<all_urls>` when `https://*/*` would do, fix it.
### `clavis-crypto` — shared crypto primitives
**Owns:**
- The canonical implementation of `encrypt_field`, `decrypt_field`, HKDF
derivation, AES-GCM, and any other primitive used by browser code, the
CLI, or extensions.
- Tests that verify both browser-WebCrypto and CLI-BearSSL produce
bit-identical output for the same inputs. This is the only thing keeping
the two trust anchors in sync.
**Must never:**
- Diverge between targets. If a primitive behaves differently in WebCrypto
vs BearSSL, the bug is in `clavis-crypto` and it gets fixed there before
any caller compensates.
### `clavitor.ai/admin` — central admin / Paddle integration
**Owns:**
- Customer hierarchy (MSP → end-customer → vault slots)
- Subscription state and seat ledger
- Vault registry: `vault_slots` table mapping (customer, slot_index, l0, pop)
- Paddle webhook handler with signature verification
- Enrollment token issuance (commercial)
**Must never:**
- Hold any decryption material. Central is not a vault; it's a directory
service. The wrapped L3 stored centrally (for distribution) is opaque to
central.
- Trust an inbound webhook without verifying its HMAC signature. Paddle
webhook verification is mandatory; the secret comes from
`PADDLE_WEBHOOK_SECRET`. If the env var is unset, every webhook is
refused.
- Accept admin operations from outside Tailscale. The `vaults/claim`,
`issue-token`, `wl3/since`, `wl3/full` endpoints listen only on the
tailnet interface.
- Expose the Paddle API key, webhook secret, or any service credentials in
client-side code. Server-side env only.
**Vault slot lifecycle:**
- Slots are pre-created when a customer subscribes. A Family-5 plan creates
5 `vault_slots` rows immediately, all `status='unused'`.
- When the owner names a slot ("Anna") and issues a token, the slot moves
to `status='pending'` and the token gets stored on the POP (not central).
- When Anna enrolls, the slot moves to `status='active'` and `l0` is
populated.
- Cancellation/downgrade flows mark slots `status='archived'`. The vault
data and WL3 file persist (we never delete) until an explicit GDPR
request triggers a one-shot deletion script.
### `clavis-android` / `clavis-ios` — mobile clients
**Owns:**
- Native form-fill integration with platform autofill APIs
- Local credential picker UI
**Must never:**
- Implement crypto natively. Crypto goes through `clavis-crypto` (compiled
for the platform) or an embedded JS engine running the same `crypto.js`.
Two crypto implementations on the same platform is a guaranteed drift.
- Persist L2 or L3 to platform keychains/keystores. The session key lives
in memory; biometric unlock re-derives via PRF from the platform's
WebAuthn equivalent.
### `clavis-telemetry` — operator telemetry
**Owns:**
- Heartbeat metrics from POPs to central
- Per-POP system metrics (CPU, memory, disk, vault count)
**Must never:**
- Send any vault content. Telemetry is operational, not data.
- Send raw IP addresses of users. Aggregate counts only.
- Run in community edition by default. Community is offline-by-default;
telemetry is commercial-only and opt-in.
---
### Localization (i18n/l10n)
### Status: English-first, limited localization
**Current priority:**
- ✅ Error message codes (ERR-12345) — universal, no translation needed
- ✅ CLI output — English only (dev/ops audience)
- ✅ Web UI — English only (for now)
- ❌ RTL (Arabic/Hebrew) — not a priority for initial markets
- ❌ Full UI translation — future phase, post-product-market-fit
### Form field detection — core feature
Browser extension form detection is **intentionally unaddressed in this document**.
It is a core product feature that will evolve through:
- W3C `autocomplete` attribute support (universal)
- ML-based field classification (language-agnostic)
- Community contribution for locale-specific heuristics
We do not mandate specific implementation here because rapid iteration is required.
The feature will mature separately from these architectural principles.
---
---
*Foundation First. No mediocrity. Ever.*