# Replication Design — Active-Passive with Async Sync ## Overview Primary POP (e.g., Calgary) replicates every write to Backup POP (e.g., Zurich). Backup serves **read-only** traffic if primary fails. ## Key Principles 1. **Primary owns writes** — Backup never accepts mutations from clients 2. **Same wire format** — Replicate the exact request payload (not re-encoded) 3. **Async, non-blocking** — Primary doesn't wait for backup ACK (queue + retry) 4. **Dirty tracking per entry** — Each entry has `replicated_at` and dirty flag 5. **Read failover only** — Clients read from backup if primary down, but writes fail ## Architecture ### On Primary (Calgary) ``` Client Request → Primary Handler ↓ [1] Apply to local DB [2] Queue for replication (async) [3] Return success to client (don't wait for backup) ↓ Replication Worker (background) ↓ POST to Backup /api/replication/apply ``` **Queue Structure:** ```go type ReplicationTask struct { EntryID int64 RawPayload []byte // Original request body (encrypted blob) Method string // POST/PUT/DELETE Timestamp int64 // When primary applied RetryCount int Dirty bool // true = not yet confirmed by backup } ``` **Per-Entry Status (in entries table):** ```sql replicated_at INTEGER, -- NULL = never replicated, timestamp = last confirmation replication_dirty BOOLEAN -- true = pending replication, false = synced ``` ### On Backup (Zurich) ``` POST /api/replication/apply ↓ Validate: Is this from an authorized primary POP? (mTLS or shared secret) ↓ Apply to local DB (exact same data, including encrypted blobs) ↓ Return 200 ACK ``` **Backup rejects client writes:** ```go if isClientRequest && isWriteOperation { return 503, "Write operations not available on backup POP" } ``` ## Failure Scenarios ### 1. Backup Unavailable (Primary Still Up) - Primary queues replication tasks (in-memory + SQLite for persistence) - Retries with exponential backoff - Marks entries as `dirty=true` - Client operations continue normally - When backup comes back: bulk sync dirty entries ### 2. Primary Fails (Backup Becomes Active) - DNS/healthcheck detects primary down - Clients routed to backup - **Backup serves reads only** - Writes return 503 with header: `X-Primary-Location: https://calgary.clavitor.ai` - Manual intervention required to promote backup to primary ### 3. Split Brain (Both Think They're Primary) - Prevented by design: Only one POP has "primary" role in control plane - Backup refuses writes from clients - If control plane fails: manual failover only ## Replication Endpoint (Backup) ```http POST /api/replication/apply Authorization: Bearer {inter-pop-token} Content-Type: application/json { "source_pop": "calgary-01", "entries": [ { "entry_id": "abc123", "operation": "create", // or "update", "delete" "encrypted_data": "base64...", "timestamp": 1743556800 } ] } ``` Response: ```json { "acknowledged": ["abc123"], "failed": [], "already_exists": [] // For conflict detection } ``` ## Audit Log Handling **Primary:** Logs all operations normally. **Backup:** Logs its own operations (replication applies) but not client operations. ```go // On backup, when applying replication: lib.AuditLog(db, &lib.AuditEvent{ Action: lib.ActionReplicated, // Special action type EntryID: entryID, Title: "replicated from " + sourcePOP, Actor: "system:replication", }) ``` ## Client Failover Behavior ```go // Client detects primary down (connection timeout, 503, etc.) // Automatically tries backup POP // On backup: GET /api/entries/123 // ✅ Allowed PUT /api/entries/123 // ❌ 503 + X-Primary-Location header ``` ## Improvements Over Original Design | Original Proposal | Improved | |-------------------|----------| | Batch polling every 30s | **Real-time async queue** — faster, lower lag | | Store just timestamp | **Add dirty flag** — faster recovery, less scanning | | Replica rejects all client traffic | **Read-only allowed** — true failover capability | | Single replication target | **Primary + Backup concept** — clearer roles | ## Database Schema Addition ```sql ALTER TABLE entries ADD COLUMN replicated_at INTEGER; -- NULL = never ALTER TABLE entries ADD COLUMN replication_dirty BOOLEAN DEFAULT 0; -- Index for fast "dirty" lookup CREATE INDEX idx_entries_dirty ON entries(replication_dirty) WHERE replication_dirty = 1; ``` ## Code Structure **Commercial-only files:** ``` edition/ ├── replication.go # Core replication logic (queue, worker) ├── replication_queue.go # SQLite-backed persistent queue ├── replication_client.go # HTTP client to backup POP └── replication_handler.go # Backup's /api/replication/apply handler ``` **Modified:** ``` api/handlers.go # Check if backup mode, reject writes api/middleware.go # Detect if backup POP, set context flag ``` ## Security Considerations 1. **Inter-POP Auth:** mTLS or shared bearer token (rotated daily) 2. **Source Validation:** Backup verifies primary is authorized in control plane 3. **No Cascade:** Backup never replicates to another backup (prevent loops) 4. **Idempotency:** Replication operations are idempotent (safe to retry) ## Metrics to Track - `replication_lag_seconds` — Time between primary apply and backup ACK - `replication_queue_depth` — Number of pending entries - `replication_failures_total` — Failed replication attempts - `replication_fallback_reads` — Client reads served from backup --- **Johan's Design + These Refinements = Production Ready**