5.7 KiB
5.7 KiB
Replication Design — Active-Passive with Async Sync
Overview
Primary POP (e.g., Calgary) replicates every write to Backup POP (e.g., Zurich). Backup serves read-only traffic if primary fails.
Key Principles
- Primary owns writes — Backup never accepts mutations from clients
- Same wire format — Replicate the exact request payload (not re-encoded)
- Async, non-blocking — Primary doesn't wait for backup ACK (queue + retry)
- Dirty tracking per entry — Each entry has
replicated_atand dirty flag - Read failover only — Clients read from backup if primary down, but writes fail
Architecture
On Primary (Calgary)
Client Request → Primary Handler
↓
[1] Apply to local DB
[2] Queue for replication (async)
[3] Return success to client (don't wait for backup)
↓
Replication Worker (background)
↓
POST to Backup /api/replication/apply
Queue Structure:
type ReplicationTask struct {
EntryID int64
RawPayload []byte // Original request body (encrypted blob)
Method string // POST/PUT/DELETE
Timestamp int64 // When primary applied
RetryCount int
Dirty bool // true = not yet confirmed by backup
}
Per-Entry Status (in entries table):
replicated_at INTEGER, -- NULL = never replicated, timestamp = last confirmation
replication_dirty BOOLEAN -- true = pending replication, false = synced
On Backup (Zurich)
POST /api/replication/apply
↓
Validate: Is this from an authorized primary POP? (mTLS or shared secret)
↓
Apply to local DB (exact same data, including encrypted blobs)
↓
Return 200 ACK
Backup rejects client writes:
if isClientRequest && isWriteOperation {
return 503, "Write operations not available on backup POP"
}
Failure Scenarios
1. Backup Unavailable (Primary Still Up)
- Primary queues replication tasks (in-memory + SQLite for persistence)
- Retries with exponential backoff
- Marks entries as
dirty=true - Client operations continue normally
- When backup comes back: bulk sync dirty entries
2. Primary Fails (Backup Becomes Active)
- DNS/healthcheck detects primary down
- Clients routed to backup
- Backup serves reads only
- Writes return 503 with header:
X-Primary-Location: https://calgary.clavitor.ai - Manual intervention required to promote backup to primary
3. Split Brain (Both Think They're Primary)
- Prevented by design: Only one POP has "primary" role in control plane
- Backup refuses writes from clients
- If control plane fails: manual failover only
Replication Endpoint (Backup)
POST /api/replication/apply
Authorization: Bearer {inter-pop-token}
Content-Type: application/json
{
"source_pop": "calgary-01",
"entries": [
{
"entry_id": "abc123",
"operation": "create", // or "update", "delete"
"encrypted_data": "base64...",
"timestamp": 1743556800
}
]
}
Response:
{
"acknowledged": ["abc123"],
"failed": [],
"already_exists": [] // For conflict detection
}
Audit Log Handling
Primary: Logs all operations normally.
Backup: Logs its own operations (replication applies) but not client operations.
// On backup, when applying replication:
lib.AuditLog(db, &lib.AuditEvent{
Action: lib.ActionReplicated, // Special action type
EntryID: entryID,
Title: "replicated from " + sourcePOP,
Actor: "system:replication",
})
Client Failover Behavior
// Client detects primary down (connection timeout, 503, etc.)
// Automatically tries backup POP
// On backup:
GET /api/entries/123 // ✅ Allowed
PUT /api/entries/123 // ❌ 503 + X-Primary-Location header
Improvements Over Original Design
| Original Proposal | Improved |
|---|---|
| Batch polling every 30s | Real-time async queue — faster, lower lag |
| Store just timestamp | Add dirty flag — faster recovery, less scanning |
| Replica rejects all client traffic | Read-only allowed — true failover capability |
| Single replication target | Primary + Backup concept — clearer roles |
Database Schema Addition
ALTER TABLE entries ADD COLUMN replicated_at INTEGER; -- NULL = never
ALTER TABLE entries ADD COLUMN replication_dirty BOOLEAN DEFAULT 0;
-- Index for fast "dirty" lookup
CREATE INDEX idx_entries_dirty ON entries(replication_dirty) WHERE replication_dirty = 1;
Code Structure
Commercial-only files:
edition/
├── replication.go # Core replication logic (queue, worker)
├── replication_queue.go # SQLite-backed persistent queue
├── replication_client.go # HTTP client to backup POP
└── replication_handler.go # Backup's /api/replication/apply handler
Modified:
api/handlers.go # Check if backup mode, reject writes
api/middleware.go # Detect if backup POP, set context flag
Security Considerations
- Inter-POP Auth: mTLS or shared bearer token (rotated daily)
- Source Validation: Backup verifies primary is authorized in control plane
- No Cascade: Backup never replicates to another backup (prevent loops)
- Idempotency: Replication operations are idempotent (safe to retry)
Metrics to Track
replication_lag_seconds— Time between primary apply and backup ACKreplication_queue_depth— Number of pending entriesreplication_failures_total— Failed replication attemptsreplication_fallback_reads— Client reads served from backup
Johan's Design + These Refinements = Production Ready