clavitor/docs/INFRASTRUCTURE.md

11 KiB

Clavitor — Global Infrastructure Plan

Created 2026-03-01 · Updated 2026-03-02


Strategy: Ontmoedigende Voorsprong

The AI agent market is exploding. OpenClaw is commonplace, Claude had 500K App Store downloads in a single day. Every developer running agents has a credential problem — and nobody is solving it with field-level AI visibility and two-tier encryption.

The goal is to present Clavitor as the global standard from day one. When a competitor evaluates the space, they should see infrastructure everywhere and think "we can't improve on that anymore." Like Google in search. The map matters more than the capacity — 16 nodes across 6 continents signals infrastructure, not a side project.

Each node runs a single Go binary + SQLite. Minimal resource requirements. Nodes are independent — no replication between regions. A user's vault lives on one instance. Scale up individual nodes only when demand justifies it.

Budget: $100/month. Deploy mini-nodes, upgrade individual nodes only when demand justifies.


Deployment Map

Existing Infrastructure (Hostkey)

City Provider Role Cost
Amsterdam Hostkey EU West, Benelux, Nordics existing
Zurich Hostkey SOC hub, Switzerland, DACH backup existing
Dubai Hostkey Gulf states, Middle East ~$5-8/mo

New Infrastructure (Vultr)

All Vultr nodes: VX1 tier — 1 vCPU, 512 MB RAM, 10 GB SSD, 0.5 TB bandwidth @ $2.50/mo.

# City Region Covers
1 New Jersey US East East coast, finance, enterprise
2 Silicon Valley US West Startups, AI companies
3 Dallas US Central Middle US, gaming corridor
4 London UK UK dev market
5 Frankfurt EU Central DACH, central Europe
6 Warsaw EU East Eastern Europe, Balkans, Turkey corridor
7 Tokyo Asia East Japan, China-facing (southern)
8 Seoul Asia East Korea, China-facing (northern)
9 Mumbai South Asia India (1.4B people)
10 São Paulo LATAM South America
11 Sydney Oceania Australia, New Zealand
12 Johannesburg Africa Africa (nobody else is there)
13 Tel Aviv Middle East Eastern Mediterranean, Israel

Note: Hostkey covers Netherlands, Germany, Finland, Iceland, UK, Turkey, USA, Dubai, Israel — significant overlap with Vultr. Consider consolidating to Hostkey where they offer competitive mini-VPS pricing (existing relationship, single invoice). Vultr only for gaps: Tokyo, Seoul, Mumbai, São Paulo, Sydney, Johannesburg, Warsaw, Dallas, Silicon Valley.


Coverage Summary

  • 16 nodes across 6 continents
  • Every major economic zone covered
  • No spot on earth more than ~100ms from a Clavitor node

Cost

Provider Nodes Monthly
Hostkey (existing) 2 $0 (already paid)
Hostkey (Dubai) 1 ~$5-8
Vultr 13 $32.50
Total 16 ~$40/mo

Remaining ~$60/mo reserved for upgrading nodes that see traction.

Upgrade Path

When a node outgrows $2.50 tier:

  1. $6/mo — 1 vCPU, 1 GB RAM, 25 GB SSD, 1 TB bandwidth
  2. $12/mo — 2 vCPU, 2 GB RAM, 50 GB SSD, 2 TB bandwidth
  3. Beyond: evaluate dedicated or move to Hostkey dedicated in that region

Node Stack

OS: NixOS

No Ubuntu. No Alpine. NixOS makes every node a deterministic clone of a single config file in the repo.

  • Declarative: One configuration.nix defines the entire node — OS, packages, services, firewall, users, TLS. Checked into the clavitor repo. Every node identical by definition.
  • Reproducible: No drift. The system IS the config.
  • Rollback: Atomic upgrades. nixos-rebuild switch --rollback instantly restores previous state.
  • Agent-friendly: OC pushes config, runs one command. Node converges or doesn't. No imperative state tracking.
  • Hostile to attackers: Read-only filesystem, no stray tooling, no package manager that works as attacker expects. Break in and find: single Go binary, encrypted SQLite file, nothing else. L2 fields cannot be decrypted — key doesn't exist on server.

Footprint: ~500 MB disk, ~60 MB RAM idle. On 10 GB / 512 MB box, plenty of room.

Deploy via nixos-infect (converts fresh Debian on Vultr to NixOS in-place).

Nix store maintenance: keep 2 generations max, periodic nix-collect-garbage. Each rebuild barely adds to the store — it's one Go binary + minimal system packages. Non-issue on these nodes.

No Caddy. No Cloudflare Proxy.

Clavitor is a password vault. Routing all traffic through a third-party proxy (Cloudflare) defeats the trust model. Cloudflare DNS only, no proxying.

Caddy was considered for TLS termination, but Go's built-in autocert (golang.org/x/crypto/acme/autocert) handles Let's Encrypt natively — about 10 lines of code. This eliminates Caddy entirely (~40 MB binary, ~30-40 MB RAM). The Go binary terminates TLS itself.

Why this works (and won't fail like at Kaseya): 16 domains all under *.clavitor.com — we control DNS. No customer domains, no proxy chains, no Windows cert stores. Let's Encrypt rate limit is 50 certs/week — we need 16. Renewal is automatic, in-process, at 30 days before expiry.

Stack Per Node

[Vultr/Hostkey VPS — NixOS]
    |
sshd (WireGuard only — no public port 22)
    |
clavitor binary (Go, ~15 MB, :80 + :443)
    |
clavitor.db (SQLite + WAL)

Two processes. Nothing else installed, nothing else running.


Network & Access

WireGuard Hub-and-Spoke

All management access via WireGuard mesh. No public SSH port on any node.

  • Hub: Zurich (SOC) — 10.84.0.1/24, listens on UDP 51820
  • Spokes: All 15 other nodes — unique 10.84.0.x address, initiate connection to hub
  • SSH binds to WireGuard interface only (10.84.0.x:22)
  • Public internet sees only ports 80 and 443

Spoke NixOS config (example for Tokyo):

networking.wireguard.interfaces.wg0 = {
  ips = [ "10.84.0.2/24" ];
  privateKeyFile = "/etc/wireguard/private.key";
  peers = [{
    publicKey = "zurich-pub-key...";
    allowedIPs = [ "10.84.0.0/24" ];
    endpoint = "zurich.clavitor.com:51820";
    persistentKeepalive = 25;
  }];
};

services.openssh = {
  enable = true;
  listenAddresses = [{ addr = "10.84.0.2"; port = 22; }];
};

Overhead: WireGuard is a kernel module. Zero processes, ~0 MB RAM.

Key management: 16 key pairs (one per node), generated by wg genkey. Add/remove node = update Zurich peer list + rebuild. Five minutes.

Break-Glass SSH

If Zurich is completely down, SOC loses WireGuard access to all nodes. Nodes keep serving customers (public 443 still works), but no management access.

Break-glass: Emergency SSH key with access restricted to jongsma.me IP. Disabled in normal operation — enable via Vultr/Hostkey console if Zurich is unrecoverable.


Telemetry & Monitoring

Push-to-Kuma Model

Nodes push status to Kuma (running in Zurich/SOC). No inbound metrics ports. No scraping. No Prometheus. No node_exporter.

clavitor (every 30s) ──POST──> https://kuma.zurich/api/push/xxxxx
                                    |
                              missing 2 posts = SEV2
                              missing ~5 min  = SEV1

Clavitor binary reads its own /proc/meminfo, /proc/loadavg, disk stats — trivial in Go (runtime.MemStats + few file reads) — and pushes JSON to Kuma. No extra software on node.

Metrics Payload

{
  "node": "tokyo",
  "ts": 1709312400,
  "ram_mb": 142,
  "ram_pct": 27.7,
  "disk_mb": 3200,
  "disk_pct": 31.2,
  "cpu_pct": 2.1,
  "db_size_mb": 12,
  "db_integrity": "ok",
  "active_sessions": 3,
  "req_1h": 847,
  "err_1h": 2,
  "cert_days_remaining": 62,
  "nix_gen": 2,
  "uptime_s": 864000
}

Key metric: cert_days_remaining. If autocert silently fails renewal, this trends toward zero — visible before expiry.


SOC Operations (Zurich)

Zurich is the SOC hub. Kuma runs here. WireGuard hub here. All management flows through Zurich.

Routine (automated/scheduled)

  • Kuma monitoring: Push monitors for all 16 nodes, SEV2/SEV1 escalation on missed heartbeats
  • NixOS updates: Weekly nixos-rebuild switch across all nodes via WireGuard SSH
  • Nix garbage collection: Weekly, keep 2 generations max
  • SQLite integrity: Periodic PRAGMA integrity_check on vault DBs
  • Cert monitoring: Watch cert_days_remaining in Kuma payload

Reactive (on alert)

  • Node down: Check Kuma, SSH via WireGuard, diagnose. If unrecoverable, reprovision from configuration.nix.
  • Disk pressure: Nix garbage collection, check DB growth, upgrade tier if needed
  • Anomaly detection: Unusual API patterns (credential stuffing, brute force) visible in err_1h metric
  • Binary deploy: Push new Clavitor binary across nodes, rolling deploy, verify health after each

Deployment

  • Node config is single configuration.nix per node (templated), checked into clavitor repo

  • Go binary cross-compiled with CGO_ENABLED=0 or musl target (for SQLite on NixOS)

  • Deploy: SCP binary + push config via WireGuard SSH, nixos-rebuild switch — atomic, rollback on failure

  • DNS-level failover: route away from unhealthy nodes


Gaps and Future Considerations

  • Istanbul/Dubai: Vultr has no Turkey or Gulf presence. Warsaw (~30ms to Istanbul) and Tel Aviv (~40ms) cover the gap. Hostkey Dubai covers the Gulf. Hostkey also has Istanbul directly.
  • China mainland: Requires ICP license + Chinese entity. Tokyo and Seoul serve as Phase 1 proxies. Evaluate Alibaba Cloud for Phase 2 if Chinese demand materializes.
  • Canada: Toronto available on Vultr if needed. Currently served by New Jersey + Silicon Valley.
  • Mexico/Central America: Mexico City available on Vultr. Currently served by Dallas + São Paulo.
  • Provider consolidation: Hostkey covers NL, DE, FI, IS, UK, TR, US, UAE, IL — check mini-VPS pricing with account manager. Could reduce Vultr dependency to ~9 nodes (Asia, LATAM, Africa, Oceania, US interior).

Node Configuration Template

Minimal configuration.nix for a Clavitor node:

{ config, pkgs, ... }:
{
  # WireGuard — management network
  networking.wireguard.interfaces.wg0 = {
    ips = [ "10.84.0.NODE_ID/24" ];
    privateKeyFile = "/etc/wireguard/private.key";
    peers = [{
      publicKey = "ZURICH_PUB_KEY";
      allowedIPs = [ "10.84.0.0/24" ];
      endpoint = "zurich.clavitor.com:51820";
      persistentKeepalive = 25;
    }];
  };

  # SSH — WireGuard only
  services.openssh = {
    enable = true;
    listenAddresses = [{ addr = "10.84.0.NODE_ID"; port = 22; }];
    settings.PasswordAuthentication = false;
  };

  # Clavitor
  systemd.services.clavitor = {
    description = "Clavitor";
    after = [ "network-online.target" ];
    wantedBy = [ "multi-user.target" ];
    serviceConfig = {
      ExecStart = "/opt/clavitor/clavitor";
      Restart = "always";
      RestartSec = 5;
      EnvironmentFile = "/etc/clavitor/env";
    };
  };

  # Firewall — public: 80+443 only. WireGuard: 51820 from Zurich only.
  networking.firewall = {
    enable = true;
    allowedTCPPorts = [ 80 443 ];
    # SSH not in allowedTCPPorts — only reachable via WireGuard
  };
}

Clavitor — the vault that knows who it's talking to.