docs/soc2/disaster-recovery.md

6.2 KiB

Disaster Recovery Plan

Last updated: 2026-02-04 Owner: James


Infrastructure Overview

Component Host Purpose
forge 192.168.1.16 Primary server — OpenClaw gateway, all services
Zurich VPS 82.22.36.202 Git repos, Uptime Kuma, security scanning
192.168.1.253 LAN Docker services (Immich, ClickHouse, Jellyfin, Signal, qBittorrent)
192.168.1.252 LAN Home Assistant OS
Caddy 192.168.0.2 Reverse proxy (james.jongsma.me, inou.com)

Backup Strategy

Tier 1: Git-backed (automated)

All source code is pushed to git@zurich.inou.com:<repo>.git. Hourly audit (scripts/git-audit.sh) checks for anomalies.

Repos (as of 2026-02-04):

  • inou, azure-backup, james-dashboard, mail-bridge, mail-agent
  • inou-mobile, clawdnode-android, clawdnode-debug-server, clawdnode-gateway
  • message-bridge, messaging-center, docman, docsys, docs
  • moltmobile-android, moltmobile-gateway, screenshot-server
  • docproc, clawd (workspace)

Recovery: git clone git@zurich.inou.com:<repo>.git

Tier 2: Configuration (documented, manually recoverable)

These items can't be git-tracked but are documented here for recovery.

Signal CLI (bot number: +31634481877)

  • Data: ~/.clawdbot/tools/signal-cli/ (linked device keys)
  • Recovery: Re-link using QR code from primary device. Takes ~2 minutes.
  • Impact: Bot is offline until re-linked. No data loss — message history is on Signal servers.
  • Note: Signal CLI version and trust store rebuild automatically on first run.

WhatsApp (message-bridge, linked to +1 727 225 2475)

  • Data: ~/.message-bridge/whatsapp.db (session keys + media refs)
  • Recovery: Delete whatsapp.db, restart message-bridge, scan new QR code from Johan's phone.
  • Impact: Bot offline until re-linked. Historical messages in WhatsApp, not in our DB.
  • Note: QR code available at http://localhost:8030/qr?format=png after restart.

Proton Mail Bridge

  • Data: ~/.config/protonmail/bridge-v3/ (account link, encryption keys)
  • Recovery:
    1. apt install protonmail-bridge (or download from proton.me)
    2. Set keychain: echo '{"preferred_keychain": "pass"}' > ~/.config/protonmail/bridge-v3/prefs.json
    3. Run protonmail-bridge --cli, login with tj@jongsma.me credentials
    4. Note new bridge password, update mail-bridge config
  • Impact: Email processing offline until re-linked. ~10 min recovery.
  • Credentials: Account password with Johan. Bridge password regenerated on setup.

Tier 3: Data (critical, needs backup solution)

Data Path Size Backup Status
Sophia's documents ~/sophia/ 9.2 GB ⚠️ SINGLE COPY — needs offsite backup
Document store ~/documents/ 7.9 MB In git (docsys repo) for records, PDFs local only
GLM-OCR model ~/models/glm-ocr/ 2.5 GB Re-downloadable from HuggingFace

Tier 4: Rebuildable (no backup needed)

Component Recovery
Python venvs (ocr-env/, .venv/) pip install -r requirements.txt
Node modules npm install
Flutter SDK Re-download
Docker images on 253 docker compose pull
OC session transcripts Nice-to-have, not critical

Service Recovery Procedures

Full Server Loss (forge)

Prerequisites: SSH key authorized on Zurich VPS, new Ubuntu 24.04 server.

  1. OS Setup:

    apt update && apt upgrade
    adduser johan
    # Install: git, go, node, python3, docker (if needed)
    
  2. SSH Keys:

    • Generate new: ssh-keygen -t ed25519
    • Authorize on Zurich: ssh root@zurich.inou.com → add to /home/git/.ssh/authorized_keys
  3. Clone all repos:

    mkdir ~/dev && cd ~/dev
    for repo in inou azure-backup james-dashboard mail-bridge mail-agent \
      inou-mobile clawdnode-android message-bridge messaging-center \
      docman docsys docs docproc screenshot-server; do
      git clone git@zurich.inou.com:$repo.git
    done
    git clone git@zurich.inou.com:clawd.git ~/clawd
    
  4. Install OpenClaw:

    npm install -g openclaw
    openclaw init
    # Restore gateway config from clawd/config-backups/ or memory
    
  5. Restore services (see systemd units below)

  6. Re-link integrations:

    • Signal CLI: QR code link
    • WhatsApp: QR code link
    • Proton Bridge: CLI login

Systemd Service Units

All services run as user units (systemctl --user).

Service Binary/Command Port Working Dir
openclaw-gateway node openclaw gateway 18789
signal-cli signal-cli daemon --http 0.0.0.0:8080 8080
protonmail-bridge protonmail-bridge --noninteractive 1143/1025
mail-bridge message-center -config config.yaml 8025 ~/dev/mail-bridge
message-bridge message-bridge 8030 ~/dev/message-bridge
james-dashboard james-dashboard --dir . 9200 ~/dev/james-dashboard
ocr-service python server.py 8090 ~/ocr-service
docsys docsys ~/dev/docsys

Unit files location: ~/.config/systemd/user/

Environment files:

  • ~/.config/message-center.env (mail-bridge credentials)
  • OpenClaw gateway env vars in unit file (API keys, tokens)

Zurich VPS Loss

  1. Provision new VPS
  2. Install git, create git user with git-shell
  3. Push all repos from forge (they're the primary copies)
  4. Reinstall Uptime Kuma, Caddy, nuclei
  5. Update DNS if IP changes

Monitoring

Check Frequency Tool
Service health Every heartbeat scripts/service-health.sh
Git audit Hourly (:30) scripts/git-audit.sh via cron
Claude usage Hourly (:00) scripts/claude-usage-check.sh via cron
Nuclei security scan Monthly Cron from Zurich
Docker updates (253) Weekly (Sunday) Heartbeat task
HAOS updates Weekly (Sunday) Heartbeat task
Uptime Kuma Continuous https://zurich.inou.com:3001

Open Items

  • Sophia docs backup — 9.2 GB, single copy. Needs offsite (Proton Drive, Zurich, or both)
  • Systemd unit backup — Track in git (docs repo or clawd)
  • Automated config snapshots — Gateway config, env files