173 lines
6.2 KiB
Markdown
173 lines
6.2 KiB
Markdown
# Disaster Recovery Plan
|
|
|
|
*Last updated: 2026-02-04*
|
|
*Owner: James ⚡*
|
|
|
|
---
|
|
|
|
## Infrastructure Overview
|
|
|
|
| Component | Host | Purpose |
|
|
|---|---|---|
|
|
| **forge** | 192.168.1.16 | Primary server — OpenClaw gateway, all services |
|
|
| **Zurich VPS** | 82.22.36.202 | Git repos, Uptime Kuma, security scanning |
|
|
| **192.168.1.253** | LAN | Docker services (Immich, ClickHouse, Jellyfin, Signal, qBittorrent) |
|
|
| **192.168.1.252** | LAN | Home Assistant OS |
|
|
| **Caddy** | 192.168.0.2 | Reverse proxy (james.jongsma.me, inou.com) |
|
|
|
|
---
|
|
|
|
## Backup Strategy
|
|
|
|
### Tier 1: Git-backed (automated)
|
|
|
|
All source code is pushed to `git@zurich.inou.com:<repo>.git`. Hourly audit (`scripts/git-audit.sh`) checks for anomalies.
|
|
|
|
**Repos (as of 2026-02-04):**
|
|
- inou, azure-backup, james-dashboard, mail-bridge, mail-agent
|
|
- inou-mobile, clawdnode-android, clawdnode-debug-server, clawdnode-gateway
|
|
- message-bridge, messaging-center, docman, docsys, docs
|
|
- moltmobile-android, moltmobile-gateway, screenshot-server
|
|
- docproc, clawd (workspace)
|
|
|
|
**Recovery:** `git clone git@zurich.inou.com:<repo>.git`
|
|
|
|
### Tier 2: Configuration (documented, manually recoverable)
|
|
|
|
These items can't be git-tracked but are documented here for recovery.
|
|
|
|
#### Signal CLI (bot number: +31634481877)
|
|
- **Data:** `~/.clawdbot/tools/signal-cli/` (linked device keys)
|
|
- **Recovery:** Re-link using QR code from primary device. Takes ~2 minutes.
|
|
- **Impact:** Bot is offline until re-linked. No data loss — message history is on Signal servers.
|
|
- **Note:** Signal CLI version and trust store rebuild automatically on first run.
|
|
|
|
#### WhatsApp (message-bridge, linked to +1 727 225 2475)
|
|
- **Data:** `~/.message-bridge/whatsapp.db` (session keys + media refs)
|
|
- **Recovery:** Delete `whatsapp.db`, restart message-bridge, scan new QR code from Johan's phone.
|
|
- **Impact:** Bot offline until re-linked. Historical messages in WhatsApp, not in our DB.
|
|
- **Note:** QR code available at `http://localhost:8030/qr?format=png` after restart.
|
|
|
|
#### Proton Mail Bridge
|
|
- **Data:** `~/.config/protonmail/bridge-v3/` (account link, encryption keys)
|
|
- **Recovery:**
|
|
1. `apt install protonmail-bridge` (or download from proton.me)
|
|
2. Set keychain: `echo '{"preferred_keychain": "pass"}' > ~/.config/protonmail/bridge-v3/prefs.json`
|
|
3. Run `protonmail-bridge --cli`, login with tj@jongsma.me credentials
|
|
4. Note new bridge password, update mail-bridge config
|
|
- **Impact:** Email processing offline until re-linked. ~10 min recovery.
|
|
- **Credentials:** Account password with Johan. Bridge password regenerated on setup.
|
|
|
|
### Tier 3: Data (critical, needs backup solution)
|
|
|
|
| Data | Path | Size | Backup Status |
|
|
|---|---|---|---|
|
|
| **Sophia's documents** | `~/sophia/` | 9.2 GB | ⚠️ **SINGLE COPY** — needs offsite backup |
|
|
| **Document store** | `~/documents/` | 7.9 MB | In git (docsys repo) for records, PDFs local only |
|
|
| **GLM-OCR model** | `~/models/glm-ocr/` | 2.5 GB | Re-downloadable from HuggingFace |
|
|
|
|
### Tier 4: Rebuildable (no backup needed)
|
|
|
|
| Component | Recovery |
|
|
|---|---|
|
|
| Python venvs (`ocr-env/`, `.venv/`) | `pip install -r requirements.txt` |
|
|
| Node modules | `npm install` |
|
|
| Flutter SDK | Re-download |
|
|
| Docker images on 253 | `docker compose pull` |
|
|
| OC session transcripts | Nice-to-have, not critical |
|
|
|
|
---
|
|
|
|
## Service Recovery Procedures
|
|
|
|
### Full Server Loss (forge)
|
|
|
|
**Prerequisites:** SSH key authorized on Zurich VPS, new Ubuntu 24.04 server.
|
|
|
|
1. **OS Setup:**
|
|
```
|
|
apt update && apt upgrade
|
|
adduser johan
|
|
# Install: git, go, node, python3, docker (if needed)
|
|
```
|
|
|
|
2. **SSH Keys:**
|
|
- Generate new: `ssh-keygen -t ed25519`
|
|
- Authorize on Zurich: `ssh root@zurich.inou.com` → add to `/home/git/.ssh/authorized_keys`
|
|
|
|
3. **Clone all repos:**
|
|
```
|
|
mkdir ~/dev && cd ~/dev
|
|
for repo in inou azure-backup james-dashboard mail-bridge mail-agent \
|
|
inou-mobile clawdnode-android message-bridge messaging-center \
|
|
docman docsys docs docproc screenshot-server; do
|
|
git clone git@zurich.inou.com:$repo.git
|
|
done
|
|
git clone git@zurich.inou.com:clawd.git ~/clawd
|
|
```
|
|
|
|
4. **Install OpenClaw:**
|
|
```
|
|
npm install -g openclaw
|
|
openclaw init
|
|
# Restore gateway config from clawd/config-backups/ or memory
|
|
```
|
|
|
|
5. **Restore services** (see systemd units below)
|
|
|
|
6. **Re-link integrations:**
|
|
- Signal CLI: QR code link
|
|
- WhatsApp: QR code link
|
|
- Proton Bridge: CLI login
|
|
|
|
### Systemd Service Units
|
|
|
|
All services run as user units (`systemctl --user`).
|
|
|
|
| Service | Binary/Command | Port | Working Dir |
|
|
|---|---|---|---|
|
|
| `openclaw-gateway` | `node openclaw gateway` | 18789 | — |
|
|
| `signal-cli` | `signal-cli daemon --http 0.0.0.0:8080` | 8080 | — |
|
|
| `protonmail-bridge` | `protonmail-bridge --noninteractive` | 1143/1025 | — |
|
|
| `mail-bridge` | `message-center -config config.yaml` | 8025 | `~/dev/mail-bridge` |
|
|
| `message-bridge` | `message-bridge` | 8030 | `~/dev/message-bridge` |
|
|
| `james-dashboard` | `james-dashboard --dir .` | 9200 | `~/dev/james-dashboard` |
|
|
| `ocr-service` | `python server.py` | 8090 | `~/ocr-service` |
|
|
| `docsys` | `docsys` | — | `~/dev/docsys` |
|
|
|
|
**Unit files location:** `~/.config/systemd/user/`
|
|
|
|
**Environment files:**
|
|
- `~/.config/message-center.env` (mail-bridge credentials)
|
|
- OpenClaw gateway env vars in unit file (API keys, tokens)
|
|
|
|
### Zurich VPS Loss
|
|
|
|
1. Provision new VPS
|
|
2. Install git, create `git` user with `git-shell`
|
|
3. Push all repos from forge (they're the primary copies)
|
|
4. Reinstall Uptime Kuma, Caddy, nuclei
|
|
5. Update DNS if IP changes
|
|
|
|
---
|
|
|
|
## Monitoring
|
|
|
|
| Check | Frequency | Tool |
|
|
|---|---|---|
|
|
| Service health | Every heartbeat | `scripts/service-health.sh` |
|
|
| Git audit | Hourly (:30) | `scripts/git-audit.sh` via cron |
|
|
| Claude usage | Hourly (:00) | `scripts/claude-usage-check.sh` via cron |
|
|
| Nuclei security scan | Monthly | Cron from Zurich |
|
|
| Docker updates (253) | Weekly (Sunday) | Heartbeat task |
|
|
| HAOS updates | Weekly (Sunday) | Heartbeat task |
|
|
| Uptime Kuma | Continuous | https://zurich.inou.com:3001 |
|
|
|
|
---
|
|
|
|
## Open Items
|
|
|
|
- [ ] **Sophia docs backup** — 9.2 GB, single copy. Needs offsite (Proton Drive, Zurich, or both)
|
|
- [ ] **Systemd unit backup** — Track in git (docs repo or clawd)
|
|
- [ ] **Automated config snapshots** — Gateway config, env files
|