Add disaster recovery plan and systemd unit backups
This commit is contained in:
parent
b3ef9747d3
commit
8f715111f0
|
|
@ -0,0 +1,172 @@
|
||||||
|
# Disaster Recovery Plan
|
||||||
|
|
||||||
|
*Last updated: 2026-02-04*
|
||||||
|
*Owner: James ⚡*
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Infrastructure Overview
|
||||||
|
|
||||||
|
| Component | Host | Purpose |
|
||||||
|
|---|---|---|
|
||||||
|
| **forge** | 192.168.1.16 | Primary server — OpenClaw gateway, all services |
|
||||||
|
| **Zurich VPS** | 82.22.36.202 | Git repos, Uptime Kuma, security scanning |
|
||||||
|
| **192.168.1.253** | LAN | Docker services (Immich, ClickHouse, Jellyfin, Signal, qBittorrent) |
|
||||||
|
| **192.168.1.252** | LAN | Home Assistant OS |
|
||||||
|
| **Caddy** | 192.168.0.2 | Reverse proxy (james.jongsma.me, inou.com) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Backup Strategy
|
||||||
|
|
||||||
|
### Tier 1: Git-backed (automated)
|
||||||
|
|
||||||
|
All source code is pushed to `git@zurich.inou.com:<repo>.git`. Hourly audit (`scripts/git-audit.sh`) checks for anomalies.
|
||||||
|
|
||||||
|
**Repos (as of 2026-02-04):**
|
||||||
|
- inou, azure-backup, james-dashboard, mail-bridge, mail-agent
|
||||||
|
- inou-mobile, clawdnode-android, clawdnode-debug-server, clawdnode-gateway
|
||||||
|
- message-bridge, messaging-center, docman, docsys, docs
|
||||||
|
- moltmobile-android, moltmobile-gateway, screenshot-server
|
||||||
|
- docproc, clawd (workspace)
|
||||||
|
|
||||||
|
**Recovery:** `git clone git@zurich.inou.com:<repo>.git`
|
||||||
|
|
||||||
|
### Tier 2: Configuration (documented, manually recoverable)
|
||||||
|
|
||||||
|
These items can't be git-tracked but are documented here for recovery.
|
||||||
|
|
||||||
|
#### Signal CLI (bot number: +31634481877)
|
||||||
|
- **Data:** `~/.clawdbot/tools/signal-cli/` (linked device keys)
|
||||||
|
- **Recovery:** Re-link using QR code from primary device. Takes ~2 minutes.
|
||||||
|
- **Impact:** Bot is offline until re-linked. No data loss — message history is on Signal servers.
|
||||||
|
- **Note:** Signal CLI version and trust store rebuild automatically on first run.
|
||||||
|
|
||||||
|
#### WhatsApp (message-bridge, linked to +1 727 225 2475)
|
||||||
|
- **Data:** `~/.message-bridge/whatsapp.db` (session keys + media refs)
|
||||||
|
- **Recovery:** Delete `whatsapp.db`, restart message-bridge, scan new QR code from Johan's phone.
|
||||||
|
- **Impact:** Bot offline until re-linked. Historical messages in WhatsApp, not in our DB.
|
||||||
|
- **Note:** QR code available at `http://localhost:8030/qr?format=png` after restart.
|
||||||
|
|
||||||
|
#### Proton Mail Bridge
|
||||||
|
- **Data:** `~/.config/protonmail/bridge-v3/` (account link, encryption keys)
|
||||||
|
- **Recovery:**
|
||||||
|
1. `apt install protonmail-bridge` (or download from proton.me)
|
||||||
|
2. Set keychain: `echo '{"preferred_keychain": "pass"}' > ~/.config/protonmail/bridge-v3/prefs.json`
|
||||||
|
3. Run `protonmail-bridge --cli`, login with tj@jongsma.me credentials
|
||||||
|
4. Note new bridge password, update mail-bridge config
|
||||||
|
- **Impact:** Email processing offline until re-linked. ~10 min recovery.
|
||||||
|
- **Credentials:** Account password with Johan. Bridge password regenerated on setup.
|
||||||
|
|
||||||
|
### Tier 3: Data (critical, needs backup solution)
|
||||||
|
|
||||||
|
| Data | Path | Size | Backup Status |
|
||||||
|
|---|---|---|---|
|
||||||
|
| **Sophia's documents** | `~/sophia/` | 9.2 GB | ⚠️ **SINGLE COPY** — needs offsite backup |
|
||||||
|
| **Document store** | `~/documents/` | 7.9 MB | In git (docsys repo) for records, PDFs local only |
|
||||||
|
| **GLM-OCR model** | `~/models/glm-ocr/` | 2.5 GB | Re-downloadable from HuggingFace |
|
||||||
|
|
||||||
|
### Tier 4: Rebuildable (no backup needed)
|
||||||
|
|
||||||
|
| Component | Recovery |
|
||||||
|
|---|---|
|
||||||
|
| Python venvs (`ocr-env/`, `.venv/`) | `pip install -r requirements.txt` |
|
||||||
|
| Node modules | `npm install` |
|
||||||
|
| Flutter SDK | Re-download |
|
||||||
|
| Docker images on 253 | `docker compose pull` |
|
||||||
|
| OC session transcripts | Nice-to-have, not critical |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Service Recovery Procedures
|
||||||
|
|
||||||
|
### Full Server Loss (forge)
|
||||||
|
|
||||||
|
**Prerequisites:** SSH key authorized on Zurich VPS, new Ubuntu 24.04 server.
|
||||||
|
|
||||||
|
1. **OS Setup:**
|
||||||
|
```
|
||||||
|
apt update && apt upgrade
|
||||||
|
adduser johan
|
||||||
|
# Install: git, go, node, python3, docker (if needed)
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **SSH Keys:**
|
||||||
|
- Generate new: `ssh-keygen -t ed25519`
|
||||||
|
- Authorize on Zurich: `ssh root@zurich.inou.com` → add to `/home/git/.ssh/authorized_keys`
|
||||||
|
|
||||||
|
3. **Clone all repos:**
|
||||||
|
```
|
||||||
|
mkdir ~/dev && cd ~/dev
|
||||||
|
for repo in inou azure-backup james-dashboard mail-bridge mail-agent \
|
||||||
|
inou-mobile clawdnode-android message-bridge messaging-center \
|
||||||
|
docman docsys docs docproc screenshot-server; do
|
||||||
|
git clone git@zurich.inou.com:$repo.git
|
||||||
|
done
|
||||||
|
git clone git@zurich.inou.com:clawd.git ~/clawd
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Install OpenClaw:**
|
||||||
|
```
|
||||||
|
npm install -g openclaw
|
||||||
|
openclaw init
|
||||||
|
# Restore gateway config from clawd/config-backups/ or memory
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Restore services** (see systemd units below)
|
||||||
|
|
||||||
|
6. **Re-link integrations:**
|
||||||
|
- Signal CLI: QR code link
|
||||||
|
- WhatsApp: QR code link
|
||||||
|
- Proton Bridge: CLI login
|
||||||
|
|
||||||
|
### Systemd Service Units
|
||||||
|
|
||||||
|
All services run as user units (`systemctl --user`).
|
||||||
|
|
||||||
|
| Service | Binary/Command | Port | Working Dir |
|
||||||
|
|---|---|---|---|
|
||||||
|
| `openclaw-gateway` | `node openclaw gateway` | 18789 | — |
|
||||||
|
| `signal-cli` | `signal-cli daemon --http 0.0.0.0:8080` | 8080 | — |
|
||||||
|
| `protonmail-bridge` | `protonmail-bridge --noninteractive` | 1143/1025 | — |
|
||||||
|
| `mail-bridge` | `message-center -config config.yaml` | 8025 | `~/dev/mail-bridge` |
|
||||||
|
| `message-bridge` | `message-bridge` | 8030 | `~/dev/message-bridge` |
|
||||||
|
| `james-dashboard` | `james-dashboard --dir .` | 9200 | `~/dev/james-dashboard` |
|
||||||
|
| `ocr-service` | `python server.py` | 8090 | `~/ocr-service` |
|
||||||
|
| `docsys` | `docsys` | — | `~/dev/docsys` |
|
||||||
|
|
||||||
|
**Unit files location:** `~/.config/systemd/user/`
|
||||||
|
|
||||||
|
**Environment files:**
|
||||||
|
- `~/.config/message-center.env` (mail-bridge credentials)
|
||||||
|
- OpenClaw gateway env vars in unit file (API keys, tokens)
|
||||||
|
|
||||||
|
### Zurich VPS Loss
|
||||||
|
|
||||||
|
1. Provision new VPS
|
||||||
|
2. Install git, create `git` user with `git-shell`
|
||||||
|
3. Push all repos from forge (they're the primary copies)
|
||||||
|
4. Reinstall Uptime Kuma, Caddy, nuclei
|
||||||
|
5. Update DNS if IP changes
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Monitoring
|
||||||
|
|
||||||
|
| Check | Frequency | Tool |
|
||||||
|
|---|---|---|
|
||||||
|
| Service health | Every heartbeat | `scripts/service-health.sh` |
|
||||||
|
| Git audit | Hourly (:30) | `scripts/git-audit.sh` via cron |
|
||||||
|
| Claude usage | Hourly (:00) | `scripts/claude-usage-check.sh` via cron |
|
||||||
|
| Nuclei security scan | Monthly | Cron from Zurich |
|
||||||
|
| Docker updates (253) | Weekly (Sunday) | Heartbeat task |
|
||||||
|
| HAOS updates | Weekly (Sunday) | Heartbeat task |
|
||||||
|
| Uptime Kuma | Continuous | https://zurich.inou.com:3001 |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Open Items
|
||||||
|
|
||||||
|
- [ ] **Sophia docs backup** — 9.2 GB, single copy. Needs offsite (Proton Drive, Zurich, or both)
|
||||||
|
- [ ] **Systemd unit backup** — Track in git (docs repo or clawd)
|
||||||
|
- [ ] **Automated config snapshots** — Gateway config, env files
|
||||||
|
|
@ -0,0 +1,7 @@
|
||||||
|
[Unit]
|
||||||
|
Description=Daily update check for OpenClaw, Claude Code, and OS packages
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=oneshot
|
||||||
|
ExecStart=/home/johan/clawd/scripts/daily-updates.sh
|
||||||
|
Environment=PATH=/home/johan/.npm-global/bin:/usr/local/bin:/usr/bin:/bin
|
||||||
|
|
@ -0,0 +1,9 @@
|
||||||
|
[Unit]
|
||||||
|
Description=Daily update check at 9:00 AM ET
|
||||||
|
|
||||||
|
[Timer]
|
||||||
|
OnCalendar=*-*-* 09:00:00
|
||||||
|
Persistent=true
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=timers.target
|
||||||
|
|
@ -0,0 +1,19 @@
|
||||||
|
[Unit]
|
||||||
|
Description=DocSys - Document Management System
|
||||||
|
After=network.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
WorkingDirectory=/home/johan/dev/docsys
|
||||||
|
ExecStart=/home/johan/dev/docsys/docsys
|
||||||
|
Restart=on-failure
|
||||||
|
RestartSec=5
|
||||||
|
Environment=HOME=/home/johan
|
||||||
|
Environment=PATH=/usr/local/bin:/usr/bin:/bin
|
||||||
|
Environment=DOCSYS_DATA_DIR=/srv/docsys
|
||||||
|
StandardOutput=journal
|
||||||
|
StandardError=journal
|
||||||
|
SyslogIdentifier=docsys
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=default.target
|
||||||
|
|
@ -0,0 +1,13 @@
|
||||||
|
[Unit]
|
||||||
|
Description=James Dashboard
|
||||||
|
After=network.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
WorkingDirectory=/home/johan/dev/james-dashboard
|
||||||
|
ExecStart=/home/johan/dev/james-dashboard/james-dashboard --dir /home/johan/dev/james-dashboard
|
||||||
|
Restart=always
|
||||||
|
RestartSec=5
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=default.target
|
||||||
|
|
@ -0,0 +1,15 @@
|
||||||
|
[Unit]
|
||||||
|
Description=Mail Bridge - IMAP to HTTP/Webhook
|
||||||
|
After=network.target protonmail-bridge.service message-bridge.service
|
||||||
|
Wants=protonmail-bridge.service
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
WorkingDirectory=/home/johan/dev/mail-bridge
|
||||||
|
EnvironmentFile=/home/johan/.config/message-center.env
|
||||||
|
ExecStart=/home/johan/dev/mail-bridge/message-center -config config.yaml
|
||||||
|
Restart=always
|
||||||
|
RestartSec=5
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=default.target
|
||||||
|
|
@ -0,0 +1,15 @@
|
||||||
|
[Unit]
|
||||||
|
Description=Message Bridge (WhatsApp)
|
||||||
|
After=network.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
ExecStart=/home/johan/dev/message-bridge/message-bridge
|
||||||
|
Environment=PORT=8030
|
||||||
|
Environment=WEBHOOK_URL=http://localhost:18789/hooks/message
|
||||||
|
Environment=WEBHOOK_TOKEN=kuma-alert-token-2026
|
||||||
|
Restart=on-failure
|
||||||
|
RestartSec=5
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=default.target
|
||||||
|
|
@ -0,0 +1,14 @@
|
||||||
|
[Unit]
|
||||||
|
Description=GLM-OCR GPU Service
|
||||||
|
After=network.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
WorkingDirectory=/home/johan/ocr-service
|
||||||
|
ExecStart=/home/johan/ocr-env/bin/python server.py
|
||||||
|
Restart=on-failure
|
||||||
|
RestartSec=10
|
||||||
|
Environment=PYTHONUNBUFFERED=1
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=default.target
|
||||||
|
|
@ -0,0 +1,21 @@
|
||||||
|
[Unit]
|
||||||
|
Description=OpenClaw Gateway
|
||||||
|
After=network-online.target
|
||||||
|
Wants=network-online.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
ExecStart="/usr/bin/node" "/home/johan/.npm-global/lib/node_modules/openclaw/dist/index.js" gateway --port 18789
|
||||||
|
Restart=always
|
||||||
|
RestartSec=5
|
||||||
|
KillMode=process
|
||||||
|
Environment=BRAVE_API_KEY=BSAc_o2YylVmDCYWP_AnUo3SLcjVeRj
|
||||||
|
Environment=HOME=/home/johan
|
||||||
|
Environment="PATH=/home/johan/.local/bin:/home/johan/.npm-global/bin:/home/johan/bin:/usr/local/go/bin:/usr/local/bin:/usr/bin:/bin"
|
||||||
|
Environment=OPENCLAW_GATEWAY_PORT=18789
|
||||||
|
Environment=OPENCLAW_GATEWAY_TOKEN=2dee57cc3ce2947c27ce9e848d5c3e95cc452f25a1477462
|
||||||
|
Environment="OPENCLAW_SYSTEMD_UNIT=openclaw-gateway.service"
|
||||||
|
Environment=OPENCLAW_SERVICE_MARKER=openclaw
|
||||||
|
Environment=OPENCLAW_SERVICE_KIND=gateway
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=default.target
|
||||||
|
|
@ -0,0 +1,12 @@
|
||||||
|
[Unit]
|
||||||
|
Description=Proton Mail Bridge
|
||||||
|
After=network.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
ExecStart=/usr/bin/protonmail-bridge --noninteractive
|
||||||
|
Restart=always
|
||||||
|
RestartSec=10
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=default.target
|
||||||
|
|
@ -0,0 +1,17 @@
|
||||||
|
[Unit]
|
||||||
|
Description=Signal-CLI Daemon
|
||||||
|
After=network.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
ExecStart=/home/johan/.clawdbot/tools/signal-cli/0.13.23/signal-cli -a +31634481877 daemon --http 0.0.0.0:8080 --no-receive-stdout
|
||||||
|
Restart=always
|
||||||
|
RestartSec=10
|
||||||
|
Environment=HOME=/home/johan
|
||||||
|
|
||||||
|
StandardOutput=journal
|
||||||
|
StandardError=journal
|
||||||
|
SyslogIdentifier=signal-cli
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=default.target
|
||||||
Loading…
Reference in New Issue