Add disaster recovery plan and systemd unit backups
This commit is contained in:
parent
b3ef9747d3
commit
8f715111f0
|
|
@ -0,0 +1,172 @@
|
|||
# Disaster Recovery Plan
|
||||
|
||||
*Last updated: 2026-02-04*
|
||||
*Owner: James ⚡*
|
||||
|
||||
---
|
||||
|
||||
## Infrastructure Overview
|
||||
|
||||
| Component | Host | Purpose |
|
||||
|---|---|---|
|
||||
| **forge** | 192.168.1.16 | Primary server — OpenClaw gateway, all services |
|
||||
| **Zurich VPS** | 82.22.36.202 | Git repos, Uptime Kuma, security scanning |
|
||||
| **192.168.1.253** | LAN | Docker services (Immich, ClickHouse, Jellyfin, Signal, qBittorrent) |
|
||||
| **192.168.1.252** | LAN | Home Assistant OS |
|
||||
| **Caddy** | 192.168.0.2 | Reverse proxy (james.jongsma.me, inou.com) |
|
||||
|
||||
---
|
||||
|
||||
## Backup Strategy
|
||||
|
||||
### Tier 1: Git-backed (automated)
|
||||
|
||||
All source code is pushed to `git@zurich.inou.com:<repo>.git`. Hourly audit (`scripts/git-audit.sh`) checks for anomalies.
|
||||
|
||||
**Repos (as of 2026-02-04):**
|
||||
- inou, azure-backup, james-dashboard, mail-bridge, mail-agent
|
||||
- inou-mobile, clawdnode-android, clawdnode-debug-server, clawdnode-gateway
|
||||
- message-bridge, messaging-center, docman, docsys, docs
|
||||
- moltmobile-android, moltmobile-gateway, screenshot-server
|
||||
- docproc, clawd (workspace)
|
||||
|
||||
**Recovery:** `git clone git@zurich.inou.com:<repo>.git`
|
||||
|
||||
### Tier 2: Configuration (documented, manually recoverable)
|
||||
|
||||
These items can't be git-tracked but are documented here for recovery.
|
||||
|
||||
#### Signal CLI (bot number: +31634481877)
|
||||
- **Data:** `~/.clawdbot/tools/signal-cli/` (linked device keys)
|
||||
- **Recovery:** Re-link using QR code from primary device. Takes ~2 minutes.
|
||||
- **Impact:** Bot is offline until re-linked. No data loss — message history is on Signal servers.
|
||||
- **Note:** Signal CLI version and trust store rebuild automatically on first run.
|
||||
|
||||
#### WhatsApp (message-bridge, linked to +1 727 225 2475)
|
||||
- **Data:** `~/.message-bridge/whatsapp.db` (session keys + media refs)
|
||||
- **Recovery:** Delete `whatsapp.db`, restart message-bridge, scan new QR code from Johan's phone.
|
||||
- **Impact:** Bot offline until re-linked. Historical messages in WhatsApp, not in our DB.
|
||||
- **Note:** QR code available at `http://localhost:8030/qr?format=png` after restart.
|
||||
|
||||
#### Proton Mail Bridge
|
||||
- **Data:** `~/.config/protonmail/bridge-v3/` (account link, encryption keys)
|
||||
- **Recovery:**
|
||||
1. `apt install protonmail-bridge` (or download from proton.me)
|
||||
2. Set keychain: `echo '{"preferred_keychain": "pass"}' > ~/.config/protonmail/bridge-v3/prefs.json`
|
||||
3. Run `protonmail-bridge --cli`, login with tj@jongsma.me credentials
|
||||
4. Note new bridge password, update mail-bridge config
|
||||
- **Impact:** Email processing offline until re-linked. ~10 min recovery.
|
||||
- **Credentials:** Account password with Johan. Bridge password regenerated on setup.
|
||||
|
||||
### Tier 3: Data (critical, needs backup solution)
|
||||
|
||||
| Data | Path | Size | Backup Status |
|
||||
|---|---|---|---|
|
||||
| **Sophia's documents** | `~/sophia/` | 9.2 GB | ⚠️ **SINGLE COPY** — needs offsite backup |
|
||||
| **Document store** | `~/documents/` | 7.9 MB | In git (docsys repo) for records, PDFs local only |
|
||||
| **GLM-OCR model** | `~/models/glm-ocr/` | 2.5 GB | Re-downloadable from HuggingFace |
|
||||
|
||||
### Tier 4: Rebuildable (no backup needed)
|
||||
|
||||
| Component | Recovery |
|
||||
|---|---|
|
||||
| Python venvs (`ocr-env/`, `.venv/`) | `pip install -r requirements.txt` |
|
||||
| Node modules | `npm install` |
|
||||
| Flutter SDK | Re-download |
|
||||
| Docker images on 253 | `docker compose pull` |
|
||||
| OC session transcripts | Nice-to-have, not critical |
|
||||
|
||||
---
|
||||
|
||||
## Service Recovery Procedures
|
||||
|
||||
### Full Server Loss (forge)
|
||||
|
||||
**Prerequisites:** SSH key authorized on Zurich VPS, new Ubuntu 24.04 server.
|
||||
|
||||
1. **OS Setup:**
|
||||
```
|
||||
apt update && apt upgrade
|
||||
adduser johan
|
||||
# Install: git, go, node, python3, docker (if needed)
|
||||
```
|
||||
|
||||
2. **SSH Keys:**
|
||||
- Generate new: `ssh-keygen -t ed25519`
|
||||
- Authorize on Zurich: `ssh root@zurich.inou.com` → add to `/home/git/.ssh/authorized_keys`
|
||||
|
||||
3. **Clone all repos:**
|
||||
```
|
||||
mkdir ~/dev && cd ~/dev
|
||||
for repo in inou azure-backup james-dashboard mail-bridge mail-agent \
|
||||
inou-mobile clawdnode-android message-bridge messaging-center \
|
||||
docman docsys docs docproc screenshot-server; do
|
||||
git clone git@zurich.inou.com:$repo.git
|
||||
done
|
||||
git clone git@zurich.inou.com:clawd.git ~/clawd
|
||||
```
|
||||
|
||||
4. **Install OpenClaw:**
|
||||
```
|
||||
npm install -g openclaw
|
||||
openclaw init
|
||||
# Restore gateway config from clawd/config-backups/ or memory
|
||||
```
|
||||
|
||||
5. **Restore services** (see systemd units below)
|
||||
|
||||
6. **Re-link integrations:**
|
||||
- Signal CLI: QR code link
|
||||
- WhatsApp: QR code link
|
||||
- Proton Bridge: CLI login
|
||||
|
||||
### Systemd Service Units
|
||||
|
||||
All services run as user units (`systemctl --user`).
|
||||
|
||||
| Service | Binary/Command | Port | Working Dir |
|
||||
|---|---|---|---|
|
||||
| `openclaw-gateway` | `node openclaw gateway` | 18789 | — |
|
||||
| `signal-cli` | `signal-cli daemon --http 0.0.0.0:8080` | 8080 | — |
|
||||
| `protonmail-bridge` | `protonmail-bridge --noninteractive` | 1143/1025 | — |
|
||||
| `mail-bridge` | `message-center -config config.yaml` | 8025 | `~/dev/mail-bridge` |
|
||||
| `message-bridge` | `message-bridge` | 8030 | `~/dev/message-bridge` |
|
||||
| `james-dashboard` | `james-dashboard --dir .` | 9200 | `~/dev/james-dashboard` |
|
||||
| `ocr-service` | `python server.py` | 8090 | `~/ocr-service` |
|
||||
| `docsys` | `docsys` | — | `~/dev/docsys` |
|
||||
|
||||
**Unit files location:** `~/.config/systemd/user/`
|
||||
|
||||
**Environment files:**
|
||||
- `~/.config/message-center.env` (mail-bridge credentials)
|
||||
- OpenClaw gateway env vars in unit file (API keys, tokens)
|
||||
|
||||
### Zurich VPS Loss
|
||||
|
||||
1. Provision new VPS
|
||||
2. Install git, create `git` user with `git-shell`
|
||||
3. Push all repos from forge (they're the primary copies)
|
||||
4. Reinstall Uptime Kuma, Caddy, nuclei
|
||||
5. Update DNS if IP changes
|
||||
|
||||
---
|
||||
|
||||
## Monitoring
|
||||
|
||||
| Check | Frequency | Tool |
|
||||
|---|---|---|
|
||||
| Service health | Every heartbeat | `scripts/service-health.sh` |
|
||||
| Git audit | Hourly (:30) | `scripts/git-audit.sh` via cron |
|
||||
| Claude usage | Hourly (:00) | `scripts/claude-usage-check.sh` via cron |
|
||||
| Nuclei security scan | Monthly | Cron from Zurich |
|
||||
| Docker updates (253) | Weekly (Sunday) | Heartbeat task |
|
||||
| HAOS updates | Weekly (Sunday) | Heartbeat task |
|
||||
| Uptime Kuma | Continuous | https://zurich.inou.com:3001 |
|
||||
|
||||
---
|
||||
|
||||
## Open Items
|
||||
|
||||
- [ ] **Sophia docs backup** — 9.2 GB, single copy. Needs offsite (Proton Drive, Zurich, or both)
|
||||
- [ ] **Systemd unit backup** — Track in git (docs repo or clawd)
|
||||
- [ ] **Automated config snapshots** — Gateway config, env files
|
||||
|
|
@ -0,0 +1,7 @@
|
|||
[Unit]
|
||||
Description=Daily update check for OpenClaw, Claude Code, and OS packages
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
ExecStart=/home/johan/clawd/scripts/daily-updates.sh
|
||||
Environment=PATH=/home/johan/.npm-global/bin:/usr/local/bin:/usr/bin:/bin
|
||||
|
|
@ -0,0 +1,9 @@
|
|||
[Unit]
|
||||
Description=Daily update check at 9:00 AM ET
|
||||
|
||||
[Timer]
|
||||
OnCalendar=*-*-* 09:00:00
|
||||
Persistent=true
|
||||
|
||||
[Install]
|
||||
WantedBy=timers.target
|
||||
|
|
@ -0,0 +1,19 @@
|
|||
[Unit]
|
||||
Description=DocSys - Document Management System
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
WorkingDirectory=/home/johan/dev/docsys
|
||||
ExecStart=/home/johan/dev/docsys/docsys
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
Environment=HOME=/home/johan
|
||||
Environment=PATH=/usr/local/bin:/usr/bin:/bin
|
||||
Environment=DOCSYS_DATA_DIR=/srv/docsys
|
||||
StandardOutput=journal
|
||||
StandardError=journal
|
||||
SyslogIdentifier=docsys
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
|
|
@ -0,0 +1,13 @@
|
|||
[Unit]
|
||||
Description=James Dashboard
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
WorkingDirectory=/home/johan/dev/james-dashboard
|
||||
ExecStart=/home/johan/dev/james-dashboard/james-dashboard --dir /home/johan/dev/james-dashboard
|
||||
Restart=always
|
||||
RestartSec=5
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
|
|
@ -0,0 +1,15 @@
|
|||
[Unit]
|
||||
Description=Mail Bridge - IMAP to HTTP/Webhook
|
||||
After=network.target protonmail-bridge.service message-bridge.service
|
||||
Wants=protonmail-bridge.service
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
WorkingDirectory=/home/johan/dev/mail-bridge
|
||||
EnvironmentFile=/home/johan/.config/message-center.env
|
||||
ExecStart=/home/johan/dev/mail-bridge/message-center -config config.yaml
|
||||
Restart=always
|
||||
RestartSec=5
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
|
|
@ -0,0 +1,15 @@
|
|||
[Unit]
|
||||
Description=Message Bridge (WhatsApp)
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
ExecStart=/home/johan/dev/message-bridge/message-bridge
|
||||
Environment=PORT=8030
|
||||
Environment=WEBHOOK_URL=http://localhost:18789/hooks/message
|
||||
Environment=WEBHOOK_TOKEN=kuma-alert-token-2026
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
|
|
@ -0,0 +1,14 @@
|
|||
[Unit]
|
||||
Description=GLM-OCR GPU Service
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
WorkingDirectory=/home/johan/ocr-service
|
||||
ExecStart=/home/johan/ocr-env/bin/python server.py
|
||||
Restart=on-failure
|
||||
RestartSec=10
|
||||
Environment=PYTHONUNBUFFERED=1
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
|
|
@ -0,0 +1,21 @@
|
|||
[Unit]
|
||||
Description=OpenClaw Gateway
|
||||
After=network-online.target
|
||||
Wants=network-online.target
|
||||
|
||||
[Service]
|
||||
ExecStart="/usr/bin/node" "/home/johan/.npm-global/lib/node_modules/openclaw/dist/index.js" gateway --port 18789
|
||||
Restart=always
|
||||
RestartSec=5
|
||||
KillMode=process
|
||||
Environment=BRAVE_API_KEY=BSAc_o2YylVmDCYWP_AnUo3SLcjVeRj
|
||||
Environment=HOME=/home/johan
|
||||
Environment="PATH=/home/johan/.local/bin:/home/johan/.npm-global/bin:/home/johan/bin:/usr/local/go/bin:/usr/local/bin:/usr/bin:/bin"
|
||||
Environment=OPENCLAW_GATEWAY_PORT=18789
|
||||
Environment=OPENCLAW_GATEWAY_TOKEN=2dee57cc3ce2947c27ce9e848d5c3e95cc452f25a1477462
|
||||
Environment="OPENCLAW_SYSTEMD_UNIT=openclaw-gateway.service"
|
||||
Environment=OPENCLAW_SERVICE_MARKER=openclaw
|
||||
Environment=OPENCLAW_SERVICE_KIND=gateway
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
|
|
@ -0,0 +1,12 @@
|
|||
[Unit]
|
||||
Description=Proton Mail Bridge
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
ExecStart=/usr/bin/protonmail-bridge --noninteractive
|
||||
Restart=always
|
||||
RestartSec=10
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
|
|
@ -0,0 +1,17 @@
|
|||
[Unit]
|
||||
Description=Signal-CLI Daemon
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
ExecStart=/home/johan/.clawdbot/tools/signal-cli/0.13.23/signal-cli -a +31634481877 daemon --http 0.0.0.0:8080 --no-receive-stdout
|
||||
Restart=always
|
||||
RestartSec=10
|
||||
Environment=HOME=/home/johan
|
||||
|
||||
StandardOutput=journal
|
||||
StandardError=journal
|
||||
SyslogIdentifier=signal-cli
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
Loading…
Reference in New Issue