8.7 KiB
Disaster Recovery Plan
Version: 1.0 Effective: January 2026 Owner: Johan Jongsma Review: Annually Last DR Test: Not yet performed
1. Purpose
Define procedures to recover inou services and data following a disaster affecting production systems.
2. Scope
| System | Location | Criticality |
|---|---|---|
| Production server | 192.168.100.2 | Critical |
| Production database | /tank/inou/data/inou.db | Critical |
| Master encryption key | /tank/inou/master.key | Critical |
| Staging server | 192.168.1.253 | Medium |
3. Recovery Objectives
| Metric | Target |
|---|---|
| RTO (Recovery Time Objective) | 4 hours |
| RPO (Recovery Point Objective) | 24 hours |
4. Backup Strategy
Backup Inventory
| Data | Method | Frequency | Retention | Location |
|---|---|---|---|---|
| Database | ZFS snapshot | Daily | 30 days | Local (RAID-Z2) |
| Database | rclone sync | Daily | 90 days | Google Drive (encrypted) |
| Images | ZFS snapshot | Daily | 30 days | Local (RAID-Z2) |
| Images | rclone sync | Daily | 90 days | Google Drive (encrypted) |
| Master key | Manual copy | On change | Permanent | Proton Pass |
| Configuration | Git repository | Per change | Permanent | Local + remote |
Encryption
All data is encrypted before leaving the server:
- Database fields: AES-256-GCM encryption
- Images: Stored encrypted
- Off-site backups: Already encrypted; Google cannot read contents
- Master key: Stored separately in Proton Pass (E2E encrypted)
ZFS Snapshot Management
# List available snapshots
zfs list -t snapshot tank/inou
# Create manual snapshot before major changes
zfs snapshot tank/inou@pre-change-$(date +%Y%m%d-%H%M)
# Snapshots are automatically created daily
5. Disaster Scenarios
Scenario A: Hardware Failure (Single Component)
Symptoms: Server unresponsive, disk errors, network failure
Recovery:
- Identify failed component
- Replace hardware
- Boot from existing ZFS pool or restore from snapshot
- Verify services:
make test
Estimated time: 2-4 hours
Scenario B: Database Corruption
Symptoms: Application errors, SQLite integrity failures
Recovery:
# 1. Stop services
ssh johan@192.168.100.2 "sudo systemctl stop inou-portal inou-api"
# 2. Backup corrupted DB for analysis
ssh johan@192.168.100.2 "cp /tank/inou/data/inou.db /tank/inou/data/inou.db.corrupted"
# 3. List available snapshots
ssh johan@192.168.100.2 "zfs list -t snapshot tank/inou"
# 4. Restore from snapshot
ssh johan@192.168.100.2 "cp /tank/inou/.zfs/snapshot/<name>/data/inou.db /tank/inou/data/inou.db"
# 5. Restart services
ssh johan@192.168.100.2 "sudo systemctl start inou-portal inou-api"
# 6. Verify
make test
Estimated time: 1-2 hours
Scenario C: Complete Server Loss
Symptoms: Server destroyed, stolen, or unrecoverable
Recovery:
# 1. Provision new server with Ubuntu 24.04 LTS
# 2. Apply OS hardening (see security-policy.md)
# 3. Create directory structure
mkdir -p /tank/inou/{bin,data,static,templates,lang}
# 4. Restore master key from Proton Pass
# Copy 32-byte key to /tank/inou/master.key
chmod 600 /tank/inou/master.key
# 5. Restore database from Google Drive
rclone copy gdrive:inou-backup/inou.db /tank/inou/data/
# 6. Restore images from Google Drive
rclone copy gdrive:inou-backup/images/ /tank/inou/data/images/
# 7. Clone application and build
cd ~/dev
git clone <repo> inou
cd inou
make build
# 8. Deploy
make deploy-prod
# 9. Update DNS if IP changed
# 10. Verify
make test
Estimated time: 4-8 hours
Scenario D: Ransomware/Compromise
Symptoms: Encrypted files, unauthorized access, system tampering
Recovery:
- Do not use compromised system - assume attacker persistence
- Provision fresh server from scratch
- Restore from known-good backup (before compromise date)
- Rotate master key and re-encrypt all data
- Rotate all credentials
- Apply additional hardening
- Monitor closely for re-compromise
Estimated time: 8-24 hours
Scenario E: Site Loss (Fire/Flood/Natural Disaster)
Symptoms: Physical location destroyed or inaccessible
Recovery:
- Obtain replacement hardware
- Restore from off-site backup (Google Drive)
- Restore master key from Proton Pass
- Rebuild and deploy application
- Update DNS to new IP
Estimated time: 24-48 hours
6. Key Management
Master Key Recovery
The master key (/tank/inou/master.key) is critical. Without it, all encrypted data is permanently unrecoverable.
Storage locations:
- Production server:
/tank/inou/master.key - Secure backup: Proton Pass (E2E encrypted, separate from data backups)
Recovery procedure:
- Log into Proton Pass
- Retrieve the 32-byte master key
- Create file:
echo -n "<key>" > /tank/inou/master.key - Set permissions:
chmod 600 /tank/inou/master.key - Verify length:
wc -c /tank/inou/master.key(must be exactly 32 bytes)
Key Rotation (If Compromised)
If the master key may be compromised:
# 1. Generate new key
head -c 32 /dev/urandom > /tank/inou/master.key.new
# 2. Run re-encryption migration (script to be created)
# This decrypts all data with old key and re-encrypts with new key
# 3. Replace key
mv /tank/inou/master.key.new /tank/inou/master.key
# 4. Update Proton Pass with new key
# 5. Verify application functionality
make test
7. Recovery Procedures
Pre-Recovery Checklist
- Incident documented and severity assessed
- Stakeholders notified
- Backup integrity verified
- Recovery environment prepared
- Master key accessible
Database Restore from ZFS
# Stop services
sudo systemctl stop inou-portal inou-api
# List snapshots
zfs list -t snapshot tank/inou
# Restore from snapshot
cp /tank/inou/.zfs/snapshot/<snapshot-name>/data/inou.db /tank/inou/data/inou.db
# Start services
sudo systemctl start inou-portal inou-api
# Verify
make test
Database Restore from Off-site
# Stop services
sudo systemctl stop inou-portal inou-api
# Download from Google Drive
rclone copy gdrive:inou-backup/inou.db /tank/inou/data/
# Start services
sudo systemctl start inou-portal inou-api
# Verify
make test
8. Communication During Disaster
| Audience | Method | Message |
|---|---|---|
| Users | Email + status page | "inou is experiencing technical difficulties. We expect to restore service by [time]." |
| Affected users | Direct email | Per incident response plan if data affected |
9. Testing Schedule
| Test Type | Frequency | Last Performed | Next Due |
|---|---|---|---|
| Backup verification | Monthly | January 2026 | February 2026 |
| Database restore (local) | Quarterly | Not yet | Q1 2026 |
| Database restore (off-site) | Quarterly | Not yet | Q1 2026 |
| Full DR drill | Annually | Not yet | Q4 2026 |
Backup Verification Procedure
# Monthly: Verify local snapshots exist and are readable
zfs list -t snapshot tank/inou
sqlite3 /tank/inou/.zfs/snapshot/<latest>/data/inou.db "SELECT COUNT(*) FROM dossiers"
# Monthly: Verify off-site backup exists
rclone ls gdrive:inou-backup/
Restore Test Procedure
# Quarterly: Restore to staging and verify
# 1. Copy from off-site to staging
rclone copy gdrive:inou-backup/inou.db /tmp/restore-test/
# 2. Verify database integrity
sqlite3 /tmp/restore-test/inou.db "PRAGMA integrity_check"
# 3. Verify data is readable (requires master key)
# Test decryption of sample records
# 4. Document results
# 5. Clean up test files
rm -rf /tmp/restore-test/
10. Post-Recovery Checklist
After any recovery:
- All services operational (
make testpasses) - Data integrity verified (spot-check records)
- Logs reviewed for errors
- Users notified if there was visible outage
- Incident documented
- Post-mortem scheduled if significant event
- This plan updated if gaps discovered
11. Quick Reference
Critical Paths
| Item | Path |
|---|---|
| Database | /tank/inou/data/inou.db |
| Auth database | /tank/inou/data/auth.db |
| Master key | /tank/inou/master.key |
| Binaries | /tank/inou/bin/ |
| Logs | /tank/inou/*.log |
Service Commands
# Status
sudo systemctl status inou-portal inou-api
# Stop
sudo systemctl stop inou-portal inou-api
# Start
sudo systemctl start inou-portal inou-api
# Logs
journalctl -u inou-portal -f
journalctl -u inou-api -f
Off-site Backup Commands
# List remote backups
rclone ls gdrive:inou-backup/
# Download specific file
rclone copy gdrive:inou-backup/inou.db /tank/inou/data/
# Upload backup manually
rclone copy /tank/inou/data/inou.db gdrive:inou-backup/
Document end