dealspace/docs/soc2/disaster-recovery-plan.md

# Disaster Recovery Plan

**Version:** 1.0
**Effective:** February 2026
**Owner:** Johan Jongsma
**Review:** Annually
**Last DR Test:** Not yet performed

---

## 1. Purpose

Define procedures to recover Dealspace services and data following a disaster affecting production systems.

---

## 2. Scope

| System | Location | Criticality |
|--------|----------|-------------|
| Production server | 82.24.174.112 (Zürich) | Critical |
| Database | /opt/dealspace/data/dealspace.db | Critical |
| Master encryption key | Secure storage | Critical |

---

## 3. Recovery Objectives

| Metric | Target |
|--------|--------|
| **RTO** (Recovery Time Objective) | 4 hours |
| **RPO** (Recovery Point Objective) | 24 hours |

---

## 4. Backup Strategy

### Backup Inventory

| Data | Method | Frequency | Retention | Location |
|------|--------|-----------|-----------|----------|
| Database | SQLite backup | Daily | 30 days | Encrypted off-site |
| Master key | Manual copy | On change | Permanent | Separate secure storage |
| Configuration | Git repository | Per change | Permanent | Remote repository |

### Encryption

All data is encrypted before leaving the server:
- Database fields: AES-256-GCM encryption with per-project keys
- Off-site backups: Already encrypted
- Master key: Stored separately from data backups

---

## 5. Disaster Scenarios

### Scenario A: Hardware Failure (Single Component)

**Symptoms:** Server unresponsive, network failure

**Recovery:**
1. Contact Hostkey support
2. Restore from backup to new VPS if needed
3. Verify services: health check endpoint
4. Update DNS if IP changed

**Estimated time:** 2-4 hours

### Scenario B: Database Corruption

**Symptoms:** Application errors, SQLite integrity failures

**Recovery:**

```bash
# 1. Stop services
ssh root@82.24.174.112 "systemctl stop dealspace"

# 2. Backup corrupted DB for analysis
ssh root@82.24.174.112 "cp /opt/dealspace/data/dealspace.db /opt/dealspace/data/dealspace.db.corrupted"

# 3. Restore from backup
# Download latest backup and restore
scp backup-server:/backups/dealspace-latest.db.enc /tmp/
# Decrypt and place in position

# 4. Restart services
ssh root@82.24.174.112 "systemctl start dealspace"

# 5. Verify
curl -s https://muskepo.com/health
```

**Estimated time:** 1-2 hours

### Scenario C: Complete Server Loss

**Symptoms:** Server destroyed, stolen, or unrecoverable

**Recovery:**

```bash
# 1. Provision new VPS at Hostkey
# 2. Apply OS hardening (see security-policy.md)

# 3. Create directory structure
mkdir -p /opt/dealspace/{bin,data}

# 4. Restore master key from secure storage
# Copy 32-byte key to secure location
chmod 600 /opt/dealspace/master.key

# 5. Restore database from backup
# Download encrypted backup
# Decrypt and place at /opt/dealspace/data/dealspace.db

# 6. Deploy application binary
scp dealspace-linux root@NEW_IP:/opt/dealspace/bin/dealspace
chmod +x /opt/dealspace/bin/dealspace

# 7. Configure systemd service
# 8. Start service
# 9. Update DNS to new IP

# 10. Verify
curl -s https://muskepo.com/health
```

**Estimated time:** 4-8 hours

### Scenario D: Ransomware/Compromise

**Symptoms:** Encrypted files, unauthorized access, system tampering

**Recovery:**
1. **Do not use compromised system** - assume attacker persistence
2. Provision fresh VPS from scratch
3. Restore from known-good backup (before compromise date)
4. Rotate master key and re-encrypt all data
5. Rotate all credentials
6. Apply additional hardening
7. Monitor closely for re-compromise

**Estimated time:** 8-24 hours

### Scenario E: Provider/Region Loss

**Symptoms:** Hostkey Zürich unavailable

**Recovery:**
1. Provision new VPS at alternate provider
2. Restore from off-site backup
3. Restore master key from secure storage
4. Deploy application
5. Update DNS

**Estimated time:** 24-48 hours

---

## 6. Key Management

### Master Key Recovery

The master key is **critical**. Without it, all encrypted data is permanently unrecoverable.

**Storage locations:**
1. Production server: Secure location
2. Secure backup: Separate secure storage (not with data backups)

**Recovery procedure:**
1. Retrieve the 32-byte master key from secure storage
2. Create file with proper permissions
3. Verify length (must be exactly 32 bytes)

### Key Rotation (If Compromised)

If the master key may be compromised:

1. Generate new master key
2. Run re-encryption migration (decrypt with old key, re-encrypt with new)
3. Replace key file
4. Update secure storage with new key
5. Verify application functionality

---

## 7. Recovery Procedures

### Pre-Recovery Checklist

- [ ] Incident documented and severity assessed
- [ ] Stakeholders notified
- [ ] Backup integrity verified
- [ ] Recovery environment prepared
- [ ] Master key accessible

### Database Restore from Backup

```bash
# Stop services
ssh root@82.24.174.112 "systemctl stop dealspace"

# Download and decrypt backup
# Place at /opt/dealspace/data/dealspace.db

# Start services
ssh root@82.24.174.112 "systemctl start dealspace"

# Verify
curl -s https://muskepo.com/health
```

---

## 8. Communication During Disaster

| Audience | Method | Message |
|----------|--------|---------|
| Clients | Email + status page | "Dealspace is experiencing technical difficulties. We expect to restore service by [time]." |
| Affected clients | Direct email | Per incident response plan if data affected |

---

## 9. Testing Schedule

| Test Type | Frequency | Last Performed | Next Due |
|-----------|-----------|----------------|----------|
| Backup verification | Monthly | Not yet | March 2026 |
| Database restore | Quarterly | Not yet | Q1 2026 |
| Full DR drill | Annually | Not yet | Q4 2026 |

### Backup Verification Procedure

```bash
# Monthly: Verify backups exist and are readable
# List available backups
# Verify database integrity of latest backup
```

### Restore Test Procedure

```bash
# Quarterly: Restore to test environment and verify

# 1. Download backup to test environment
# 2. Verify database integrity: sqlite3 test.db "PRAGMA integrity_check"
# 3. Verify data is readable (requires master key)
# 4. Document results
# 5. Clean up test files
```

---

## 10. Post-Recovery Checklist

After any recovery:

- [ ] All services operational (health check passes)
- [ ] Data integrity verified (spot-check records)
- [ ] Logs reviewed for errors
- [ ] Clients notified if there was visible outage
- [ ] Incident documented
- [ ] Post-mortem scheduled if significant event
- [ ] This plan updated if gaps discovered

---

## 11. Quick Reference

### Critical Paths

| Item | Path |
|------|------|
| Database | /opt/dealspace/data/dealspace.db |
| Binary | /opt/dealspace/bin/dealspace |
| Master key | Secure location |

### Service Commands

```bash
# Status
ssh root@82.24.174.112 "systemctl status dealspace"

# Stop
ssh root@82.24.174.112 "systemctl stop dealspace"

# Start
ssh root@82.24.174.112 "systemctl start dealspace"

# Logs
ssh root@82.24.174.112 "journalctl -u dealspace -f"

# Health check
curl -s https://muskepo.com/health
```

---

*Document end*