clawd/memory/forge-server.md

189 lines
6.0 KiB
Markdown

# Forge Server — James's Future Home
*Last updated: 2026-02-04*
**This IS my primary home.** Migration completed 2026-02-04. IP swapped to 192.168.1.16.
---
## Hardware
| Component | Details |
|-----------|---------|
| **Machine** | Lenovo ThinkServer TS140 (second unit) |
| **CPU** | Intel Core i7-6700K @ 4.0GHz (4c/8t, HyperThreading) |
| **RAM** | 64GB DDR4 |
| **GPU** | NVIDIA GeForce GTX 970 4GB (compute 5.2, Maxwell) |
| **Storage** | 469GB NVMe (28G used, 417G free = 7%) |
| **Network** | Single NIC, enp10s0, 192.168.3.138/22 |
## OS & Kernel
- **OS:** Ubuntu 24.04.3 LTS (Server, headless)
- **Kernel:** 6.8.0-94-generic
- **Timezone:** America/New_York
## Network
- **IP:** 192.168.1.16 (swapped from old james on 2026-02-04)
- **Subnet:** 192.168.0.0/22
- **Gateway:** (standard home network)
- **DNS:** systemd-resolved (127.0.0.53, 127.0.0.54)
## Access
- **SSH:** Key auth only (password auth disabled, root login disabled)
- **Authorized keys:**
- `james@server` — James (primary)
- `johan@ubuntu2404` — Johan
- `claude@macbook` — Johan's Mac
- **Sudo:** Passwordless (`johan ALL=(ALL) NOPASSWD:ALL`)
- **Linger:** Enabled (user services persist without active SSH)
## Security
- **Firewall (UFW):** Active
- Rule 1: SSH (22/tcp) from anywhere
- Rule 2: All traffic from LAN (192.168.0.0/22)
- Default: deny incoming, allow outgoing
- **Fail2ban:** Active, monitoring sshd
- **Unattended upgrades:** Enabled
- **Sysctl hardening:** rp_filter, syncookies enabled
- **Disabled services:** snapd, ModemManager
- **Still enabled (fix later):** cloud-init
## GPU Stack
- **Driver:** nvidia-headless-580 + nvidia-utils-580 (v580.126.09)
- **CUDA:** 13.0 (reported by nvidia-smi)
- **Persistence mode:** Enabled
- **VRAM:** 4096 MiB total, ~2.2 GB used by OCR model
- **Temp:** ~44-51°C idle
- **CRITICAL:** GTX 970 = compute capability 5.2 (Maxwell)
- PyTorch ≤ 2.2.x only (newer drops sm_52 support)
- Must use CUDA 11.8 wheels
## Python Environment
- **Path:** `/home/johan/ocr-env/`
- **Python:** 3.12.3
- **Key packages:**
- PyTorch 2.2.2+cu118
- torchvision 0.17.2+cu118
- transformers 5.0.1.dev0 (installed from git, has GLM-OCR support)
- accelerate 1.12.0
- FastAPI 0.128.0
- uvicorn 0.40.0
- Pillow (for image processing)
- **Monkey-patch:** `transformers/utils/generic.py` patched for `torch.is_autocast_enabled()` compat with PyTorch 2.2
## Services
### OCR Service (GLM-OCR)
- **Port:** 8090 (0.0.0.0)
- **Service:** `systemctl --user status ocr-service`
- **Unit file:** `~/.config/systemd/user/ocr-service.service`
- **Source:** `/home/johan/ocr-service/server.py`
- **Model:** `/home/johan/models/glm-ocr` (zai-org/GLM-OCR, 2.47 GB)
**Endpoints:**
- `GET /health` — status, GPU memory, model info
- `POST /ocr` — single image (multipart: file + prompt + max_tokens)
- `POST /ocr/batch` — multiple images
**Performance:**
- Model load: ~1.4s (stays warm in VRAM)
- Small images: ~2s
- Full-page documents: ~25s (auto-resized to 1280px max)
- VRAM: 2.2 GB idle, peaks ~2.8 GB during inference
**Usage from james:**
```bash
# Health check
curl http://192.168.3.138:8090/health
# OCR a single image
curl -X POST http://192.168.3.138:8090/ocr -F "file=@image.png"
# OCR with custom prompt
curl -X POST http://192.168.3.138:8090/ocr -F "file=@doc.png" -F "prompt=Extract all text:"
```
### Ollama
- **Port:** 11434 (localhost only)
- **Version:** 0.15.4
- **Status:** Installed, waiting for v0.15.5 for native GLM-OCR support
- **Note:** Not currently used — Python/transformers handles OCR directly
## Migration Plan: james → forge
### What moves:
- [ ] OpenClaw gateway (port 18789)
- [ ] Signal-cli daemon (port 8080)
- [ ] Proton Mail Bridge (ports 1143, 1025)
- [ ] Mail Bridge / Message Center (port 8025)
- [ ] Message Bridge / WhatsApp (port 8030)
- [ ] Dashboard (port 9200)
- [ ] Headless Chrome (port 9223)
- [ ] All workspace files (`~/clawd/`)
- [ ] Document management system
- [ ] Cron jobs and heartbeat config
- [ ] SSH keys and configs
### What stays on james (or TBD):
- Legacy configs / backups
- SMB shares (maybe move too?)
### Pre-migration checklist:
- [ ] Install Node.js 22 on forge
- [ ] Install OpenClaw on forge
- [ ] Set up Signal-cli on forge
- [ ] Set up Proton Mail Bridge on forge
- [ ] Set up message-bridge (WhatsApp) on forge
- [ ] Set up headless Chrome on forge
- [ ] Copy workspace (`~/clawd/`) to forge
- [ ] Copy documents system to forge
- [ ] Test all services on forge before switchover
- [ ] Update DNS/Caddy to point to forge IP
- [ ] Update TOOLS.md, MEMORY.md with new IPs
- [ ] Verify GPU OCR still works alongside gateway
### Advantages of forge over james:
- **CPU:** i7-6700K (4c/8t, 4.0GHz) vs Xeon E3-1225v3 (4c/4t, 3.2GHz) — faster + HT
- **RAM:** 64GB vs 16GB — massive headroom
- **GPU:** GTX 970 for local ML inference
- **Storage:** 469GB NVMe vs 916GB SSD — less space but faster
- **Network:** Same /22 subnet, same LAN access to everything
### Risks:
- Storage is smaller (469G vs 916G) — may need to be selective about what moves
- GPU driver + gateway on same box — monitor for resource conflicts
- Signal-cli needs to re-link or transfer DB
- WhatsApp bridge needs QR re-link
---
## Directory Layout
```
/home/johan/
├── ocr-env/ # Python venv (PyTorch + transformers)
├── ocr-service/ # FastAPI OCR server
│ └── server.py
├── models/
│ └── glm-ocr/ # GLM-OCR weights (2.47 GB)
├── .config/
│ └── systemd/user/
│ └── ocr-service.service
└── .ssh/
└── authorized_keys
```
## Key Constraints
1. **PyTorch version locked to 2.2.x** — GTX 970 sm_52 not supported in newer
2. **CUDA 11.8 wheels only** — matches PyTorch 2.2 requirement
3. **Max image dimension 1280px** — larger causes excessive VRAM/time on GTX 970
4. **transformers from git** — stock pip version doesn't have GLM-OCR model class
5. **Monkey-patch required**`torch.is_autocast_enabled()` API changed in PyTorch 2.4