chore: auto-commit uncommitted changes

2026-02-13 05:30:17 -05:00 · 2026-02-13 05:30:17 -05:00 · c7ac6b6e38
parent 7eb0d94559
commit c7ac6b6e38
3 changed files with 112 additions and 28 deletions
--- a/memory/2026-02-13.md
+++ b/memory/2026-02-13.md
@ -1,27 +1,111 @@
-# 2026-02-13 (Thursday night / Friday early AM)
+# 2026-02-13 (Friday)

-## Local Models Conversation (continued from previous session)
+## GPU Research Session (4:14-4:48 AM)

-### Context
-Johan wants local models not just for coding but for EVERYTHING — a "chief of staff" model.
- inou development, Kaseya projects, Sophia medical, general knowledge
- All his "virtual employees" should get smarter over time
- This is NOT just a coding subagent — it's a general-purpose assistant
+Johan explored GPU options for local AI/CoS. Key progression:

-### Key Discussion Points (previous session → this one)
-1. **3090 GPU upgrade for forge** — ~$850-900 total (used 3090 + PSU), runs 32B models at 25-35 tok/s
-2. **Fine-tuning transfers across models** — correction dataset is the asset, not the weights
-3. **OpenClaw stays on Opus** — person-knowledge, memory, judgment, routing
-4. **Local model gets coding DNA via LoRA** — knows Johan's coding style
-5. **I contradicted myself** — said local model "doesn't know you" then listed fine-tuning benefits. Johan caught it. Corrected: local model DOES know him as a coder via fine-tuning.
+1. **Started with RTX A6000 48GB ($4,599)** — too expensive for the value
+2. **RTX 8000 48GB (~$2,000)** — same VRAM, older/slower, better price
+3. **RTX 3090 24GB (~$850)** — faster than RTX 8000 but only 24GB
+4. **Tradeoff crystallized:** 3090 = fast but limited VRAM, RTX 8000 = slower but can run 70B models
+5. **Johan's concern:** slow assistant = "I'll do it myself" — speed matters for adoption
+6. **Real motivation revealed:** NOT cost savings — he wants **persistent memory/consistent CoS**. Tired of amnesia.
+7. **Cloud GPU rental** (RunPod, Vast.ai) works for periodic LoRA training without buying hardware
+8. **Conclusion direction:** Better memory pipeline (RAG + nightly distillation) > buying GPU hardware
+   - Distillation/memory work is cheap model work (Qwen, K2.5, Gemini Flash)
+   - Opus stays for live conversation judgment
+   - No hardware purchase needed — fix the software/memory problem instead

-### NEW this session: "Chief of Staff" vision
- Johan clarified scope: not just coding, but "everything"
- Wants model that handles inou, Kaseya (many projects), Sophia, general knowledge
- I presented two paths: RAG-heavy (works on 3090) vs bigger model (needs more VRAM)
- **Open question:** Does he prioritize reasoning-with-context (RAG) or built-in knowledge (bigger model)?
- Conversation was cut by compaction — needs continuation
+### Key insight from Johan
+"It's not about money! It's about consistent memory" — the amnesia problem is his #1 frustration with AI assistants.

-### Infrastructure
- Mail bridge returning empty on /messages/new (0 bytes) — might need investigation
- Network fine: ping 1.1.1.1 → 4/4, ~34ms avg
+## Qwen2.5-32B Assessment
+- Compared to Opus 4: B/B+ vs A+ (solid mid-level vs senior engineer)
+- Compared to Opus 3.5: closer but still noticeable gap
+- No local model today is good enough for full autonomous CoS role
+- 6 months from now: maybe (open-source improving fast)
+
+## Alex Finn Post (@AlexFinn/2021992770370764878)
+- Guide on running local models via LM Studio — 1,891 likes
+- Good for basics but focused on cost savings, not memory persistence
+
+## Cloudflare Agent Content Negotiation
+- Cloudflare adding `Accept: text/markdown` at the edge for AI agents
+- **Added to inou TODO** (`/home/johan/dev/inou/docs/TODO.md`)
+- Relevant: inou should be agent-first, serve structured markdown to AI assistants
+- Competitive differentiator vs anti-bot health platforms
+
+## Email Triage
+- 1 new email: Amazon shipping (EZVALO motion sensor night lights, $20.32, arriving today)
+- Updated delivery tracker, trashed email
+- **MC performance issue:** queries taking 15-16 seconds consistently — needs investigation
+
+## RTX 5090 Scam
+- Johan found $299 "RTX 5090" on eBay — zero feedback seller, obvious scam. Warned him off.
+
+## Webchat Bug
+- Johan's message got swallowed (NO_REPLY triggered incorrectly), he had to resend
+
+## Cron Jobs → Kimi K2.5 on Fireworks
+- Switched 7 cron jobs from Opus to Kimi K2.5 (`fireworks/accounts/fireworks/models/kimi-k2p5`)
+- K2.5 Watchdog, claude-usage-hourly, git-audit-hourly, dashboard usage, git-audit-daily, update check, weekly memory synthesis
+- Qwen 2.5 32B deprecated/removed from Fireworks — only Qwen3 models remain
+- Qwen3 235B MoE had cold-start 503s (serverless scaling to zero) — unreliable
+- K2.5 stays warm (popular model), ~9s runs, proven in browser agent
+- Fireworks provider registered in OpenClaw config with two models: K2.5 (primary) + Qwen3 235B (backup)
+
+## Fireworks Blog Post
+- Fireworks published blog about OpenClaw + Fireworks integration
+- Pitch: use open models for routine tasks (10x cheaper), Opus for judgment
+- Validates our exact setup
+
+## Shannon VPS — New Credentials (from Hostkey/Maxim)
+- IP: 82.24.174.112, root / K_cX1aFThB
+- **DO NOT disable password login** until Johan confirms SSH key access (lesson learned from Feb 11 lockout)
+- Task: Install Shannon (KeygraphHQ/shannon) and test against inou portal ONLY
+- Server ID: 53643, HostKey panel: https://panel.hostkey.com/controlpanel.html?key=639551e73029b90f-c061af4412951b2e
+
+## Fire Tablet Alert Dashboard (new project)
+- Johan doesn't see Signal alerts reliably — wants a spare Fire tablet (Fully Kiosk) as alert display
+- Requirements: clock, calendar, notification push with sound ("modest pling")
+- Two approaches discussed: standalone web page (preferred) vs Home Assistant integration
+- Johan OK with me coding it or using HA
+- Plan: simple HTML dashboard on forge, SSE for push alerts, Fully Kiosk loads URL
+
+## GPU Purchase Decision
+- No GPU purchase yet — persistent memory problem better solved with software (RAG + nightly distillation)
+- If buying: RTX 8000 48GB (~$2K) best option for fine-tuning/70B models
+- Cloud GPU (RunPod/Vast.ai) viable for periodic LoRA training
+
+## MC Performance Issue
+- Message Center queries taking 15-16 seconds consistently — needs investigation
+
+## Alert Dashboard — Port Conflict Fixed
+- Subagent built alert-dashboard (Node.js/Express, SSE, analog clock, calendar, alert feed)
+- Initially deployed on port 9201 — **WRONG, that's DocSys's port**
+- Moved to port **9202**, restored DocSys on 9201
+- Service: `alert-dashboard.service` (systemd user, enabled)
+- Source: `/home/johan/dev/alert-dashboard/`
+- API: GET /, GET /api/alerts, POST /api/alerts, GET /api/alerts/stream (SSE)
+- Fully Kiosk URL: `http://192.168.1.16:9202`
+
+## Shannon VPS — Setup Progress
+- SSH key from forge works ✅ (root@82.24.174.112)
+- Password login: root / K_cX1aFThB — **LEFT ENABLED per instructions**
+- Repo cloned to /opt/shannon
+- Docker build started (still building when subagent finished)
+- TODO: Check build completion, run portal test against inou.com
+
+## Kaseya Device Policy Change (IMPORTANT)
+- CISO Jason Manar announced: only Kaseya-issued IT-managed devices on corporate network
+- Personal/BYO devices → BYO network only, no VPN access
+- Rolling out "starting tomorrow" (Feb 14) over coming weeks
+- Johan currently uses personal Mac Mini for EVERYTHING (Kaseya + inou)
+- Has a Kaseya XPS14 laptop he hates
+- **Recommended:** Request a MacBook Pro (CTO-level ask), keep Mac Mini for inou on BYO network
+- Johan is upset about this — impacts his entire workflow
+
+## Cron Job Fixes
+- git-audit-hourly timeout bumped 60s → 120s (K2.5 needs more time for git operations)
+- claude-usage-hourly had stale Qwen3 235B session — will self-correct on next run
+- K2.5 Watchdog hit session lock error — transient from concurrent subagent spawns
--- a/memory/claude-usage.json
+++ b/memory/claude-usage.json
@ -1,9 +1,9 @@
 {
-  "last_updated": "2026-02-13T09:19:37.432834Z",
+  "last_updated": "2026-02-13T10:00:01.902264Z",
  "source": "api",
-  "session_percent": 12,
-  "session_resets": "2026-02-13T10:00:00.397771+00:00",
-  "weekly_percent": 63,
-  "weekly_resets": "2026-02-14T19:00:00.397796+00:00",
+  "session_percent": 0,
+  "session_resets": null,
+  "weekly_percent": 64,
+  "weekly_resets": "2026-02-14T18:59:59.832971+00:00",
  "sonnet_percent": 0
 }
--- a/memory/git-audit-lastfull.txt
+++ b/memory/git-audit-lastfull.txt
@ -1 +1 @@
-1770901262
+1770975061