diff --git a/memory/2026-02-13.md b/memory/2026-02-13.md index 615466e..c1a0818 100644 --- a/memory/2026-02-13.md +++ b/memory/2026-02-13.md @@ -1,27 +1,111 @@ -# 2026-02-13 (Thursday night / Friday early AM) +# 2026-02-13 (Friday) -## Local Models Conversation (continued from previous session) +## GPU Research Session (4:14-4:48 AM) -### Context -Johan wants local models not just for coding but for EVERYTHING — a "chief of staff" model. -- inou development, Kaseya projects, Sophia medical, general knowledge -- All his "virtual employees" should get smarter over time -- This is NOT just a coding subagent — it's a general-purpose assistant +Johan explored GPU options for local AI/CoS. Key progression: -### Key Discussion Points (previous session → this one) -1. **3090 GPU upgrade for forge** — ~$850-900 total (used 3090 + PSU), runs 32B models at 25-35 tok/s -2. **Fine-tuning transfers across models** — correction dataset is the asset, not the weights -3. **OpenClaw stays on Opus** — person-knowledge, memory, judgment, routing -4. **Local model gets coding DNA via LoRA** — knows Johan's coding style -5. **I contradicted myself** — said local model "doesn't know you" then listed fine-tuning benefits. Johan caught it. Corrected: local model DOES know him as a coder via fine-tuning. +1. **Started with RTX A6000 48GB ($4,599)** — too expensive for the value +2. **RTX 8000 48GB (~$2,000)** — same VRAM, older/slower, better price +3. **RTX 3090 24GB (~$850)** — faster than RTX 8000 but only 24GB +4. **Tradeoff crystallized:** 3090 = fast but limited VRAM, RTX 8000 = slower but can run 70B models +5. **Johan's concern:** slow assistant = "I'll do it myself" — speed matters for adoption +6. **Real motivation revealed:** NOT cost savings — he wants **persistent memory/consistent CoS**. Tired of amnesia. +7. **Cloud GPU rental** (RunPod, Vast.ai) works for periodic LoRA training without buying hardware +8. **Conclusion direction:** Better memory pipeline (RAG + nightly distillation) > buying GPU hardware + - Distillation/memory work is cheap model work (Qwen, K2.5, Gemini Flash) + - Opus stays for live conversation judgment + - No hardware purchase needed — fix the software/memory problem instead -### NEW this session: "Chief of Staff" vision -- Johan clarified scope: not just coding, but "everything" -- Wants model that handles inou, Kaseya (many projects), Sophia, general knowledge -- I presented two paths: RAG-heavy (works on 3090) vs bigger model (needs more VRAM) -- **Open question:** Does he prioritize reasoning-with-context (RAG) or built-in knowledge (bigger model)? -- Conversation was cut by compaction — needs continuation +### Key insight from Johan +"It's not about money! It's about consistent memory" — the amnesia problem is his #1 frustration with AI assistants. -### Infrastructure -- Mail bridge returning empty on /messages/new (0 bytes) — might need investigation -- Network fine: ping 1.1.1.1 → 4/4, ~34ms avg +## Qwen2.5-32B Assessment +- Compared to Opus 4: B/B+ vs A+ (solid mid-level vs senior engineer) +- Compared to Opus 3.5: closer but still noticeable gap +- No local model today is good enough for full autonomous CoS role +- 6 months from now: maybe (open-source improving fast) + +## Alex Finn Post (@AlexFinn/2021992770370764878) +- Guide on running local models via LM Studio — 1,891 likes +- Good for basics but focused on cost savings, not memory persistence + +## Cloudflare Agent Content Negotiation +- Cloudflare adding `Accept: text/markdown` at the edge for AI agents +- **Added to inou TODO** (`/home/johan/dev/inou/docs/TODO.md`) +- Relevant: inou should be agent-first, serve structured markdown to AI assistants +- Competitive differentiator vs anti-bot health platforms + +## Email Triage +- 1 new email: Amazon shipping (EZVALO motion sensor night lights, $20.32, arriving today) +- Updated delivery tracker, trashed email +- **MC performance issue:** queries taking 15-16 seconds consistently — needs investigation + +## RTX 5090 Scam +- Johan found $299 "RTX 5090" on eBay — zero feedback seller, obvious scam. Warned him off. + +## Webchat Bug +- Johan's message got swallowed (NO_REPLY triggered incorrectly), he had to resend + +## Cron Jobs → Kimi K2.5 on Fireworks +- Switched 7 cron jobs from Opus to Kimi K2.5 (`fireworks/accounts/fireworks/models/kimi-k2p5`) +- K2.5 Watchdog, claude-usage-hourly, git-audit-hourly, dashboard usage, git-audit-daily, update check, weekly memory synthesis +- Qwen 2.5 32B deprecated/removed from Fireworks — only Qwen3 models remain +- Qwen3 235B MoE had cold-start 503s (serverless scaling to zero) — unreliable +- K2.5 stays warm (popular model), ~9s runs, proven in browser agent +- Fireworks provider registered in OpenClaw config with two models: K2.5 (primary) + Qwen3 235B (backup) + +## Fireworks Blog Post +- Fireworks published blog about OpenClaw + Fireworks integration +- Pitch: use open models for routine tasks (10x cheaper), Opus for judgment +- Validates our exact setup + +## Shannon VPS — New Credentials (from Hostkey/Maxim) +- IP: 82.24.174.112, root / K_cX1aFThB +- **DO NOT disable password login** until Johan confirms SSH key access (lesson learned from Feb 11 lockout) +- Task: Install Shannon (KeygraphHQ/shannon) and test against inou portal ONLY +- Server ID: 53643, HostKey panel: https://panel.hostkey.com/controlpanel.html?key=639551e73029b90f-c061af4412951b2e + +## Fire Tablet Alert Dashboard (new project) +- Johan doesn't see Signal alerts reliably — wants a spare Fire tablet (Fully Kiosk) as alert display +- Requirements: clock, calendar, notification push with sound ("modest pling") +- Two approaches discussed: standalone web page (preferred) vs Home Assistant integration +- Johan OK with me coding it or using HA +- Plan: simple HTML dashboard on forge, SSE for push alerts, Fully Kiosk loads URL + +## GPU Purchase Decision +- No GPU purchase yet — persistent memory problem better solved with software (RAG + nightly distillation) +- If buying: RTX 8000 48GB (~$2K) best option for fine-tuning/70B models +- Cloud GPU (RunPod/Vast.ai) viable for periodic LoRA training + +## MC Performance Issue +- Message Center queries taking 15-16 seconds consistently — needs investigation + +## Alert Dashboard — Port Conflict Fixed +- Subagent built alert-dashboard (Node.js/Express, SSE, analog clock, calendar, alert feed) +- Initially deployed on port 9201 — **WRONG, that's DocSys's port** +- Moved to port **9202**, restored DocSys on 9201 +- Service: `alert-dashboard.service` (systemd user, enabled) +- Source: `/home/johan/dev/alert-dashboard/` +- API: GET /, GET /api/alerts, POST /api/alerts, GET /api/alerts/stream (SSE) +- Fully Kiosk URL: `http://192.168.1.16:9202` + +## Shannon VPS — Setup Progress +- SSH key from forge works ✅ (root@82.24.174.112) +- Password login: root / K_cX1aFThB — **LEFT ENABLED per instructions** +- Repo cloned to /opt/shannon +- Docker build started (still building when subagent finished) +- TODO: Check build completion, run portal test against inou.com + +## Kaseya Device Policy Change (IMPORTANT) +- CISO Jason Manar announced: only Kaseya-issued IT-managed devices on corporate network +- Personal/BYO devices → BYO network only, no VPN access +- Rolling out "starting tomorrow" (Feb 14) over coming weeks +- Johan currently uses personal Mac Mini for EVERYTHING (Kaseya + inou) +- Has a Kaseya XPS14 laptop he hates +- **Recommended:** Request a MacBook Pro (CTO-level ask), keep Mac Mini for inou on BYO network +- Johan is upset about this — impacts his entire workflow + +## Cron Job Fixes +- git-audit-hourly timeout bumped 60s → 120s (K2.5 needs more time for git operations) +- claude-usage-hourly had stale Qwen3 235B session — will self-correct on next run +- K2.5 Watchdog hit session lock error — transient from concurrent subagent spawns diff --git a/memory/claude-usage.json b/memory/claude-usage.json index 93b1e01..7a322c9 100644 --- a/memory/claude-usage.json +++ b/memory/claude-usage.json @@ -1,9 +1,9 @@ { - "last_updated": "2026-02-13T09:19:37.432834Z", + "last_updated": "2026-02-13T10:00:01.902264Z", "source": "api", - "session_percent": 12, - "session_resets": "2026-02-13T10:00:00.397771+00:00", - "weekly_percent": 63, - "weekly_resets": "2026-02-14T19:00:00.397796+00:00", + "session_percent": 0, + "session_resets": null, + "weekly_percent": 64, + "weekly_resets": "2026-02-14T18:59:59.832971+00:00", "sonnet_percent": 0 } \ No newline at end of file diff --git a/memory/git-audit-lastfull.txt b/memory/git-audit-lastfull.txt index b8bae05..2fe5c55 100644 --- a/memory/git-audit-lastfull.txt +++ b/memory/git-audit-lastfull.txt @@ -1 +1 @@ -1770901262 +1770975061