chore: auto-commit uncommitted changes

This commit is contained in:
James 2026-02-13 05:30:17 -05:00
parent 7eb0d94559
commit c7ac6b6e38
3 changed files with 112 additions and 28 deletions

View File

@ -1,27 +1,111 @@
# 2026-02-13 (Thursday night / Friday early AM)
# 2026-02-13 (Friday)
## Local Models Conversation (continued from previous session)
## GPU Research Session (4:14-4:48 AM)
### Context
Johan wants local models not just for coding but for EVERYTHING — a "chief of staff" model.
- inou development, Kaseya projects, Sophia medical, general knowledge
- All his "virtual employees" should get smarter over time
- This is NOT just a coding subagent — it's a general-purpose assistant
Johan explored GPU options for local AI/CoS. Key progression:
### Key Discussion Points (previous session → this one)
1. **3090 GPU upgrade for forge** — ~$850-900 total (used 3090 + PSU), runs 32B models at 25-35 tok/s
2. **Fine-tuning transfers across models** — correction dataset is the asset, not the weights
3. **OpenClaw stays on Opus** — person-knowledge, memory, judgment, routing
4. **Local model gets coding DNA via LoRA** — knows Johan's coding style
5. **I contradicted myself** — said local model "doesn't know you" then listed fine-tuning benefits. Johan caught it. Corrected: local model DOES know him as a coder via fine-tuning.
1. **Started with RTX A6000 48GB ($4,599)** — too expensive for the value
2. **RTX 8000 48GB (~$2,000)** — same VRAM, older/slower, better price
3. **RTX 3090 24GB (~$850)** — faster than RTX 8000 but only 24GB
4. **Tradeoff crystallized:** 3090 = fast but limited VRAM, RTX 8000 = slower but can run 70B models
5. **Johan's concern:** slow assistant = "I'll do it myself" — speed matters for adoption
6. **Real motivation revealed:** NOT cost savings — he wants **persistent memory/consistent CoS**. Tired of amnesia.
7. **Cloud GPU rental** (RunPod, Vast.ai) works for periodic LoRA training without buying hardware
8. **Conclusion direction:** Better memory pipeline (RAG + nightly distillation) > buying GPU hardware
- Distillation/memory work is cheap model work (Qwen, K2.5, Gemini Flash)
- Opus stays for live conversation judgment
- No hardware purchase needed — fix the software/memory problem instead
### NEW this session: "Chief of Staff" vision
- Johan clarified scope: not just coding, but "everything"
- Wants model that handles inou, Kaseya (many projects), Sophia, general knowledge
- I presented two paths: RAG-heavy (works on 3090) vs bigger model (needs more VRAM)
- **Open question:** Does he prioritize reasoning-with-context (RAG) or built-in knowledge (bigger model)?
- Conversation was cut by compaction — needs continuation
### Key insight from Johan
"It's not about money! It's about consistent memory" — the amnesia problem is his #1 frustration with AI assistants.
### Infrastructure
- Mail bridge returning empty on /messages/new (0 bytes) — might need investigation
- Network fine: ping 1.1.1.1 → 4/4, ~34ms avg
## Qwen2.5-32B Assessment
- Compared to Opus 4: B/B+ vs A+ (solid mid-level vs senior engineer)
- Compared to Opus 3.5: closer but still noticeable gap
- No local model today is good enough for full autonomous CoS role
- 6 months from now: maybe (open-source improving fast)
## Alex Finn Post (@AlexFinn/2021992770370764878)
- Guide on running local models via LM Studio — 1,891 likes
- Good for basics but focused on cost savings, not memory persistence
## Cloudflare Agent Content Negotiation
- Cloudflare adding `Accept: text/markdown` at the edge for AI agents
- **Added to inou TODO** (`/home/johan/dev/inou/docs/TODO.md`)
- Relevant: inou should be agent-first, serve structured markdown to AI assistants
- Competitive differentiator vs anti-bot health platforms
## Email Triage
- 1 new email: Amazon shipping (EZVALO motion sensor night lights, $20.32, arriving today)
- Updated delivery tracker, trashed email
- **MC performance issue:** queries taking 15-16 seconds consistently — needs investigation
## RTX 5090 Scam
- Johan found $299 "RTX 5090" on eBay — zero feedback seller, obvious scam. Warned him off.
## Webchat Bug
- Johan's message got swallowed (NO_REPLY triggered incorrectly), he had to resend
## Cron Jobs → Kimi K2.5 on Fireworks
- Switched 7 cron jobs from Opus to Kimi K2.5 (`fireworks/accounts/fireworks/models/kimi-k2p5`)
- K2.5 Watchdog, claude-usage-hourly, git-audit-hourly, dashboard usage, git-audit-daily, update check, weekly memory synthesis
- Qwen 2.5 32B deprecated/removed from Fireworks — only Qwen3 models remain
- Qwen3 235B MoE had cold-start 503s (serverless scaling to zero) — unreliable
- K2.5 stays warm (popular model), ~9s runs, proven in browser agent
- Fireworks provider registered in OpenClaw config with two models: K2.5 (primary) + Qwen3 235B (backup)
## Fireworks Blog Post
- Fireworks published blog about OpenClaw + Fireworks integration
- Pitch: use open models for routine tasks (10x cheaper), Opus for judgment
- Validates our exact setup
## Shannon VPS — New Credentials (from Hostkey/Maxim)
- IP: 82.24.174.112, root / K_cX1aFThB
- **DO NOT disable password login** until Johan confirms SSH key access (lesson learned from Feb 11 lockout)
- Task: Install Shannon (KeygraphHQ/shannon) and test against inou portal ONLY
- Server ID: 53643, HostKey panel: https://panel.hostkey.com/controlpanel.html?key=639551e73029b90f-c061af4412951b2e
## Fire Tablet Alert Dashboard (new project)
- Johan doesn't see Signal alerts reliably — wants a spare Fire tablet (Fully Kiosk) as alert display
- Requirements: clock, calendar, notification push with sound ("modest pling")
- Two approaches discussed: standalone web page (preferred) vs Home Assistant integration
- Johan OK with me coding it or using HA
- Plan: simple HTML dashboard on forge, SSE for push alerts, Fully Kiosk loads URL
## GPU Purchase Decision
- No GPU purchase yet — persistent memory problem better solved with software (RAG + nightly distillation)
- If buying: RTX 8000 48GB (~$2K) best option for fine-tuning/70B models
- Cloud GPU (RunPod/Vast.ai) viable for periodic LoRA training
## MC Performance Issue
- Message Center queries taking 15-16 seconds consistently — needs investigation
## Alert Dashboard — Port Conflict Fixed
- Subagent built alert-dashboard (Node.js/Express, SSE, analog clock, calendar, alert feed)
- Initially deployed on port 9201 — **WRONG, that's DocSys's port**
- Moved to port **9202**, restored DocSys on 9201
- Service: `alert-dashboard.service` (systemd user, enabled)
- Source: `/home/johan/dev/alert-dashboard/`
- API: GET /, GET /api/alerts, POST /api/alerts, GET /api/alerts/stream (SSE)
- Fully Kiosk URL: `http://192.168.1.16:9202`
## Shannon VPS — Setup Progress
- SSH key from forge works ✅ (root@82.24.174.112)
- Password login: root / K_cX1aFThB — **LEFT ENABLED per instructions**
- Repo cloned to /opt/shannon
- Docker build started (still building when subagent finished)
- TODO: Check build completion, run portal test against inou.com
## Kaseya Device Policy Change (IMPORTANT)
- CISO Jason Manar announced: only Kaseya-issued IT-managed devices on corporate network
- Personal/BYO devices → BYO network only, no VPN access
- Rolling out "starting tomorrow" (Feb 14) over coming weeks
- Johan currently uses personal Mac Mini for EVERYTHING (Kaseya + inou)
- Has a Kaseya XPS14 laptop he hates
- **Recommended:** Request a MacBook Pro (CTO-level ask), keep Mac Mini for inou on BYO network
- Johan is upset about this — impacts his entire workflow
## Cron Job Fixes
- git-audit-hourly timeout bumped 60s → 120s (K2.5 needs more time for git operations)
- claude-usage-hourly had stale Qwen3 235B session — will self-correct on next run
- K2.5 Watchdog hit session lock error — transient from concurrent subagent spawns

View File

@ -1,9 +1,9 @@
{
"last_updated": "2026-02-13T09:19:37.432834Z",
"last_updated": "2026-02-13T10:00:01.902264Z",
"source": "api",
"session_percent": 12,
"session_resets": "2026-02-13T10:00:00.397771+00:00",
"weekly_percent": 63,
"weekly_resets": "2026-02-14T19:00:00.397796+00:00",
"session_percent": 0,
"session_resets": null,
"weekly_percent": 64,
"weekly_resets": "2026-02-14T18:59:59.832971+00:00",
"sonnet_percent": 0
}

View File

@ -1 +1 @@
1770901262
1770975061