Commit Graph

79 Commits

Author SHA1 Message Date
Othavio Quiliao 33f28d6877
fix: align Mission Control device auth handshake with OpenClaw protocol 2026-03-03 20:51:22 +07:00
nyk f23c78f43a
fix: resolve WebSocket disconnect bugs and add SSE reconnect backoff (#97)
- Fix stale closure: onclose now calls connectRef.current instead of
  capturing connect by value, so reconnect always uses the latest version
- Fix disconnect-reconnect race: manualDisconnectRef prevents onclose
  from scheduling a new reconnect after explicit disconnect()
- Fix double-connect guard: check both OPEN and CONNECTING states
- Add SSE exponential backoff with 20-attempt cap (was flat 3s infinite)
- Add SSE error logging (was silently swallowed)
- Update README: fix stale counts (28 panels, 66 routes, 21 migrations,
  148 E2E tests), add missing features (SOUL system, Ed25519, agent
  messaging, update checker), document NEXT_PUBLIC_GATEWAY_TOKEN
2026-03-03 17:53:02 +07:00
nyk e6bae7ad88
fix: sync agent SOUL content with workspace files (#91) (#95)
Workspace file is now the primary source for soul.md with DB as
fallback. Reads prefer workspace → DB. Writes go to both. Config sync
imports soul.md from each agent's workspace using double resolveWithin
guard to prevent path traversal.
2026-03-03 17:36:14 +07:00
nyk 274b726df4
feat: add Update Available banner with GitHub release check (#94)
* fix: migrate middleware.ts to proxy.ts for Next.js 16 (#88)

Next.js 16 deprecated the `middleware` file convention in favor of
`proxy`. The proxy runs on the Node.js runtime instead of Edge, so
safeCompare now uses crypto.timingSafeEqual instead of manual XOR.

All auth logic, CSRF validation, host matching, and security headers
are preserved unchanged.

* feat: add "Update Available" banner with GitHub release check

Add a dismissible emerald banner that appears when a newer GitHub release
exists, so self-hosting users know an update is available. The banner
dismisses per-version (reappears for new releases).

- Create src/lib/version.ts as single source of truth from package.json
- Add /api/releases/check route with 1hr caching and graceful fallback
- Add UpdateBanner component mirroring LocalModeBanner pattern
- Add update state to Zustand store with localStorage persistence
- Fix hardcoded v2.0 in header-bar.tsx and 2.0.0 in websocket.ts
2026-03-03 17:17:15 +07:00
nyk 71f2627138
fix: task board SSE wiring, priority enum, webhook event, auto-advance, parallel broadcast (#73) (#89)
- Wire task board panel into Zustand store for real-time SSE updates
  instead of local useState; add useSmartPoll fallback when SSE disconnects
- Fix priority enum mismatch: UI now uses 'critical' matching the Zod
  validation schema instead of 'urgent'
- Add 'task.status_changed' to webhook EVENT_MAP so external consumers
  receive status transition events
- Auto-advance task to 'done' column when aegis quality review approves,
  broadcasting task.status_changed for real-time UI update
- Parallelize broadcast loop with Promise.allSettled so N agents execute
  concurrently (~10s) instead of serially (N×10s)

Closes #73
2026-03-03 16:20:53 +07:00
nyk 6ce38b13dc
feat: sync side panel navigation with URL routes (#76) (#87)
Move page.tsx to [[...panel]] optional catch-all route so each panel
gets its own URL (e.g. /tasks, /agents, /settings). URL is the source
of truth — synced into Zustand via usePathname on every navigation.
Enables bookmarking, refresh persistence, deep-linking, and browser
back/forward.
2026-03-03 15:08:59 +07:00
nyk a4a606d5ac
feat: Ed25519 device identity for WebSocket challenge-response handshake (#85)
Add client-side Ed25519 key pair generation and nonce signing for
OpenClaw gateway protocol v3 connect.challenge flow. Keys persist in
localStorage and are reused across sessions. The handshake falls back
gracefully to auth-token-only mode when Ed25519 is unavailable.

Closes #74, closes #79, closes #81
2026-03-03 14:30:25 +07:00
nyk f0f22129be
fix: healthcheck auth, secure cookie auto-detect, model object crash (#84)
Bug 1 (#78): Dockerfile HEALTHCHECK curled authenticated /api/status,
always got 401 in production. Changed to /login which is public.

Bug 2 (#78): Login hangs on HTTP deployments because secure=true cookie
is silently rejected. Now auto-detects protocol from x-forwarded-proto
header, only sets secure when request actually came over HTTPS.

Bug 3 (#78): Agent model field from OpenClaw 2026.3.x is {primary: "name"}
object instead of string, causing React error #31. Added normalizeModel()
helper and applied it in all WebSocket/session mapping code paths.
2026-03-03 14:19:34 +07:00
nyk d826435401
feat: add local mode auto-detection for no-gateway users (#83)
When no OpenClaw gateway is detected, Mission Control now automatically
switches to Local Mode — showing a clear info banner, greying out
gateway-dependent panels, and surfacing Claude Code session stats,
GitHub profile data, and subscription-aware cost display.

Changes:
- Add capabilities endpoint to detect gateway, Claude home, subscription
- Add dashboardMode/gatewayAvailable/subscription state to Zustand store
- Add dismissible LocalModeBanner component
- Grey out Agents/Spawn/Config nav items when no gateway
- Show blue "Local Mode" indicator instead of red "Disconnected"
- Dashboard shows local metric cards (sessions, projects, tokens, cost)
- Claude Code Stats panel with session/token/cost breakdown
- GitHub panel with repo stats, languages, star/fork counts
- Subscription detection from ~/.claude/.credentials.json
- Show "Included (Max plan)" instead of dollar cost for subscribers
- Fix token cost estimation (cache reads at 10%, not 100%)
- Sessions API falls back to local Claude session scanner
- Live feed injects session items in local mode
- Memory browser auto-creates data dir with fallback path
2026-03-03 13:41:55 +07:00
nyk 304a9b3194
fix: cherry-pick improvements from PR #57 (#71)
Cherry-picks three valuable fixes from @doanbactam's WebSocket refactor PR:

1. Feed item ID collision fix — prefix log IDs with 'log-' to avoid
   React key collisions with activity IDs in the combined feed

2. Jittered reconnect backoff — add random jitter (0-50% of base) to
   WebSocket exponential backoff to prevent thundering-herd reconnects
   when multiple tabs reconnect after a server restart

3. Cron job deduplication + async I/O — deduplicate jobs.json entries
   by name (keeps latest), prevent duplicates on add, and convert
   sync file reads/writes to async to avoid blocking the event loop

Co-authored-by: Doan Bac Tam <24356000+doanbactam@users.noreply.github.com>
2026-03-02 23:54:20 +07:00
nyk 96168fe2f4
feat: audit hardening, webhook retry, and local Claude session tracking (#68)
Security hardening:
- Fix timing-safe comparison bugs in webhooks.ts and auth.ts (was comparing buffer with itself)
- Harden rate limiter IP extraction — use rightmost untrusted IP from XFF chain with MC_TRUSTED_PROXIES support
- Add 12-char minimum password validation in Zod schema and runtime check
- Add Zod validation on PUT /api/tasks bulk status update

Webhook retry system (completing in-progress feature):
- Exponential backoff with circuit breaker in webhooks.ts
- POST /api/webhooks/retry endpoint for manual retry
- GET /api/webhooks/verify-docs endpoint for signature verification docs
- Scheduler integration for automatic retry processing
- Unit tests for signature verification and backoff logic

Local Claude Code session tracking:
- New claude-sessions.ts scanner parses JSONL transcripts from ~/.claude/projects/
- Extracts model, tokens, messages, cost estimates, active status per session
- Migration 020 adds claude_sessions table
- GET/POST /api/claude/sessions endpoint with filtering and aggregate stats
- Scheduler runs scan every 60s with MC_CLAUDE_HOME config

Quality improvements:
- Replace all console.error/warn with structured logger across 31 API routes
- Add Docker HEALTHCHECK directive
- Add vitest coverage config with v8 provider (60% threshold)
- Update README with new features, API docs, env vars, and roadmap items
- Fix E2E tests for password length and rate limiter IP changes
2026-03-02 22:17:35 +07:00
nyk b2703b37d5
fix: resolve all 44 failing CI E2E tests (#64)
* fix: resolve all 44 failing CI E2E tests

- Bypass non-critical rate limiters in test env (MC_DISABLE_RATE_LIMIT=1)
  to prevent 429s when 165 tests share the same IP bucket
- Make admin seed idempotent (INSERT OR IGNORE) to fix UNIQUE constraint
  race when multiple Next.js workers initialize concurrently
- Add distinct x-forwarded-for headers to login-flow tests so they never
  share the critical login rate-limit bucket with other test suites
- Add missing 018_token_usage migration that the heartbeat POST handler
  depends on, fixing the 500 on inline token reporting

* docs: update README with latest features and test count

- Update migration count from 15 to 18
- Update E2E test count from 146 to 165
- Move Direct CLI, OpenAPI docs, and GitHub sync to completed roadmap
- Add Direct CLI and GitHub sync feature descriptions
- Add /api/connect and /api/github to API reference
- Remove resolved known limitation (vitest stubs)
- Update repo description

* fix: prevent build-time admin seed with wrong credentials in CI

Move `cp .env.test .env` before `pnpm build` in CI workflow so env vars
are present during build. Add NEXT_PHASE guard to skip seed during build
as belt-and-suspenders — env vars may not be available at build time.

Root cause: `next build` imports db.ts, triggering seedAdminUserFromEnv()
with undefined AUTH_USER/AUTH_PASS, seeding user `admin` instead of
`testadmin`. Runtime seed then sees count > 0 and skips. Tests login
as `testadmin` which doesn't exist → 401.
2026-03-02 13:53:00 +07:00
Nyk 60197ab21f feat: add GitHub Issues sync (Phase 1, Issue #58)
Import GitHub issues as Mission Control tasks with duplicate detection,
priority mapping from labels, and bidirectional actions (comment/close).

- Migration 017: github_syncs table for sync history tracking
- GitHub API client (src/lib/github.ts) with fetch, comment, close ops
- POST/GET /api/github route with sync, comment, close, status actions
- GitHubSyncPanel UI: import form, issue preview, sync history, linked tasks
- Nav rail + page router wiring
- 6 E2E tests (all passing)
- Validation schema + github.synced event type
2026-03-02 12:45:39 +07:00
Nyk ebdc8de8b9 fix: resolve reconnect storm and improve Ubuntu deployment
Fix WebSocket reconnect storm (issue #53) caused by stale closure
reading connection.reconnectAttempts from Zustand state. Use a ref
to track attempts, avoiding the closure capture problem entirely.

Improve Dockerfile: create .data directory with correct ownership for
SQLite, set PORT/HOSTNAME env vars explicitly.

Add deployment guide documenting Ubuntu prerequisites (python3, make,
g++ for better-sqlite3 native compilation) and platform-specific
build constraints.
2026-03-02 12:15:19 +07:00
nyk f3e6c896a5
Merge pull request #54 from rezero-household/fix/websocket-auth-token-field
fix: use correct auth field in gateway WebSocket handshake
2026-03-02 11:51:19 +07:00
Nyk f7aa1db27e feat: add direct CLI integration for gateway-free tool connections
- Add migration 016 for direct_connections table
- Add POST/GET/DELETE /api/connect for CLI tool registration
- Enhance heartbeat POST to accept connection_id and inline token_usage
- Add connectSchema to validation
- Add connection.created/disconnected event types to event bus
- Show direct CLI connections in gateway manager panel
- Add 5 E2E tests for connection lifecycle
- Add CLI integration documentation (docs/cli-integration.md)
- Fix openapi.json brace mismatch on line 642 (Phase 2 bug)
- Add /api/connect endpoints to OpenAPI spec
2026-03-02 11:45:12 +07:00
rezero-household 2eec86cc87 fix: use correct auth field in gateway WebSocket handshake
OpenClaw gateway configured with auth.mode='token' expects
{ token: '...' } in the connect handshake params, not { password: '...' }.
Sending 'password' causes the gateway to reject the handshake, resulting
in a disconnect→reconnect loop that floods the error log.

Tested against OpenClaw gateway v2026.2.25 with auth.mode='token'.
2026-03-01 14:46:04 -08:00
Nyk 45ad4a488b test: add 94 E2E tests covering all CRUD routes + fix middleware location
Add comprehensive Playwright E2E test coverage for all major API routes:
- tasks-crud (18 tests): full lifecycle, filters, Aegis approval gate
- agents-crud (15 tests): CRUD, lookup by name/id, admin-only delete
- task-comments (7 tests): threaded comments, validation
- workflows-crud (8 tests): workflow template lifecycle
- webhooks-crud (9 tests): secret masking, regeneration
- alerts-crud (8 tests): alert rule lifecycle
- notifications (7 tests): delivery tracking, read status
- quality-review (6 tests): reviews with batch lookup
- search-and-export (7 tests): global search, export, activities
- user-management (8 tests): user admin CRUD
- helpers.ts: shared factory functions and cleanup utilities

Infrastructure fixes:
- Move middleware.ts to src/middleware.ts (Next.js 16 Turbopack
  requires middleware in src/ when using src/app/ directory — the
  root-level file was silently ignored, breaking CSRF protection)
- Add MC_DISABLE_RATE_LIMIT env var to bypass non-critical rate
  limiters during E2E runs (login limiter stays active via critical flag)
- Fix limit-caps test: /api/activities caps at 500, not 200
- Set playwright workers=1, fullyParallel=false for serial execution
- Add CSRF origin fallback to request.nextUrl.host

Roadmap additions from user feedback:
- Agent-agnostic gateway support (not just OpenClaw)
- Direct CLI integration (Codex, Claude Code, etc.)
- Native macOS app (Electron or Tauri)

146/146 E2E tests passing (up from 51).
2026-03-02 02:21:10 +07:00
Nyk df06c3a2ad feat: v1.2.0 — validation hardening, unit tests, quality improvements
- Fix task status enum mismatch (blocked → quality_review)
- Add 12 Zod schemas for all unvalidated mutation routes
- Apply validateBody() across 11 API route handlers
- Add readLimiter (120/min) for GET-heavy endpoints
- Extend heavyLimiter to search, backup, cleanup routes
- Add security headers (X-Content-Type-Options, X-Frame-Options, Referrer-Policy)
- Fill auth test stubs with real assertions (safeCompare, requireRole)
- Add validation, rate-limit, and db-helpers unit test suites (60 tests total)
- Replace as-any casts with typed interfaces (SessionQueryRow, UserQueryRow, CountRow)
- Bump version to 1.2.0, add CHANGELOG.md, update README roadmap
2026-03-02 00:22:59 +07:00
Nyk c8f932344f fix: patch command injection, missing rate limit, Docker build, logger crash
- Sanitize session ID in control route to prevent command injection
  via unsanitized URL params interpolated into shell commands
- Add mutationLimiter and structured logging to session control endpoint
- Install python3/make/g++ in Dockerfile deps stage for better-sqlite3
  native addon compilation
- Handle missing public/ directory in Docker COPY with glob pattern
- Guard pino-pretty transport against missing devDependency at runtime
2026-02-27 21:57:50 +07:00
Nyk c104b7e071 Merge remote-tracking branch 'origin/main' into feat/medium-priority-v1.1
# Conflicts:
#	src/app/api/agents/route.ts
#	src/app/api/alerts/route.ts
#	src/app/api/auth/login/route.ts
#	src/app/api/spawn/route.ts
#	src/app/api/tasks/[id]/route.ts
#	src/app/api/tasks/route.ts
#	src/app/api/webhooks/route.ts
#	src/lib/validation.ts
2026-02-27 21:47:56 +07:00
Nyk 321a7c2db2 feat: error boundaries, pino logger, a11y, HSTS, zod validation, export limits 2026-02-27 21:37:06 +07:00
Nyk 299faf50e3 feat: add Docker support, session controls, model catalog, API rate limiting 2026-02-27 20:56:02 +07:00
Nyk b5766b0850 fix: enable foreign_keys pragma and add missing indexes
- Add `PRAGMA foreign_keys = ON` to db.ts — without this, all
  ON DELETE CASCADE constraints across 7 tables are silently ignored
  (SQLite disables foreign keys by default)
- Add migration 015 with indexes on hot query paths:
  notifications(read_at), notifications(recipient, read_at),
  activities(actor), activities(entity_type, entity_id),
  messages(read_at)
2026-02-27 20:07:50 +07:00
Nyk 8de9e0b5c3 test: add 52 Playwright E2E tests covering all critical fixes
8 test suites verifying:
- Auth guards on 19 GET endpoints (Issue #4)
- Timing-safe API key comparison (Issue #5)
- Legacy cookie auth removal (Issue #7)
- Login rate limiting (Issue #8)
- CSRF Origin header validation (Issue #20)
- DELETE body standardization (Issue #18)
- Query limit caps at 200 (Issue #19)
- Login flow and session lifecycle

Also fixes migration 013 crash on fresh DB when gateways table
doesn't exist (created lazily by gateways API, not in migrations).
2026-02-27 15:38:49 +07:00
Nyk bf0df9b6d0 fix: strict mode, test stubs, pagination counts, N+1 queries, CSP hardening
- Enable TypeScript strict mode and fix all resulting type errors
- Add auth test stubs for requireRole and safeCompare
- Add proper COUNT(*) pagination totals to agents, tasks, notifications,
  messages, conversations, and standup history endpoints
- Fix N+1 queries by hoisting db.prepare() outside loops in agents,
  activities, notifications, conversations, standup, gateway health,
  and notification delivery routes
- Remove unsafe-eval from CSP script-src directive
- Remove deprecated X-XSS-Protection header
2026-02-27 14:02:52 +07:00
Nyk 3b600d817e fix: remove legacy auth, add login rate limiting, block SSRF metadata, parameterize migration SQL 2026-02-27 13:58:52 +07:00
Nyk 1ee506b4cf fix: add auth checks on all GET endpoints, timing-safe comparisons, and XSS sanitization 2026-02-27 13:04:24 +07:00
Nyk 99815d20b3 feat: initial open-source release
OpenClaw Mission Control — agent orchestration dashboard.

Built with Next.js 16, React 19, TypeScript, SQLite, and Tailwind CSS.
MIT License.
2026-02-23 02:00:44 +07:00