clawd/memory/shannon-audit.md

# Shannon AI Pentesting Tool — Security Audit

**Date:** 2025-02-09
**Repo:** https://github.com/KeygraphHQ/shannon
**Auditor:** James (automated)
**Verdict: ✅ SAFE** (with caveats)

---

## Summary

Shannon is a TypeScript-based AI pentesting orchestrator that uses the Claude Agent SDK to run multi-phase security assessments. The codebase is clean, well-structured, and **makes no outbound network calls of its own**. All network activity is delegated to:

1. **Anthropic Claude API** (via `@anthropic-ai/claude-agent-sdk`) — the LLM backend
2. **Temporal** (optional, localhost:7233 by default) — workflow orchestration
3. **Playwright MCP** (spawned via npx) — browser automation against the *target* app
4. **The target application itself** — whatever URL you point it at

There is **no telemetry, no phone-home, no data exfiltration, no uploads to cloud storage, no webhook calls to Keygraph or any third party.**

---

## Exfiltration Risk Assessment

| Check | Result |
|-------|--------|
| Outbound HTTP/HTTPS calls (fetch, axios, http.get) | ❌ **None found** in Shannon source |
| Telemetry / analytics | ❌ **None** |
| Phone-home behavior | ❌ **None** |
| S3/GCS/cloud storage uploads | ❌ **None** |
| Webhook calls to KeygraphHQ | ❌ **None** |
| Base64 encoded payloads | Only standard base32 decode for TOTP generation (RFC 4226/6238) |
| Obfuscated URLs | ❌ **None** |
| eval()/exec() with remote input | ❌ **None** — only `regex.exec()` for prompt template parsing |

---

## All Outbound Network Connections

### 1. Anthropic API (Claude)
- **How:** Via `@anthropic-ai/claude-agent-sdk` `query()` function
- **Destination:** `api.anthropic.com` (or `ANTHROPIC_BASE_URL` if set for router mode)
- **Auth:** `ANTHROPIC_API_KEY` env var (handled by SDK, not Shannon)
- **Data sent:** Prompts + tool results; source code is sent as prompt context to Claude

### 2. Temporal Server (Optional)
- **How:** `@temporalio/client` and `@temporalio/worker`
- **Destination:** `TEMPORAL_ADDRESS` env var, default `localhost:7233`
- **Purpose:** Workflow orchestration (parallel agent execution, retries)
- **Note:** Local by default; only remote if you configure it

### 3. Playwright MCP (Browser Automation)
- **How:** Spawned as child process via `npx @playwright/mcp@latest`
- **Destination:** The target application URL only
- **Purpose:** Browser-based testing of the target app
- **Note:** Each agent gets isolated browser profile in `/tmp/playwright-agentN`

### 4. npm Registry (npx)
- **How:** `npx @playwright/mcp@latest` downloads on first run
- **Destination:** `registry.npmjs.org`
- **One-time:** Only on first execution; can be pre-installed

---

## Recommended Firewall Whitelist

```
# Required
api.anthropic.com:443          # Claude API (or your ANTHROPIC_BASE_URL)
<your-target-app>              # Whatever you're pentesting

# Optional (Temporal, only if using workflow mode)
localhost:7233                  # Temporal server (default, local)

# One-time (npx playwright download)
registry.npmjs.org:443         # npm packages
```

**Block everything else.** Shannon makes no other outbound connections.

---

## Credential Handling

| Credential | Handling | Risk |
|------------|----------|------|
| `ANTHROPIC_API_KEY` | Read by SDK from env var | ✅ Never stored/logged by Shannon |
| `ANTHROPIC_BASE_URL` | Optional env var for router mode | ✅ Safe |
| `ROUTER_DEFAULT` | Optional env var (e.g., `gemini,gemini-2.5-pro`) | ✅ Safe |
| Target source code | Copied to local `sourceDir`, git-managed | ⚠️ Sent to Claude API as prompt context |

**Key concern:** Target source code IS sent to the Anthropic API as part of the prompt. This is inherent to how the tool works — Claude analyzes the code. If your source code is highly sensitive, this is the main risk vector (data goes to Anthropic's API).

---

## Source Code / Temp File Handling

- Source is cloned to a local directory you specify
- Git initialized for checkpoints/rollbacks within that directory
- Deliverables written to `<sourceDir>/deliverables/`
- Playwright profiles in `/tmp/playwright-agent{1-5}`
- Error logs written to `<sourceDir>/error.log`
- No cleanup of temp files after execution (manual cleanup needed)
- **No uploads anywhere** — all files stay local

---

## Dependencies Risk Assessment

### Direct Dependencies (package.json)

| Package | Version | Risk | Notes |
|---------|---------|------|-------|
| `@anthropic-ai/claude-agent-sdk` | ^0.1.0 | ✅ Low | Official Anthropic SDK |
| `@temporalio/activity` | ^1.11.0 | ✅ Low | Well-known workflow engine |
| `@temporalio/client` | ^1.11.0 | ✅ Low | " |
| `@temporalio/worker` | ^1.11.0 | ✅ Low | " |
| `@temporalio/workflow` | ^1.11.0 | ✅ Low | " |
| `ajv` | ^8.12.0 | ✅ Low | JSON schema validator, widely used |
| `ajv-formats` | ^2.1.1 | ✅ Low | AJV extension |
| `boxen` | ^8.0.1 | ✅ Low | Terminal box drawing |
| `chalk` | ^5.0.0 | ✅ Low | Terminal colors |
| `dotenv` | ^16.4.5 | ✅ Low | Env file loader |
| `figlet` | ^1.9.3 | ✅ Low | ASCII art text |
| `gradient-string` | ^3.0.0 | ✅ Low | Terminal gradients |
| `js-yaml` | ^4.1.0 | ✅ Low | YAML parser |
| `zod` | ^3.22.4 | ✅ Low | Schema validation |
| `zx` | ^8.0.0 | ⚠️ Medium | Google's shell scripting lib — powerful but well-known; used for `fs`, `path`, `$` shell execution |

### Supply Chain

- **No postinstall/preinstall scripts** in package.json
- **All resolved packages from registry.npmjs.org** (verified in package-lock.json, 205 resolved entries, all npmjs.org)
- **No suspicious transitive dependencies detected**
- `zx` gives shell access via `$` template tag — but Shannon only uses it for `git` commands in `environment.ts`

---

## Code Sections Requiring Closer Review

### 1. `permissionMode: 'bypassPermissions'` (claude-executor.ts:201)
The Claude agent runs with **all permissions bypassed**. This means Claude Code can read/write/execute anything in the `sourceDir`. This is by design (pentesting requires it) but means the Claude agent has full filesystem access within its working directory.

### 2. `npx @playwright/mcp@latest` (claude-executor.ts:83)
Downloads latest Playwright MCP from npm at runtime. Pin the version for reproducibility:
```
'@playwright/mcp@0.0.28'  // or whatever current version is
```

### 3. `process.env` passthrough (claude-executor.ts:95)
Full `process.env` is passed to Playwright MCP child process. This includes `ANTHROPIC_API_KEY` and any other env vars. Playwright MCP doesn't need the API key — consider filtering env vars.

### 4. Router mode (`ANTHROPIC_BASE_URL`)
If set, all API calls go to a custom endpoint instead of Anthropic. An attacker who controls this env var captures all prompts including source code.

---

## Architecture Overview

```
Shannon CLI
  ├── Pre-recon agent (code analysis, no browser)
  ├── Recon agent (browser + code analysis)
  ├── 5x Vulnerability agents (parallel, browser + code)
  ├── 5x Exploitation agents (parallel, browser + code)
  └── Report agent (generates final report)

Each agent = Claude API call with:
  - System prompt from prompts/*.txt
  - Target source code as context
  - MCP tools: save_deliverable, generate_totp, playwright
  - Isolated browser profile
  - Git checkpoints for rollback
```

---

## Recommendations

1. **✅ Safe to use** against your own infrastructure
2. **Pin Playwright MCP version** instead of `@latest`
3. **Filter env vars** passed to Playwright child process
4. **Be aware:** your source code goes to Anthropic's API — standard for any AI code analysis tool
5. **Run in Docker** (they support it via `SHANNON_DOCKER=true`) for additional isolation
6. **Set firewall rules** per the whitelist above — block all other egress