186 lines
7.6 KiB
Markdown
186 lines
7.6 KiB
Markdown
# Shannon AI Pentesting Tool — Security Audit
|
|
|
|
**Date:** 2025-02-09
|
|
**Repo:** https://github.com/KeygraphHQ/shannon
|
|
**Auditor:** James (automated)
|
|
**Verdict: ✅ SAFE** (with caveats)
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
Shannon is a TypeScript-based AI pentesting orchestrator that uses the Claude Agent SDK to run multi-phase security assessments. The codebase is clean, well-structured, and **makes no outbound network calls of its own**. All network activity is delegated to:
|
|
|
|
1. **Anthropic Claude API** (via `@anthropic-ai/claude-agent-sdk`) — the LLM backend
|
|
2. **Temporal** (optional, localhost:7233 by default) — workflow orchestration
|
|
3. **Playwright MCP** (spawned via npx) — browser automation against the *target* app
|
|
4. **The target application itself** — whatever URL you point it at
|
|
|
|
There is **no telemetry, no phone-home, no data exfiltration, no uploads to cloud storage, no webhook calls to Keygraph or any third party.**
|
|
|
|
---
|
|
|
|
## Exfiltration Risk Assessment
|
|
|
|
| Check | Result |
|
|
|-------|--------|
|
|
| Outbound HTTP/HTTPS calls (fetch, axios, http.get) | ❌ **None found** in Shannon source |
|
|
| Telemetry / analytics | ❌ **None** |
|
|
| Phone-home behavior | ❌ **None** |
|
|
| S3/GCS/cloud storage uploads | ❌ **None** |
|
|
| Webhook calls to KeygraphHQ | ❌ **None** |
|
|
| Base64 encoded payloads | Only standard base32 decode for TOTP generation (RFC 4226/6238) |
|
|
| Obfuscated URLs | ❌ **None** |
|
|
| eval()/exec() with remote input | ❌ **None** — only `regex.exec()` for prompt template parsing |
|
|
|
|
---
|
|
|
|
## All Outbound Network Connections
|
|
|
|
### 1. Anthropic API (Claude)
|
|
- **How:** Via `@anthropic-ai/claude-agent-sdk` `query()` function
|
|
- **Destination:** `api.anthropic.com` (or `ANTHROPIC_BASE_URL` if set for router mode)
|
|
- **Auth:** `ANTHROPIC_API_KEY` env var (handled by SDK, not Shannon)
|
|
- **Data sent:** Prompts + tool results; source code is sent as prompt context to Claude
|
|
|
|
### 2. Temporal Server (Optional)
|
|
- **How:** `@temporalio/client` and `@temporalio/worker`
|
|
- **Destination:** `TEMPORAL_ADDRESS` env var, default `localhost:7233`
|
|
- **Purpose:** Workflow orchestration (parallel agent execution, retries)
|
|
- **Note:** Local by default; only remote if you configure it
|
|
|
|
### 3. Playwright MCP (Browser Automation)
|
|
- **How:** Spawned as child process via `npx @playwright/mcp@latest`
|
|
- **Destination:** The target application URL only
|
|
- **Purpose:** Browser-based testing of the target app
|
|
- **Note:** Each agent gets isolated browser profile in `/tmp/playwright-agentN`
|
|
|
|
### 4. npm Registry (npx)
|
|
- **How:** `npx @playwright/mcp@latest` downloads on first run
|
|
- **Destination:** `registry.npmjs.org`
|
|
- **One-time:** Only on first execution; can be pre-installed
|
|
|
|
---
|
|
|
|
## Recommended Firewall Whitelist
|
|
|
|
```
|
|
# Required
|
|
api.anthropic.com:443 # Claude API (or your ANTHROPIC_BASE_URL)
|
|
<your-target-app> # Whatever you're pentesting
|
|
|
|
# Optional (Temporal, only if using workflow mode)
|
|
localhost:7233 # Temporal server (default, local)
|
|
|
|
# One-time (npx playwright download)
|
|
registry.npmjs.org:443 # npm packages
|
|
```
|
|
|
|
**Block everything else.** Shannon makes no other outbound connections.
|
|
|
|
---
|
|
|
|
## Credential Handling
|
|
|
|
| Credential | Handling | Risk |
|
|
|------------|----------|------|
|
|
| `ANTHROPIC_API_KEY` | Read by SDK from env var | ✅ Never stored/logged by Shannon |
|
|
| `ANTHROPIC_BASE_URL` | Optional env var for router mode | ✅ Safe |
|
|
| `ROUTER_DEFAULT` | Optional env var (e.g., `gemini,gemini-2.5-pro`) | ✅ Safe |
|
|
| Target source code | Copied to local `sourceDir`, git-managed | ⚠️ Sent to Claude API as prompt context |
|
|
|
|
**Key concern:** Target source code IS sent to the Anthropic API as part of the prompt. This is inherent to how the tool works — Claude analyzes the code. If your source code is highly sensitive, this is the main risk vector (data goes to Anthropic's API).
|
|
|
|
---
|
|
|
|
## Source Code / Temp File Handling
|
|
|
|
- Source is cloned to a local directory you specify
|
|
- Git initialized for checkpoints/rollbacks within that directory
|
|
- Deliverables written to `<sourceDir>/deliverables/`
|
|
- Playwright profiles in `/tmp/playwright-agent{1-5}`
|
|
- Error logs written to `<sourceDir>/error.log`
|
|
- No cleanup of temp files after execution (manual cleanup needed)
|
|
- **No uploads anywhere** — all files stay local
|
|
|
|
---
|
|
|
|
## Dependencies Risk Assessment
|
|
|
|
### Direct Dependencies (package.json)
|
|
|
|
| Package | Version | Risk | Notes |
|
|
|---------|---------|------|-------|
|
|
| `@anthropic-ai/claude-agent-sdk` | ^0.1.0 | ✅ Low | Official Anthropic SDK |
|
|
| `@temporalio/activity` | ^1.11.0 | ✅ Low | Well-known workflow engine |
|
|
| `@temporalio/client` | ^1.11.0 | ✅ Low | " |
|
|
| `@temporalio/worker` | ^1.11.0 | ✅ Low | " |
|
|
| `@temporalio/workflow` | ^1.11.0 | ✅ Low | " |
|
|
| `ajv` | ^8.12.0 | ✅ Low | JSON schema validator, widely used |
|
|
| `ajv-formats` | ^2.1.1 | ✅ Low | AJV extension |
|
|
| `boxen` | ^8.0.1 | ✅ Low | Terminal box drawing |
|
|
| `chalk` | ^5.0.0 | ✅ Low | Terminal colors |
|
|
| `dotenv` | ^16.4.5 | ✅ Low | Env file loader |
|
|
| `figlet` | ^1.9.3 | ✅ Low | ASCII art text |
|
|
| `gradient-string` | ^3.0.0 | ✅ Low | Terminal gradients |
|
|
| `js-yaml` | ^4.1.0 | ✅ Low | YAML parser |
|
|
| `zod` | ^3.22.4 | ✅ Low | Schema validation |
|
|
| `zx` | ^8.0.0 | ⚠️ Medium | Google's shell scripting lib — powerful but well-known; used for `fs`, `path`, `$` shell execution |
|
|
|
|
### Supply Chain
|
|
|
|
- **No postinstall/preinstall scripts** in package.json
|
|
- **All resolved packages from registry.npmjs.org** (verified in package-lock.json, 205 resolved entries, all npmjs.org)
|
|
- **No suspicious transitive dependencies detected**
|
|
- `zx` gives shell access via `$` template tag — but Shannon only uses it for `git` commands in `environment.ts`
|
|
|
|
---
|
|
|
|
## Code Sections Requiring Closer Review
|
|
|
|
### 1. `permissionMode: 'bypassPermissions'` (claude-executor.ts:201)
|
|
The Claude agent runs with **all permissions bypassed**. This means Claude Code can read/write/execute anything in the `sourceDir`. This is by design (pentesting requires it) but means the Claude agent has full filesystem access within its working directory.
|
|
|
|
### 2. `npx @playwright/mcp@latest` (claude-executor.ts:83)
|
|
Downloads latest Playwright MCP from npm at runtime. Pin the version for reproducibility:
|
|
```
|
|
'@playwright/mcp@0.0.28' // or whatever current version is
|
|
```
|
|
|
|
### 3. `process.env` passthrough (claude-executor.ts:95)
|
|
Full `process.env` is passed to Playwright MCP child process. This includes `ANTHROPIC_API_KEY` and any other env vars. Playwright MCP doesn't need the API key — consider filtering env vars.
|
|
|
|
### 4. Router mode (`ANTHROPIC_BASE_URL`)
|
|
If set, all API calls go to a custom endpoint instead of Anthropic. An attacker who controls this env var captures all prompts including source code.
|
|
|
|
---
|
|
|
|
## Architecture Overview
|
|
|
|
```
|
|
Shannon CLI
|
|
├── Pre-recon agent (code analysis, no browser)
|
|
├── Recon agent (browser + code analysis)
|
|
├── 5x Vulnerability agents (parallel, browser + code)
|
|
├── 5x Exploitation agents (parallel, browser + code)
|
|
└── Report agent (generates final report)
|
|
|
|
Each agent = Claude API call with:
|
|
- System prompt from prompts/*.txt
|
|
- Target source code as context
|
|
- MCP tools: save_deliverable, generate_totp, playwright
|
|
- Isolated browser profile
|
|
- Git checkpoints for rollback
|
|
```
|
|
|
|
---
|
|
|
|
## Recommendations
|
|
|
|
1. **✅ Safe to use** against your own infrastructure
|
|
2. **Pin Playwright MCP version** instead of `@latest`
|
|
3. **Filter env vars** passed to Playwright child process
|
|
4. **Be aware:** your source code goes to Anthropic's API — standard for any AI code analysis tool
|
|
5. **Run in Docker** (they support it via `SHANNON_DOCKER=true`) for additional isolation
|
|
6. **Set firewall rules** per the whitelist above — block all other egress
|