7.6 KiB

Raw Blame History

Shannon AI Pentesting Tool — Security Audit

Date: 2025-02-09
Repo: https://github.com/KeygraphHQ/shannon
Auditor: James (automated)
Verdict: ✅ SAFE (with caveats)

Summary

Shannon is a TypeScript-based AI pentesting orchestrator that uses the Claude Agent SDK to run multi-phase security assessments. The codebase is clean, well-structured, and makes no outbound network calls of its own. All network activity is delegated to:

Anthropic Claude API (via @anthropic-ai/claude-agent-sdk) — the LLM backend
Temporal (optional, localhost:7233 by default) — workflow orchestration
Playwright MCP (spawned via npx) — browser automation against the target app
The target application itself — whatever URL you point it at

There is no telemetry, no phone-home, no data exfiltration, no uploads to cloud storage, no webhook calls to Keygraph or any third party.

Exfiltration Risk Assessment

Check	Result
Outbound HTTP/HTTPS calls (fetch, axios, http.get)	❌ None found in Shannon source
Telemetry / analytics	❌ None
Phone-home behavior	❌ None
S3/GCS/cloud storage uploads	❌ None
Webhook calls to KeygraphHQ	❌ None
Base64 encoded payloads	Only standard base32 decode for TOTP generation (RFC 4226/6238)
Obfuscated URLs	❌ None
eval()/exec() with remote input	❌ None — only `regex.exec()` for prompt template parsing

All Outbound Network Connections

1. Anthropic API (Claude)

How: Via @anthropic-ai/claude-agent-sdk query() function
Destination: api.anthropic.com (or ANTHROPIC_BASE_URL if set for router mode)
Auth: ANTHROPIC_API_KEY env var (handled by SDK, not Shannon)
Data sent: Prompts + tool results; source code is sent as prompt context to Claude

2. Temporal Server (Optional)

How: @temporalio/client and @temporalio/worker
Destination: TEMPORAL_ADDRESS env var, default localhost:7233
Purpose: Workflow orchestration (parallel agent execution, retries)
Note: Local by default; only remote if you configure it

3. Playwright MCP (Browser Automation)

How: Spawned as child process via npx @playwright/mcp@latest
Destination: The target application URL only
Purpose: Browser-based testing of the target app
Note: Each agent gets isolated browser profile in /tmp/playwright-agentN

4. npm Registry (npx)

How: npx @playwright/mcp@latest downloads on first run
Destination: registry.npmjs.org
One-time: Only on first execution; can be pre-installed

Recommended Firewall Whitelist

# Required
api.anthropic.com:443          # Claude API (or your ANTHROPIC_BASE_URL)
<your-target-app>              # Whatever you're pentesting

# Optional (Temporal, only if using workflow mode)
localhost:7233                  # Temporal server (default, local)

# One-time (npx playwright download)
registry.npmjs.org:443         # npm packages

Block everything else. Shannon makes no other outbound connections.

Credential Handling

Credential	Handling	Risk
`ANTHROPIC_API_KEY`	Read by SDK from env var	✅ Never stored/logged by Shannon
`ANTHROPIC_BASE_URL`	Optional env var for router mode	✅ Safe
`ROUTER_DEFAULT`	Optional env var (e.g., `gemini,gemini-2.5-pro`)	✅ Safe
Target source code	Copied to local `sourceDir`, git-managed	⚠️ Sent to Claude API as prompt context

Key concern: Target source code IS sent to the Anthropic API as part of the prompt. This is inherent to how the tool works — Claude analyzes the code. If your source code is highly sensitive, this is the main risk vector (data goes to Anthropic's API).

Source Code / Temp File Handling

Source is cloned to a local directory you specify
Git initialized for checkpoints/rollbacks within that directory
Deliverables written to <sourceDir>/deliverables/
Playwright profiles in /tmp/playwright-agent{1-5}
Error logs written to <sourceDir>/error.log
No cleanup of temp files after execution (manual cleanup needed)
No uploads anywhere — all files stay local

Dependencies Risk Assessment

Direct Dependencies (package.json)

Package	Version	Risk	Notes
`@anthropic-ai/claude-agent-sdk`	^0.1.0	✅ Low	Official Anthropic SDK
`@temporalio/activity`	^1.11.0	✅ Low	Well-known workflow engine
`@temporalio/client`	^1.11.0	✅ Low	"
`@temporalio/worker`	^1.11.0	✅ Low	"
`@temporalio/workflow`	^1.11.0	✅ Low	"
`ajv`	^8.12.0	✅ Low	JSON schema validator, widely used
`ajv-formats`	^2.1.1	✅ Low	AJV extension
`boxen`	^8.0.1	✅ Low	Terminal box drawing
`chalk`	^5.0.0	✅ Low	Terminal colors
`dotenv`	^16.4.5	✅ Low	Env file loader
`figlet`	^1.9.3	✅ Low	ASCII art text
`gradient-string`	^3.0.0	✅ Low	Terminal gradients
`js-yaml`	^4.1.0	✅ Low	YAML parser
`zod`	^3.22.4	✅ Low	Schema validation
`zx`	^8.0.0	⚠️ Medium	Google's shell scripting lib — powerful but well-known; used for `fs`, `path`, `$` shell execution

Supply Chain

No postinstall/preinstall scripts in package.json
All resolved packages from registry.npmjs.org (verified in package-lock.json, 205 resolved entries, all npmjs.org)
No suspicious transitive dependencies detected
zx gives shell access via $ template tag — but Shannon only uses it for git commands in environment.ts

Code Sections Requiring Closer Review

1. `permissionMode: 'bypassPermissions'` (claude-executor.ts:201)

The Claude agent runs with all permissions bypassed. This means Claude Code can read/write/execute anything in the sourceDir. This is by design (pentesting requires it) but means the Claude agent has full filesystem access within its working directory.

2. `npx @playwright/mcp@latest` (claude-executor.ts:83)

Downloads latest Playwright MCP from npm at runtime. Pin the version for reproducibility:

'@playwright/mcp@0.0.28'  // or whatever current version is

3. `process.env` passthrough (claude-executor.ts:95)

Full process.env is passed to Playwright MCP child process. This includes ANTHROPIC_API_KEY and any other env vars. Playwright MCP doesn't need the API key — consider filtering env vars.

4. Router mode (`ANTHROPIC_BASE_URL`)

If set, all API calls go to a custom endpoint instead of Anthropic. An attacker who controls this env var captures all prompts including source code.

Architecture Overview

Shannon CLI
  ├── Pre-recon agent (code analysis, no browser)
  ├── Recon agent (browser + code analysis)
  ├── 5x Vulnerability agents (parallel, browser + code)
  ├── 5x Exploitation agents (parallel, browser + code)
  └── Report agent (generates final report)

Each agent = Claude API call with:
  - System prompt from prompts/*.txt
  - Target source code as context
  - MCP tools: save_deliverable, generate_totp, playwright
  - Isolated browser profile
  - Git checkpoints for rollback

Recommendations

✅ Safe to use against your own infrastructure
Pin Playwright MCP version instead of @latest
Filter env vars passed to Playwright child process
Be aware: your source code goes to Anthropic's API — standard for any AI code analysis tool
Run in Docker (they support it via SHANNON_DOCKER=true) for additional isolation
Set firewall rules per the whitelist above — block all other egress

7.6 KiB Raw Blame History