Feat/mcp ecosystem by frankgraziano · Pull Request #691 · DataDog/guarddog

frankgraziano · 2026-03-23T17:53:13Z

PR Inspired by @AlecRandazzo thread: https://detection-response.slack.com/archives/C01TNGGQ84V/p1774018433400609

Summary

Add MCP (Model Context Protocol) as a first-class ecosystem in guarddog, enabling static security analysis of MCP server configurations across 11 AI coding clients.

MCP configs define how AI agents connect to external tool servers. Misconfigurations — hardcoded secrets, shell wrappers, unpinned packages, overbroad filesystem access — create real attack surface as MCP adoption grows across developer tooling. This PR adds discovery, parsing, and 7 metadata security rules to catch these issues.

What's included

MCP ecosystem plumbing

ECOSYSTEM.MCP = "mcp" enum value — automatically registers CLI subcommand (guarddog mcp scan, guarddog mcp verify, guarddog mcp list-rules)
MCPConfigScanner (package scanner) and MCPDiscoveryScanner (project scanner)
Config file discovery that finds MCP configs in both project-scoped paths (.mcp.json, .cursor/mcp.json, .vscode/mcp.json, etc.) and user-scoped paths (~/Library/Application Support/Claude/, ~/.codex/config.toml, etc.)
Recursive globs are scoped to known VS Code extension and config directories to avoid expensive walks over $HOME

11 client parsers

Normalized parsing for: Claude Desktop, Claude Code, Cursor, VS Code, Windsurf, Cline, Roo Code, Continue, Codex, Gemini CLI, Copilot CLI. Each parser handles the client's specific config format (JSON/YAML/TOML, varying key names like mcpServers vs servers vs mcp_servers) and produces a uniform MCPServerConfig → MCPConfigFile → MCPInventory model.

7 metadata security rules

Rule | What it catches -- | -- inline-secret-in-mcp-config | Hard-coded API keys, tokens, passwords in env vars or headers plaintext-http-mcp | MCP server endpoints using http:// instead of https:// arbitrary-shell-launcher | Servers launched via bash -c, sh -c, cmd /c, etc. shared-project-mcp-config | Project-scoped configs (.mcp.json, .cursor/mcp.json) likely committed to VCS floating-package-launcher | Unpinned npx, uvx, pipx, or docker :latest package resolution dangerous-tool-surface | Server names/commands suggesting exec, delete, ssh, kubectl, etc. overbroad-filesystem-access | Broad or sensitive paths (/, ~, .ssh, .aws) in server args or cwd

`--verbose` flag and finding citations (cross-ecosystem)

Detector base class gains optional help_url and verbose_description fields
Human-readable reporter now shows a ref: link for each metadata finding
--verbose adds a why: explanation with remediation guidance
Available for all ecosystems, not just MCP — existing detectors can adopt help_url/verbose_description incrementally

Lazy rule loading (cross-ecosystem perf fix)

Metadata detector imports are now lazy (deferred to first use inside each match case)
_LazyRulesChoice in the CLI defers rule set computation until Click actually needs tab-completion or validation
Fixes an issue where every CLI invocation eagerly instantiated all ecosystem detectors, triggering network requests (e.g. rubygems top-packages cache refresh)

Tests

48 unit tests for all 7 MCP detectors (positive detection, negative/safe cases, edge cases for placeholders, substrings, scoped paths)
5 integration tests for MCPConfigScanner.scan_local() with temporary .mcp.json configs
All pre-existing tests remain unaffected

Example output

$ guarddog mcp scan /path/to/project Found 3 potentially malicious indicators in /path/to/project inline-secret-in-mcp-config: MCP server 'my-server' contains an inline secret in 'env.API_KEY' ref: https://github.com/DataDog/guarddog/wiki/MCP-Rules#inline-secret-in-mcp-config arbitrary-shell-launcher: MCP server 'my-server' is launched via shell command 'bash' ref: https://github.com/DataDog/guarddog/wiki/MCP-Rules#arbitrary-shell-launcher

shared-project-mcp-config: MCP config '.cursor/mcp.json' is project-scoped and may be shared with collaborators or CI ref: https://github.com/DataDog/guarddog/wiki/MCP-Rules#shared-project-mcp-config

With --verbose, each finding also shows a why: line with a detailed explanation and remediation guidance.

Commits

Add MCP ecosystem enum and scanner infrastructure — enum, discovery, 11 parsers, data models, both scanner classes
Add MCP metadata detectors — 7 security rules + registry wiring
Add --verbose flag, finding citations, and lazy rule loading — cross-ecosystem CLI/reporter enhancements + startup perf fix
Add tests for MCP detectors and config scanner — 53 tests total

Test plan

python -m pytest tests/analyzer/metadata/test_mcp_detectors.py -v — 48 detector unit tests
python -m pytest tests/core/test_mcp_config_scanner.py -v — 5 integration tests
python -m guarddog mcp list-rules — lists all 7 metadata rules + 1 source code rule
python -m guarddog mcp scan <dir> — scans a directory and reports findings with ref: links
python -m guarddog mcp scan <dir> --verbose — includes why: explanations
python -m guarddog pypi scan requests — verify existing ecosystems unaffected
make test — full suite (no new failures introduced)

Introduce MCP (Model Context Protocol) as a first-class ecosystem in guarddog. This adds the ECOSYSTEM.MCP enum value, config file discovery for 11 MCP clients (Claude Desktop, Claude Code, Cursor, VS Code, Windsurf, Cline, Roo Code, Continue, Codex, Gemini CLI, Copilot CLI), a normalized data model (MCPServerConfig / MCPConfigFile / MCPInventory), and two scanner classes: MCPConfigScanner (package scanner) and MCPDiscoveryScanner (project scanner). Discovery scopes recursive globs to known VS Code extension and config directories to avoid expensive walks over the entire home directory. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add 7 security rules for MCP configuration analysis: - inline-secret-in-mcp-config: hard-coded credentials in env/headers - plaintext-http-mcp: insecure HTTP endpoints - arbitrary-shell-launcher: shell wrapper execution (bash, sh, cmd, etc.) - shared-project-mcp-config: project-scoped configs shared via VCS - floating-package-launcher: unpinned npx/uvx/pipx/docker packages - dangerous-tool-surface: server names suggesting risky capabilities - overbroad-filesystem-access: broad paths (/, ~, .ssh, .aws) in args Register the rules in the metadata detector registry so they are available via 'guarddog mcp list-rules' and 'guarddog mcp scan'. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Extend the Detector base class with optional help_url and verbose_description fields. The human-readable reporter now shows a 'ref:' link for each metadata finding, and a 'why:' explanation when --verbose is passed. The --verbose flag is threaded through all CLI scan/verify paths and all reporter interfaces. Also introduce _LazyRulesChoice in the CLI to defer metadata detector instantiation until rule names are actually needed. This prevents eager imports that triggered network requests (e.g. rubygems cache refresh) on every CLI invocation, even for unrelated ecosystems. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add 48 unit tests covering all 7 MCP metadata detectors, including positive detection, negative/safe cases, edge cases (placeholders, substrings, scoped paths), and the rule registry. Add 5 integration tests for MCPConfigScanner.scan_local() using temporary .mcp.json configs to verify end-to-end detection of inline secrets, plaintext HTTP, shell launchers, and benign configs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

frankgraziano and others added 4 commits March 23, 2026 13:38

frankgraziano requested a review from a team as a code owner March 23, 2026 17:53

Merge branch 'main' into feat/mcp-ecosystem

c3f6df8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/mcp ecosystem#691

Feat/mcp ecosystem#691
frankgraziano wants to merge 5 commits intoDataDog:mainfrom
frankgraziano:feat/mcp-ecosystem

frankgraziano commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

frankgraziano commented Mar 23, 2026

Summary

What's included

MCP ecosystem plumbing

11 client parsers

7 metadata security rules

--verbose flag and finding citations (cross-ecosystem)

Lazy rule loading (cross-ecosystem perf fix)

Tests

Example output

Commits

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`--verbose` flag and finding citations (cross-ecosystem)