Skip to content

Feat/mcp ecosystem#691

Open
frankgraziano wants to merge 5 commits intoDataDog:mainfrom
frankgraziano:feat/mcp-ecosystem
Open

Feat/mcp ecosystem#691
frankgraziano wants to merge 5 commits intoDataDog:mainfrom
frankgraziano:feat/mcp-ecosystem

Conversation

@frankgraziano
Copy link
Copy Markdown

PR Inspired by @AlecRandazzo thread: https://detection-response.slack.com/archives/C01TNGGQ84V/p1774018433400609

Summary

Add MCP (Model Context Protocol) as a first-class ecosystem in guarddog, enabling static security analysis of MCP server configurations across 11 AI coding clients.

MCP configs define how AI agents connect to external tool servers. Misconfigurations — hardcoded secrets, shell wrappers, unpinned packages, overbroad filesystem access — create real attack surface as MCP adoption grows across developer tooling. This PR adds discovery, parsing, and 7 metadata security rules to catch these issues.

What's included

MCP ecosystem plumbing

  • ECOSYSTEM.MCP = "mcp" enum value — automatically registers CLI subcommand (guarddog mcp scan, guarddog mcp verify, guarddog mcp list-rules)
  • MCPConfigScanner (package scanner) and MCPDiscoveryScanner (project scanner)
  • Config file discovery that finds MCP configs in both project-scoped paths (.mcp.json, .cursor/mcp.json, .vscode/mcp.json, etc.) and user-scoped paths (~/Library/Application Support/Claude/, ~/.codex/config.toml, etc.)
  • Recursive globs are scoped to known VS Code extension and config directories to avoid expensive walks over $HOME

11 client parsers

Normalized parsing for: Claude Desktop, Claude Code, Cursor, VS Code, Windsurf, Cline, Roo Code, Continue, Codex, Gemini CLI, Copilot CLI. Each parser handles the client's specific config format (JSON/YAML/TOML, varying key names like mcpServers vs servers vs mcp_servers) and produces a uniform MCPServerConfigMCPConfigFileMCPInventory model.

7 metadata security rules

Rule | What it catches -- | -- inline-secret-in-mcp-config | Hard-coded API keys, tokens, passwords in env vars or headers plaintext-http-mcp | MCP server endpoints using http:// instead of https:// arbitrary-shell-launcher | Servers launched via bash -c, sh -c, cmd /c, etc. shared-project-mcp-config | Project-scoped configs (.mcp.json, .cursor/mcp.json) likely committed to VCS floating-package-launcher | Unpinned npx, uvx, pipx, or docker :latest package resolution dangerous-tool-surface | Server names/commands suggesting exec, delete, ssh, kubectl, etc. overbroad-filesystem-access | Broad or sensitive paths (/, ~, .ssh, .aws) in server args or cwd

--verbose flag and finding citations (cross-ecosystem)

  • Detector base class gains optional help_url and verbose_description fields
  • Human-readable reporter now shows a ref: link for each metadata finding
  • --verbose adds a why: explanation with remediation guidance
  • Available for all ecosystems, not just MCP — existing detectors can adopt help_url/verbose_description incrementally

Lazy rule loading (cross-ecosystem perf fix)

  • Metadata detector imports are now lazy (deferred to first use inside each match case)
  • _LazyRulesChoice in the CLI defers rule set computation until Click actually needs tab-completion or validation
  • Fixes an issue where every CLI invocation eagerly instantiated all ecosystem detectors, triggering network requests (e.g. rubygems top-packages cache refresh)

Tests

  • 48 unit tests for all 7 MCP detectors (positive detection, negative/safe cases, edge cases for placeholders, substrings, scoped paths)
  • 5 integration tests for MCPConfigScanner.scan_local() with temporary .mcp.json configs
  • All pre-existing tests remain unaffected

Example output

$ guarddog mcp scan /path/to/project
Found 3 potentially malicious indicators in /path/to/project

inline-secret-in-mcp-config: MCP server 'my-server' contains an inline secret in 'env.API_KEY'
ref: https://github.com/DataDog/guarddog/wiki/MCP-Rules#inline-secret-in-mcp-config

arbitrary-shell-launcher: MCP server 'my-server' is launched via shell command 'bash'
ref: https://github.com/DataDog/guarddog/wiki/MCP-Rules#arbitrary-shell-launcher

shared-project-mcp-config: MCP config '.cursor/mcp.json' is project-scoped and may be shared with collaborators or CI
ref: https://github.com/DataDog/guarddog/wiki/MCP-Rules#shared-project-mcp-config

With --verbose, each finding also shows a why: line with a detailed explanation and remediation guidance.

Commits

  1. Add MCP ecosystem enum and scanner infrastructure — enum, discovery, 11 parsers, data models, both scanner classes
  2. Add MCP metadata detectors — 7 security rules + registry wiring
  3. Add --verbose flag, finding citations, and lazy rule loading — cross-ecosystem CLI/reporter enhancements + startup perf fix
  4. Add tests for MCP detectors and config scanner — 53 tests total

Test plan

  •  python -m pytest tests/analyzer/metadata/test_mcp_detectors.py -v — 48 detector unit tests
  •  python -m pytest tests/core/test_mcp_config_scanner.py -v — 5 integration tests
  •  python -m guarddog mcp list-rules — lists all 7 metadata rules + 1 source code rule
  •  python -m guarddog mcp scan <dir> — scans a directory and reports findings with ref: links
  •  python -m guarddog mcp scan <dir> --verbose — includes why: explanations
  •  python -m guarddog pypi scan requests — verify existing ecosystems unaffected
  •  make test — full suite (no new failures introduced)

frankgraziano and others added 4 commits March 23, 2026 13:38
Introduce MCP (Model Context Protocol) as a first-class ecosystem in
guarddog. This adds the ECOSYSTEM.MCP enum value, config file discovery
for 11 MCP clients (Claude Desktop, Claude Code, Cursor, VS Code,
Windsurf, Cline, Roo Code, Continue, Codex, Gemini CLI, Copilot CLI),
a normalized data model (MCPServerConfig / MCPConfigFile / MCPInventory),
and two scanner classes: MCPConfigScanner (package scanner) and
MCPDiscoveryScanner (project scanner).

Discovery scopes recursive globs to known VS Code extension and config
directories to avoid expensive walks over the entire home directory.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add 7 security rules for MCP configuration analysis:
- inline-secret-in-mcp-config: hard-coded credentials in env/headers
- plaintext-http-mcp: insecure HTTP endpoints
- arbitrary-shell-launcher: shell wrapper execution (bash, sh, cmd, etc.)
- shared-project-mcp-config: project-scoped configs shared via VCS
- floating-package-launcher: unpinned npx/uvx/pipx/docker packages
- dangerous-tool-surface: server names suggesting risky capabilities
- overbroad-filesystem-access: broad paths (/, ~, .ssh, .aws) in args

Register the rules in the metadata detector registry so they are
available via 'guarddog mcp list-rules' and 'guarddog mcp scan'.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extend the Detector base class with optional help_url and
verbose_description fields. The human-readable reporter now shows a
'ref:' link for each metadata finding, and a 'why:' explanation when
--verbose is passed. The --verbose flag is threaded through all CLI
scan/verify paths and all reporter interfaces.

Also introduce _LazyRulesChoice in the CLI to defer metadata detector
instantiation until rule names are actually needed. This prevents
eager imports that triggered network requests (e.g. rubygems cache
refresh) on every CLI invocation, even for unrelated ecosystems.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add 48 unit tests covering all 7 MCP metadata detectors, including
positive detection, negative/safe cases, edge cases (placeholders,
substrings, scoped paths), and the rule registry.

Add 5 integration tests for MCPConfigScanner.scan_local() using
temporary .mcp.json configs to verify end-to-end detection of
inline secrets, plaintext HTTP, shell launchers, and benign configs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@frankgraziano frankgraziano requested a review from a team as a code owner March 23, 2026 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant