fix(ai_guard): scan Anthropic document content blocks by avara1986 · Pull Request #18576 · DataDog/dd-trace-py

avara1986 · 2026-06-11T08:04:47Z

Description

The Anthropic AI Guard converter (ddtrace/appsec/_ai_guard/_anthropic.py) listed document in _DROPPED_BLOCK_TYPES, so document content blocks were dropped before evaluation. Anthropic document blocks carry model-visible content:

source.type == "text" → plain text in source.data
source.type == "content" → nested text/image blocks
title / context → model-visible strings

Combined with the before-hook's skip-when-empty path (if not ai_guard_messages: return None), a document-only prompt produced no convertible messages and evaluation was skipped entirely; a benign-text + malicious-document prompt was scanned only on the surrounding text. Either way an attacker who can place document content into a traced Anthropic call bypasses the AI Guard prompt-injection / security check. Streaming and non-streaming hooks share the behavior via the same converter.

This change removes document from _DROPPED_BLOCK_TYPES and adds _format_document_block():

text / content sources (and title/context) are extracted and scanned.
Binary (base64) / remote (url) sources — which AI Guard cannot read as text — emit a [non-text document] placeholder so a document-only message still yields an evaluable payload (no silent skip), without pretending to OCR binary PDFs.

Resolves APMSP-3286. This is the Anthropic counterpart of the Strands fix in #18574 (APMSP-3089).

Testing

In tests/appsec/ai_guard/anthropic/test_anthropic.py:

Updated two tests that asserted the old drop-document behavior (one now uses redacted_thinking to keep empty-wrapper-suppression coverage; the other asserts the binary-document marker).
Added converter tests: text source, content source, title/context, document-only evaluability, and binary-source → marker.
Added a before-hook regression proving a document-only prompt now reaches client.evaluate instead of being skipped.

test_anthropic.py (82 passed / 5 version-skipped) and test_streaming.py (7 passed / 2 skipped) pass on Python 3.11. lint fmt, typing, and spelling pass.

Risks

Low. Behavior is unchanged for text-only conversations and for genuinely non-scannable blocks (redacted_thinking, etc.). Document blocks now contribute text (or a short placeholder) to the AI Guard payload, which can cause evaluation to run where it previously did not — the intended fix.

Additional Notes

The placeholder keeps binary/remote document sources from silently bypassing evaluation; forwarding richer representations (e.g. document images as image_url parts) could be a follow-up.

🤖 Generated with Claude Code

The Anthropic AI Guard converter treated `document` blocks as non-scannable and dropped them before evaluation. Anthropic document blocks carry model-visible content: `source.type == "text"` holds plain text, `source.type == "content"` nests text/image blocks, and `title`/`context` are model-visible strings. A document-only prompt therefore produced no convertible messages and the before-hook skipped evaluation entirely, while a benign-text + malicious-document prompt was scanned only on the surrounding text — an AI Guard bypass (APMSP-3286). Streaming and non-streaming hooks shared the behavior via the same converter. Document blocks are now converted: readable text sources are scanned, and binary (`base64`) / remote (`url`) sources emit a `[non-text document]` placeholder so a document-only message still yields an evaluable payload. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

cit-pr-commenter-54b7da · 2026-06-11T08:05:52Z

Codeowners resolved as

ddtrace/appsec/_ai_guard/_anthropic.py                                  @DataDog/asm-python
releasenotes/notes/ai-guard-anthropic-document-content-9d4b0c945a0db0f7.yaml  @DataDog/apm-python
tests/appsec/ai_guard/anthropic/test_anthropic.py                       @DataDog/asm-python

datadog-datadog-prod-us1-2 · 2026-06-11T08:09:30Z

Tests

✨ Fix all issues with BitsAI

⚠️ Warnings

🚦 8 Pipeline jobs failed

DataDog/apm-reliability/dd-trace-py | build linux serverless: [amd64, cp315-cp315, v113741238-d2b8243-manylinux2014_x86_64, 1]

DataDog/apm-reliability/dd-trace-py | build linux serverless: [amd64, cp315-cp315, v113741491-d2b8243-musllinux_1_2_x86_64, 1]

DataDog/apm-reliability/dd-trace-py | build linux serverless: [arm64, cp315-cp315, v113741357-d2b8243-manylinux2014_aarch64, 1]

View all 8 failed jobs.

ℹ️ Info

No other issues found (see more)

🧪 All tests passed
❄️ No new flaky tests detected

Useful? React with 👍 / 👎

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 93f77e2 | Docs | Datadog PR Page | Give us feedback!}

pr-commenter · 2026-06-11T08:19:19Z

Benchmarks

Benchmark execution time: 2026-06-11 14:26:16

Comparing candidate commit 93f77e2 in PR branch fix/ai-guard-anthropic-document-content with baseline commit 42c7b35 in branch main.

Found 0 performance improvements and 1 performance regressions! Performance is the same for 83 metrics, 0 unstable metrics.

scenario:iastaspectsospath-ospathbasename_aspect

🟥 execution_time [+99.411µs; +109.074µs] or [+23.255%; +25.516%]

avara1986 · 2026-06-11T13:49:07Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c7a22bb57c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

avara1986 · 2026-06-12T07:26:13Z

/merge

gh-worker-devflow-routing-ef8351 · 2026-06-12T07:26:19Z

View all feedbacks in Devflow UI.

2026-06-12 07:26:19 UTC ℹ️ Start processing command /merge

2026-06-12 07:26:24 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in main is approximately 55m (p90).

2026-06-12 08:09:34 UTC ℹ️ MergeQueue: This merge request was merged

avara1986 added 2 commits June 11, 2026 11:10

Merge branch 'main' into fix/ai-guard-anthropic-document-content

8f9ad2c

refactor function

c7a22bb

avara1986 marked this pull request as ready for review June 11, 2026 13:48

avara1986 requested review from a team as code owners June 11, 2026 13:48

avara1986 requested review from brettlangdon and rachelyangdog June 11, 2026 13:48

avara1986 requested a review from christophe-papazian June 11, 2026 13:49

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread ddtrace/appsec/_ai_guard/_anthropic.py

avara1986 added 2 commits June 11, 2026 16:06

fix comment

6069487

Merge branch 'main' into fix/ai-guard-anthropic-document-content

93f77e2

christophe-papazian approved these changes Jun 11, 2026

View reviewed changes

gh-worker-dd-mergequeue-cf854d Bot merged commit 28814a8 into main Jun 12, 2026
666 checks passed

gh-worker-dd-mergequeue-cf854d Bot deleted the fix/ai-guard-anthropic-document-content branch June 12, 2026 08:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ai_guard): scan Anthropic document content blocks#18576

fix(ai_guard): scan Anthropic document content blocks#18576
gh-worker-dd-mergequeue-cf854d[bot] merged 5 commits into
mainfrom
fix/ai-guard-anthropic-document-content

avara1986 commented Jun 11, 2026 •

edited by atlassian Bot

Loading

Uh oh!

cit-pr-commenter-54b7da Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

datadog-datadog-prod-us1-2 Bot commented Jun 11, 2026 •

edited by datadog-prod-us1-4 Bot

Loading

Uh oh!

pr-commenter Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

avara1986 commented Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

avara1986 commented Jun 12, 2026

Uh oh!

gh-worker-devflow-routing-ef8351 Bot commented Jun 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

avara1986 commented Jun 11, 2026 • edited by atlassian Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Risks

Additional Notes

Uh oh!

cit-pr-commenter-54b7da Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codeowners resolved as

Uh oh!

datadog-datadog-prod-us1-2 Bot commented Jun 11, 2026 • edited by datadog-prod-us1-4 Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Warnings

ℹ️ Info

Uh oh!

pr-commenter Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

scenario:iastaspectsospath-ospathbasename_aspect

Uh oh!

avara1986 commented Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

avara1986 commented Jun 12, 2026

Uh oh!

gh-worker-devflow-routing-ef8351 Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

avara1986 commented Jun 11, 2026 •

edited by atlassian Bot

Loading

cit-pr-commenter-54b7da Bot commented Jun 11, 2026 •

edited

Loading

datadog-datadog-prod-us1-2 Bot commented Jun 11, 2026 •

edited by datadog-prod-us1-4 Bot

Loading

pr-commenter Bot commented Jun 11, 2026 •

edited

Loading

gh-worker-devflow-routing-ef8351 Bot commented Jun 12, 2026 •

edited

Loading