Skip to content

fix(osint): refine verifier claim filtering#19509

Closed
BrianCLong wants to merge 1 commit intomainfrom
codex/add-osint-hallucination-mitigation-features-25gntw
Closed

fix(osint): refine verifier claim filtering#19509
BrianCLong wants to merge 1 commit intomainfrom
codex/add-osint-hallucination-mitigation-features-25gntw

Conversation

@BrianCLong
Copy link
Copy Markdown
Owner

Motivation

  • Reduce false positives from the verifier by treating explicit gap/unknown statements as non-claims and preferring explicit Evidence ID detection.
  • Align verifier heuristics with the OSINT evidence-first standard so narrative claims are only flagged when truly unsupported.

Description

  • Tighten claim extraction by adding GAP_PREFIXES and _is_gap_statement to ignore sentences starting with explicit gap phrases.
  • Add EVIDENCE_ID_PATTERN detection and update extract_claims and verify_report to prefer evidence-bearing statements when deciding unsupported claims.
  • Introduce lightweight claim-candidate heuristics in _is_claim_candidate to balance sensitivity and reduce noise.
  • Update tests/test_verifier_flags_unsupported.py to include an explicit unknown/gap sentence and assert the verifier still flags unsupported claims appropriately.

Testing

  • Ran the targeted pytest suite: pytest tests/test_verifier_flags_unsupported.py tests/test_provenance_required.py tests/test_unknown_degradation.py tests/test_two_source_promotion.py tests/test_deterministic_evidence_ids.py, and all tests passed (5 passed).
{
  "agent_id": "codex",
  "task_id": "osint-hallucination-mitigation-mws",
  "prompt_hash": "9d6a71f809270db4fc2c78ef7741f3517a60cac3e92488d811a3e6b0cac9a49b",
  "domains": ["osint", "governance", "documentation", "testing"],
  "verification_tiers": ["C"],
  "debt_delta": 0,
  "declared_scope": {
    "paths": [
      "packages/osint/src/hallucination/",
      "tests/test_provenance_required.py",
      "tests/test_unknown_degradation.py",
      "tests/test_verifier_flags_unsupported.py",
      "tests/test_two_source_promotion.py",
      "tests/test_deterministic_evidence_ids.py",
      "docs/standards/osint-hallucination-mitigation.md",
      "docs/security/data-handling/osint-hallucination-mitigation.md",
      "docs/ops/runbooks/osint-hallucination-mitigation.md",
      "docs/roadmap/STATUS.json",
      "prompts/osint/osint-hallucination-mitigation@v1.md",
      "prompts/registry.yaml"
    ]
  },
  "allowed_operations": ["create", "edit"]
}

Codex Task

@BrianCLong BrianCLong added the codex Codex-owned implementation work label Mar 7, 2026 — with ChatGPT Codex Connector
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the robustness of OSINT (Open Source Intelligence) processes by refining the claim verification mechanism. The changes aim to reduce false positives in unsupported claim detection and align verification heuristics with an evidence-first standard. This is achieved through new code modules that introduce deterministic evidence IDs, enforce provenance requirements, and provide a more intelligent verifier that can distinguish between actual unsupported claims and explicit 'gap' statements. The update also includes comprehensive documentation to guide the implementation and operation of these new hallucination mitigation standards.

Highlights

  • Refined Verifier Logic: The verifier's claim extraction process has been enhanced to reduce false positives by explicitly ignoring sentences starting with 'gap' phrases and prioritizing the detection of explicit Evidence IDs within claims.
  • New OSINT Hallucination Mitigation Modules: Introduced new Python modules (evidence_id.py, facts.py, verifier.py) to handle deterministic evidence ID generation, define fact and provenance schemas, and implement the core verification logic for unsupported claims.
  • Comprehensive Documentation Added: New documentation has been added, including a runbook for OSINT hallucination mitigation, data handling guidelines, and a detailed standard document outlining non-negotiables and required fields for hallucination resistance.
  • Expanded Test Coverage: New tests were added to validate deterministic evidence ID generation, ensure provenance requirements are met, confirm the two-source promotion policy, verify unknown degradation behavior, and test the verifier's ability to flag unsupported claims.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • docs/ops/runbooks/osint-hallucination-mitigation.md
    • Added a new runbook detailing the objective, execution, interpretation of verification output, and triage checklist for OSINT hallucination mitigation.
  • docs/roadmap/STATUS.json
    • Updated the roadmap to include the 'osint-hallucination-mitigation' initiative with its current status, owner, and notes.
  • docs/security/data-handling/osint-hallucination-mitigation.md
    • Added a new document outlining data handling policies for OSINT hallucination mitigation, covering scope, never-log list, retention, determinism rules, and verification requirements.
  • docs/standards/osint-hallucination-mitigation.md
    • Added a new standard document defining the purpose, non-negotiables, required fact and evidence fields, deterministic artifacts, import/export matrix, non-goals, and security alignment for OSINT hallucination mitigation.
  • packages/osint/src/hallucination/init.py
    • Added an initialization file for the new 'hallucination' package.
  • packages/osint/src/hallucination/evidence_id.py
    • Added utilities for deterministic evidence ID generation, including functions to normalize source URLs and canonicalize snippets.
  • packages/osint/src/hallucination/facts.py
    • Added data classes for Provenance and Fact, along with functions to check for missing provenance fields and apply provenance and two-source policies.
  • packages/osint/src/hallucination/verifier.py
    • Added a verifier agent module with functions to identify gap statements, determine claim candidates, extract claims from report text, and verify reports against facts to detect unsupported claims.
  • prompts/osint/osint-hallucination-mitigation@v1.md
    • Added a new prompt definition for the OSINT hallucination mitigation task, detailing its objective, scope, non-goals, and required outputs.
  • prompts/registry.yaml
    • Registered the new 'osint-hallucination-mitigation' prompt, including its version, path, SHA256 hash, description, scope, verification requirements, and allowed operations.
  • tests/test_deterministic_evidence_ids.py
    • Added a test to ensure the compute_evidence_id function generates consistent and deterministic evidence IDs.
  • tests/test_provenance_required.py
    • Added a test to verify that facts with missing provenance fields are correctly identified and degraded to 'unknown' verdict.
  • tests/test_two_source_promotion.py
    • Added a test to confirm that facts marked as 'confirmed' without at least two independent sources are downgraded to 'unconfirmed'.
  • tests/test_unknown_degradation.py
    • Added a test to ensure that facts without any provenance are correctly degraded to an 'unknown' verdict.
  • tests/test_verifier_flags_unsupported.py
    • Added a test to confirm the verifier correctly flags unsupported claims in a report, even when explicit gap statements are present.
Activity
  • The targeted pytest suite (tests/test_verifier_flags_unsupported.py, tests/test_provenance_required.py, tests/test_unknown_degradation.py, tests/test_two_source_promotion.py, tests/test_deterministic_evidence_ids.py) was executed, and all 5 tests passed.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 7, 2026

Warning

Rate limit exceeded

@BrianCLong has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 5 minutes and 27 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ffd3e58f-9c92-48f6-9557-f3bbae79b4bc

📥 Commits

Reviewing files that changed from the base of the PR and between 68c99c3 and 00d59b0.

📒 Files selected for processing (15)
  • docs/ops/runbooks/osint-hallucination-mitigation.md
  • docs/roadmap/STATUS.json
  • docs/security/data-handling/osint-hallucination-mitigation.md
  • docs/standards/osint-hallucination-mitigation.md
  • packages/osint/src/hallucination/__init__.py
  • packages/osint/src/hallucination/evidence_id.py
  • packages/osint/src/hallucination/facts.py
  • packages/osint/src/hallucination/verifier.py
  • prompts/osint/osint-hallucination-mitigation@v1.md
  • prompts/registry.yaml
  • tests/test_deterministic_evidence_ids.py
  • tests/test_provenance_required.py
  • tests/test_two_source_promotion.py
  • tests/test_unknown_degradation.py
  • tests/test_verifier_flags_unsupported.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/add-osint-hallucination-mitigation-features-25gntw

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive suite of tools for OSINT hallucination mitigation, including deterministic evidence ID generation, fact/provenance schemas, and a verifier to flag unsupported claims. The changes are well-structured, with new modules, documentation, and tests. My review focuses on refining the implementation of the verifier to improve its accuracy and robustness. I've identified a critical issue in the claim verification logic that could allow hallucinated evidence IDs to pass, a high-severity issue with an overly broad claim detection heuristic, and a few medium-severity suggestions to improve code clarity and follow Python idioms.

Comment on lines +54 to +55
has_evidence_id = bool(EVIDENCE_ID_PATTERN.search(claim))
if not has_evidence_id and not any(evidence_id in claim for evidence_id in evidence_ids):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There is a critical flaw in the claim verification logic. The current implementation only checks if a claim contains a string that looks like an evidence ID, but it does not validate this ID against the list of known evidence IDs from the provided facts. This means a claim with a hallucinated (i.e., fake but well-formed) evidence ID will incorrectly pass verification. The logic must be changed to ensure that a claim is only considered supported if it contains at least one of the known evidence IDs.

        if not any(evidence_id in claim for evidence_id in evidence_ids):

Comment on lines +20 to +27
def _is_claim_candidate(sentence: str) -> bool:
if CLAIM_PATTERN.search(sentence):
return True
if re.search(r"\b\d{1,4}\b", sentence):
return True
if re.search(r"\b[A-Z][a-z]+\b", sentence):
return True
return False
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The heuristic re.search(r"\b[A-Z][a-z]+\b", sentence) is too broad for identifying claim candidates. It will match the first word of most sentences in English, as well as any proper noun, leading to a high number of false positives. This would cause many ordinary sentences to be flagged as unsupported claims, which runs counter to the goal of reducing noise. The heuristic should be more specific or removed.

def _is_claim_candidate(sentence: str) -> bool:
    if CLAIM_PATTERN.search(sentence):
        return True
    if re.search(r"\b\d{1,4}\b", sentence):
        return True
    return False

Comment on lines +20 to +28
def _drop_tracking_params(params: Iterable[tuple[str, str]]) -> list[tuple[str, str]]:
cleaned: list[tuple[str, str]] = []
for key, value in params:
if key.startswith("utm_"):
continue
if key in TRACKING_QUERY_KEYS:
continue
cleaned.append((key, value))
return cleaned
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For improved readability and conciseness, this function can be refactored to use a list comprehension. This is a more idiomatic Python approach for filtering and transforming lists.

def _drop_tracking_params(params: Iterable[tuple[str, str]]) -> list[tuple[str, str]]:
    return [
        (key, value)
        for key, value in params
        if not key.startswith("utm_") and key not in TRACKING_QUERY_KEYS
    ]

Comment on lines +40 to +46
def _collect_evidence_ids(facts: Iterable[Fact]) -> List[str]:
evidence_ids: List[str] = []
for fact in facts:
for prov in fact.provenance:
if prov.evidence_id:
evidence_ids.append(prov.evidence_id)
return evidence_ids
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This function can be made more concise and Pythonic by using a nested list comprehension to collect the evidence IDs. This improves readability by expressing the logic in a single statement.

def _collect_evidence_ids(facts: Iterable[Fact]) -> List[str]:
    return [
        prov.evidence_id for fact in facts for prov in fact.provenance if prov.evidence_id
    ]

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Auto-approved by Mega Merge Orchestrator

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Auto-approved by Mega Merge Orchestrator

@BrianCLong
Copy link
Copy Markdown
Owner Author

Temporarily closing to reduce Actions queue saturation and unblock #22241. Reopen after the golden-main convergence PR merges.

1 similar comment
@BrianCLong
Copy link
Copy Markdown
Owner Author

Temporarily closing to reduce Actions queue saturation and unblock #22241. Reopen after the golden-main convergence PR merges.

@BrianCLong BrianCLong closed this Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

codex Codex-owned implementation work queue:blocked

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant