fix(osint): refine verifier claim filtering by BrianCLong · Pull Request #19509 · BrianCLong/summit

BrianCLong · 2026-03-07T02:14:27Z

Motivation

Reduce false positives from the verifier by treating explicit gap/unknown statements as non-claims and preferring explicit Evidence ID detection.
Align verifier heuristics with the OSINT evidence-first standard so narrative claims are only flagged when truly unsupported.

Description

Tighten claim extraction by adding GAP_PREFIXES and _is_gap_statement to ignore sentences starting with explicit gap phrases.
Add EVIDENCE_ID_PATTERN detection and update extract_claims and verify_report to prefer evidence-bearing statements when deciding unsupported claims.
Introduce lightweight claim-candidate heuristics in _is_claim_candidate to balance sensitivity and reduce noise.
Update tests/test_verifier_flags_unsupported.py to include an explicit unknown/gap sentence and assert the verifier still flags unsupported claims appropriately.

Testing

Ran the targeted pytest suite: pytest tests/test_verifier_flags_unsupported.py tests/test_provenance_required.py tests/test_unknown_degradation.py tests/test_two_source_promotion.py tests/test_deterministic_evidence_ids.py, and all tests passed (5 passed).

{
  "agent_id": "codex",
  "task_id": "osint-hallucination-mitigation-mws",
  "prompt_hash": "9d6a71f809270db4fc2c78ef7741f3517a60cac3e92488d811a3e6b0cac9a49b",
  "domains": ["osint", "governance", "documentation", "testing"],
  "verification_tiers": ["C"],
  "debt_delta": 0,
  "declared_scope": {
    "paths": [
      "packages/osint/src/hallucination/",
      "tests/test_provenance_required.py",
      "tests/test_unknown_degradation.py",
      "tests/test_verifier_flags_unsupported.py",
      "tests/test_two_source_promotion.py",
      "tests/test_deterministic_evidence_ids.py",
      "docs/standards/osint-hallucination-mitigation.md",
      "docs/security/data-handling/osint-hallucination-mitigation.md",
      "docs/ops/runbooks/osint-hallucination-mitigation.md",
      "docs/roadmap/STATUS.json",
      "prompts/osint/osint-hallucination-mitigation@v1.md",
      "prompts/registry.yaml"
    ]
  },
  "allowed_operations": ["create", "edit"]
}

Codex Task

chatgpt-codex-connector · 2026-03-07T02:14:32Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

gemini-code-assist · 2026-03-07T02:14:50Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the robustness of OSINT (Open Source Intelligence) processes by refining the claim verification mechanism. The changes aim to reduce false positives in unsupported claim detection and align verification heuristics with an evidence-first standard. This is achieved through new code modules that introduce deterministic evidence IDs, enforce provenance requirements, and provide a more intelligent verifier that can distinguish between actual unsupported claims and explicit 'gap' statements. The update also includes comprehensive documentation to guide the implementation and operation of these new hallucination mitigation standards.

Highlights

Refined Verifier Logic: The verifier's claim extraction process has been enhanced to reduce false positives by explicitly ignoring sentences starting with 'gap' phrases and prioritizing the detection of explicit Evidence IDs within claims.
New OSINT Hallucination Mitigation Modules: Introduced new Python modules (evidence_id.py, facts.py, verifier.py) to handle deterministic evidence ID generation, define fact and provenance schemas, and implement the core verification logic for unsupported claims.
Comprehensive Documentation Added: New documentation has been added, including a runbook for OSINT hallucination mitigation, data handling guidelines, and a detailed standard document outlining non-negotiables and required fields for hallucination resistance.
Expanded Test Coverage: New tests were added to validate deterministic evidence ID generation, ensure provenance requirements are met, confirm the two-source promotion policy, verify unknown degradation behavior, and test the verifier's ability to flag unsupported claims.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

docs/ops/runbooks/osint-hallucination-mitigation.md
- Added a new runbook detailing the objective, execution, interpretation of verification output, and triage checklist for OSINT hallucination mitigation.
docs/roadmap/STATUS.json
- Updated the roadmap to include the 'osint-hallucination-mitigation' initiative with its current status, owner, and notes.
docs/security/data-handling/osint-hallucination-mitigation.md
- Added a new document outlining data handling policies for OSINT hallucination mitigation, covering scope, never-log list, retention, determinism rules, and verification requirements.
docs/standards/osint-hallucination-mitigation.md
- Added a new standard document defining the purpose, non-negotiables, required fact and evidence fields, deterministic artifacts, import/export matrix, non-goals, and security alignment for OSINT hallucination mitigation.
packages/osint/src/hallucination/init.py
- Added an initialization file for the new 'hallucination' package.
packages/osint/src/hallucination/evidence_id.py
- Added utilities for deterministic evidence ID generation, including functions to normalize source URLs and canonicalize snippets.
packages/osint/src/hallucination/facts.py
- Added data classes for Provenance and Fact, along with functions to check for missing provenance fields and apply provenance and two-source policies.
packages/osint/src/hallucination/verifier.py
- Added a verifier agent module with functions to identify gap statements, determine claim candidates, extract claims from report text, and verify reports against facts to detect unsupported claims.
prompts/osint/osint-hallucination-mitigation@v1.md
- Added a new prompt definition for the OSINT hallucination mitigation task, detailing its objective, scope, non-goals, and required outputs.
prompts/registry.yaml
- Registered the new 'osint-hallucination-mitigation' prompt, including its version, path, SHA256 hash, description, scope, verification requirements, and allowed operations.
tests/test_deterministic_evidence_ids.py
- Added a test to ensure the compute_evidence_id function generates consistent and deterministic evidence IDs.
tests/test_provenance_required.py
- Added a test to verify that facts with missing provenance fields are correctly identified and degraded to 'unknown' verdict.
tests/test_two_source_promotion.py
- Added a test to confirm that facts marked as 'confirmed' without at least two independent sources are downgraded to 'unconfirmed'.
tests/test_unknown_degradation.py
- Added a test to ensure that facts without any provenance are correctly degraded to an 'unknown' verdict.
tests/test_verifier_flags_unsupported.py
- Added a test to confirm the verifier correctly flags unsupported claims in a report, even when explicit gap statements are present.

Activity

The targeted pytest suite (tests/test_verifier_flags_unsupported.py, tests/test_provenance_required.py, tests/test_unknown_degradation.py, tests/test_two_source_promotion.py, tests/test_deterministic_evidence_ids.py) was executed, and all 5 tests passed.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-03-07T02:16:00Z

Warning

Rate limit exceeded

@BrianCLong has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 5 minutes and 27 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ffd3e58f-9c92-48f6-9557-f3bbae79b4bc

📥 Commits

Reviewing files that changed from the base of the PR and between 68c99c3 and 00d59b0.

📒 Files selected for processing (15)

docs/ops/runbooks/osint-hallucination-mitigation.md
docs/roadmap/STATUS.json
docs/security/data-handling/osint-hallucination-mitigation.md
docs/standards/osint-hallucination-mitigation.md
packages/osint/src/hallucination/__init__.py
packages/osint/src/hallucination/evidence_id.py
packages/osint/src/hallucination/facts.py
packages/osint/src/hallucination/verifier.py
prompts/osint/osint-hallucination-mitigation@v1.md
prompts/registry.yaml
tests/test_deterministic_evidence_ids.py
tests/test_provenance_required.py
tests/test_two_source_promotion.py
tests/test_unknown_degradation.py
tests/test_verifier_flags_unsupported.py

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch codex/add-osint-hallucination-mitigation-features-25gntw

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces a comprehensive suite of tools for OSINT hallucination mitigation, including deterministic evidence ID generation, fact/provenance schemas, and a verifier to flag unsupported claims. The changes are well-structured, with new modules, documentation, and tests. My review focuses on refining the implementation of the verifier to improve its accuracy and robustness. I've identified a critical issue in the claim verification logic that could allow hallucinated evidence IDs to pass, a high-severity issue with an overly broad claim detection heuristic, and a few medium-severity suggestions to improve code clarity and follow Python idioms.

gemini-code-assist · 2026-03-07T02:16:28Z

packages/osint/src/hallucination/verifier.py

+        has_evidence_id = bool(EVIDENCE_ID_PATTERN.search(claim))
+        if not has_evidence_id and not any(evidence_id in claim for evidence_id in evidence_ids):


There is a critical flaw in the claim verification logic. The current implementation only checks if a claim contains a string that looks like an evidence ID, but it does not validate this ID against the list of known evidence IDs from the provided facts. This means a claim with a hallucinated (i.e., fake but well-formed) evidence ID will incorrectly pass verification. The logic must be changed to ensure that a claim is only considered supported if it contains at least one of the known evidence IDs.

if not any(evidence_id in claim for evidence_id in evidence_ids):

gemini-code-assist · 2026-03-07T02:16:28Z

packages/osint/src/hallucination/verifier.py

+def _is_claim_candidate(sentence: str) -> bool:
+    if CLAIM_PATTERN.search(sentence):
+        return True
+    if re.search(r"\b\d{1,4}\b", sentence):
+        return True
+    if re.search(r"\b[A-Z][a-z]+\b", sentence):
+        return True
+    return False


The heuristic re.search(r"\b[A-Z][a-z]+\b", sentence) is too broad for identifying claim candidates. It will match the first word of most sentences in English, as well as any proper noun, leading to a high number of false positives. This would cause many ordinary sentences to be flagged as unsupported claims, which runs counter to the goal of reducing noise. The heuristic should be more specific or removed.

def _is_claim_candidate(sentence: str) -> bool: if CLAIM_PATTERN.search(sentence): return True if re.search(r"\b\d{1,4}\b", sentence): return True return False

gemini-code-assist · 2026-03-07T02:16:28Z

packages/osint/src/hallucination/evidence_id.py

+def _drop_tracking_params(params: Iterable[tuple[str, str]]) -> list[tuple[str, str]]:
+    cleaned: list[tuple[str, str]] = []
+    for key, value in params:
+        if key.startswith("utm_"):
+            continue
+        if key in TRACKING_QUERY_KEYS:
+            continue
+        cleaned.append((key, value))
+    return cleaned


For improved readability and conciseness, this function can be refactored to use a list comprehension. This is a more idiomatic Python approach for filtering and transforming lists.

def _drop_tracking_params(params: Iterable[tuple[str, str]]) -> list[tuple[str, str]]: return [ (key, value) for key, value in params if not key.startswith("utm_") and key not in TRACKING_QUERY_KEYS ]

gemini-code-assist · 2026-03-07T02:16:28Z

packages/osint/src/hallucination/verifier.py

+def _collect_evidence_ids(facts: Iterable[Fact]) -> List[str]:
+    evidence_ids: List[str] = []
+    for fact in facts:
+        for prov in fact.provenance:
+            if prov.evidence_id:
+                evidence_ids.append(prov.evidence_id)
+    return evidence_ids


This function can be made more concise and Pythonic by using a nested list comprehension to collect the evidence IDs. This improves readability by expressing the logic in a single statement.

def _collect_evidence_ids(facts: Iterable[Fact]) -> List[str]: return [ prov.evidence_id for fact in facts for prov in fact.provenance if prov.evidence_id ]

github-actions

🤖 Auto-approved by Mega Merge Orchestrator

github-actions

🤖 Auto-approved by Mega Merge Orchestrator

BrianCLong · 2026-03-30T14:01:46Z

Temporarily closing to reduce Actions queue saturation and unblock #22241. Reopen after the golden-main convergence PR merges.

BrianCLong · 2026-03-30T14:01:47Z

Temporarily closing to reduce Actions queue saturation and unblock #22241. Reopen after the golden-main convergence PR merges.

fix(osint): refine verifier claim filtering

00d59b0

BrianCLong added the codex Codex-owned implementation work label Mar 7, 2026 — with ChatGPT Codex Connector

gemini-code-assist bot reviewed Mar 7, 2026

View reviewed changes

BrianCLong added the queue:needs-rebase label Mar 7, 2026

github-actions bot approved these changes Mar 8, 2026

View reviewed changes

BrianCLong added queue:blocked and removed queue:needs-rebase labels Mar 23, 2026

BrianCLong closed this Mar 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(osint): refine verifier claim filtering#19509

fix(osint): refine verifier claim filtering#19509
BrianCLong wants to merge 1 commit intomainfrom
codex/add-osint-hallucination-mitigation-features-25gntw

BrianCLong commented Mar 7, 2026

Uh oh!

chatgpt-codex-connector bot commented Mar 7, 2026

Uh oh!

gemini-code-assist bot commented Mar 7, 2026

Uh oh!

coderabbitai bot commented Mar 7, 2026

Rate limit exceeded

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 7, 2026

Uh oh!

gemini-code-assist bot Mar 7, 2026

Uh oh!

gemini-code-assist bot Mar 7, 2026

Uh oh!

gemini-code-assist bot Mar 7, 2026

Uh oh!

github-actions bot left a comment

Uh oh!

github-actions bot left a comment

Uh oh!

BrianCLong commented Mar 30, 2026

Uh oh!

BrianCLong commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		has_evidence_id = bool(EVIDENCE_ID_PATTERN.search(claim))
		if not has_evidence_id and not any(evidence_id in claim for evidence_id in evidence_ids):

Conversation

BrianCLong commented Mar 7, 2026

Motivation

Description

Testing

Uh oh!

chatgpt-codex-connector bot commented Mar 7, 2026

Uh oh!

gemini-code-assist bot commented Mar 7, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot commented Mar 7, 2026

Rate limit exceeded

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

BrianCLong commented Mar 30, 2026

Uh oh!

BrianCLong commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant