-
Notifications
You must be signed in to change notification settings - Fork 1
fix(osint): refine verifier claim filtering #19509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| # OSINT Hallucination Mitigation Runbook | ||
|
|
||
| ## Objective | ||
|
|
||
| Execute OSINT runs with provenance-required facts, deterministic evidence IDs, and | ||
| verifier gates that flag unsupported claims. | ||
|
|
||
| ## How to Run | ||
|
|
||
| ```bash | ||
| python scripts/osint_run.py --case <fixture> --out artifacts/osint/<run_id> | ||
| ``` | ||
|
|
||
| ## Interpreting Verification Output | ||
|
|
||
| - `needs_human_review: true` means unsupported claims or missing provenance were | ||
| detected. | ||
| - `unsupported_claims[]` lists claims without Evidence ID support. | ||
| - `missing_provenance_facts[]` lists fact IDs missing required fields. | ||
|
|
||
| ## Triage Checklist | ||
|
|
||
| 1. Confirm provenance fields are complete for every fact. | ||
| 2. Validate two-source promotion for any `confirmed` fact. | ||
| 3. Remove or downgrade unsupported narrative claims. | ||
| 4. Re-run verification and ensure `needs_human_review` is cleared. | ||
|
|
||
| ## Operational SLO (Initial) | ||
|
|
||
| - 95% of runs complete under the agreed CI runner time budget. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| # OSINT Hallucination Mitigation Data Handling | ||
|
|
||
| ## Scope | ||
|
|
||
| Applies to OSINT collection, retrieval, summarization, and verification artifacts. | ||
|
|
||
| ## Never-Log List | ||
|
|
||
| Never log or persist the following without explicit approval and redaction: | ||
|
|
||
| - Auth tokens, session cookies, API keys | ||
| - Private keys or signing materials | ||
| - Emails or phone numbers unless explicitly required and redacted | ||
|
|
||
| ## Retention | ||
|
|
||
| - Raw pages and blobs: short-lived retention, delete after extraction and | ||
| verification windows close. | ||
| - Extracted facts: longer-lived retention to preserve auditability. | ||
|
|
||
| ## Determinism Rules | ||
|
|
||
| - Deterministic artifacts must not embed wall-clock timestamps. | ||
| - `collected_at` is permitted in provenance fields but stored outside | ||
| deterministic bundles when possible. | ||
|
|
||
| ## Verification Requirements | ||
|
|
||
| - Missing provenance must downgrade facts to `unknown`. | ||
| - Unsupported narrative claims trigger `needs_human_review`. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,80 @@ | ||
| # OSINT Hallucination Mitigation Standard | ||
|
|
||
| ## Purpose | ||
|
|
||
| Make hallucination resistance a first-class OSINT design goal by enforcing | ||
| traceable, checkable, degradable-to-unknown facts with deterministic evidence | ||
| artifacts. | ||
|
|
||
| ## Non-Negotiables | ||
|
|
||
| 1. **Provenance mandatory**: every assertion carries explicit source metadata. | ||
| 2. **Degradable-to-unknown**: missing provenance downgrades facts to `unknown`. | ||
| 3. **Retrieval-first**: collect raw → retrieval selects → summarizer references | ||
| retrieved evidence IDs only. | ||
| 4. **Extractive-first**: key fields (names, dates, IPs, IOCs) must prefer | ||
| extractive resolution prior to LLM paraphrase. | ||
| 5. **Two-source promotion**: `confirmed` requires ≥2 independent sources. | ||
| 6. **Verifier required**: final report is audited for unsupported claims. | ||
| 7. **Human sign-off**: final assessment requires human approval. | ||
|
|
||
| ## Required Fact & Evidence Fields | ||
|
|
||
| Each fact MUST include provenance fields: | ||
|
|
||
| - `source_url` | ||
| - `source_type` | ||
| - `collected_at` | ||
| - `collector_tool` | ||
| - `verdict_confidence` | ||
|
|
||
| Evidence IDs are deterministic: | ||
|
|
||
| ``` | ||
| EVID:<source_type>:<sha256(normalized_source_url)>:<sha256(snippet_canonical)> | ||
| ``` | ||
|
|
||
| ## Deterministic Artifacts | ||
|
|
||
| Artifacts must be produced per run with no unstable timestamps inside the | ||
| deterministic files: | ||
|
|
||
| - `artifacts/osint/<run_id>/raw/…` | ||
| - `artifacts/osint/<run_id>/retrieved.json` | ||
| - `artifacts/osint/<run_id>/facts.jsonl` | ||
| - `artifacts/osint/<run_id>/report.md` | ||
| - `artifacts/osint/<run_id>/verification.json` | ||
| - `artifacts/osint/<run_id>/metrics.json` | ||
|
|
||
| ## Import / Export Matrix | ||
|
|
||
| **Imports** | ||
|
|
||
| - Collector raw blobs (JSON/HTML/text) | ||
| - External tool ID + version in `collector_tool` | ||
|
|
||
| **Exports** | ||
|
|
||
| - `retrieved.json`: evidence selection list | ||
| - `facts.jsonl`: fact records with provenance | ||
| - `verification.json`: verifier outputs | ||
| - `report.md`: narrative with inline Evidence IDs | ||
|
|
||
| **Non-goals** | ||
|
|
||
| - No automatic truth adjudication without provenance | ||
| - No single-source confirmation | ||
| - No silent backfilling of missing fields | ||
|
|
||
| ## MAESTRO Security Alignment | ||
|
|
||
| **MAESTRO Layers:** Data, Agents, Tools, Observability, Security. | ||
| **Threats Considered:** prompt injection, unsupported claims, single-source | ||
| misinformation, evidence tampering. | ||
| **Mitigations:** provenance-required facts, deterministic Evidence IDs, two-source | ||
| promotion gate, verifier audit, human approval. | ||
|
|
||
| ## References | ||
|
|
||
| - Summit Readiness Assertion | ||
| - MAESTRO Threat Modeling Framework |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| """OSINT hallucination mitigation primitives.""" |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,61 @@ | ||
| """Deterministic evidence ID generation utilities.""" | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import hashlib | ||
| import re | ||
| import unicodedata | ||
| from typing import Iterable | ||
| from urllib.parse import parse_qsl, urlencode, urlparse, urlunparse | ||
|
|
||
| TRACKING_QUERY_KEYS = { | ||
| "fbclid", | ||
| "gclid", | ||
| "igshid", | ||
| "mc_cid", | ||
| "mc_eid", | ||
| } | ||
|
|
||
|
|
||
| def _drop_tracking_params(params: Iterable[tuple[str, str]]) -> list[tuple[str, str]]: | ||
| cleaned: list[tuple[str, str]] = [] | ||
| for key, value in params: | ||
| if key.startswith("utm_"): | ||
| continue | ||
| if key in TRACKING_QUERY_KEYS: | ||
| continue | ||
| cleaned.append((key, value)) | ||
| return cleaned | ||
|
|
||
|
|
||
| def normalize_source_url(source_url: str) -> str: | ||
| """Normalize URLs for deterministic evidence IDs.""" | ||
|
|
||
| parsed = urlparse(source_url.strip()) | ||
| query_params = _drop_tracking_params(parse_qsl(parsed.query, keep_blank_values=True)) | ||
| query_params.sort() | ||
| normalized = parsed._replace( | ||
| scheme=parsed.scheme.lower(), | ||
| netloc=parsed.netloc.lower(), | ||
| query=urlencode(query_params, doseq=True), | ||
| fragment="", | ||
| ) | ||
| return urlunparse(normalized) | ||
|
|
||
|
|
||
| def canonicalize_snippet(snippet: str) -> str: | ||
| """Canonicalize snippets to reduce noise before hashing.""" | ||
|
|
||
| normalized = unicodedata.normalize("NFKC", snippet) | ||
| normalized = re.sub(r"\s+", " ", normalized).strip() | ||
| return normalized | ||
|
|
||
|
|
||
| def compute_evidence_id(source_type: str, source_url: str, snippet: str) -> str: | ||
| """Compute deterministic evidence IDs.""" | ||
|
|
||
| normalized_url = normalize_source_url(source_url) | ||
| normalized_snippet = canonicalize_snippet(snippet) | ||
| url_hash = hashlib.sha256(normalized_url.encode("utf-8")).hexdigest() | ||
| snippet_hash = hashlib.sha256(normalized_snippet.encode("utf-8")).hexdigest() | ||
| return f"EVID:{source_type}:{url_hash}:{snippet_hash}" | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| """Fact and provenance schema helpers for OSINT hallucination mitigation.""" | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| from dataclasses import dataclass, replace | ||
| from typing import Dict, List, Literal, Optional | ||
|
|
||
| Verdict = Literal["confirmed", "unconfirmed", "unknown", "rejected"] | ||
|
|
||
| REQUIRED_PROVENANCE_FIELDS = [ | ||
| "source_url", | ||
| "source_type", | ||
| "collected_at", | ||
| "collector_tool", | ||
| "verdict_confidence", | ||
| ] | ||
|
|
||
|
|
||
| @dataclass(frozen=True) | ||
| class Provenance: | ||
| source_url: str | ||
| source_type: str | ||
| collected_at: str | ||
| collector_tool: str | ||
| verdict_confidence: float | ||
| snippet: Optional[str] = None | ||
| evidence_id: Optional[str] = None | ||
|
|
||
|
|
||
| @dataclass(frozen=True) | ||
| class Fact: | ||
| fact_id: str | ||
| predicate: str | ||
| value: str | ||
| verdict: Verdict | ||
| confidence: float | ||
| provenance: List[Provenance] | ||
| notes: Optional[str] = None | ||
| labels: Optional[Dict[str, str]] = None | ||
|
|
||
|
|
||
| def missing_provenance_fields(fact: Fact) -> List[str]: | ||
| if not fact.provenance: | ||
| return ["missing_provenance"] | ||
| missing: List[str] = [] | ||
| for prov in fact.provenance: | ||
| for field in REQUIRED_PROVENANCE_FIELDS: | ||
| value = getattr(prov, field, None) | ||
| if value in (None, "", []): | ||
| missing.append(f"provenance_missing:{field}") | ||
| return missing | ||
|
|
||
|
|
||
| def validate_fact(fact: Fact) -> List[str]: | ||
| errors = missing_provenance_fields(fact) | ||
| if fact.verdict == "confirmed": | ||
| sources = {prov.source_url for prov in fact.provenance if prov.source_url} | ||
| if len(sources) < 2: | ||
| errors.append("confirmed_requires_two_sources") | ||
| return errors | ||
|
|
||
|
|
||
| def apply_provenance_policy(fact: Fact) -> Fact: | ||
| missing = missing_provenance_fields(fact) | ||
| if not missing: | ||
| return fact | ||
| notes = "; ".join(missing) | ||
| return replace(fact, verdict="unknown", notes=notes) | ||
|
|
||
|
|
||
| def apply_two_source_policy(fact: Fact) -> Fact: | ||
| if fact.verdict != "confirmed": | ||
| return fact | ||
| sources = {prov.source_url for prov in fact.provenance if prov.source_url} | ||
| if len(sources) >= 2: | ||
| return fact | ||
| notes = "confirmed_requires_two_sources" | ||
| return replace(fact, verdict="unconfirmed", notes=notes) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,77 @@ | ||
| """Verifier agent for unsupported claim detection.""" | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import re | ||
| from typing import Any, Dict, Iterable, List | ||
|
|
||
| from .facts import Fact, missing_provenance_fields | ||
|
|
||
| CLAIM_PATTERN = re.compile(r"\b(?:claim|fact)\b", re.IGNORECASE) | ||
| EVIDENCE_ID_PATTERN = re.compile(r"\bEVID:[A-Za-z0-9_-]+:[a-f0-9]{64}:[a-f0-9]{64}\b") | ||
| GAP_PREFIXES = ("unknown", "unanswered", "open question", "gap") | ||
|
|
||
|
|
||
| def _is_gap_statement(sentence: str) -> bool: | ||
| normalized = sentence.strip().lower() | ||
| return any(normalized.startswith(prefix) for prefix in GAP_PREFIXES) | ||
|
|
||
|
|
||
| def _is_claim_candidate(sentence: str) -> bool: | ||
| if CLAIM_PATTERN.search(sentence): | ||
| return True | ||
| if re.search(r"\b\d{1,4}\b", sentence): | ||
| return True | ||
| if re.search(r"\b[A-Z][a-z]+\b", sentence): | ||
| return True | ||
| return False | ||
|
Comment on lines
+20
to
+27
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The heuristic def _is_claim_candidate(sentence: str) -> bool:
if CLAIM_PATTERN.search(sentence):
return True
if re.search(r"\b\d{1,4}\b", sentence):
return True
return False |
||
|
|
||
|
|
||
| def extract_claims(report_text: str) -> List[str]: | ||
| sentences = re.split(r"(?<=[.!?])\s+", report_text.strip()) | ||
| claims = [ | ||
| sentence | ||
| for sentence in sentences | ||
| if sentence and not _is_gap_statement(sentence) and _is_claim_candidate(sentence) | ||
| ] | ||
| return claims | ||
|
|
||
|
|
||
| def _collect_evidence_ids(facts: Iterable[Fact]) -> List[str]: | ||
| evidence_ids: List[str] = [] | ||
| for fact in facts: | ||
| for prov in fact.provenance: | ||
| if prov.evidence_id: | ||
| evidence_ids.append(prov.evidence_id) | ||
| return evidence_ids | ||
|
Comment on lines
+40
to
+46
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This function can be made more concise and Pythonic by using a nested list comprehension to collect the evidence IDs. This improves readability by expressing the logic in a single statement. def _collect_evidence_ids(facts: Iterable[Fact]) -> List[str]:
return [
prov.evidence_id for fact in facts for prov in fact.provenance if prov.evidence_id
] |
||
|
|
||
|
|
||
| def verify_report(report_text: str, facts: List[Fact]) -> Dict[str, Any]: | ||
| claims = extract_claims(report_text) | ||
| evidence_ids = _collect_evidence_ids(facts) | ||
| unsupported_claims: List[Dict[str, str]] = [] | ||
| for claim in claims: | ||
| has_evidence_id = bool(EVIDENCE_ID_PATTERN.search(claim)) | ||
| if not has_evidence_id and not any(evidence_id in claim for evidence_id in evidence_ids): | ||
|
Comment on lines
+54
to
+55
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a critical flaw in the claim verification logic. The current implementation only checks if a claim contains a string that looks like an evidence ID, but it does not validate this ID against the list of known evidence IDs from the provided facts. This means a claim with a hallucinated (i.e., fake but well-formed) evidence ID will incorrectly pass verification. The logic must be changed to ensure that a claim is only considered supported if it contains at least one of the known evidence IDs. if not any(evidence_id in claim for evidence_id in evidence_ids): |
||
| unsupported_claims.append( | ||
| { | ||
| "claim": claim, | ||
| "reason": "missing_evidence_id", | ||
| } | ||
| ) | ||
|
|
||
| missing_provenance = [ | ||
| fact.fact_id for fact in facts if missing_provenance_fields(fact) | ||
| ] | ||
|
|
||
| needs_human_review = bool(unsupported_claims or missing_provenance) | ||
| return { | ||
| "needs_human_review": needs_human_review, | ||
| "unsupported_claims": unsupported_claims, | ||
| "missing_provenance_facts": missing_provenance, | ||
| "summary": { | ||
| "facts_total": len(facts), | ||
| "facts_missing_provenance": len(missing_provenance), | ||
| "unsupported_claims_total": len(unsupported_claims), | ||
| }, | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For improved readability and conciseness, this function can be refactored to use a list comprehension. This is a more idiomatic Python approach for filtering and transforming lists.