Skip to content

fix: align agent data access policy checks#19507

Closed
BrianCLong wants to merge 1 commit intomainfrom
codex/implement-agent-data-access-policies-jxdqh9
Closed

fix: align agent data access policy checks#19507
BrianCLong wants to merge 1 commit intomainfrom
codex/implement-agent-data-access-policies-jxdqh9

Conversation

@BrianCLong
Copy link
Copy Markdown
Owner

Motivation

  • Harden the policy foundation by validating banned-operation patterns and ensuring registries are schema-checked to reduce policy drift and bypass risk.
  • Replace the Jest-based evaluator test (which failed due to repo-wide haste/ESM transform collisions) with an ESM-compatible Node runner to provide reliable, local verification.
  • Keep changes scoped to policy foundations and evidence/metadata so runtime behavior remains unchanged.

Description

  • Add banned-patterns artifacts and JSON Schema under .github/policies/agent-data-access/ and enforce validation in the loader (.github/policies/agent-data-access/banned_patterns.json, .github/policies/agent-data-access/banned_patterns.schema.json).
  • Extend the policy loader (src/agents/policy/policyLoad.ts) to validate the banned-patterns schema and perform registry id/pattern checks.
  • Introduce typed policy contracts and evaluator: src/agents/policy/policyTypes.ts and src/agents/policy/policyEval.ts implementing deny-by-default logic and banned-pattern checks.
  • Replace the Jest test with an ESM-compatible check runner tests/agents/policy/policyEval.check.mjs and update evidence, prompt metadata, agent example, and roadmap (evidence/*, prompts/registry.yaml, agents/examples/AGENT_DATA_ACCESS_PR1.json, docs/roadmap/STATUS.json).

Testing

  • Running the repository Jest invocation against the original test failed due to haste-map/module naming collisions and ESM/transform issues (environmental, not policy logic).
  • Verified policy evaluator behavior with an ESM check: node --loader ts-node/esm tests/agents/policy/policyEval.check.mjs which executed and returned policyEval.check: ok (pass).
  • Evidence artifacts and references were updated to point at the new check runner and included in evidence/index.json.

Codex Task

@BrianCLong BrianCLong added the codex Codex-owned implementation work label Mar 7, 2026 — with ChatGPT Codex Connector
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly hardens the agent data access policy framework. It establishes a robust foundation for defining, validating, and evaluating agent interactions with data sources and tools, ensuring a deny-by-default posture and enforcing critical security patterns. The changes are confined to the policy definition and evaluation layers, without altering existing runtime behavior, and include a modernized testing approach for improved reliability.

Highlights

  • Enhanced Policy Foundation: Strengthened agent data access policies by introducing validation for banned operation patterns and ensuring all registries (tools, sources) are schema-checked to prevent policy drift and bypass risks.
  • Modernized Testing Framework: Replaced the problematic Jest-based evaluator test with an ESM-compatible Node.js runner, providing a more reliable and local verification method for policy logic.
  • Introduced Core Policy Components: Added new TypeScript files for defining policy types, a policy loader that includes schema and registry ID validation, and a policy evaluator implementing deny-by-default logic and banned-pattern checks.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Activity
  • The policy evaluator behavior was verified using an ESM-compatible Node.js check, confirming its functionality outside of the previous Jest environment.
  • Evidence artifacts and references were updated to reflect the new check runner and included in the evidence/index.json file.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 7, 2026

Warning

Rate limit exceeded

@BrianCLong has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 13 minutes and 28 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8807f4e7-2af1-44de-9029-53c223ed4f3c

📥 Commits

Reviewing files that changed from the base of the PR and between 68c99c3 and e2967f9.

📒 Files selected for processing (21)
  • .github/policies/agent-data-access/banned_patterns.json
  • .github/policies/agent-data-access/banned_patterns.schema.json
  • .github/policies/agent-data-access/source_registry.json
  • .github/policies/agent-data-access/source_registry.schema.json
  • .github/policies/agent-data-access/tool_registry.json
  • .github/policies/agent-data-access/tool_registry.schema.json
  • agents/examples/AGENT_DATA_ACCESS_PR1.json
  • docs/roadmap/STATUS.json
  • evidence/EVD-OSINTLEGAL-DATAACCESS-001/metrics.json
  • evidence/EVD-OSINTLEGAL-DATAACCESS-001/report.json
  • evidence/EVD-OSINTLEGAL-DATAACCESS-001/stamp.json
  • evidence/EVD-OSINTLEGAL-DATAACCESS-002/metrics.json
  • evidence/EVD-OSINTLEGAL-DATAACCESS-002/report.json
  • evidence/EVD-OSINTLEGAL-DATAACCESS-002/stamp.json
  • evidence/index.json
  • prompts/agent-data-access/policy-foundation@v1.md
  • prompts/registry.yaml
  • src/agents/policy/policyEval.ts
  • src/agents/policy/policyLoad.ts
  • src/agents/policy/policyTypes.ts
  • tests/agents/policy/policyEval.check.mjs
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/implement-agent-data-access-policies-jxdqh9

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request lays a strong foundation for agent data access policies by introducing schema-validated registries, a typed policy evaluator, and an ESM-compatible test runner. However, there are significant security gaps in the evaluator logic that undermine the intended 'deny-by-default' posture. Specifically, the approvalId check for restricted sources is easily bypassed due to lack of validation, and the optional nature of operation and sourceId fields allows for bypassing banned pattern and source restriction checks. Additionally, the policy loader is vulnerable to a limited path traversal if the policyDir parameter is untrusted. Further suggestions include aligning JSON schemas with the specified draft version, improving policy loading and evaluation performance, and enhancing test coverage.

Comment on lines +40 to +42
if (!inv.approvalId) {
return { allow: false, reason: 'APPROVAL_REQUIRED' };
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The policy evaluator allows access to restricted sources as long as an approvalId is present in the invocation. It does not perform any validation on the approvalId (e.g., checking against a registry of valid approvals, verifying the requester, or ensuring the approval is for the specific source). This allows an attacker to bypass the restriction by providing any non-empty string as an approvalId.

Comment on lines +12 to +17
const op = (inv.operation || '').toLowerCase();
for (const pattern of policy.bannedPatterns) {
if (op.includes(pattern.toLowerCase())) {
return { allow: false, reason: 'BANNED_OPERATION' };
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The policy evaluator only performs checks for operation and sourceId if those fields are provided in the ToolInvocation. Since these fields are optional, an attacker can bypass banned operation patterns and restricted source checks by simply omitting the corresponding fields from the invocation, undermining the 'deny-by-default' security posture. Furthermore, if banned patterns are normalized to lowercase at load time, the .toLowerCase() call here is redundant and can be removed for a minor performance improvement.

    if (op.includes(pattern)) {

Comment on lines +56 to +76
export function loadPolicyBundle(
policyDir = path.resolve(
process.cwd(),
'.github',
'policies',
'agent-data-access',
),
): PolicyBundle {
const toolSchemaPath = path.join(policyDir, TOOL_SCHEMA);
const sourceSchemaPath = path.join(policyDir, SOURCE_SCHEMA);
const bannedSchemaPath = path.join(policyDir, BANNED_SCHEMA);

const toolRegistry = readJson<ToolRegistry>(
path.join(policyDir, 'tool_registry.json'),
);
const sourceRegistry = readJson<SourceRegistry>(
path.join(policyDir, 'source_registry.json'),
);
const bannedRegistry = readJson<BannedPatternRegistry>(
path.join(policyDir, 'banned_patterns.json'),
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The loadPolicyBundle function accepts a policyDir parameter which is used to construct file paths for fs.readFileSync without validation. If this parameter is controlled by an untrusted user, it could lead to arbitrary file read (limited to specific filenames like tool_registry.json).

export function loadPolicyBundle(
  policyDir = path.resolve(
    process.cwd(),
    '.github',
    'policies',
    'agent-data-access',
  ),
): PolicyBundle {
  const resolvedPath = path.resolve(policyDir);
  if (!resolvedPath.startsWith(process.cwd())) {
    throw new Error('Invalid policy directory');
  }
  const toolSchemaPath = path.join(policyDir, TOOL_SCHEMA);

Comment on lines +15 to +19
"additionalProperties": { "$ref": "#/definitions/source" }
}
},
"definitions": {
"source": {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The JSON schema is declared as draft/2020-12, but it uses the definitions keyword which is from older drafts. For draft 2020-12, definitions is deprecated in favor of $defs. To align with the specified schema version, you should use $defs and update the $ref accordingly.

Suggested change
"additionalProperties": { "$ref": "#/definitions/source" }
}
},
"definitions": {
"source": {
"additionalProperties": { "$ref": "#/$defs/source" }
}
},
"$defs": {
"source": {

Comment on lines +15 to +19
"additionalProperties": { "$ref": "#/definitions/tool" }
}
},
"definitions": {
"tool": {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The JSON schema is declared as draft/2020-12, but it uses the definitions keyword which is from older drafts. For draft 2020-12, definitions is deprecated in favor of $defs. To align with the specified schema version, you should use $defs and update the $ref accordingly.

Suggested change
"additionalProperties": { "$ref": "#/definitions/tool" }
}
},
"definitions": {
"tool": {
"additionalProperties": { "$ref": "#/$defs/tool" }
}
},
"$defs": {
"tool": {

Comment on lines +22 to +23
const ajv = new Ajv2020({ allErrors: true, strict: false });
addFormats(ajv);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Creating a new Ajv2020 instance on every call to validateSchema is inefficient. The AJV instance should be created once at the module level and reused. This avoids the overhead of re-initializing the validator and its formats for each schema validation. After applying the suggested schema fixes for $defs, you might also be able to enable strict mode (strict: true).

return {
tools: toolRegistry.tools,
sources: sourceRegistry.sources,
bannedPatterns: bannedRegistry.patterns,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve performance in the policy evaluator, it's better to normalize the banned patterns to lowercase here during load time. This avoids repeated toLowerCase() calls inside the evaluation loop.

    bannedPatterns: bannedRegistry.patterns.map(p => p.toLowerCase()),

);
assert.deepEqual(allowDecision, { allow: true });

console.log('policyEval.check: ok');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The test coverage for the policy evaluator is a bit sparse. It would be beneficial to add more test cases to cover other denial reasons handled by evaluateInvocation. For example:

  • Invocation with a banned operation pattern (e.g., 'bypass').
  • Invocation of a disabled tool or source.
  • Invocation with a scope not allowlisted for the tool.
  • Invocation of a source missing a lawful_basis.

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Auto-approved by Mega Merge Orchestrator

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Auto-approved by Mega Merge Orchestrator

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Auto-approved by Mega Merge Orchestrator

@BrianCLong
Copy link
Copy Markdown
Owner Author

Temporarily closing to reduce Actions queue saturation and unblock #22241. Reopen after the golden-main convergence PR merges.

1 similar comment
@BrianCLong
Copy link
Copy Markdown
Owner Author

Temporarily closing to reduce Actions queue saturation and unblock #22241. Reopen after the golden-main convergence PR merges.

@BrianCLong BrianCLong closed this Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

codex Codex-owned implementation work queue:blocked

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant