Skip to content

🛡️ Sentinel: [CRITICAL] Fix Server-Side Request Forgery (SSRF) in bulk lookup API#207

Open
aicoder2009 wants to merge 1 commit into
mainfrom
sentinel/ssrf-bulk-lookup-7554644686263006690
Open

🛡️ Sentinel: [CRITICAL] Fix Server-Side Request Forgery (SSRF) in bulk lookup API#207
aicoder2009 wants to merge 1 commit into
mainfrom
sentinel/ssrf-bulk-lookup-7554644686263006690

Conversation

@aicoder2009

@aicoder2009 aicoder2009 commented Jul 3, 2026

Copy link
Copy Markdown
Owner

🚨 Severity: CRITICAL
💡 Vulnerability: The bulk lookup API relied on the user-controllable Host header via request.nextUrl.origin to execute a loopback fetch() for internal endpoints. This introduces a Server-Side Request Forgery (SSRF) vulnerability.
🎯 Impact: Attackers could manipulate the Host header to force the server to proxy requests to arbitrary internal services.
🔧 Fix: Refactored the bulk lookup route to directly import the internal Next.js POST handlers, bypassing network execution. Handlers are securely invoked using a synthetic NextRequest with only safe JSON headers.
Verification: Verified by checking that the SSRF tests pass successfully alongside the full test suite (pnpm test:run), and that pnpm lint confirms there are no new lint regressions. Also documented learning in Sentinel's journal.


PR created automatically by Jules for task 7554644686263006690 started by @aicoder2009

Summary by CodeRabbit

  • Bug Fixes

    • Improved bulk lookup requests to work more reliably and securely, reducing failures caused by internal request handling.
    • Preserved the same batch results and summary counts while making item lookups more robust.
  • Documentation

    • Added a security note describing the issue and the updated handling approach for future reference.

Removed loopback fetch in /api/lookup/bulk/route.ts that relied on request.nextUrl.origin and replaced it with secure direct NextRequest invocations. Also updated corresponding tests.

Co-authored-by: aicoder2009 <127642633+aicoder2009@users.noreply.github.com>
Copilot AI review requested due to automatic review settings July 3, 2026 06:30
@vercel

vercel Bot commented Jul 3, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
opencitation Ready Ready Preview, Comment Jul 3, 2026 6:31am

@google-labs-jules

Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The bulk lookup API route was changed to invoke internal URL/DOI/ISBN lookup route handlers directly via synthetic NextRequest objects instead of making fetch calls to loopback endpoints, mitigating an SSRF risk. Tests were updated to mock the handlers instead of global.fetch, and a security note was added.

Changes

Bulk Lookup SSRF Mitigation

Layer / File(s) Summary
Direct handler invocation implementation
src/app/api/lookup/bulk/route.ts
Route now imports and directly invokes URL/DOI/ISBN lookup POST handlers via synthetic NextRequest objects instead of fetching ${baseUrl}${apiEndpoint} over HTTP.
Test updates for mocked handlers
src/app/api/lookup/bulk/route.test.ts
Tests mock lookupUrl/lookupDoi/lookupIsbn POST handlers and assert on their invocation and returned NextResponse data instead of mocking global.fetch.
SSRF documentation entry
.jules/sentinel.md
Adds a dated entry describing the SSRF risk from loopback fetch using request.nextUrl.origin and the mitigation approach.

Estimated code review effort: 3 (Moderate) | ~20 minutes

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant BulkRoute as bulk/route.ts
  participant Handler as lookupUrl/lookupDoi/lookupIsbn POST

  Client->>BulkRoute: POST items[]
  loop each item
    BulkRoute->>BulkRoute: detect type & build payload
    BulkRoute->>BulkRoute: create synthetic NextRequest
    BulkRoute->>Handler: await handler(syntheticReq)
    Handler-->>BulkRoute: NextResponse
  end
  BulkRoute-->>Client: results + summary
Loading
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: fixing an SSRF vulnerability in the bulk lookup API.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sentinel/ssrf-bulk-lookup-7554644686263006690

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint install failed due to a network error.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR mitigates an SSRF risk in the bulk lookup API by removing the internal loopback fetch() behavior derived from request.nextUrl.origin and instead invoking internal Next.js route handlers directly with a synthetic NextRequest.

Changes:

  • Refactors /api/lookup/bulk to route items by directly calling the URL/DOI/ISBN POST handlers (no internal network call).
  • Updates the bulk route tests to mock the internal handlers rather than mocking global.fetch.
  • Documents the SSRF learning/prevention approach in .jules/sentinel.md.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/app/api/lookup/bulk/route.ts Replaces loopback fetch() with direct internal handler invocation using a synthetic NextRequest.
src/app/api/lookup/bulk/route.test.ts Switches tests to mocking internal route handlers instead of global.fetch.
.jules/sentinel.md Adds a journal entry documenting the SSRF issue and the chosen mitigation approach.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 39 to +43
if (trimmedItem.match(/^(https?:\/\/|www\.)/i)) {
apiEndpoint = "/api/lookup/url";
body = { url: trimmedItem };
handler = lookupUrl as unknown as (req: NextRequest) => Promise<NextResponse> | NextResponse;
payload = { url: trimmedItem };
} else if (trimmedItem.match(/^10\.\d{4,}/)) {
apiEndpoint = "/api/lookup/doi";
body = { doi: trimmedItem };
handler = lookupDoi as unknown as (req: NextRequest) => Promise<NextResponse> | NextResponse;
Comment on lines 1 to +3
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { POST } from './route';
import { NextRequest } from 'next/server';
import { NextRequest, NextResponse } from 'next/server';
Comment on lines 68 to 77
it('routes URLs to /api/lookup/url', async () => {
(global.fetch as ReturnType<typeof vi.fn>).mockResolvedValueOnce({
ok: true,
json: async () => ({ data: { title: 'Example Page' } }),
});
(lookupUrl as ReturnType<typeof vi.fn>).mockResolvedValueOnce(
NextResponse.json({ data: { title: 'Example Page' } }, { status: 200 })
);
const response = await POST(makeRequest({ items: ['https://example.com'] }));
const data = await response.json();
expect(data.results[0].success).toBe(true);
expect(data.results[0].data.title).toBe('Example Page');
const [url] = (global.fetch as ReturnType<typeof vi.fn>).mock.calls[0] as [string];
expect(url).toContain('/api/lookup/url');
expect(lookupUrl).toHaveBeenCalled();
});

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/app/api/lookup/bulk/route.test.ts (1)

99-107: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Consider a case exercising the synthetic-request path itself.

Mocking the URL/DOI/ISBN POST handlers correctly isolates routing/summary logic, but it also bypasses the actual SSRF-mitigation code (synthetic NextRequest construction + real handler invocation). Since that construction is the core of this PR, one integration-style test that lets a real handler parse the synthetic request (with network fetches stubbed at a lower layer) would guard against regressions in the request-building contract.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/app/api/lookup/bulk/route.test.ts` around lines 99 - 107, Add an
integration-style test around POST in bulk/route.test.ts that exercises the
synthetic NextRequest path instead of mocking the URL/DOI/ISBN handlers
directly. Keep the real handler invocation for one item, but stub the
lower-level network layer so the synthetic request construction and parsing are
actually covered. Use the existing POST and makeRequest flow as the entry point,
and verify the response summary still marks the item failed when the real
handler processes the synthetic request.
src/app/api/lookup/bulk/route.ts (1)

35-47: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Extract a shared handler type instead of as unknown as casts.

The triple as unknown as (req: NextRequest) => Promise<NextResponse> | NextResponse obscures that all three imports already share the signature (req: NextRequest) => Promise<NextResponse<APIResponse>>. A single aliased type improves readability and would surface real signature drift rather than silencing it.

♻️ Sketch
+type LookupHandler = (req: NextRequest) => Promise<NextResponse> | NextResponse;
+
     const lookupPromises = items.map(async (item) => {
       ...
-        let handler: (req: NextRequest) => Promise<NextResponse> | NextResponse;
+        let handler: LookupHandler;

Then cast each import once as lookupUrl as unknown as LookupHandler (or, better, align the handler return types so no cast is needed).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/app/api/lookup/bulk/route.ts` around lines 35 - 47, The bulk lookup route
is repeating a verbose `as unknown as` cast for `lookupUrl`, `lookupDoi`, and
`lookupIsbn`, which hides their shared function shape. Introduce a single shared
handler type near the `handler` variable in `route.ts` and use that alias for
all three imports so the cast happens once (or remove it entirely if the
signatures already align); this keeps `NextRequest`/`NextResponse` handling
consistent and makes signature drift visible.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/app/api/lookup/bulk/route.ts`:
- Around line 45-47: Bulk ISBN matching is using trimmedItem directly, so
hyphenated or spaced ISBNs are rejected even though lookupIsbn accepts
normalized values. In src/app/api/lookup/bulk/route.ts, normalize the candidate
ISBN with the same cleanISBN-style separator stripping used by /api/lookup/isbn
before the trimmedItem.match(/^(97[89])?\d{9}[\dXx]$/) check, and use that
normalized value for the isbn payload passed to lookupIsbn.
- Around line 39-41: Normalize bare www. inputs before calling lookupUrl in bulk
route handling: the trimmedItem branch in the bulk lookup route currently
forwards values like www.example.com unchanged, which later fails in lookup/url
when new URL(url) is parsed. Update the payload construction in the bulk route’s
URL-detection path so that when the input starts with www. but has no scheme, it
is prefixed with https:// before delegating to lookupUrl; keep already-schemed
URLs unchanged.

---

Nitpick comments:
In `@src/app/api/lookup/bulk/route.test.ts`:
- Around line 99-107: Add an integration-style test around POST in
bulk/route.test.ts that exercises the synthetic NextRequest path instead of
mocking the URL/DOI/ISBN handlers directly. Keep the real handler invocation for
one item, but stub the lower-level network layer so the synthetic request
construction and parsing are actually covered. Use the existing POST and
makeRequest flow as the entry point, and verify the response summary still marks
the item failed when the real handler processes the synthetic request.

In `@src/app/api/lookup/bulk/route.ts`:
- Around line 35-47: The bulk lookup route is repeating a verbose `as unknown
as` cast for `lookupUrl`, `lookupDoi`, and `lookupIsbn`, which hides their
shared function shape. Introduce a single shared handler type near the `handler`
variable in `route.ts` and use that alias for all three imports so the cast
happens once (or remove it entirely if the signatures already align); this keeps
`NextRequest`/`NextResponse` handling consistent and makes signature drift
visible.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: a49feb2f-07eb-4229-9a15-a9239d3c8f13

📥 Commits

Reviewing files that changed from the base of the PR and between b69285b and 868bf79.

📒 Files selected for processing (3)
  • .jules/sentinel.md
  • src/app/api/lookup/bulk/route.test.ts
  • src/app/api/lookup/bulk/route.ts

Comment on lines 39 to +41
if (trimmedItem.match(/^(https?:\/\/|www\.)/i)) {
apiEndpoint = "/api/lookup/url";
body = { url: trimmedItem };
handler = lookupUrl as unknown as (req: NextRequest) => Promise<NextResponse> | NextResponse;
payload = { url: trimmedItem };

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n== route.ts files ==\n'
git ls-files 'src/app/api/lookup/**/route.ts'

printf '\n== bulk route excerpt ==\n'
sed -n '1,140p' src/app/api/lookup/bulk/route.ts

printf '\n== url route excerpt ==\n'
sed -n '1,160p' src/app/api/lookup/url/route.ts

printf '\n== search for URL normalization ==\n'
rg -n "new URL\\(|https://|www\\.|normalize.*url|scheme" src/app/api/lookup -S

Repository: aicoder2009/opencitation

Length of output: 14414


Normalize www. URLs before delegating in src/app/api/lookup/bulk/route.ts:39-41. lookup/url calls new URL(url), so inputs like www.example.com are rejected as Invalid URL format. Prepend https:// when the scheme is missing.

🐛 Proposed fix
         if (trimmedItem.match(/^(https?:\/\/|www\.)/i)) {
           handler = lookupUrl as unknown as (req: NextRequest) => Promise<NextResponse> | NextResponse;
-          payload = { url: trimmedItem };
+          payload = { url: /^www\./i.test(trimmedItem) ? `https://${trimmedItem}` : trimmedItem };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (trimmedItem.match(/^(https?:\/\/|www\.)/i)) {
apiEndpoint = "/api/lookup/url";
body = { url: trimmedItem };
handler = lookupUrl as unknown as (req: NextRequest) => Promise<NextResponse> | NextResponse;
payload = { url: trimmedItem };
if (trimmedItem.match(/^(https?:\/\/|www\.)/i)) {
handler = lookupUrl as unknown as (req: NextRequest) => Promise<NextResponse> | NextResponse;
payload = { url: /^www\./i.test(trimmedItem) ? `https://${trimmedItem}` : trimmedItem };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/app/api/lookup/bulk/route.ts` around lines 39 - 41, Normalize bare www.
inputs before calling lookupUrl in bulk route handling: the trimmedItem branch
in the bulk lookup route currently forwards values like www.example.com
unchanged, which later fails in lookup/url when new URL(url) is parsed. Update
the payload construction in the bulk route’s URL-detection path so that when the
input starts with www. but has no scheme, it is prefixed with https:// before
delegating to lookupUrl; keep already-schemed URLs unchanged.

Comment on lines 45 to +47
} else if (trimmedItem.match(/^(97[89])?\d{9}[\dXx]$/)) {
apiEndpoint = "/api/lookup/isbn";
body = { isbn: trimmedItem };
handler = lookupIsbn as unknown as (req: NextRequest) => Promise<NextResponse> | NextResponse;
payload = { isbn: trimmedItem };

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate the bulk lookup route and ISBN handler files
git ls-files 'src/app/api/lookup/**' 'src/app/api/**/isbn/**' 'src/lib/**' | sed -n '1,200p'

echo '--- bulk route outline ---'
ast-grep outline src/app/api/lookup/bulk/route.ts --view expanded || true

echo '--- search for cleanISBN and ISBN-related validation ---'
rg -n "cleanISBN|isbn|Unrecognized format|lookupIsbn" src/app/api src/lib -g '!**/*.map'

echo '--- read relevant sections ---'
for f in \
  src/app/api/lookup/bulk/route.ts \
  src/app/api/lookup/isbn/route.ts \
  src/lib/*isbn* \
  src/lib/*ISBN*; do
  [ -f "$f" ] && { echo "### $f"; wc -l "$f"; cat -n "$f" | sed -n '1,220p'; }
done

Repository: aicoder2009/opencitation

Length of output: 21586


Normalize ISBNs before bulk matching src/app/api/lookup/bulk/route.ts:45-47trimmedItem.match(/^(97[89])?\d{9}[\dXx]$/) rejects hyphenated/spaced ISBNs that /api/lookup/isbn accepts after cleanISBN(), so pasted ISBNs fall into “Unrecognized format” in bulk. Strip separators before this check.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/app/api/lookup/bulk/route.ts` around lines 45 - 47, Bulk ISBN matching is
using trimmedItem directly, so hyphenated or spaced ISBNs are rejected even
though lookupIsbn accepts normalized values. In
src/app/api/lookup/bulk/route.ts, normalize the candidate ISBN with the same
cleanISBN-style separator stripping used by /api/lookup/isbn before the
trimmedItem.match(/^(97[89])?\d{9}[\dXx]$/) check, and use that normalized value
for the isbn payload passed to lookupIsbn.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants