Skip to content

feat(prerender): use organic+included urls when audit is triggered from Slack#2307

Open
ssilare-adobe wants to merge 14 commits intomainfrom
feat/prerender-organic-mode
Open

feat(prerender): use organic+included urls when audit is triggered from Slack#2307
ssilare-adobe wants to merge 14 commits intomainfrom
feat/prerender-organic-mode

Conversation

@ssilare-adobe
Copy link
Copy Markdown
Contributor

@ssilare-adobe ssilare-adobe commented Apr 6, 2026

Summary

  • When run {site} prerender Slack command triggers a prerender audit, the audit worker now detects this via auditContext.slackContext.channelId and uses a simplified URL strategy: organic + included URLs only, with no daily batching and no agentic URL sources
  • For scheduled (non-Slack) runs, the original behavior is preserved: agentic URLs + organic + included, with daily batching and recency filtering
  • Removes the MODE_ORGANIC constant in favor of Slack context detection
  • Restores getRecentlyProcessedPathnames and isNotRecentUrl helpers for scheduled runs
  • No changes to spacecat-api-service — the existing run {site} prerender command is used as-is

Test plan

  • All 213 prerender handler tests pass
  • Slack-triggered tests verify agentic URLs are NOT fetched and recency filter is bypassed
  • Scheduled run tests verify batching and agentic URLs are still used
  • Existing batching/citability tests restored and passing

🤖 Generated with Claude Code

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 6, 2026

This PR will trigger a minor release when merged.

@ssilare-adobe ssilare-adobe changed the title feat(prerender): add organic-only mode to skip agentic URL sources feat(prerender): use organic+included urls when audit is triggered from Slack Apr 6, 2026
@ssilare-adobe ssilare-adobe requested a review from anuj-adobe April 6, 2026 10:14
Copy link
Copy Markdown
Contributor

@anuj-adobe anuj-adobe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Overview

When run {site} prerender is triggered via Slack, the prerender audit now detects this via auditContext.slackContext.channelId and uses a simplified strategy: organic + included URLs only, skipping agentic URL sources and daily batching. Scheduled runs are unaffected.


Code Quality & Readability

src/prerender/handler.js

  1. Variable pre-declaration block is a readability concern (around the isSlackTriggered branch):

    let finalUrls;
    let filteredCount;
    let agenticUrlsCount = 0;
    let currentAgentic = 0;
    let currentOrganic;
    let currentIncludedUrls;
    let isFirstRunOfCycle;

    Six let declarations before an if/else is a code smell. The Slack path and scheduled path are divergent enough to warrant extraction into separate helper functions (e.g., buildSlackTriggeredBatch(...) / buildScheduledBatch(...)), which would eliminate these shared mutable vars and make each path self-contained.

  2. isNotRecentUrl moved up — relocated from after importTopPages to just after getRecentlyProcessedPathnames. Improves logical proximity. No correctness issue.

  3. Minor: cleaner siteIdsiteId: site.getId()siteId (already destructured above). Clean.

  4. Log message formatting fixed — the multi-line log had a leading newline before prerender_submit_scraping_metrics:. Good cleanup.


Potential Issues / Regressions

🚨 Critical gap: fallback path in processContentAndGenerateOpportunities not guarded

The spec explicitly states:

Step 3 fallback path: skips agentic fetch when mode === 'organic'

But the diff only changes a comment in step 3 — the actual getTopAgenticUrls() call in the fallback is not gated on isSlackTriggered. If a Slack-triggered run produces zero scrape results, the fallback will still fetch agentic URLs — contradicting the intent. This is a bug.

Removed metric: agenticNewThisCycle

The previous log included agenticNewThisCycle=${filteredAgenticUrls.length} — the count of agentic URLs new in this cycle. This was replaced with isSlackTriggered. If anyone is parsing this log field for monitoring/alerting, it's a silent regression worth noting.

Spec/implementation mismatch

The design doc (2026-04-06-prerender-organic-mode-design.md) describes using MODE_ORGANIC = 'organic' driven by explicit {"mode":"organic"} in the SQS message data (like MODE_AI_ONLY). The actual implementation diverged to use Slack context detection instead. The spec is now stale/misleading and should be updated to reflect the actual approach, or removed.


Test Coverage

Tests added cover:

  • ✅ Slack-triggered: organic URLs included even when "recently processed"
  • ✅ Slack-triggered: agentic URLs NOT fetched
  • ✅ Scheduled: agentic URLs still fetched

Missing test: The fallback path in processContentAndGenerateOpportunities for Slack-triggered runs is not tested (and as noted above, the code doesn't guard it either).


Summary

Area Status
Correctness ⚠️ Step 3 fallback not guarded for Slack runs — potential regression
Readability Acceptable, but pre-declared let block warrants refactoring into helpers
Test coverage Missing fallback path test; matches the gap in the implementation
Observability agenticNewThisCycle metric silently removed
Docs Spec is stale relative to implementation

Recommendation: Block on the step 3 fallback gap. The rest can be addressed or accepted as follow-up.

@ssilare-adobe
Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review @anuj-adobe!

Critical fix — step 3 fallback not guarded (addressed in 5659973)

The getTopAgenticUrls() call in the processContentAndGenerateOpportunities fallback path is now gated on isSlackTriggered. I destructured auditContext at the top of the function (reusing the same detection pattern as step 1) and wrapped the agentic fetch in if (!isSlackTriggered). Also added a dedicated test: "should not fetch agentic URLs in fallback path when triggered from Slack" that asserts the Athena stub is not called when the fallback path is hit with a Slack context.

Other points from the review

  • let pre-declaration block / helper extraction — agreed it's a smell; I'll track this as a follow-up refactor rather than bundling it here to keep the PR focused on the behaviour fix.
  • agenticNewThisCycle metric removed — noted as a silent regression. Worth monitoring if anything parses that field; I'll add a note in the PR description.
  • Spec staleness — the design doc diverged from the implementation (Slack context detection vs. explicit mode field). I'll update or remove the stale plan doc as a follow-up.

@ssilare-adobe ssilare-adobe requested a review from anuj-adobe April 6, 2026 15:30
@anuj-adobe
Copy link
Copy Markdown
Contributor

anuj-adobe commented Apr 7, 2026

  1. Stale docs — Both docs/superpowers/plans/2026-04-06-prerender-organic-mode.md and the design spec describe the MODE_ORGANIC/SQS approach that was abandoned in favor of Slack context detection. They're misleading as committed. Consider removing them in this PR rather than leaving them stale from day one.
  2. why is agenticNewThisCycle removed?

@ssilare-adobe ssilare-adobe force-pushed the feat/prerender-organic-mode branch from 8149afe to b550078 Compare April 7, 2026 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants