A Cloudflare Worker that generates RSS feeds from websites that don't have them.
It scrapes listing pages, uses Workers AI to identify articles, then fetches and parses each article into a proper RSS 2.0 feed — complete with images, excerpts, and full content.
- Cron trigger runs hourly (configurable)
- Fetches the listing page for each configured feed
- Workers AI extracts article URLs, titles, and dates from the page text
- Scrapes new articles using HTMLRewriter to pull titles, content, and Open Graph metadata
- Stores articles in KV and serves them as RSS 2.0 feeds
- Node.js (v18+)
- A Cloudflare account (free tier works)
git clone https://github.com/tamm/rss-anything.git
cd rss-anything
npm install
# Copy the example configs
cp wrangler.toml.example wrangler.toml
cp .env.example .envEdit .env with your Cloudflare API token.
Create a KV namespace and update wrangler.toml:
npx wrangler kv namespace create KV
# Copy the output id into wrangler.tomlnpm run dev
# Visit http://localhost:8787Feeds are defined in src/config.ts. Each feed needs:
| Field | Description |
|---|---|
slug |
URL-safe identifier (e.g. "anthropic-news") |
title |
Human-readable feed title |
description |
Short description |
listUrl |
The listing/index page to monitor |
baseUrl |
Base URL for resolving relative links |
titleSelector |
CSS selector for the article title on article pages |
contentSelector |
CSS selector for article content on article pages |
maxArticles |
Maximum articles to keep in the feed |
{
slug: "anthropic-news",
title: "Anthropic News",
description: "Latest news and announcements from Anthropic",
listUrl: "https://www.anthropic.com/news",
baseUrl: "https://www.anthropic.com",
titleSelector: "h1",
contentSelector: "main",
maxArticles: 20,
}The listing page is fetched and all text content + links are extracted using HTMLRewriter. This data is sent to a Workers AI model (Llama 3.1 8B by default) with a prompt asking it to identify the chronological article entries and return structured JSON with titles, URLs, and dates.
This approach means you don't need to write fragile CSS selectors for every listing page — the AI handles the varying layouts.
npm run deployThe worker will be available at https://rss-anything.<your-subdomain>.workers.dev. Feeds are accessible at /feed/<slug>.
To serve feeds from a custom domain, uncomment and edit the routes section in wrangler.toml.
Tuneable defaults are in src/defaults.ts:
| Constant | Default | Description |
|---|---|---|
USER_AGENT |
"rss-anything/1.0" |
User-Agent for outgoing requests |
AI_TEXT_LIMIT |
4000 |
Max characters of page text sent to AI |
AI_MAX_TOKENS |
2000 |
Max tokens the AI may generate |
AI_MODEL |
"@cf/meta/llama-3.1-8b-instruct" |
Workers AI model |
FEED_CACHE_MAX_AGE |
900 |
Cache-Control max-age (seconds) |
MOUNT_PREFIX |
"/rss" |
Path prefix for custom domain routing |
FETCH_TIMEOUT |
15000 |
Timeout (ms) for outgoing requests |
npm run dev # Start local dev server
npm test # Run tests
npm run test:watch # Run tests in watch mode
npm run typecheck # Type-check without emitting