fix(auth): detect HTTPS behind TLS-terminating reverse proxy#313
fix(auth): detect HTTPS behind TLS-terminating reverse proxy#313etiquet wants to merge 65 commits intolinagora:devfrom
Conversation
… + harden gitignore Add OpenWebUI/Keycloak integration layer, Google Drive connector, notification channels (email, Tchap, webhook), admin router, QA override, eval module, and OIDC/integration migrations. Harden .gitignore to exclude PostgreSQL data directory (db/) and additional macOS artifacts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- MyRAG (beta) v0.1.0 FastAPI app with health, config, root endpoints - Pydantic settings for OpenRAG, Keycloak, Legifrance, GraphRAG config - Dockerfile (python:3.12-slim, port 8200) - docker-compose.yaml with owui-net network alias - TDD: 6 unit tests passing (health, config, app title with beta) - Directory structure: routers/, services/, models/, templates/, static/ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- article: splits legal codes by Article Lxxx-x with hierarchy and cross-refs - section: splits reports by markdown headers - qr: splits FAQs by question/answer pairs - length: fixed-length chunks with overlap - auto-detection: regex-based strategy selection - TDD: 28 unit tests passing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- OpenRAG API client: create partition, upload chunks, search, list models
- Ingest router: POST /api/ingest/{collection} with file upload + auto-chunking
- Strategy selection via form param (auto, article, section, qr, length)
- TDD: 40 unit tests passing (health + chunker + openrag client)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Article content now prefixed with "Article Lxxx — Livre X, Titre Y, Chapitre Z" - Added metadata fields: page, parent_path, referenced_by (placeholder), graph_ready flag - parent_path enables hierarchy navigation (Livre-I/Titre-II/Chapitre-Ier) - referenced_by + graph_ready prepared for post-processing graph build - TDD: 46 unit tests passing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Sensitivity levels: public, internal, restricted, confidential, secret - Sensitivity set at ingestion time, stored in every chunk's metadata - Can be modified later per-chunk (for access control filtering) - Article chunks prefixed with "Article Lxxx — Livre X, Titre Y" for LLM citation - Ingest endpoint accepts sensitivity parameter - TDD: 46 unit tests passing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Ingest returns immediately with job_id (no more browser timeout)
- Background upload via asyncio.create_task
- GET /api/ingest/jobs/{job_id}: real-time progress (uploaded/total, pct, ETA)
- GET /api/ingest/jobs: list all jobs (filterable by collection)
- watch-ingest.sh: terminal progress bar with ETA
- CESEDA v2: 2399 chunks with article headers + sensitivity metadata
- TDD: 46 unit tests passing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Titre/Chapitre regex now captures only roman numerals + ordinals (Ier, bis, ter, préliminaire) - Before: "Titre du séjour" → "du" / "Chapitre IV du titre" → "IV du" - After: "Titre II : LES CARTES" → "II" / "Chapitre Ier" → "Ier" - TDD: 46 tests passing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…cking
- Collection model with configurable system_prompt per collection
- Default prompt forces article citation (juridique)
- CRUD: POST /api/collections, GET, PATCH /{name}/system-prompt
- Async ingest: returns job_id immediately, tracks progress
- GET /api/ingest/jobs/{job_id}: uploaded/total, pct, ETA
- watch-ingest.sh: terminal progress bar
- TDD: 55 unit tests passing (health + chunker + openrag + collections)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Explicit rules for citing article numbers (L, R, D) with Livre/Titre/Chapitre - Filter out technical articles (AGDREF, data processing, transitional provisions) - Prefer legislative articles (L) over regulatory (R, D) - Structured response format: direct answer then article-by-article details - Tested: correctly cites L423-1, L423-14, L423-15 for "vie privee et familiale" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lates Templates builtin: - generic: recherche documentaire polyvalente (defaut) - juridique: codes et lois avec citation d'articles - ceseda: specialise CESEDA (droit des etrangers) - multi_thematique: gros corpus multi-domaines - faq: bases de connaissances Q&R - multimedia: images, video, audio, transcriptions - technique: documentation technique et specs Features: - Chaque collection reference un template + peut overrider le prompt - API CRUD templates: GET/POST/PUT/DELETE /api/collections/_templates - Builtins non modifiables, customs extensibles par admin - Templates customs persistees dans data/_config/prompt_templates.json - Auto-loaded au demarrage - TDD: 55 tests passing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Token management with client_credentials grant + cache - Group CRUD: create, find, list members, add/remove user - MyRAG-specific: create_collection_groups (user + admin groups) - delete_collection_groups, list_collection_groups - Ensure root group /myrag/ exists - Paginated user listing - TDD: 62 unit tests passing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- SyncService: maps KC groups to OpenRAG memberships
- myrag/{collection} members → editor role
- myrag/{collection}-admin members → owner role
- Auto-provisions users in OpenRAG from KC
- POST /api/sync: sync all collections
- POST /api/sync/{collection}: sync one collection
- OpenRAG client: added _upload_form for form-data endpoints
- TDD: 68 unit tests passing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- 11 commits, 68 unit tests passing - All Phase 1 modules implemented: skeleton, chunker, OpenRAG client, async ingest, collections, prompt templates, Keycloak client, sync service - Real-world test: CESEDA 2399 articles indexed with citations working - Updated with all UX findings from testing session Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- build_graph_from_chunks: creates directed graph from chunk metadata - Nodes: articles with livre/titre/chapitre/content_preview - Edges: directional "cite" references between articles - referenced_by populated on target nodes - GraphBuilder class: build, save (JSON), load, get_subgraph (N-hop) - to_graph_data_response: format compatible with grafragexp Cytoscape.js viewer - Nodes sized by degree, grouped by Livre - Query filtering: subgraph around matching articles - TDD: 84 unit tests passing (16 new for graph) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Graph API:
- GET /graph: Cytoscape.js viewer (HTML, copied from grafragexp)
- GET /graph/data: GraphDataResponse format (compatible grafragexp)
- GET /graph/{collection}/related: subgraph around an article
- POST /graph/{collection}/build: build graph from indexed chunks
- GET /graph/config: viewer configuration
Article views:
- GET /articles/{collection}/{article_id}: DSFR HTML view (iframe-friendly)
- GET /articles/{collection}/{article_id}/json: JSON data
- Jinja2 template with breadcrumb, hierarchy, references, cited-by links
- PostMessage iframe-resize for OWUI embedding
- Sensitivity badge display
TDD: 84 unit tests passing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tool MyRAG (owui/tool_myrag.py): - search_collection: RAG + graph context → HTMLResponse iframe - view_article: article complet DSFR → HTMLResponse iframe - explore_graph: Cytoscape.js viewer → HTMLResponse iframe - browse_collection: table des matieres → HTMLResponse iframe - Pattern: (HTMLResponse, context) compatible owuitools-websnap Pipe filter (owui/pipe_myrag_filter.py): - Detects #collection in user message - Searches OpenRAG, injects context into prompt - Loads collection system prompt from MyRAG - Type: filter (inlet) Plugin declaration (owui-plugin.yaml): - Tool: myrag with 4 methods - Filter: myrag-collection-filter TDD: 84 unit tests passing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Moved /templates routes before /{name} to prevent FastAPI from
matching "templates" as a collection name
- Renamed /_templates to /templates (cleaner URL)
- TDD: 84 tests passing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- build-graph.py: CLI script to build graph from local file - CESEDA v3 graph: 2399 articles, 9293 cross-references - Most connected: R931-5 (132 links), L445-1 (109), L446-1 (109) - Grouped by Livre (I-VI) - Viewer: http://localhost:8200/graph?corpus_id=ceseda-v3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fragments in graph viewer were truncated at 200/300 chars - Now show up to 2000 chars (full article content for most articles) - Rebuilt ceseda-v3 graph with full content Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ai_summary_enabled: toggle in collection config (default: off)
- ai_summary_threshold: chars threshold for triggering LLM summary (default: 1000)
- POST /graph/{collection}/summarize: generates summaries via LLM
- Short articles: 500-char raw preview (no LLM)
- Long articles: "Resume par l'IA" badge + 3-5 sentence summary
- Graph viewer shows "[... tronque — N caracteres]" for unsummarized long articles
- Endpoint respects collection config (disabled returns early)
- TDD: 84 tests passing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- LegifranceClient: OAuth2 token, search, get_article, get_code_toc
- parse_legifrance_url: extract type + ID from any Legifrance URL
(codes, articles, lois, JO)
- Sources router:
- POST /api/sources/legifrance/parse-url: parse and validate URL
- POST /api/sources/legifrance/search: search PISTE API
- POST /api/sources/legifrance/add: register source on collection
- GET /api/sources/legifrance/status/{collection}: check source config
- TDD: 95 unit tests passing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- namespace, deployment, service, ingress (TLS), configmap, secret, PVC - Liveness/readiness probes on /health - ConfigMap: OpenRAG, Keycloak, GraphRAG viewer URLs - Secrets: admin tokens, Legifrance credentials - Ingress: myrag.mirai.gouv.fr with Let's Encrypt - PVC: 5Gi for graph data persistence Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- QRCache: per-collection Q&R cache with JSON persistence - CRUD: add, list, update, delete entries - Search: exact match + fuzzy matching (SequenceMatcher, threshold 0.7) - Import/export JSON for sharing between environments - Hit/miss stats tracking - Sources: manual, feedback, import - TDD: 106 unit tests passing (11 new) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- EvalService: manage Q&R datasets per collection - Score similarity (SequenceMatcher 50%) + citation correctness (50%) - must_cite / must_not_cite for precise article validation - Detects missing citations and unwanted pollution (AGDREF) - Eval runs: create, list, update with results - Import/export JSON datasets - TDD: 117 unit tests passing (11 new) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- FeedbackService: ingest, list, review, promote, stats
- Idempotent on owui_message_id (no duplicates)
- Feedback router:
- POST /api/feedback/ingest (called by OWUI outlet)
- GET /api/feedback/{collection} (filter by status, rating)
- GET /api/feedback/{collection}/stats (satisfaction rate)
- PATCH /{id}/review (reviewed | ignored)
- POST /{id}/promote (→ Q&R cache or eval dataset)
- OWUI outlet (feedback_outlet.py): fire-and-forget capture
- Promotion loop:
- promote_to='qr' → adds to QRCache (R1)
- promote_to='eval' → adds to EvalService (R2)
- R5 boucle vertueuse: feedback → review → promote → improve
- TDD: 127 unit tests passing (10 new)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pages implemented:
- / (dashboard): collection list with quality indicators
- /c/{id}: collection detail with tabs (prompt, feedback, Q&R)
- /c/{id}/playground: RAG test with debug panel (sources, graph)
- /c/{id}/graph: Cytoscape.js viewer in iframe
- /c/{id}/upload: file upload with strategy + sensitivity selection
- /c/{id}/config: collection settings (strategy, sensitivity, scope, graph, AI summary)
- /c/{id}/prompt: system prompt editor with template selector + test playground
- /admin: dashboard with collections table, sync button, jobs list
- /admin/create: create collection form with all options
Stack: Nuxt 4 + @gouvfr/dsfr + @gouvminint/vue-dsfr
Layout: DSFR header, nav, footer, breadcrumbs
API composable: useApi() for centralized fetch calls
Build: SPA mode, isolated tsconfig
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Green dot: service UP and reachable - Red pulsing dot: service DOWN with error banner - Orange pulsing dot: checking... - Grey dot: unknown (internal network, not verifiable from browser) - Checks MyRAG /health and OpenRAG /health_check every 30s - Error banner when OpenRAG is down Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… links - DSFR CSS was not loading because head link paths were wrong - Moved to nuxt.config css array which resolves from node_modules Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Header logo: Mes collections (beta) - Header service title: Mes collections - Footer: Mes collections (beta) - Page title: Mes collections (beta) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…zard - Sources with soon:true are grayed out (opacity 0.5, cursor not-allowed) - Badge 'Bientot disponible' replaces strategy/refresh badges - Cannot be selected (click disabled) - Applied to Resana as example Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Single 'Type de collection' selector replaces separate strategy + prompt fields - 6 profiles: juridique, FAQ, rapport, corpus, multimedia, generique - Each profile sets: strategy, prompt template, graph enabled - Pre-selected based on source (legifrance → juridique, directory → corpus, etc.) - Hint text shows profile description - Source cards also carry prompt_template for consistency Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- AI summary checkbox disabled and grayed when graph is off - Unchecking graph auto-unchecks AI summary - Label shows "(necessite le graph)" when disabled - "En savoir plus" expandable section explains what graph is and when to use it - Checklist: useful for legal codes, technical docs with cross-refs; not for FAQs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- AI summary checkbox only visible when graph is enabled (v-if instead of grayed) - Removed confusing "(necessite le graph)" inline text - "En savoir plus" explains: summary is only for graph viewer display - Clarifies: original article is never modified, RAG uses full text Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- contact_name: pre-filled from profile.name or preferred_username - contact_email: pre-filled from profile.email - Only pre-fills if fields are empty (user can override) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Name field: - Placeholder: "un-nom-clair-et-unique" (was ceseda-v4) - Hint: format explanation (minuscules, sans espaces) - "Verifier" button checks uniqueness against existing collections - Green valid text when available, red error when taken - Auto-suggestion (appends -v2, -v3...) with "Utiliser" button - Auto-normalizes input (lowercase, replace spaces with dashes) Description field: - Hint explains it appears in the catalog - Better placeholder with concrete example Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Portee: "Tout le ministere" / "Un ou plusieurs groupes" / "Prive (pour evaluation)" - Group scope: searchable group picker from Keycloak session groups - Selected groups as dismissible tags - Private scope: explanation text - Options fieldset moved under Type de collection - Label: "decoupage + prompt systeme" (was "adaptes") - Pre-fill user groups from Keycloak JWT profile Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Donnees ouvertes, Interne au ministeriel, Donnees personnelles, Confidentiel, Diffusion restreinte Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add database foundation for MyRAG persistent storage: - SQLAlchemy async models: collections, publications, ingest_jobs, feedback, eval_datasets, eval_runs, source_files (R7) - SQLite for dev (default), PostgreSQL for prod via DATABASE_URL - Auto-create tables on startup via lifespan event - Dependencies: sqlalchemy[asyncio], aiosqlite, asyncpg, alembic Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…y DB Replace file-based CollectionConfig with SQLAlchemy-backed collection_store: - New collection_store.py service (CRUD async via DB) - Migrate collections.py router to use DB queries - Migrate publication.py router to use Publication/PublicationHistory models - Update playground.py, graph.py, sources.py to use DB store - Remove all CollectionConfig.load/save references from routers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace in-memory job_tracker with job_store.py (SQLAlchemy DB)
- Jobs persist across restarts, queryable via DB
- R7: save source files to /app/data/_sources/{collection}/ before
chunking, record in source_files table with checksum
- Refactor ingest router: shared _ingest_content() for file upload
and URL fetch
- Source files enable future re-indexation when strategy changes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace file-based FeedbackService with feedback_store.py: - Feedback CRUD via SQLAlchemy async (ingest, list, review, promote) - Stats computed from DB queries - Idempotent ingest on owui_message_id - All routers now use DB-backed services Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- POST /api/ingest/{collection}/reindex?strategy=xxx re-chunks all
stored source files with the new strategy and re-indexes in OpenRAG
- GET /api/ingest/{collection}/sources lists stored source files
with metadata (filename, size, checksum, chunks produced)
- Enables strategy changes without re-uploading documents
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Major refactoring of MyRAG backend and frontend:
Backend (R6 — SQLite/PostgreSQL):
- All data persisted in SQLAlchemy DB (collections, publications,
jobs, feedback, source files)
- SQLite for dev, PostgreSQL for prod via DATABASE_URL
- Pod stateless: no more JSON files for config
Backend (R7 — source file storage + reindex):
- Source files saved to /app/data/_sources/ before chunking
- POST /api/ingest/{collection}/reindex re-indexes with new strategy
- GET /api/ingest/{collection}/sources lists stored files
Frontend improvements:
- Step 3: unified upload card, URL verification via backend, GitHub
URL normalization, preview via backend proxy
- Step 4: markdown preview, auto-generated eval dataset, run tests
with scoring, guardrail prompt suggestion for out-of-scope
- Step 5: DSFR cards, Keycloak group selector, create group button
- Config page: aligned with wizard, reindex button when strategy changes
- Homepage: collections from OpenRAG + file counts
- Keycloak: list/create groups endpoints, admin password fallback
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lections The integrations/ directory (MyRAG, OpenWebUI integrations, Keycloak scripts) has been extracted into its own repository with full git history preserved: https://github.com/IA-Generative/mycollections Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When OpenRAG runs behind Traefik/Nginx that terminates TLS, the internal request arrives as HTTP. This causes the OIDC state cookie to be set without the Secure flag, which prevents the browser from sending it back on the HTTPS callback — breaking the OIDC flow. Fix: _is_request_secure() now checks: 1. PREFERRED_URL_SCHEME env var (explicit config) 2. X-Forwarded-Proto header (set by reverse proxies) 3. request.url.scheme (existing check) Without this fix, OIDC login fails with "OIDC code exchange failed" when deployed behind any TLS-terminating proxy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 17 minutes and 28 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (21)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Hello @etiquet, Before putting more time into this PR, would you mind clearing up some qualms I have with the scope of your modifications and the relevance of certain changes?
What is the point of providing this file in the PR? With the new age of AI, there may be some use for this file I don't grasp.
Unless these changes are core to the auth fix in some way, please resubmit them in other PRs. |
Summary
When OpenRAG runs behind a TLS-terminating reverse proxy (Traefik, Nginx, etc.), the OIDC state cookie is set without the
Secureflag becauserequest.url.schemeishttp(the proxy handles TLS). This causes the browser to not send the cookie back on the HTTPS callback, breaking the OIDC Authorization Code + PKCE flow with:Root cause
_is_request_secure()only checksrequest.url.scheme, which reflects the internal HTTP connection from the proxy, not the client-facing HTTPS.Fix
_is_request_secure()now checks three indicators:PREFERRED_URL_SCHEME=httpsenv var (explicit config, already used elsewhere in OpenRAG)X-Forwarded-Proto: httpsheader (standard reverse proxy header)request.url.scheme(existing fallback)Test plan
Secureflag when accessed via HTTPS🤖 Generated with Claude Code