Skip to content

fix: only enforce token limit in direct LLM mode#307

Merged
paultranvan merged 1 commit intodevfrom
fix/token_check
Apr 20, 2026
Merged

fix: only enforce token limit in direct LLM mode#307
paultranvan merged 1 commit intodevfrom
fix/token_check

Conversation

@Ahmath-Gadji
Copy link
Copy Markdown
Collaborator

@Ahmath-Gadji Ahmath-Gadji commented Apr 16, 2026

Move the token-limit check so it only runs in direct LLM mode. In RAG mode the pipeline trims chat history to chat_history_depth (default 4) before calling the LLM, so the pre-trim check was falsely rejecting long conversations that would fit once truncated. Direct LLM mode still enforces the limit since no trimming happens there.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 16, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3b3ba131-0ddc-4e14-8077-23dbbdea07ca

📥 Commits

Reviewing files that changed from the base of the PR and between a3f5070 and 2c04fd4.

📒 Files selected for processing (1)
  • openrag/routers/openai.py

📝 Walkthrough

Walkthrough

The token limit validation in the OpenAI router has been relocated to execute conditionally only for direct LLM model requests, moving it from a universal pre-flight check to logic nested within the direct-model handling branch, skipping validation for RAG-partitioned models.

Changes

Cohort / File(s) Summary
Token Limit Validation Restructure
openrag/routers/openai.py
Moved check_tokens_limit() call from unconditional pre-flight execution to conditional execution within the direct LLM model branch, causing RAG-partitioned models to bypass token limit validation.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A token check hops to new ground,
Now direct models keep it around,
While RAG-partitioned paths skip ahead,
Control flow branches where logic is led! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: the token limit check is now only enforced in direct LLM mode, which matches the code modification that moved the check to run conditionally inside the direct-model branch.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/token_check

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Ahmath-Gadji Ahmath-Gadji marked this pull request as ready for review April 20, 2026 12:43
@paultranvan paultranvan merged commit da655cf into dev Apr 20, 2026
4 checks passed
@paultranvan paultranvan deleted the fix/token_check branch April 20, 2026 13:00
@Ahmath-Gadji Ahmath-Gadji added the fix Fix issue label Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix Fix issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants