feat: support renderer stop token IDs#9141
Draft
AmeenP wants to merge 1 commit intoai-dynamo:bis/dynamo-rlfrom
Draft
feat: support renderer stop token IDs#9141AmeenP wants to merge 1 commit intoai-dynamo:bis/dynamo-rlfrom
AmeenP wants to merge 1 commit intoai-dynamo:bis/dynamo-rlfrom
Conversation
Contributor
|
👋 Hi AmeenP! Thank you for contributing to ai-dynamo/dynamo. Just a reminder: The 🚀 |
Signed-off-by: AmeenP <ameenp360@gmail.com>
ec45e57 to
bf00fb2
Compare
biswapanda
added a commit
that referenced
this pull request
May 5, 2026
…clusion Three improvements borrowed from PR #9141 (Ameen Patel) on top of the existing PASSTHROUGH_EXTRA_FIELDS expansion: 1. get_stop_token_ids returns Result<Option<Vec<TokenIdType>>>, not Option<Vec<TokenIdType>>. Malformed payloads (e.g. stop_token_ids: "not-an-array") now surface as a typed 400 with the diagnostic 'stop_token_ids must be an array of unsigned token IDs: {err}'. extract_stop_conditions propagates the Result via ?. Replaces the prior silent-fallback Option<> variant which dropped malformed inputs without telling the caller. Silent drops on RL correctness primitives (stop conditions affect what tokens the engine emits) is the bug class CR-9 was about — same principle applies here. 2. Mutual-exclusion between messages and pre-tokenized input is now scoped to the canonical TOP-LEVEL prompt_token_ids extension only. The legacy nvext.token_data channel — which the verifiers dynamo_chat_nvext renderer transport (#1287) uses with placeholder messages 'role: user, content: (token-in mode)' — is allowed to coexist with non-empty messages. validate_messages still gates the empty-messages-with-no-tokens case. Without this relaxation, the renderer transport's placeholder pattern would 400 on every request. 3. Two new tests in test_common_ext.rs: - test_chat_completions_stop_token_ids_extraction: positive case with nvext.token_data + top-level stop_token_ids (lifted from PR #9141 verbatim). - test_chat_completions_stop_token_ids_malformed_returns_400: verifies the typed-error path on bad input. Pre-existing test struct-init sites in test_common_ext.rs were missing required fields (return_token_ids, tokens) added to the NvCreateChatCompletionRequest struct since the tests were written. Three sites updated to construct cleanly. cargo test test_common_ext: 15 tests, 15 passes.
biswapanda
added a commit
that referenced
this pull request
May 8, 2026
Cleanup-only — no behavior change. Strips review-tracker noise that accumulated on top of PR-added text during iteration: - "Closes hhzhang16 HH-19/HH-21/HH-22/HH-23/HH-25/HH-26/HH-27" - "CR-8 / CR-9 / CR-10 closure" prefixes on serde-error / doc-attach fixes - Branch-name references: bis/parity-tokenize-tcp, bis/prime-rl-merged - Internal PR numbers: #6094, #7699, #8197, #9141 - Phase numbers from internal design docs (rl-support.md Phase 1/4/5) - "prime-rl" mentions in narrative copy and mermaid diagrams → generic "RL trainer / RL orchestrator / external client" Technical content (semantics, invariants, why-this-exists rationale) preserved everywhere; only the internal-process scaffolding is removed. Scope verification: every removed line is one this branch ADDED (diff main..HEAD shows the removed text on a `+` line). No edits land on pre-existing main-branch comments. Specifically reverted the nvext.rs cleanup attempt — its target lines (GAIE Stage 1/2, SGLang-specific) live on main, not in this PR's diff. Files touched: components/src/dynamo/vllm/handlers.py +9 -10 components/src/dynamo/vllm/worker_factory.py +6 -4 docs/dynamo-RL-api.md +19 -32 lib/llm/src/http/service/openai.rs +32 -34 lib/llm/src/protocols/openai/chat_completions/delta.rs +4 -4 lib/llm/src/protocols/openai/completions/delta.rs +3 -3 lib/llm/src/protocols/openai/validate.rs +20 -20 cargo check -p dynamo-llm: clean (1 pre-existing benign warning).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds renderer-style
stop_token_idssupport to the Dynamo chat completions path on top ofbis/dynamo-rl.stop_token_idsas an RL/verifiers passthrough fieldNvCreateChatCompletionRequestStopConditions.stop_token_ids_hidden, the existing decoder path that hides stop tokens from outputnvext.token_data+stop_token_idsextractionThis keeps the transport on
/v1/chat/completionswithnvext.token_data; it does not add/generateor revive/v1/chat/completions/tokens.Compatibility
This is the Dynamo-side piece for Prime-RL/verifiers renderer parity. Companion verifiers PR: PrimeIntellect-ai/verifiers#1287
Validation
git diff --checkAttempted locally:
cargo fmt --checkis blocked by unrelated formatting diffs already present inbis/dynamo-rl(lib/llm/src/http/service/openai.rs,lib/llm/src/protocols/openai/chat_completions/delta.rs).RUSTC=$(rustup which --toolchain 1.90.0 rustc) $(rustup which --toolchain 1.90.0 cargo) test -p dynamo-llm --test test_common_ext test_chat_completions_stop_token_ids_extractionstarts compiling but fails on macOS-only build incompatibilities in existingdynamo-llmLinux storage paths (dynamo_memory::numa,DiskStorage,fallocate,O_DIRECT).