feat(inference): add inferenceClass option to inference.LLM by adrian-cowham · Pull Request #1282 · livekit/agents-js

adrian-cowham · 2026-04-21T22:24:57Z

Description

Adds an optional inferenceClass?: 'priority' | 'standard' option to inference.LLM, letting callers opt into the LiveKit Agent Gateway's priority/standard routing tiers. When set, the outbound request includes an X-LiveKit-Inference-Priority header; when unset, nothing is emitted (behavior unchanged).

The option is available in two places, mirroring the existing precedence pattern for parallelToolCalls/toolChoice in the same file:

Constructor — applies to every chat() call made by that LLM instance.
Per-call on LLM.chat() — overrides the constructor value for a single request.

Changes Made

InferenceLLMOptions gets a new optional inferenceClass?: InferenceClass field; InferenceClass is a new exported type alias ('priority' | 'standard').
LLM constructor accepts inferenceClass and stores it on this.opts without a default.
LLM.chat() accepts inferenceClass, resolves it as callArg !== undefined ? callArg : this.opts.inferenceClass, and forwards the resolved value to LLMStream.
LLMStream.run() sets extraHeaders[INFERENCE_PRIORITY_HEADER] = inferenceClass immediately after the existing provider-header block, only when the resolved value is defined.
Header name literals (X-LiveKit-Inference-Provider, X-LiveKit-Inference-Priority) are consolidated as INFERENCE_PROVIDER_HEADER / INFERENCE_PRIORITY_HEADER constants in agents/src/inference/utils.ts, co-located with the other gateway-concerned helpers.
type InferenceClass is re-exported from agents/src/inference/index.ts.
Adds agents/src/inference/llm.test.ts with 7 explicit it(...) tests covering every observable precedence state (no-value, constructor-only × 2, per-call-only × 2, per-call-overrides-constructor × 2), per the project's "explicit tests, no loop-over-cases" convention.

Usage

// Default applied to every chat() call
const llm = new inference.LLM({ model: 'openai/gpt-4o-mini', inferenceClass: 'priority' });
await llm.chat({ chatCtx });                             // header = 'priority'

// Per-call override
await llm.chat({ chatCtx, inferenceClass: 'standard' }); // header = 'standard'

// No default, opt into priority for one call
const llm2 = new inference.LLM({ model: 'openai/gpt-4o-mini' });
await llm2.chat({ chatCtx, inferenceClass: 'priority' }); // header = 'priority'
await llm2.chat({ chatCtx });                             // no header emitted

Pre-Review Checklist

Build passes: pnpm build, pnpm format:check, pnpm lint (no new warnings) pass locally.
AI-generated code reviewed: minimal, narrowly-scoped diff; no throwaway comments.
Changes explained: scoped to a single gateway-routing option.
Scope appropriate: only the inference.LLM path is touched — inference.STT, inference.TTS, and FallbackAdapter are intentionally out of scope for this PR.

Testing

Automated tests added: agents/src/inference/llm.test.ts (7/7 pass).
All inference/voice/LLM unit tests pass. (Unrelated pre-existing plugin-test failures on main — e.g. cerebras/mistral — are not affected by this change.)

Adds an optional `inferenceClass?: 'priority' | 'standard'` option on both the `inference.LLM` constructor and `LLM.chat()`. When set, the outbound gateway request includes an `X-LiveKit-Inference-Priority` header, letting callers opt into the LiveKit Agent Gateway's priority/standard routing tiers. Precedence follows the existing pattern for `parallelToolCalls`/`toolChoice`: per-call value beats constructor default; when neither is set, no header is emitted (behavior unchanged). Header name literals are consolidated in inference/utils.ts as `INFERENCE_PROVIDER_HEADER` and `INFERENCE_PRIORITY_HEADER`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

changeset-bot · 2026-04-21T22:25:02Z

⚠️ No Changeset found

Latest commit: 9124309

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

adrian-cowham requested a review from toubatbrian April 21, 2026 22:27

devin-ai-integration Bot reviewed Apr 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(inference): add inferenceClass option to inference.LLM#1282

feat(inference): add inferenceClass option to inference.LLM#1282
adrian-cowham wants to merge 1 commit intomainfrom
ac/interesting-chaum-47b60c

adrian-cowham commented Apr 21, 2026 •

edited

Loading

Uh oh!

changeset-bot Bot commented Apr 21, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

adrian-cowham commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes Made

Usage

Pre-Review Checklist

Testing

Uh oh!

changeset-bot Bot commented Apr 21, 2026

⚠️ No Changeset found

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

adrian-cowham commented Apr 21, 2026 •

edited

Loading