Skip to content

feat(inference): add inferenceClass option to inference.LLM#1282

Open
adrian-cowham wants to merge 1 commit intomainfrom
ac/interesting-chaum-47b60c
Open

feat(inference): add inferenceClass option to inference.LLM#1282
adrian-cowham wants to merge 1 commit intomainfrom
ac/interesting-chaum-47b60c

Conversation

@adrian-cowham
Copy link
Copy Markdown
Contributor

@adrian-cowham adrian-cowham commented Apr 21, 2026

Description

Adds an optional inferenceClass?: 'priority' | 'standard' option to inference.LLM, letting callers opt into the LiveKit Agent Gateway's priority/standard routing tiers. When set, the outbound request includes an X-LiveKit-Inference-Priority header; when unset, nothing is emitted (behavior unchanged).

The option is available in two places, mirroring the existing precedence pattern for parallelToolCalls/toolChoice in the same file:

  1. Constructor — applies to every chat() call made by that LLM instance.
  2. Per-call on LLM.chat() — overrides the constructor value for a single request.

Changes Made

  • InferenceLLMOptions gets a new optional inferenceClass?: InferenceClass field; InferenceClass is a new exported type alias ('priority' | 'standard').
  • LLM constructor accepts inferenceClass and stores it on this.opts without a default.
  • LLM.chat() accepts inferenceClass, resolves it as callArg !== undefined ? callArg : this.opts.inferenceClass, and forwards the resolved value to LLMStream.
  • LLMStream.run() sets extraHeaders[INFERENCE_PRIORITY_HEADER] = inferenceClass immediately after the existing provider-header block, only when the resolved value is defined.
  • Header name literals (X-LiveKit-Inference-Provider, X-LiveKit-Inference-Priority) are consolidated as INFERENCE_PROVIDER_HEADER / INFERENCE_PRIORITY_HEADER constants in agents/src/inference/utils.ts, co-located with the other gateway-concerned helpers.
  • type InferenceClass is re-exported from agents/src/inference/index.ts.
  • Adds agents/src/inference/llm.test.ts with 7 explicit it(...) tests covering every observable precedence state (no-value, constructor-only × 2, per-call-only × 2, per-call-overrides-constructor × 2), per the project's "explicit tests, no loop-over-cases" convention.

Usage

// Default applied to every chat() call
const llm = new inference.LLM({ model: 'openai/gpt-4o-mini', inferenceClass: 'priority' });
await llm.chat({ chatCtx });                             // header = 'priority'

// Per-call override
await llm.chat({ chatCtx, inferenceClass: 'standard' }); // header = 'standard'

// No default, opt into priority for one call
const llm2 = new inference.LLM({ model: 'openai/gpt-4o-mini' });
await llm2.chat({ chatCtx, inferenceClass: 'priority' }); // header = 'priority'
await llm2.chat({ chatCtx });                             // no header emitted

Pre-Review Checklist

  • Build passes: pnpm build, pnpm format:check, pnpm lint (no new warnings) pass locally.
  • AI-generated code reviewed: minimal, narrowly-scoped diff; no throwaway comments.
  • Changes explained: scoped to a single gateway-routing option.
  • Scope appropriate: only the inference.LLM path is touched — inference.STT, inference.TTS, and FallbackAdapter are intentionally out of scope for this PR.

Testing

  • Automated tests added: agents/src/inference/llm.test.ts (7/7 pass).
  • All inference/voice/LLM unit tests pass. (Unrelated pre-existing plugin-test failures on main — e.g. cerebras/mistral — are not affected by this change.)

Adds an optional `inferenceClass?: 'priority' | 'standard'` option on both
the `inference.LLM` constructor and `LLM.chat()`. When set, the outbound
gateway request includes an `X-LiveKit-Inference-Priority` header, letting
callers opt into the LiveKit Agent Gateway's priority/standard routing tiers.

Precedence follows the existing pattern for `parallelToolCalls`/`toolChoice`:
per-call value beats constructor default; when neither is set, no header is
emitted (behavior unchanged).

Header name literals are consolidated in inference/utils.ts as
`INFERENCE_PROVIDER_HEADER` and `INFERENCE_PRIORITY_HEADER`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 21, 2026

⚠️ No Changeset found

Latest commit: 9124309

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant