feat(providers): add DeepInfra as a built-in inference provider by mmilutinovic371 · Pull Request #1902 · NVIDIA/OpenShell

mmilutinovic371 · 2026-06-14T16:48:37Z

Summary

DeepInfra is one of the top open source LLM providers and a perfect fit for agent frameworks with its low cost and high performance. This PR promotes it from a documented workaround to a core built-in provider in OpenShell using Providers v2.

Adds deepinfra as a built-in Providers v2 profile with DEEPINFRA_API_KEY discovery
Adds deepinfra as a built-in inference provider alongside nvidia, openai, and anthropic
DEEPINFRA_API_KEY is now discovered automatically via --from-existing (through the v2 profile discovery section)
openshell provider list-profiles shows DeepInfra in the INFERENCE section
Fixes build_backend_url to correctly strip /v1 from request paths when the provider base URL contains /v1/ as an internal path segment (e.g. https://api.deepinfra.com/v1/openai) — without this fix, requests were routed to .../v1/openai/v1/chat/completions (404) instead of .../v1/openai/chat/completions

Related Issue

N/A

Changes

providers/deepinfra.yaml — new built-in Providers v2 profile (inference category, api.deepinfra.com:443, Bearer auth, DEEPINFRA_API_KEY)
crates/openshell-core/src/inference.rs — DEEPINFRA_PROFILE, normalization, profile_for entries + tests; deepinfra added to openai_compatible_profiles_include_embeddings
crates/openshell-router/src/backend.rs — URL construction fix narrowed to path-rooted /v1 check; regression test for nested proxy path added
crates/openshell-providers/src/profiles.rs — registration of deepinfra.yaml in built-in profile catalog
crates/openshell-providers/src/providers/deepinfra.rs — new provider discovery plugin (DEEPINFRA_API_KEY, discovery test)
crates/openshell-providers/src/providers/mod.rs — module declaration for deepinfra
crates/openshell-providers/src/lib.rs — registers deepinfra in ProviderRegistry so known_types() and TUI include it
crates/openshell-server/src/inference.rs — adds deepinfra to unsupported-type error message
docs/sandboxes/providers-v2.mdx — DeepInfra row in built-in profiles table
docs/sandboxes/manage-providers.mdx — DeepInfra rows (provider types + inference providers); removes old v1 workaround row that used openai type with OPENAI_API_KEY

Testing

mise run pre-commit passes (rust, helm, markdown, license; python:proto is a pre-existing failure unrelated to this PR)
294 Rust unit tests pass across openshell-core, openshell-providers, openshell-router (cargo test -p openshell-core -p openshell-providers -p openshell-router)
openshell provider list-profiles shows deepinfra in INFERENCE section
openshell provider create --name di --type deepinfra --from-existing discovers DEEPINFRA_API_KEY
openshell inference set --provider di --model <model> --no-verify configures route
curl https://inference.local/v1/chat/completions from inside sandbox returns a valid completion from DeepInfra

Unit test results

test result: ok. 176 passed; 0 failed; 0 ignored  (openshell-core)
test result: ok. 47 passed;  0 failed; 0 ignored  (openshell-providers)
test result: ok. 54 passed;  0 failed; 0 ignored  (openshell-router)
test result: ok. 17 passed;  0 failed; 0 ignored  (openshell-router integration)

Includes inference::tests::profile_for_deepinfra, inference::tests::openai_compatible_profiles_include_embeddings (covers deepinfra), backend::tests::build_backend_url_dedupes_v1_for_base_with_v1_subpath, backend::tests::build_backend_url_preserves_v1_for_nested_proxy_path, and providers::deepinfra::tests::discovers_deepinfra_env_credentials.

Checklist

Follows Conventional Commits
Commits are signed off (DCO)
Architecture docs updated (docs/sandboxes/providers-v2.mdx, docs/sandboxes/manage-providers.mdx)

…nly) - Adds `deepinfra` as a built-in Providers v2 profile (`providers/deepinfra.yaml`) with inference category, Bearer auth, and `DEEPINFRA_API_KEY` discovery - Adds `DEEPINFRA_PROFILE` to inference routing so `inference.local` works with the `deepinfra` provider type - Fixes `build_backend_url` to strip `/v1` from request paths when the base URL contains `/v1/` as an internal segment (e.g. `api.deepinfra.com/v1/openai`), preventing double-versioned paths like `.../v1/openai/v1/chat/completions` - Updates `docs/sandboxes/providers-v2.mdx` and `docs/sandboxes/manage-providers.mdx` with DeepInfra entries; removes the old v1 workaround row that used `openai` type with `OPENAI_API_KEY` Signed-off-by: Milos Milutinovic <codemastermilos@gmail.com>

copy-pr-bot · 2026-06-14T16:48:40Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

johntmyers · 2026-06-15T18:05:28Z

gator-agent

PR Review Status

Validation: This PR is project-valid for OpenShell because it is concentrated Providers v2/inference work with a clear user path, dedicated DeepInfra credentials, provider policy metadata, tests, and docs updates.
Head SHA: b9a714f4685e3d062f3b8d31acb45e7b22cffb00

Review findings:

crates/openshell-router/src/backend.rs: the new /v1 dedupe check now matches any base URL containing /v1/. That can regress custom/proxy endpoints such as https://proxy.example/api/v1/openai, where /v1/chat/completions may intentionally need to remain appended. Please narrow the check to the intended cases, such as a base path that starts with v1 or ends with v1, and add a regression test that preserves /v1/chat/completions for a nested proxy path.
crates/openshell-providers/src/lib.rs / crates/openshell-providers/src/providers/mod.rs: DeepInfra is added to the built-in profile catalog, but not to the provider discovery plugin registry. ProviderRegistry::known_types() will omit deepinfra, so the TUI create-provider modal will not list it, and legacy --from-existing fallback discovery will report it as unsupported. Please add a deepinfra provider plugin with DEEPINFRA_API_KEY, register it, and include the standard discovery test.
crates/openshell-server/src/inference.rs: the unsupported-inference-provider error still lists openai, anthropic, nvidia, google-vertex-ai; please add deepinfra so users get accurate debugging guidance.
crates/openshell-core/src/inference.rs: please include deepinfra in the openai_compatible_profiles_include_embeddings test so the OpenAI-compatible protocol contract is locked in.

Docs: Updated on existing Fern pages under docs/; no docs/index.yml navigation change is needed.

Next state: gator:in-review

- Narrow build_backend_url /v1 dedupe to URLs whose path component is exactly /v1 or starts with /v1/ — prevents regression on proxy endpoints where /v1 is buried deeper (e.g. /api/v1/openai); add regression test for the nested proxy path case - Add deepinfra provider plugin with DEEPINFRA_API_KEY discovery, registered in ProviderRegistry so known_types() and TUI include it - Add deepinfra to unsupported-inference-provider error message in openshell-server for accurate user-facing debugging guidance - Add deepinfra to openai_compatible_profiles_include_embeddings test to lock in the OpenAI-compatible protocol contract Signed-off-by: Milos Milutinovic <codemastermilos@gmail.com>

mmilutinovic371 · 2026-06-15T18:47:23Z

All four gator review findings addressed in 1c3fc5b:

build_backend_url /v1 dedupe narrowed — changed from base.contains("/v1/") to extracting the URL's path component and checking path == "/v1" || path.starts_with("/v1/"). This preserves the full path for proxy endpoints like https://proxy.example/api/v1/openai (path is /api/v1/..., not rooted at /v1) while still deduplicating for DeepInfra (/v1/openai) and OpenAI/Nvidia (/v1). Regression test build_backend_url_preserves_v1_for_nested_proxy_path added.
DeepInfra provider plugin added — new providers/deepinfra.rs with DEEPINFRA_API_KEY discovery spec, registered in ProviderRegistry::new() and providers/mod.rs. known_types() now returns deepinfra; discovery test included.
Error message updated — deepinfra added to the supported providers list in openshell-server/src/inference.rs.
Embeddings test extended — deepinfra added to openai_compatible_profiles_include_embeddings in openshell-core/src/inference.rs.

johntmyers · 2026-06-15T18:56:00Z

gator-agent

Re-check After Author Update

I re-evaluated latest head 1c3fc5bd47f91f22038eddafb49026ae675f9295 after @mmilutinovic371's 2026-06-15 18:47 UTC comment saying the prior gator findings were addressed.

Disposition: partially resolved.

Remaining items:

crates/openshell-router/src/backend.rs: the updated /v1 dedupe logic now preserves nested proxy paths such as https://proxy.example/api/v1/openai, but it regresses documented OpenAI-compatible endpoints whose base URL ends in /v1 below a non-root prefix, for example https://api.groq.com/openai/v1. Those would route to .../openai/v1/v1/chat/completions. Please dedupe when the base URL path has v1 as the first or final path segment, while still preserving the new nested proxy case, and add a regression test for https://api.groq.com/openai/v1 plus /v1/chat/completions.

Resolved from the prior review:

DeepInfra provider discovery is registered with DEEPINFRA_API_KEY.
The unsupported inference provider error includes deepinfra.
The OpenAI-compatible embeddings protocol test includes deepinfra.
Docs were updated on the existing Fern provider pages; no navigation change is needed.

Next state: gator:in-review

Extends the /v1 deduplication logic to also strip /v1 from request paths when the base URL's path ends with /v1 (e.g. https://api.groq.com/openai/v1). The previous fix only matched paths starting with /v1/, which regressed providers like Groq whose base path has /v1 as the last segment rather than the first. The nested-proxy exclusion (e.g. /api/v1/openai) is preserved since /v1 appears in the middle — neither first nor last segment. Adds a regression test for the Groq-style base URL. Signed-off-by: Milos Milutinovic <codemastermilos@gmail.com>

mmilutinovic371 · 2026-06-15T18:59:39Z

Addressed in ce0ccda.

The root cause: the path-rooted check (starts_with("/v1/")) only matched providers where /v1 is the first path segment. Groq-style base URLs (/openai/v1) have /v1 as the last segment, so dedup was skipped and /v1 doubled.

Fix: dedup when the base URL's path starts with /v1/ (deepinfra: /v1/openai) or ends with /v1 (openai/nvidia: /v1, groq: /openai/v1). The nested-proxy exclusion (/api/v1/openai) still holds — /v1 there is neither first nor last segment so neither condition fires.

Regression test build_backend_url_dedupes_v1_for_base_ending_with_v1 added for https://api.groq.com/openai/v1 + /v1/chat/completions → https://api.groq.com/openai/v1/chat/completions.

johntmyers · 2026-06-15T19:17:10Z

gator-agent

Re-check After Author Update

I re-evaluated latest head ce0ccda6aa14b8ee33e1fa0c9709e7695bbe4df1 after @mmilutinovic371's 2026-06-15 18:59 UTC comment saying the remaining /v1 routing feedback was addressed.

Disposition: partially resolved.

Resolved items:

The DeepInfra and Groq-style /v1 dedupe cases are now covered while preserving the nested proxy-path case.
DeepInfra provider discovery is registered with DEEPINFRA_API_KEY.
The unsupported inference provider error includes deepinfra.
The OpenAI-compatible embeddings protocol test includes deepinfra.
Fern docs under docs/ were updated; no docs/index.yml navigation change is needed.

Remaining items:

crates/openshell-server/src/grpc/provider.rs: list_provider_profiles_returns_built_in_profile_categories still expects the old built-in profile inventory. Because providers/deepinfra.yaml is now included in the default profile catalog, this assertion should include deepinfra between cursor and github; otherwise CI should fail when that test runs.
crates/openshell-server/src/grpc/provider.rs / crates/openshell-core/src/telemetry.rs: DeepInfra now normalizes as a built-in provider, but provider lifecycle telemetry still falls through to Custom. Please either add a deepinfra telemetry bucket and mapping, or add a test/comment that documents why this built-in provider intentionally remains in the custom bucket.

Non-blocking docs note:

architecture/gateway.md still lists supported cluster inference providers as openai, anthropic, nvidia, and google-vertex-ai; please update that overview if architecture docs are expected to stay current with this provider addition.

Next state: gator:in-review

…t test - Add DeepInfra variant to ProviderProfile telemetry enum and from_raw() mapping so deepinfra providers are tracked in their own bucket rather than falling through to Custom - Map deepinfra in telemetry_provider_profile() in openshell-server - Add deepinfra to list_provider_profiles_returns_built_in_profile_categories test (sorted between cursor and github) - Update architecture/gateway.md inference provider list to include deepinfra Signed-off-by: Milos Milutinovic <codemastermilos@gmail.com>

mmilutinovic371 · 2026-06-15T19:33:19Z

Addressed in d4e5e98:

Telemetry: Added Deepinfra variant to ProviderProfile enum in openshell-core/src/telemetry.rs with as_str() → "deepinfra" and from_raw("deepinfra") → Deepinfra. Mapped it in telemetry_provider_profile() in openshell-server so deepinfra providers get their own bucket rather than falling through to Custom.
Profile list test: Added "deepinfra" between "cursor" and "github" in list_provider_profiles_returns_built_in_profile_categories.
Architecture docs: Updated architecture/gateway.md to include deepinfra in the supported cluster inference provider types.

johntmyers · 2026-06-15T19:36:10Z

gator-agent

Re-check After Author Update

I re-evaluated latest head d4e5e980c697a45ae1023af98c7bf058fc717464 after @mmilutinovic371's 2026-06-15 19:33 UTC comment saying the remaining telemetry/profile-list feedback was addressed.

Disposition: resolved.

Remaining items:

No blocking items remain.

Review: The independent reviewer found no blocking findings at this head. The only follow-up was a non-blocking suggestion to add a focused DeepInfra telemetry assertion.

Docs: Fern docs under docs/sandboxes/ and the architecture overview were updated; no docs/index.yml navigation change is needed.

E2E: I applied test:e2e because this touches provider credential discovery and inference routing behavior.

Next state: gator:watch-pipeline

johntmyers · 2026-06-15T19:36:17Z

/ok to test d4e5e98

github-actions · 2026-06-15T19:36:23Z

Label test:e2e applied for d4e5e98. Open Branch E2E Checks, find the run for commit d4e5e98, and click Re-run all jobs to execute with the label set. The run will execute the standard E2E suite after building the required gateway and supervisor images once. The matching required CI gate status on this PR will flip green automatically once the run finishes.

johntmyers · 2026-06-15T19:53:17Z

gator-agent

Pipeline Failure

Head SHA: d4e5e980c697a45ae1023af98c7bf058fc717464

OpenShell / Branch Checks failed because both Rust jobs failed cargo fmt --all -- --check in crates/openshell-router/src/backend.rs around the new Groq-style /v1 regression test assertion.

Next action: @mmilutinovic371, please run cargo fmt --all or otherwise apply the formatter output, push the formatting-only fix, and the pipeline can be rechecked. The test:e2e label remains appropriate, but gator is moving this back to gator:in-review until the required branch check is green.

mmilutinovic371 requested review from a team, derekwaynecarr and mrunalp as code owners June 14, 2026 16:48

mmilutinovic371 mentioned this pull request Jun 14, 2026

feat(providers): add DeepInfra as a built-in inference provider #1773

Closed

9 tasks

johntmyers self-assigned this Jun 15, 2026

johntmyers added the gator:in-review Gator is reviewing or awaiting PR review feedback label Jun 15, 2026

johntmyers added test:e2e Requires end-to-end coverage gator:watch-pipeline Gator is monitoring PR CI/CD status and removed gator:in-review Gator is reviewing or awaiting PR review feedback labels Jun 15, 2026

johntmyers added gator:in-review Gator is reviewing or awaiting PR review feedback and removed gator:watch-pipeline Gator is monitoring PR CI/CD status labels Jun 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(providers): add DeepInfra as a built-in inference provider#1902

feat(providers): add DeepInfra as a built-in inference provider#1902
mmilutinovic371 wants to merge 4 commits into
NVIDIA:mainfrom
mmilutinovic371:feat/providers-deepinfra-v2-only

mmilutinovic371 commented Jun 14, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented Jun 14, 2026

Uh oh!

johntmyers commented Jun 15, 2026

Uh oh!

mmilutinovic371 commented Jun 15, 2026

Uh oh!

johntmyers commented Jun 15, 2026

Uh oh!

mmilutinovic371 commented Jun 15, 2026

Uh oh!

johntmyers commented Jun 15, 2026

Uh oh!

mmilutinovic371 commented Jun 15, 2026

Uh oh!

johntmyers commented Jun 15, 2026

Uh oh!

johntmyers commented Jun 15, 2026

Uh oh!

github-actions Bot commented Jun 15, 2026

Uh oh!

johntmyers commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mmilutinovic371 commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Testing

Unit test results

Checklist

Uh oh!

copy-pr-bot Bot commented Jun 14, 2026

Uh oh!

johntmyers commented Jun 15, 2026

PR Review Status

Uh oh!

mmilutinovic371 commented Jun 15, 2026

Uh oh!

johntmyers commented Jun 15, 2026

Re-check After Author Update

Uh oh!

mmilutinovic371 commented Jun 15, 2026

Uh oh!

johntmyers commented Jun 15, 2026

Re-check After Author Update

Uh oh!

mmilutinovic371 commented Jun 15, 2026

Uh oh!

johntmyers commented Jun 15, 2026

Re-check After Author Update

Uh oh!

johntmyers commented Jun 15, 2026

Uh oh!

github-actions Bot commented Jun 15, 2026

Uh oh!

johntmyers commented Jun 15, 2026

Pipeline Failure

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mmilutinovic371 commented Jun 14, 2026 •

edited

Loading