Skip to content

Migrate OpenAI provider to Responses API#50

Merged
msmakouz merged 14 commits into
masterfrom
bugfix/thinking-effort
Apr 30, 2026
Merged

Migrate OpenAI provider to Responses API#50
msmakouz merged 14 commits into
masterfrom
bugfix/thinking-effort

Conversation

@msmakouz
Copy link
Copy Markdown
Contributor

@msmakouz msmakouz commented Apr 28, 2026

OpenAI provider on Responses API; split out OpenAI-compatible backends; rework error handling

GPT-5.4 with both tools and reasoning_effort is rejected on /v1/chat/completions — OpenAI now points that combination at /v1/responses. Rather than branch the OpenAI provider per model family, this PR moves wippy.llm.openai entirely onto the Responses API and pulls everything that needs Chat Completions (Ollama, vLLM, llama.cpp, OpenRouter, Together, Groq, Fireworks, Mistral, etc.) into a separate wippy.llm.openai_compat provider.

Error handling across all providers (OpenAI, OpenAI-compatible, Claude, Bedrock, Google) was reworked.

Provider split

  • wippy.llm.openai/v1/responses only. Required for tools + reasoning_effort on GPT-5.4; also unlocks previous_response_id, encrypted reasoning persistence, minimal / xhigh effort, and the new SSE event model.
  • wippy.llm.openai_compat/v1/chat/completions, with its own OPENAI_COMPAT_* env vars. Drop-in for any OpenAI-compatible endpoint. Keeps OpenRouter reasoning_details passthrough and the full sampling parameter set (frequency_penalty, presence_penalty, stop, seed).

Embeddings keep their own endpoint and are unaffected.

Responses API mapping

  • messagesinput[] items; system / developer extracted into a top-level instructions field.
  • function_call and function_call_output are now top-level items rather than nested under an assistant message.
  • Tool definitions carry name and parameters at the top level; strict defaults to false to keep existing schemas working.
  • max_tokensmax_output_tokens; reasoning_effortreasoning.effort with the full range (minimal / low / medium / high / xhigh).
  • Structured output uses text.format.json_schema instead of response_format.
  • SSE parser rewritten for named events: response.output_text.delta, response.function_call_arguments.{delta,done}, response.reasoning_summary_text.delta, response.completed, response.failed.

Error handling

A single fluent path replaces output.to_structured_error(...) and per-mapper map_error_response:

local err = output.errors.generate(contract_args):classifier(mapper.classify_error)

-- HTTP / transport error:
return nil, err:from(http_err):build()

-- inline error:
return nil, err:kind(K):message(M):details(D):build()
  • mapper.classify_error(http_err) -> (kind, message, details) is now a pure function in every mapper. No errors.new(...) calls outside output.lua.
  • output.errors.<op>(contract_args) reads _provider_id and model directly from contract_args — handlers no longer pass literal strings.
  • build_error in output.lua is the single observability point: logger:named("llm"):error(...) for every provider error, with kind, retryable, provider, operation, model, and the calling user (resolved automatically from security.actor()).
  • ERROR_KIND_MAP uses stdlib errors.RATE_LIMITED / errors.UNAVAILABLE / etc. instead of hardcoded kind strings.
  • Provider mappers and handlers return (nil, structured_error) on the error path; the public llm.* API still returns (result, err_string) via a format_error shim.
  • Type aliases (ErrorBuilder, ClassifyError, ErrorContract, ErrorBuilderFactory) added so the linter follows the fluent chain.

Backwards compatibility

  • Public llm.generate / structured_output / embed / status API is unchanged.
  • Migration required: anyone using wippy.llm.openai with OPENAI_BASE_URL pointed at Ollama, vLLM, OpenRouter, Together, etc. needs to switch the provider to wippy.llm.openai_compat:provider and use OPENAI_COMPAT_BASE_URL. Same wire format as before for those backends.

Tests

Mapper unit tests cover input mapping, tool format, options (including minimal / xhigh effort), output items, refusal, incomplete→length, encrypted_reasoning round-trip, and error classification. Handler tests adapted to (response, err). SSE stream tests rewritten to the named-event format. Status-handler tests are unchanged — those still use the data-driven { success, status, message } shape on purpose.

output.to_structured_error and mapper.map_error_response are gone from production code; integration tests' error-path assertions migrated to err:kind() / err:message().

@msmakouz msmakouz requested a review from wolfy-j April 28, 2026 20:02
@msmakouz msmakouz self-assigned this Apr 28, 2026
@msmakouz msmakouz merged commit 3fb273b into master Apr 30, 2026
18 checks passed
@msmakouz msmakouz deleted the bugfix/thinking-effort branch April 30, 2026 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant