feat(group): codex 分组新增强制 fast 模式开关(绕过客户端 anthropic-beta 限制)#1654
Open
HaoYan-A wants to merge 1 commit intoWei-Shaw:mainfrom
Open
feat(group): codex 分组新增强制 fast 模式开关(绕过客户端 anthropic-beta 限制)#1654HaoYan-A wants to merge 1 commit intoWei-Shaw:mainfrom
HaoYan-A wants to merge 1 commit intoWei-Shaw:mainfrom
Conversation
Introduce a group-level "force fast mode" toggle that, when enabled on an openai (codex) group, unconditionally rewrites the upstream request body to service_tier="priority" regardless of the client's inbound value or anthropic-beta: fast-mode-* header. Applies to all three forward paths: /v1/responses, /v1/chat/completions and Claude /v1/messages (including the OpenAI passthrough branch). The design reuses applyCodexOAuthTransform's call sites rather than changing its signature (which would touch 20+ existing tests). A single helper getForceFastModeFromContext reads the already-loaded apiKey.Group.ForceFastMode from the gin context, so handlers need no changes. Billing automatically flows through the existing service_tier path and charges at priority pricing. Notable subtleties: - The api_key eager-load in api_key_repo uses an explicit Select field allowlist, so the new column must be added there (otherwise ent reads false even when the DB column is true). - api_key auth cache snapshot serializes/deserializes through a separate DTO, also updated with the new field. - omitempty on the JSON field makes the addition backward-compatible: old sub2api versions sharing the same Redis will simply read false for the missing key, and new versions reading old cached snapshots behave correctly. Verified end-to-end against a real codex group: - off -> usage_logs.service_tier empty, baseline cost - on -> usage_logs.service_tier=priority, ~1.5x cost (gpt-5.4 priority) - Covers /v1/chat/completions and /v1/messages paths - 8-case unit test in openai_gateway_service_codex_cli_only_test.go covers every nil/type-mismatch/non-openai/off/on branch of the helper. Migration numbered 903 (test-space) to avoid colliding with main-branch additions in the 10x range; rename to an appropriate number before any upstream PR. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
背景
项目已经支持 Claude 协议 → Codex Fast 的自动映射:当 Claude Code 客户端通过
/v1/messages发送请求并携带anthropic-beta: fast-mode-2026-02-01header 时,ForwardAsAnthropic会把responsesReq.ServiceTier设为"priority"(openai_gateway_messages.go:58-61)。但实际用户遇到的问题是:
结果就是:即使用户愿意使用 fast 档(priority 档位),现有链路也触发不了 —— 整个"Claude → Codex Fast"的功能在实际场景中派不上用场。
方案
在 openai 类型分组上新增一个 admin 级开关
force_fast_mode。开启后,此分组处理的所有请求(不论来自/v1/responses、/v1/chat/completions还是/v1/messages)都会被强制写入service_tier="priority",无条件覆盖客户端传入的service_tier和anthropic-beta: fast-mode-*header。这是一个分组级的"override 兜底":管理员配置一次,该分组下所有 API Key 的请求都自动走 fast 档,不依赖客户端能不能传 header。
设计要点
单一注入入口,不改函数签名
三条 forward 路径(
Forward/ForwardAsAnthropic/ForwardAsChatCompletions)都在 OAuth 分支调用applyCodexOAuthTransform。为避免改动applyCodexOAuthTransform签名(会牵扯到 20+ 个已有测试桩),新增一个 helper:这个 helper 复用了中间件已 eager-load 到 context 的
apiKey.Group,handler 层完全不需要改动。三处单点注入
openai_gateway_service.goForward(/v1/responses,含 passthrough 和 WSv2 分支):在applyCodexOAuthTransform之后设置reqBody["service_tier"] = "priority"openai_gateway_messages.goForwardAsAnthropic(Claude/v1/messages):扩展原有BetaFastMode判断为force_fast_mode || betaopenai_gateway_chat_completions.goForwardAsChatCompletions(/v1/chat/completions):在responsesBody最终 marshal 前同时设置responsesReq.ServiceTier和 body bytes计费自动跟随
不需要碰 billing 代码。
extractOpenAIServiceTier(reqBody)已经会从最终 body 读service_tier,CalculateCostWithServiceTier会按 priority 档计费(billing_service_test.go:503已覆盖)。api_key_auth_cache 同步更新
APIKeyAuthGroupSnapshot(api_key_auth_cache.go)需要同步序列化新字段,否则 cache 里读回来的 Group 永远是默认 false。字段用omitempty,老版本读到缺失字段时默认 false,向后兼容。api_key_repo Select 白名单
api_key_repo.go的GetByKeyForAuth用了显式q.Select(...)字段白名单,必须把group.FieldForceFastMode加进去,否则 ent 不会 SELECT 这一列,读出来的结构体字段是 false(已踩过这个坑)。向后兼容性
migration 107加列force_fast_mode BOOLEAN NOT NULL DEFAULT false。老代码读到这一列不会崩,只是不读。APIKeyAuthGroupSnapshotJSON 字段用omitempty。新老版本混部(共享同一个 Redis)时,双向读写都安全。force_fast_mode关闭时原链路完全保留)。开启时会覆盖客户端显式值,这是预期行为。测试
TestGetForceFastModeFromContext覆盖 nil context、缺 api_key、类型错、nil APIKey、nil Group、关闭、非 openai 平台、openai+开启 所有分支go test -tags=unit ./internal/service/... ./internal/handler/... ./internal/repository/...全部通过golangci-lint run ./... --new-from-rev=upstream/main0 issuespnpm run typecheck通过Checklist
go test -tags=unit ./...通过go test -tags=integration ./...—— 本地未启动 postgres/redis,依赖 CI 跑(改动对集成测试零影响:只加了一个字段 + 对应白名单更新,没有改任何既有接口)golangci-lint run ./...无新增问题pnpm-lock.yaml同步(未改 package.json)使用方式
在 admin UI 编辑一个 openai 平台分组,可以看到"强制 fast 模式"开关(位于"允许 /v1/messages 调度"下方)。开启后立即对所有该分组的 API Key 生效(auth cache 会在下次请求时重新加载或自动失效)。
用户如果同时使用
force_fast_mode=true和客户端anthropic-beta: fast-modeheader,行为一致(都走 priority);如果force_fast_mode=false+ 客户端传 header,按原有链路走 priority;force_fast_mode=false+ 客户端不传 header,走默认档。