The hardcoded codex CLI version (0.104.0) causes upstream rejection
when using gpt-5.5 with compact, as the server treats the request
as an outdated client and returns 400/502.
Update codexCLIVersion, codexCLIUserAgent, and openAICodexProbeVersion
to 0.125.0 to match the current Codex CLI release.
Fixes#1933, #1887, #1865
Related: #1609, #1298, #849
In reconstructResponseOutputFromSSE, text content Marshal/Unmarshal
failure previously caused an early return that silently discarded
already-extracted image_generation_call outputs. Now serialization
errors are tolerated so image results still reach the client.
Consolidates 9 call sites of resolveUpstreamResponseReadLimit + readUpstreamResponseBodyLimited + ErrUpstreamResponseBodyTooLarge error handling into a single ReadUpstreamResponseBody function with TooLargeWriter callback for API-format-specific error responses (Anthropic, OpenAI, countTokens).
Priority was wrong:
- Before: custom rules → LiteLLM (when ApplyPricingToAccountStats) → nil
- After: custom rules → totalCost (when ApplyPricingToAccountStats) → LiteLLM → nil
When ApplyPricingToAccountStats is enabled, use the request's actual
client billing cost (before multiplier) as account_stats_cost, instead
of recalculating from LiteLLM per-token prices which produced incorrect
values for per-request billing mode.
LiteLLM model pricing is now the final fallback (priority 3), used only
when neither custom rules nor ApplyPricingToAccountStats apply.
- resolveAccountStatsCost now uses the final upstream model (after
account-level mapping) to match custom pricing rules, fixing the
issue where requested model (e.g. claude-sonnet-4-5) didn't match
rules configured for upstream model (e.g. claude-opus-4-6)
- Remove tryChannelPricing fallback — only custom rules are applied,
unmatched requests use default formula (total_cost × rate)
- Remove unused billingService and serviceTier parameters
- Update description: "启用后将支持自定义账号统计的模型价格"
Allow channels to configure independent model pricing for account
statistics cost calculation, decoupled from user billing.
Backend:
- Migration 101: channels.apply_pricing_to_account_stats toggle,
channel_account_stats_pricing_rules/model_pricing tables,
usage_logs.account_stats_cost column
- resolveAccountStatsCost: match rules by group/account, then channel
pricing, fallback to original formula when unconfigured
- Integrate into both GatewayService.recordUsageCore and
OpenAIGatewayService.RecordUsage
- Update 8 account stats SQL queries to use
COALESCE(account_stats_cost, total_cost) * account_rate_multiplier
- 23 unit tests for matching, pricing lookup, and cost calculation
Frontend:
- Channel edit dialog: toggle + custom rules UI with group/account
multi-select and pricing entry cards
- API types and i18n (zh/en)
When a channel has no model mapping for the requested model, ChannelMappedModel
equals OriginalModel (the user's arbitrary input). Combined with the default
BillingModelSource="channel_mapped", this incorrectly overrides the BillingModel
set by the OpenAI format conversion layer (e.g., gpt-5.4 from DefaultMappedModel)
back to the unmapped original model (e.g., glm) which has no pricing — resulting
in zero-cost billing.
Add guard condition so the channel_mapped override only fires when the channel
actually changed the model (ChannelMappedModel != OriginalModel).
Eliminates unnecessary indirection layer. The wrapper function only
called normalizeCodexModel with a special case for "gpt 5.3 codex spark"
(space-separated variant) that is no longer needed.
All call sites now use normalizeCodexModel directly.
P0-1: Credits degraded response retry + fail-open
- Add isAntigravityDegradedResponse() to detect transient API failures
- Retry up to 3 times with exponential backoff (500ms/1s/2s)
- Invalidate singleflight cache between retries
- Fail-open after exhausting retries instead of 5h circuit break
P1-1: Fix channel restriction pre-check timing conflict
- Swap checkClaudeCodeRestriction before checkChannelPricingRestriction
- Ensures channel restriction is checked against final fallback groupID
P1-2: Add interval pricing validation (frontend + backend)
- Backend: ValidateIntervals() with boundary, price, overlap checks
- Frontend: validateIntervals() with Chinese error messages
- Rules: MinTokens>=0, MaxTokens>MinTokens, prices>=0, no overlap
P2: Fix cross-platform same-model pricing/mapping override
- Store cache keys using original platform instead of group platform
- Lookup across matching platforms (antigravity→anthropic→gemini)
- Prevents anthropic/gemini same-name models from overwriting each other
- Fix 7 stale comments still mentioning "限制检查" in handlers/services
- Make billingModelForRestriction explicitly list channel_mapped case
- Add slog.Warn for error swallowing in ResolveChannelMapping and
needsUpstreamChannelRestrictionCheck
- Document sticky session upstream check exemption
- PricingSourceChannel/LiteLLM/Fallback for resolver source
- MediaTypeImage/Video/Prompt for result.MediaType
- Reuse BillingModeToken/BillingModeImage for billing mode
- Reuse BillingModelSourceChannelMapped/PlatformAnthropic in handler
- Fix errcheck: defer rows.Close() with nolint
- Fix errcheck: type assertion with ok check in channel cache
- Fix staticcheck ST1005: lowercase error string
- Fix staticcheck SA5011: nil check cost before use in openai gateway
- Fix gofmt: format chatcompletions_to_responses.go
- Parse candidatesTokensDetails from Gemini API to separate image/text output tokens
- Add image_output_tokens and image_output_cost to usage_log (migration 089)
- Support per-image-token pricing via output_cost_per_image_token from model pricing data
- Channel pricing ImageOutputPrice override works in token billing mode
- Auto-fill image_output_price in channel pricing form from model defaults
- Add "channel_mapped" billing model source as new default (migration 088)
- Bills by model name after channel mapping, before account mapping
- Fix channel cache error TTL sign error (115s → 5s)
- Fix Update channel only invalidating new groups, not removed groups
- Fix frontend model_mapping clearing sending undefined instead of {}
- Credits balance precheck via shared AccountUsageService cache before injection
- Skip credits injection for accounts with insufficient balance
- Don't mark credits exhausted for "exhausted your capacity on this model" 429s
When no explicit session signals (session_id, conversation_id, prompt_cache_key)
are provided, derive a stable session seed from the request body content
(model + tools + system prompt + first user message) to enable sticky routing
and prompt caching for non-Codex clients using the Chat Completions API.
This mirrors the content-based fallback already present in GatewayService.
GenerateSessionHash, adapted for the OpenAI gateway's request formats (both
Chat Completions messages and Responses API input).
JSON fragments are canonicalized via normalizeCompatSeedJSON to ensure
semantically identical requests produce the same seed regardless of
whitespace or key ordering.
Closes#1421