- Add int64(0) param to SelectAccountWithLoadAwareness callers (signature change from channel scheduling refactor)
- Add UsageMapHook type and struct field to StreamingProcessor
- Revert Claude Max cache billing code to upstream/main (not part of channel feature)
- Revert credits overages logic to upstream/main (non-channel change)
- Remove Instructions field reference (non-channel OpenAI feature)
- Restore sora_client_handler_test.go from upstream + add channel service nil params
Move the model pricing restriction check from 8 handler entry points
to the account scheduling phase (SelectAccountForModelWithExclusions /
SelectAccountWithLoadAwareness), aligning restriction with billing:
- requested: check original request model against pricing list
- channel_mapped: check channel-mapped model against pricing list
- upstream: per-account check using account-mapped model
Handler layer now only resolves channel mapping (no restriction).
Scheduling layer performs pre-check for requested/channel_mapped,
and per-account filtering for upstream billing source.
The forward_failed error log only included account_id, making it
difficult to identify which account and proxy caused the failure
without querying the database. Add account_name, account_platform,
and proxy details (id, name, host, port) to the log fields.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add maximum Claude Code version limit to complement the existing minimum
version check. Refactor the version cache from single-value to unified
bounds struct (min+max) with a single atomic.Value and singleflight group.
- Backend: new constant, struct field, cache refactor, validation (semver
format + cross-validation max >= min), gateway enforcement, audit diff
- Frontend: settings UI input, TypeScript types, zh/en i18n
- Add CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 to all Claude Code
tutorials on /keys page (unix/cmd/powershell/vscode settings.json)
When all failover accounts are exhausted, handleFailoverExhausted maps
the upstream status code (e.g. 403) to a client-facing code (e.g. 502)
but did not write the original code to the gin context. This caused ops
error logs to show the mapped code instead of the real upstream code.
Call SetOpsUpstreamError before mapUpstreamError in all failover-
exhausted paths so that ops_error_logger captures the true upstream
status code and message.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Apply InboundEndpointMiddleware to all gateway route groups
- Replace normalizedOpenAIInboundEndpoint/normalizedOpenAIUpstreamEndpoint and normalizedGatewayInboundEndpoint/normalizedGatewayUpstreamEndpoint with GetInboundEndpoint/GetUpstreamEndpoint
- Remove 4 old constants and 4 old normalization functions (-70 lines)
- Migrate existing endpoint normalization test to new API
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Claude's output_config.effort parameter (low/medium/high/max) was not
being extracted from requests or logged in the reasoning_effort column
of usage logs. Only the OpenAI path populated this field.
Changes:
- Extract output_config.effort in ParseGatewayRequest
- Add ReasoningEffort field to ForwardResult
- Populate reasoning_effort in both RecordUsage and RecordUsageWithLongContext
- Guard against overwriting service-set effort values in handler
- Update stale comments that described reasoning_effort as OpenAI-only
- Add unit tests for extraction, normalization, and persistence
- Use TxPipeline (MULTI/EXEC) instead of Pipeline for atomic INCR+EXPIRE
- Filter negative values in GetBaseRPM(), update test expectation
- Add RPM batch query (GetRPMBatch) to account List API
- Add warn logs for RPM increment failures in gateway handler
- Reset enableRpmLimit on BulkEditAccountModal close
- Use union type 'tiered' | 'sticky_exempt' for rpmStrategy refs
- Add design decision comments for rdb.Time() RTT trade-off
- Remove Gemini platform exclusion from model restriction UI in
Create/Edit account modals (Gemini now supports model_mapping)
- Remove outdated Gemini model passthrough info cards
- Add model_mapping field to GeminiCredentials type
- Extend warmup request interception toggle to Antigravity platform
- Remove redundant try/catch in API key account creation
- Remove noisy gateway.request_completed debug log
- Reorganize Gemini model mapping sections in constants.go
For retryable transient errors (Google 400 "invalid project resource name"
and empty stream responses), retry on the same account up to 2 times
(with 500ms delay) before switching to another account.
- Add RetryableOnSameAccount field to UpstreamFailoverError
- Add same-account retry loop in both Gemini and Claude/OpenAI handler paths
- Move temp-unschedule from service layer to handler layer (only after
all same-account retries exhausted)
- Reduce temp-unschedule cooldown from 30 minutes to 1 minute