Add maximum Claude Code version limit to complement the existing minimum
version check. Refactor the version cache from single-value to unified
bounds struct (min+max) with a single atomic.Value and singleflight group.
- Backend: new constant, struct field, cache refactor, validation (semver
format + cross-validation max >= min), gateway enforcement, audit diff
- Frontend: settings UI input, TypeScript types, zh/en i18n
- Add CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 to all Claude Code
tutorials on /keys page (unix/cmd/powershell/vscode settings.json)
shouldMarkCreditsExhausted was blocked by isURLLevelRateLimit check when
credit overages retry returned "Resource has been exhausted (e.g. check quota).",
causing credits to never be marked as exhausted. This led to an infinite loop
where each request injected credits, bypassed model rate limits, and failed again.
- Remove isURLLevelRateLimit guard from shouldMarkCreditsExhausted (only called
for credit retry responses — if credits retry fails, mark exhausted)
- Add "resource has been exhausted" to creditsExhaustedKeywords
- Update tests to match corrected behavior
## Problem
When a proxy is unreachable, token refresh retries up to 4 times with
30s timeout each, causing requests to hang for ~2 minutes before
failing with a generic 502 error. The failed account is not marked,
so subsequent requests keep hitting it.
## Changes
### Proxy connection fast-fail
- Set TCP dial timeout to 5s and TLS handshake timeout to 5s on
antigravity client, so proxy connectivity issues fail within 5s
instead of 30s
- Reduce overall HTTP client timeout from 30s to 10s
- Export `IsConnectionError` for service-layer use
- Detect proxy connection errors in `RefreshToken` and return
immediately with "proxy unavailable" error (no retries)
### Token refresh temp-unschedulable
- Add 8s context timeout for token refresh on request path
- Mark account as temp-unschedulable for 10min when refresh fails
(both background `TokenRefreshService` and request-path
`GetAccessToken`)
- Sync temp-unschedulable state to Redis cache for immediate
scheduler effect
- Inject `TempUnschedCache` into `AntigravityTokenProvider`
### Account failover
- Return `UpstreamFailoverError` on `GetAccessToken` failure in
`Forward`/`ForwardGemini` to trigger handler-level account switch
instead of returning 502 directly
### Proxy probe alignment
- Apply same 5s dial/TLS timeout to shared `httpclient` pool
- Reduce proxy probe timeout from 30s to 10s
When all failover accounts are exhausted, handleFailoverExhausted maps
the upstream status code (e.g. 403) to a client-facing code (e.g. 502)
but did not write the original code to the gin context. This caused ops
error logs to show the mapped code instead of the real upstream code.
Call SetOpsUpstreamError before mapUpstreamError in all failover-
exhausted paths so that ops_error_logger captures the true upstream
status code and message.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move 529 overload cooldown configuration from config file to admin
settings UI. Adds an enable/disable toggle and configurable cooldown
duration (1-120 min) under /admin/settings gateway tab, stored as
JSON in the settings table.
When disabled, 529 errors are logged but accounts are no longer
paused from scheduling. Falls back to config file value when DB
is unreachable or settingService is nil.
Update model mapping target for claude-haiku-4-5 and
claude-haiku-4-5-20251001 from claude-sonnet-4-5 to claude-sonnet-4-6.
Includes migration script, default constants, and test updates.
Empty text blocks ({"type":"text","text":""}) cause Anthropic upstream to
return 400: "text content blocks must be non-empty". This was not caught
by the existing error detection pattern in isThinkingBlockSignatureError,
nor handled by FilterThinkingBlocksForRetry.
- Add empty text block stripping to FilterThinkingBlocksForRetry
- Fix isThinkingBlockSignatureError to match new Anthropic error format
- Add fast-path byte patterns to avoid unnecessary JSON parsing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a platform filter dropdown to the admin subscriptions view, allowing
filtering subscriptions by platform (Anthropic, OpenAI, Gemini, etc.)
through the group association.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cover IsValidModelSource/NormalizeModelSource, resolveModelDimensionExpression SQL expressions, invalid model_source 400 responses on both GetModelStats and GetUserBreakdown, upstream_model in scan/insert SQL mock expectations, and updated passthrough/billing test signatures.
Add model_source query parameter to GetModelStats and GetUserBreakdown handlers with explicit IsValidModelSource validation. Include model_source in cache key to prevent cross-source cache hits. Expose upstream_model in usage log DTO with omitempty semantics.
Support querying model statistics by 'requested', 'upstream', or 'mapping' dimension. Add resolveModelDimensionExpression for safe SQL expression generation, IsValidModelSource whitelist validator, and NormalizeModelSource fallback. Repository persists and scans upstream_model in all insert/select paths.
Propagate UpstreamModel through ForwardResult and OpenAIForwardResult in Anthropic direct, API-key passthrough, Bedrock, and OpenAI gateway flows. Extract optionalNonEqualStringPtr and optionalTrimmedStringPtr into usage_log_helpers.go. Store upstream_model only when it differs from the requested model.
Also introduces anthropicPassthroughForwardInput struct to reduce parameter count.
Make setup password requirements consistent with backend rules and show API-provided error messages so install failures are actionable. Trim admin email before validation to avoid false invalid-email rejections from surrounding whitespace.
- Add compile-time interface assertion for sessionWindowMockRepo
- Fix flaky fallback test by capturing time.Now() before calling UpdateSessionWindow
- Replace stale hardcoded timestamps with dynamic future values
- Add millisecond detection and bounds validation for reset header timestamp
- Use pause/resume pattern for interval in UsageProgressBar to avoid idle timers on large lists
- Fix gofmt comment alignment
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>