The quota reset mechanism is lazy — quota_daily_used/quota_weekly_used
in the database are only reset on the next IncrementQuotaUsed call.
The scheduling layer (IsQuotaExceeded) correctly checks period expiry
before enforcing limits, so the account remains usable. However, the
API response mapper reads the raw DB value without checking expiry,
causing the frontend to display cumulative usage (e.g. 110%) even
after the reset period has passed.
Add IsDailyQuotaPeriodExpired/IsWeeklyQuotaPeriodExpired methods and
use them in the mapper to return used=0 when the period has expired.
Add maximum Claude Code version limit to complement the existing minimum
version check. Refactor the version cache from single-value to unified
bounds struct (min+max) with a single atomic.Value and singleflight group.
- Backend: new constant, struct field, cache refactor, validation (semver
format + cross-validation max >= min), gateway enforcement, audit diff
- Frontend: settings UI input, TypeScript types, zh/en i18n
- Add CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 to all Claude Code
tutorials on /keys page (unix/cmd/powershell/vscode settings.json)
shouldMarkCreditsExhausted was blocked by isURLLevelRateLimit check when
credit overages retry returned "Resource has been exhausted (e.g. check quota).",
causing credits to never be marked as exhausted. This led to an infinite loop
where each request injected credits, bypassed model rate limits, and failed again.
- Remove isURLLevelRateLimit guard from shouldMarkCreditsExhausted (only called
for credit retry responses — if credits retry fails, mark exhausted)
- Add "resource has been exhausted" to creditsExhaustedKeywords
- Update tests to match corrected behavior
## Problem
When a proxy is unreachable, token refresh retries up to 4 times with
30s timeout each, causing requests to hang for ~2 minutes before
failing with a generic 502 error. The failed account is not marked,
so subsequent requests keep hitting it.
## Changes
### Proxy connection fast-fail
- Set TCP dial timeout to 5s and TLS handshake timeout to 5s on
antigravity client, so proxy connectivity issues fail within 5s
instead of 30s
- Reduce overall HTTP client timeout from 30s to 10s
- Export `IsConnectionError` for service-layer use
- Detect proxy connection errors in `RefreshToken` and return
immediately with "proxy unavailable" error (no retries)
### Token refresh temp-unschedulable
- Add 8s context timeout for token refresh on request path
- Mark account as temp-unschedulable for 10min when refresh fails
(both background `TokenRefreshService` and request-path
`GetAccessToken`)
- Sync temp-unschedulable state to Redis cache for immediate
scheduler effect
- Inject `TempUnschedCache` into `AntigravityTokenProvider`
### Account failover
- Return `UpstreamFailoverError` on `GetAccessToken` failure in
`Forward`/`ForwardGemini` to trigger handler-level account switch
instead of returning 502 directly
### Proxy probe alignment
- Apply same 5s dial/TLS timeout to shared `httpclient` pool
- Reduce proxy probe timeout from 30s to 10s
When all failover accounts are exhausted, handleFailoverExhausted maps
the upstream status code (e.g. 403) to a client-facing code (e.g. 502)
but did not write the original code to the gin context. This caused ops
error logs to show the mapped code instead of the real upstream code.
Call SetOpsUpstreamError before mapUpstreamError in all failover-
exhausted paths so that ops_error_logger captures the true upstream
status code and message.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move 529 overload cooldown configuration from config file to admin
settings UI. Adds an enable/disable toggle and configurable cooldown
duration (1-120 min) under /admin/settings gateway tab, stored as
JSON in the settings table.
When disabled, 529 errors are logged but accounts are no longer
paused from scheduling. Falls back to config file value when DB
is unreachable or settingService is nil.
Update model mapping target for claude-haiku-4-5 and
claude-haiku-4-5-20251001 from claude-sonnet-4-5 to claude-sonnet-4-6.
Includes migration script, default constants, and test updates.
Empty text blocks ({"type":"text","text":""}) cause Anthropic upstream to
return 400: "text content blocks must be non-empty". This was not caught
by the existing error detection pattern in isThinkingBlockSignatureError,
nor handled by FilterThinkingBlocksForRetry.
- Add empty text block stripping to FilterThinkingBlocksForRetry
- Fix isThinkingBlockSignatureError to match new Anthropic error format
- Add fast-path byte patterns to avoid unnecessary JSON parsing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a platform filter dropdown to the admin subscriptions view, allowing
filtering subscriptions by platform (Anthropic, OpenAI, Gemini, etc.)
through the group association.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>