## Problem
When a proxy is unreachable, token refresh retries up to 4 times with
30s timeout each, causing requests to hang for ~2 minutes before
failing with a generic 502 error. The failed account is not marked,
so subsequent requests keep hitting it.
## Changes
### Proxy connection fast-fail
- Set TCP dial timeout to 5s and TLS handshake timeout to 5s on
antigravity client, so proxy connectivity issues fail within 5s
instead of 30s
- Reduce overall HTTP client timeout from 30s to 10s
- Export `IsConnectionError` for service-layer use
- Detect proxy connection errors in `RefreshToken` and return
immediately with "proxy unavailable" error (no retries)
### Token refresh temp-unschedulable
- Add 8s context timeout for token refresh on request path
- Mark account as temp-unschedulable for 10min when refresh fails
(both background `TokenRefreshService` and request-path
`GetAccessToken`)
- Sync temp-unschedulable state to Redis cache for immediate
scheduler effect
- Inject `TempUnschedCache` into `AntigravityTokenProvider`
### Account failover
- Return `UpstreamFailoverError` on `GetAccessToken` failure in
`Forward`/`ForwardGemini` to trigger handler-level account switch
instead of returning 502 directly
### Proxy probe alignment
- Apply same 5s dial/TLS timeout to shared `httpclient` pool
- Reduce proxy probe timeout from 30s to 10s
Cover IsValidModelSource/NormalizeModelSource, resolveModelDimensionExpression SQL expressions, invalid model_source 400 responses on both GetModelStats and GetUserBreakdown, upstream_model in scan/insert SQL mock expectations, and updated passthrough/billing test signatures.
Support querying model statistics by 'requested', 'upstream', or 'mapping' dimension. Add resolveModelDimensionExpression for safe SQL expression generation, IsValidModelSource whitelist validator, and NormalizeModelSource fallback. Repository persists and scans upstream_model in all insert/select paths.
Click on a group name, model name, or endpoint name in the distribution
tables to expand and show per-user usage breakdown (requests, tokens,
actual cost, standard cost).
Backend: new GET /admin/dashboard/user-breakdown API with group_id,
model, endpoint, endpoint_type filters.
Frontend: clickable rows with expand/collapse sub-table in all three
distribution charts.
- Fix gofmt alignment in admin_service.go and trailing newline in
antigravity_credits_overages.go
- Suppress errcheck for fmt.Sscanf in client.go GetMinimumAmount
- apply default mapped model only when scheduling fallback is actually used
- preserve reasoning in OpenAI-compatible output via reasoning_content and avoid invalid input function_call ids
Backend:
- Detect and classify 403 responses into three types:
validation (account needs Google verification),
violation (terms of service / banned),
forbidden (generic 403)
- Extract verification/appeal URLs from 403 response body
(structured JSON parsing with regex fallback)
- Add needs_verify, is_banned, needs_reauth, error_code fields
to UsageInfo (omitempty for zero impact on other platforms)
- Handle 403 in request path: classify and permanently set account error
- Save validation_url in error_message for degraded path recovery
- Enrich usage with account error on both success and degraded paths
- Add singleflight dedup for usage requests with independent context
- Differentiate cache TTL: success/403 → 3min, errors → 1min
- Return degraded UsageInfo instead of HTTP 500 on quota fetch errors
Frontend:
- Display forbidden status badges with color coding (red for banned,
amber for needs verification, gray for generic)
- Show clickable verification/appeal URL links
- Display needs_reauth and degraded error states in usage cell
- Add Antigravity tier label badge next to platform type
Tests:
- Comprehensive unit tests for classifyForbiddenType (7 cases)
- Unit tests for extractValidationURL (8 cases including unicode escapes)
- Integration test for FetchQuota forbidden path
The SSE stream termination marker string was incorrectly included in
DefaultStopSequences, causing Gemini to prematurely stop generating
output whenever the model produced text containing that marker.
The SSE-level protocol filtering in stream_transformer.go already
handles this marker correctly; it should not be a stop sequence for
the model's text generation.
Add Anthropic Messages API support for OpenAI platform groups, enabling
clients using Claude-style /v1/messages format to access OpenAI accounts
through automatic protocol conversion.
- Add apicompat package with type definitions and bidirectional converters
(Anthropic ↔ Chat, Chat ↔ Responses, Anthropic ↔ Responses)
- Implement /v1/messages endpoint for OpenAI gateway with streaming support
- Add model mapping UI for OpenAI OAuth accounts (whitelist + mapping modes)
- Support prompt caching fields and codex OAuth transforms
- Fix tool call ID conversion for Responses API (fc_ prefix)
- Ensure function_call_output has non-empty output field
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>