v0.1.74 merged upstream accounts into the OAuth path, causing requests
to hit the wrong protocol and endpoint. Add three upstream-specific
methods (testUpstreamConnection, ForwardUpstream, ForwardUpstreamGemini)
that use base_url + apiKey auth and passthrough the original body, while
reusing the existing response handling and error/retry logic.
1. Frontend: replace hardcoded antigravityDefaultMappings with async
fetch from GET /admin/accounts/antigravity/default-model-mapping,
eliminating the duplicate data source that caused frontend/backend
mapping inconsistency.
2. Backend: convert handleSmartRetry and antigravityRetryLoop from
standalone functions to AntigravityGatewayService methods, enabling
Redis cache sync (updateAccountModelRateLimitInCache) after both
rate-limit write paths — long-delay branch and retry-exhausted branch.
- GetAccessToken: add upstream branch to read api_key from credentials
- shouldTriggerAntigravitySmartRetry: relax check from IsOAuth to Platform-based
- isModelSupportedByAccount/WithContext: replace IsAntigravityModelSupported
whitelist with mapAntigravityModel for unified scheduling/forwarding logic
- mapAntigravityModel: fix edge case where wildcard target equals request model
- Update tests for new behavior and add custom model_mapping test cases
Change formatTime() to include seconds (HH:MM:SS) instead of only
hours and minutes (HH:MM). This gives users more precise information
about when rate limits will reset.
The default fallback cooldown when rate limit reset time cannot be
parsed was 5 minutes, which is too aggressive and causes accounts
to be unnecessarily locked out. Reduce to 30 seconds for faster
recovery. Config override still works (unit remains minutes).
When extended thinking is enabled, Claude API requires max_tokens >
thinking.budget_tokens. If misconfigured, this auto-adjusts max_tokens
to budget_tokens + 1000 instead of returning a 400 error.
- Add ensureMaxTokensGreaterThanBudget helper function
- Extract Gemini25FlashThinkingBudgetLimit constant (24576)
- Log adjustment for debugging
Ensures consistent line endings for SQL migration files, Go source,
shell scripts, YAML configs, and Dockerfiles. Fixes checksum mismatches
on Windows where CRLF line endings cause migration hash differences.
Kimi 等 Claude 兼容 API 返回缓存信息使用 OpenAI 风格的 cached_tokens 字段,
而非 Claude 标准的 cache_read_input_tokens,导致客户端收不到缓存命中信息且
内部计费缓存折扣为 0。
新增 reconcileCachedTokens 辅助函数,在 cache_read_input_tokens == 0 且
cached_tokens > 0 时自动填充,覆盖流式(message_start/message_delta)和
非流式两种响应路径。对 Claude 原生上游无影响。
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Kimi 等 Claude 兼容 API 返回缓存信息使用 OpenAI 风格的 cached_tokens 字段,
而非 Claude 标准的 cache_read_input_tokens,导致客户端收不到缓存命中信息且
内部计费缓存折扣为 0。
新增 reconcileCachedTokens 辅助函数,在 cache_read_input_tokens == 0 且
cached_tokens > 0 时自动填充,覆盖流式(message_start/message_delta)和
非流式两种响应路径。对 Claude 原生上游无影响。
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>