- Add `sort_order` field to groups table with migration
- Add `PUT /api/v1/admin/groups/sort-order` API for batch update
- Implement drag-and-drop UI using vue-draggable-plus
- All queries now order groups by sort_order
- Add i18n support (en/zh) for sort-related UI text
- Update test stubs to satisfy new interface methods
Upstream accounts now use the standard APIKey type instead of a dedicated
upstream type. GetBaseURL() and new GetGeminiBaseURL() automatically append
/antigravity for Antigravity platform APIKey accounts, eliminating the need
for separate upstream forwarding methods.
- Remove ForwardUpstream, ForwardUpstreamGemini, testUpstreamConnection
- Remove upstream branch guards in Forward/ForwardGemini/TestConnection
- Add migration 052 to convert existing upstream accounts to apikey
- Update frontend CreateAccountModal to create apikey type
- Add unit tests for GetBaseURL and GetGeminiBaseURL
Replace manual header setting (Content-Type, anthropic-version, anthropic-beta)
with full client header passthrough in ForwardUpstream/ForwardUpstreamGemini.
Only authentication headers (Authorization, x-api-key) are overridden with
upstream account credentials. Hop-by-hop headers are excluded.
Add unit tests covering header passthrough, auth override, and hop-by-hop filtering.
v0.1.74 merged upstream accounts into the OAuth path, causing requests
to hit the wrong protocol and endpoint. Add three upstream-specific
methods (testUpstreamConnection, ForwardUpstream, ForwardUpstreamGemini)
that use base_url + apiKey auth and passthrough the original body, while
reusing the existing response handling and error/retry logic.
- Change antigravitySmartRetryMaxAttempts from 3 to 1 to prevent
repeated rate limiting and long waits
- Clear sticky session binding (DeleteSessionAccountID) after smart
retry exhaustion, so subsequent requests don't hit the same
rate-limited account
- Add flow diagrams to Forward/ForwardGemini doc comments
- Add comprehensive unit tests covering:
- Sticky session cleared on retry failure (429, 503, network error)
- Sticky session NOT cleared on retry success
- Sticky session NOT cleared for non-sticky requests (empty hash)
- Sticky session NOT cleared on long delay path (handled by handler)
- Nil cache safety (no panic)
- MaxAttempts constant verification
- End-to-end retryLoop → switchError propagation with session clear
The previous fallback (step 3) in GenerateSessionHash hashed system +
all messages together, producing a different hash each round as the
conversation grew ([a] -> [a,b] -> [a,b,c]). This made fallback sticky
sessions ineffective for multi-turn conversations.
Implement per-message Trie digest chain matching (reusing Gemini's Trie
infrastructure) so that the previous round's chain is always a prefix
of the current round's chain, enabling reliable session affinity.
Remove threshold-based waiting in both sticky session and antigravity
pre-check paths. When a model is rate-limited, immediately clear the
sticky session and switch accounts instead of waiting for short durations.
1. Frontend: replace hardcoded antigravityDefaultMappings with async
fetch from GET /admin/accounts/antigravity/default-model-mapping,
eliminating the duplicate data source that caused frontend/backend
mapping inconsistency.
2. Backend: convert handleSmartRetry and antigravityRetryLoop from
standalone functions to AntigravityGatewayService methods, enabling
Redis cache sync (updateAccountModelRateLimitInCache) after both
rate-limit write paths — long-delay branch and retry-exhausted branch.
- GetAccessToken: add upstream branch to read api_key from credentials
- shouldTriggerAntigravitySmartRetry: relax check from IsOAuth to Platform-based
- isModelSupportedByAccount/WithContext: replace IsAntigravityModelSupported
whitelist with mapAntigravityModel for unified scheduling/forwarding logic
- mapAntigravityModel: fix edge case where wildcard target equals request model
- Update tests for new behavior and add custom model_mapping test cases
The default fallback cooldown when rate limit reset time cannot be
parsed was 5 minutes, which is too aggressive and causes accounts
to be unnecessarily locked out. Reduce to 30 seconds for faster
recovery. Config override still works (unit remains minutes).
Kimi 等 Claude 兼容 API 返回缓存信息使用 OpenAI 风格的 cached_tokens 字段,
而非 Claude 标准的 cache_read_input_tokens,导致客户端收不到缓存命中信息且
内部计费缓存折扣为 0。
新增 reconcileCachedTokens 辅助函数,在 cache_read_input_tokens == 0 且
cached_tokens > 0 时自动填充,覆盖流式(message_start/message_delta)和
非流式两种响应路径。对 Claude 原生上游无影响。
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Kimi 等 Claude 兼容 API 返回缓存信息使用 OpenAI 风格的 cached_tokens 字段,
而非 Claude 标准的 cache_read_input_tokens,导致客户端收不到缓存命中信息且
内部计费缓存折扣为 0。
新增 reconcileCachedTokens 辅助函数,在 cache_read_input_tokens == 0 且
cached_tokens > 0 时自动填充,覆盖流式(message_start/message_delta)和
非流式两种响应路径。对 Claude 原生上游无影响。
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>