182 Commits

Author SHA1 Message Date
erio
fc095bf054 refactor: replace scope-level rate limiting with model-level rate limiting
Merge functional changes from develop branch:
- Remove AntigravityQuotaScope system (claude/gemini_text/gemini_image)
- Replace with per-model rate limiting using resolveAntigravityModelKey
- Remove model load statistics (IncrModelCallCount/GetModelLoadBatch)
- Simplify account selection to unified priority→load→LRU algorithm
- Remove SetAntigravityQuotaScopeLimit from AccountRepository
- Clean up scope-related UI indicators and API fields
2026-02-09 08:19:01 +08:00
erio
1af06aed96 feat: shuffle accounts within same sort group to prevent thundering herd
Add post-sort shuffle for accounts with identical (priority, loadRate,
lastUsedAt) to break deterministic ordering when concurrent requests
read the same scheduler snapshot. Applies to both Antigravity and
OpenAI scheduling paths, plus the sortAccountsByPriorityAndLastUsed
helper.

Keeps upstream CallCount/ModelLoadInfo scheduling intact; shuffle is
additive and only randomises within equivalent-rank groups.
2026-02-09 07:33:17 +08:00
erio
b889d5017b refactor: replace Trie-based digest session store with flat cache 2026-02-09 07:02:12 +08:00
erio
35598d5648 fix: parse Gemini native request format in ParseGatewayRequest for correct session hash generation
ParseGatewayRequest only parsed Anthropic format (system/messages),
ignoring Gemini native format (systemInstruction/contents). This caused
GenerateSessionHash to produce identical hashes for all Gemini sessions.

Add protocol parameter to ParseGatewayRequest to branch between
Anthropic and Gemini parsing. Update GenerateSessionHash message
traversal to extract text from both formats.
2026-02-09 06:47:22 +08:00
erio
5c76b9e45a fix: prevent sessionHash collision for different users with same messages
Mix SessionContext (ClientIP, UserAgent, APIKeyID) into
GenerateSessionHash 3rd-level fallback to differentiate requests
from different users sending identical content.

Also switch hashContent from SHA256-truncated to XXHash64 for
better performance, and optimize Trie Lua script to match from
longest prefix first.
2026-02-09 06:46:32 +08:00
erio
e3748da860 fix(lint): handle errcheck for strings.Builder.WriteString 2026-02-07 18:18:15 +08:00
erio
50a783ff01 feat: add Anthropic sticky session digest chain matching via Trie
The previous fallback (step 3) in GenerateSessionHash hashed system +
all messages together, producing a different hash each round as the
conversation grew ([a] -> [a,b] -> [a,b,c]). This made fallback sticky
sessions ineffective for multi-turn conversations.

Implement per-message Trie digest chain matching (reusing Gemini's Trie
infrastructure) so that the previous round's chain is always a prefix
of the current round's chain, enabling reliable session affinity.
2026-02-07 18:00:56 +08:00
erio
e1a68497d6 refactor: simplify sticky session rate limit handling — switch immediately on any rate limit
Remove threshold-based waiting in both sticky session and antigravity
pre-check paths. When a model is rate-limited, immediately clear the
sticky session and switch accounts instead of waiting for short durations.
2026-02-07 17:06:49 +08:00
erio
b4f6c4f9d5 style: fix gofmt formatting in gateway_service.go
Remove extra blank line that caused golangci-lint gofmt check to fail.
2026-02-07 14:51:20 +08:00
erio
14c6c9321a refactor: remove unused IsAntigravityModelSupported function and its tests 2026-02-07 14:42:28 +08:00
erio
de0927289e fix(antigravity): support upstream accounts and custom model_mapping in scheduling
- GetAccessToken: add upstream branch to read api_key from credentials
- shouldTriggerAntigravitySmartRetry: relax check from IsOAuth to Platform-based
- isModelSupportedByAccount/WithContext: replace IsAntigravityModelSupported
  whitelist with mapAntigravityModel for unified scheduling/forwarding logic
- mapAntigravityModel: fix edge case where wildcard target equals request model
- Update tests for new behavior and add custom model_mapping test cases
2026-02-07 14:32:08 +08:00
erio
edb0937024 fix: restore non-failover error passthrough from 7b156489 2026-02-07 14:24:55 +08:00
erio
5e98445b22 feat(antigravity): comprehensive enhancements - model mapping, rate limiting, scheduling & ops
Key changes:
- Upgrade model mapping: Opus 4.5 → Opus 4.6-thinking with precise matching
- Unified rate limiting: scope-level → model-level with Redis snapshot sync
- Load-balanced scheduling by call count with smart retry mechanism
- Force cache billing support
- Model identity injection in prompts with leak prevention
- Thinking mode auto-handling (max_tokens/budget_tokens fix)
- Frontend: whitelist mode toggle, model mapping validation, status indicators
- Gemini session fallback with Redis Trie O(L) matching
- Ops: enhanced concurrency monitoring, account availability, retry logic
- Migration scripts: 049-051 for model mapping unification
2026-02-07 12:31:10 +08:00
shaw
7b1564898b fix: make error passthrough effective for non-failover upstream errors 2026-02-07 10:25:56 +08:00
shaw
d182ef0391 fix(gateway): 移除 PR #316 引入的工具名转换逻辑
移除响应阶段的工具名/schema/description 转换逻辑,修复第三方工具调用时
工具名被错误转换的问题(如 Task → task)。

移除内容:
- 工具名相关正则变量(toolPrefixRe, toolNameBoundaryRe 等)
- openCodeToolOverrides 和 claudeToolNameOverrides 映射表
- 工具名转换函数(normalizeToolNameForClaude, normalizeToolNameForOpenCode 等)
- 响应体工具名替换函数(replaceToolNamesInText, replaceToolNamesInResponseBody 等)
- 参数名转换函数(normalizeParamNameForOpenCode, rewriteParamKeysInValue)
- 工具描述清理函数(sanitizeToolDescription)
- 输入 schema 转换函数(normalizeToolInputSchema)
- 模型 ID 正则替换函数(replaceModelIDInText)

保留内容:
- 系统提示词清理(sanitizeSystemText)
- Claude Code 指纹 headers 处理
- 模型 ID 映射(通过 JSON 对象操作)
2026-02-06 16:09:58 +08:00
yangjianbo
f33a950103 fix(兼容): 将 Kimi cached_tokens 映射到 Claude 标准 cache_read_input_tokens
Kimi 等 Claude 兼容 API 返回缓存信息使用 OpenAI 风格的 cached_tokens 字段,
而非 Claude 标准的 cache_read_input_tokens,导致客户端收不到缓存命中信息且
内部计费缓存折扣为 0。

新增 reconcileCachedTokens 辅助函数,在 cache_read_input_tokens == 0 且
cached_tokens > 0 时自动填充,覆盖流式(message_start/message_delta)和
非流式两种响应路径。对 Claude 原生上游无影响。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 09:27:42 +08:00
shaw
39e05a2dad feat: 新增全局错误透传规则功能
支持管理员配置上游错误如何返回给客户端:
- 新增 ErrorPassthroughRule 数据模型和 Ent Schema
- 实现规则的 CRUD API(/admin/error-passthrough-rules)
- 支持按错误码、关键词匹配,支持 any/all 匹配模式
- 支持按平台过滤(anthropic/openai/gemini/antigravity)
- 支持透传或自定义响应状态码和错误消息
- 实现两级缓存(Redis + 本地内存)和多实例同步
- 集成到 gateway_handler 的错误处理流程
- 新增前端管理界面组件
- 新增单元测试覆盖核心匹配逻辑

优化:
- 移除 refreshLocalCache 中的冗余排序(数据库已排序)
- 后端 Validate() 增加匹配条件非空校验
2026-02-05 21:52:54 +08:00
shaw
2b192f7dca feat: 支持用户专属分组倍率配置 2026-02-05 16:05:42 +08:00
shaw
05af95dade fix(gateway): 修复工具名转换破坏 Anthropic 特殊工具的问题
未知工具名不再进行 PascalCase/snake_case 转换,保持原样透传。
修复 text_editor_20250728 等 Anthropic 特殊工具被错误转换的问题。
2026-02-05 09:53:20 +08:00
shaw
8f39754812 fix(gateway): 修复模型前缀映射逻辑错误
问题:normalizeClaudeModelForAnthropic 函数错误地将长模型ID截断为短ID,
导致 APIKey 账号的模型名被错误修改。

修复:
- 删除错误的 normalizeClaudeModelForAnthropic 函数和 anthropicPrefixMappings 变量
- 直接使用 claude.NormalizeModelID(正确的短ID->长ID扩展)
- APIKey 账号无显式映射时透传原始模型名
2026-02-04 17:50:05 +08:00
Wesley Liddick
804b6f2282 Merge pull request #468 from s-Joshua-s/fix/thinking-block-modification-error
fix(api): 修复 thinking 块被意外修改导致的 400 错误
2026-02-03 22:21:06 +08:00
Wesley Liddick
4cce21b125 Merge branch 'main' into main 2026-02-03 21:43:41 +08:00
bayma888
3fed478e4d fix(lint): format gateway_service.go with gofmt 2026-02-03 20:55:17 +08:00
bayma888
6146be1474 feat(api-key): add independent quota and expiration support
This feature allows API Keys to have their own quota limits and expiration
times, independent of the user's balance.

Backend:
- Add quota, quota_used, expires_at fields to api_key schema
- Implement IsExpired() and IsQuotaExhausted() checks in middleware
- Add ResetQuota and ClearExpiration API endpoints
- Integrate quota billing in gateway handlers (OpenAI, Anthropic, Gemini)
- Include quota/expiration fields in auth cache for performance
- Expiration check returns 403, quota exhausted returns 429

Frontend:
- Add quota and expiration inputs to key create/edit dialog
- Add quick-select buttons for expiration (+7, +30, +90 days)
- Add reset quota confirmation dialog
- Add expires_at column to keys list
- Add i18n translations for new features (en/zh)

Migration:
- Add 045_add_api_key_quota.sql for new columns
2026-02-03 19:49:31 +08:00
JIA-ss
ad90bb4645 fix(api): 修复 thinking 块被意外修改导致的 400 错误
问题描述:
使用扩展思考功能时,偶现以下错误:
"thinking or redacted_thinking blocks in the latest assistant message cannot be modified"

根因分析:
当代理服务修改请求体中的某些字段时(如 metadata.user_id、model),
使用 map[string]any 解析整个 JSON 后重新序列化,导致:
1. 字段顺序改变(Go map 序列化按字母排序)
2. 数字格式变化(如 1.0 → 1)
3. Unicode 转义变化

Claude API 对 thinking 块进行字节级验证,任何变化都会触发错误。

修复内容:
1. identity_service.go - RewriteUserID/RewriteUserIDWithMasking
   使用 json.RawMessage 保留其他字段的原始字节

2. gateway_service.go - replaceModelInBody
   使用 json.RawMessage 保留其他字段的原始字节

3. gateway_service.go - normalizeClaudeOAuthRequestBody
   保留 messages 的原始字节,跳过包含 thinking 块的消息修改

4. gateway_service.go - isThinkingBlockSignatureError
   添加 "cannot be modified" 错误检测,触发自动重试

5. antigravity_gateway_service.go - isSignatureRelatedError
   添加 "cannot be modified" 错误检测

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 16:15:37 +08:00
song
2220fd18ca merge upstream main 2026-02-03 15:36:17 +08:00
Wesley Liddick
a09478f374 Merge pull request #316 from cyhhao/fix/claude-oauth-compat
fix(网关): 完善 Claude OAuth/Claude Code 兼容
2026-02-03 14:26:19 +08:00
song
3ecadf4aad chore: apply stashed changes 2026-02-02 22:20:08 +08:00
song
0170d19fa7 merge upstream main 2026-02-02 22:13:50 +08:00
liuxiongfeng
45e1429ae8 feat(billing): 添加 Gemini 200K 长上下文双倍计费功能
- 新增 CalculateCostWithLongContext 方法支持阈值双倍计费
- 新增 RecordUsageWithLongContext 方法专用于 Gemini 计费
- Gemini 超过 200K token 的部分按 2 倍费率计算
- 其他平台(Claude/OpenAI)完全不受影响
2026-02-02 21:47:02 +08:00
liuxiongfeng
bbc7b4aeed feat(gateway): Gemini API Key 账户跳过模型映射检查,直接透传
Gemini API Key 账户通常代理上游服务,模型支持由上游判断,
本地不需要预先配置模型映射。
2026-02-02 13:40:29 +08:00
cyhhao
adb77af1d9 fix: satisfy golangci-lint (nil checks, remove unused helpers) 2026-01-31 02:07:57 +08:00
cyhhao
3a34746668 refactor: stop rewriting tool descriptions; keep only system sentence rewrite 2026-01-31 02:01:51 +08:00
cyhhao
fe17058700 refactor: limit OpenCode keyword replacement to tool descriptions 2026-01-31 01:40:38 +08:00
cyhhao
fa454b1b99 fix: align Claude Code system banner with opencode latest 2026-01-29 15:37:07 +08:00
cyhhao
8375094c69 fix(oauth): match Claude CLI accept header and beta set 2026-01-29 15:31:29 +08:00
cyhhao
91079d3f15 chore(debug): emit Claude mimic fingerprint on credential-scope error 2026-01-29 15:17:46 +08:00
cyhhao
63412a9fcc chore(debug): log Claude mimic fingerprint 2026-01-29 03:13:14 +08:00
cyhhao
d98648f03b fix: rewrite OpenCode identity sentence to Claude Code 2026-01-29 03:03:40 +08:00
cyhhao
4d40fb6b60 fix(oauth): merge anthropic-beta and force Claude Code headers in mimic mode 2026-01-29 02:36:28 +08:00
cyhhao
be3b788b8f fix: also prefix next system block with Claude Code banner 2026-01-29 02:03:54 +08:00
cyhhao
723e54013a fix(oauth): mimic Claude Code metadata and beta headers 2026-01-29 01:49:51 +08:00
cyhhao
4d566f68b6 chore: gofmt 2026-01-29 01:34:58 +08:00
cyhhao
31f817d189 fix: add newline separation for Claude Code system prompt 2026-01-29 01:28:43 +08:00
cyhhao
59231668c5 Merge branch 'main' of github.com:Wei-Shaw/sub2api 2026-01-29 01:16:36 +08:00
shaw
cadca752c4 修复SSE流式响应中usage数据被覆盖的问题 2026-01-28 18:36:21 +08:00
cyhhao
ffe43f6098 Merge branch 'main' of github.com:Wei-Shaw/sub2api 2026-01-27 11:09:11 +08:00
shaw
56a1e29cdd fix(gateway): 修复 SSE 流式响应 usage 统计错误
message_delta 应完全覆盖 message_start 的 usage 数据,
而非仅在值为 0 时才更新。
2026-01-27 09:16:34 +08:00
cyhhao
a161fcc89b Merge branch 'main' of github.com:Wei-Shaw/sub2api 2026-01-26 10:44:38 +08:00
ianshaw
839975b0cf feat(gemini): 支持 Gemini CLI 粘性会话与跨账号 thoughtSignature 清理
## 问题背景

1. Gemini CLI 没有明确的会话标识(如 Claude Code 的 metadata.user_id)
2. thoughtSignature 与具体上游账号强绑定,跨账号使用会导致 400 错误
3. 粘性会话切换账号或 cache 丢失时,旧签名会导致请求失败

## 解决方案

### 1. Gemini CLI 会话标识提取

- 从 `x-gemini-api-privileged-user-id` header 和请求体中的 tmp 目录哈希生成会话标识
- 组合策略:SHA256(privileged-user-id + ":" + tmp_dir_hash)
- 正则提取:`/\.gemini/tmp/([A-Fa-f0-9]{64})`

### 2. 跨账号 thoughtSignature 清理

实现三种场景的智能清理:

1. **Cache 命中 + 账号切换**
   - 粘性会话绑定的账号与当前选择的账号不同时清理

2. **同一请求内 failover 切换**
   - 通过 sessionBoundAccountID 跟踪,检测重试时的账号切换

3. **Gemini CLI + Cache 未命中 + 含签名**
   - 预防性清理,避免 cache 丢失后首次转发就 400
   - 仅对 Gemini CLI 请求且请求体包含 thoughtSignature 时触发

## 修改内容

### backend/internal/handler/gemini_v1beta_handler.go
- 添加 `extractGeminiCLISessionHash` 函数提取 Gemini CLI 会话标识
- 添加 `isGeminiCLIRequest` 函数识别 Gemini CLI 请求
- 实现账号切换检测与 thoughtSignature 清理逻辑
- 添加 `geminiCLITmpDirRegex` 正则表达式

### backend/internal/service/gateway_service.go
- 添加 `GetCachedSessionAccountID` 方法查询粘性会话绑定的账号 ID

### backend/internal/service/gemini_native_signature_cleaner.go (新增)
- 实现 `CleanGeminiNativeThoughtSignatures` 函数
- 递归清理 JSON 中的所有 thoughtSignature 字段
- 支持任意 JSON 顶层类型(object/array)

### backend/internal/handler/gemini_cli_session_test.go (新增)
- 测试 Gemini CLI 会话哈希提取逻辑
- 测试 tmp 目录正则匹配
- 覆盖有/无 privileged-user-id 的场景

## 影响范围

- 修复 Gemini CLI 多轮对话时账号切换导致的 400 错误
- 提高粘性会话的稳定性和容错能力
- 不影响其他客户端(Claude Code 等)的会话标识生成

## 测试

- 单元测试:go test -tags=unit ./internal/handler -run TestExtractGeminiCLISessionHash
- 单元测试:go test -tags=unit ./internal/handler -run TestGeminiCLITmpDirRegex
- 编译验证:go build ./cmd/server
2026-01-26 04:40:38 +08:00