Edric Li
378e476e48
fix: 修复 CI 检查失败
...
- gofmt: 修复 error_passthrough_service.go 格式问题
- errcheck: 修复 error_passthrough_runtime_test.go 类型断言未检查
- staticcheck: if-else 改为 switch (gateway_service.go)
- test: 修复两个测试用例错误使用 MODEL_CAPACITY_EXHAUSTED 导致走错路径
2026-02-10 22:08:49 +08:00
Edric Li
2a1067c82b
Merge remote-tracking branch 'upstream/main'
2026-02-10 21:52:33 +08:00
Edric Li
a54b81cf74
perf: 错误处理性能优化
...
- MatchRule 延迟/限制 body ToLower,先用 statusCode 短路,只在需要关键词匹配时转换且限制 8KB
- 预计算规则的小写关键词/平台和 error code set,消除运行时重复 ToLower 和线性扫描
- MODEL_CAPACITY_EXHAUSTED 全局去重,避免并发请求重复重试同一模型
- 503 重试 body 读取限制从 2MB 降至 8KB
- time.After 替换为 time.NewTimer,防止 context 取消时 timer 泄漏
2026-02-10 21:40:31 +08:00
Edric Li
2d4236f76e
fix: 修复错误透传规则 skip_monitoring 未生效的问题
...
- ops_error_logger: status < 400 分支增加 OpsSkipPassthroughKey 检查
- ops_upstream_context: 新增 checkSkipMonitoringForUpstreamEvent,中间重试/故障转移事件也能触发跳过标记
- gateway_handler/openai_gateway_handler/gemini_v1beta_handler: handleFailoverExhausted 匹配规则后设置 OpsSkipPassthroughKey
- antigravity_gateway_service: writeMappedClaudeError 增加 applyErrorPassthroughRule 调用
2026-02-10 20:56:01 +08:00
yangjianbo
166080b29c
chore: 更新版本号至 0.1.74.3
2026-02-10 18:02:02 +08:00
yangjianbo
3b0910f664
Merge branch 'main' into test-sora
2026-02-10 18:01:17 +08:00
yangjianbo
e489996713
test(backend): 补充改动代码单元测试覆盖率至 85%+
...
新增 48 个测试用例覆盖修复代码的各分支路径:
- subscription_maintenance_queue: nil receiver/task、Stop 幂等、零值参数 (+6)
- billing_service: CalculateCostWithConfig、错误传播、SoraImageCost 等 (+12)
- timing_wheel_service: Schedule/ScheduleRecurring after Stop (+3)
- sora_media_cleanup_service: nil guard、Start/Stop 各分支、timezone (+10)
- sora_gateway_service: normalizeSoraMediaURLs、buildSoraContent 等辅助函数 (+17)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-10 17:52:10 +08:00
yangjianbo
54fe363257
fix(backend): 修复代码审核发现的 8 个确认问题
...
- P0-1: subscription_maintenance_queue 使用 RWMutex 防止 channel close/send 竞态
- P0-2: billing_service CalculateCostWithLongContext 修复被吞没的 out-range 错误
- P1-1: timing_wheel_service Schedule/ScheduleRecurring 添加 SetTimer 错误日志
- P1-2: sora_gateway_service StoreFromURLs 失败时降级使用原始 URL
- P1-3: concurrency_cache 用 Pipeline 替代 Lua 脚本兼容 Redis Cluster
- P1-6: sora_media_cleanup_service runCleanup 添加 nil cfg/storage 防护
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-10 17:51:49 +08:00
song
b161312183
test(antigravity): 更新单URL策略下的重试断言
2026-02-10 14:36:09 +08:00
程序猿MT
1dd3158c7e
Merge branch 'Wei-Shaw:main' into main
2026-02-10 13:55:51 +08:00
song
1f647b120a
feat(antigravity): 转发与测试支持daily/prod单URL切换
2026-02-10 13:51:29 +08:00
Edric Li
7d0a30fa8f
merge: sync upstream main (antigravity single-account 503 retry)
...
合并上游新增的 Antigravity 单账号 503 退避重试机制,
解决与本地 MODEL_CAPACITY_EXHAUSTED 逻辑的冲突,两者共存。
2026-02-10 12:00:21 +08:00
Edric Li
d95e04fd1f
feat: 错误透传规则支持 skip_monitoring 跳过运维监控记录
...
在每条错误透传规则上新增 skip_monitoring 选项,开启后匹配该规则的错误
不会被记录到 ops_error_logs,减少监控噪音。默认关闭,不影响现有规则。
2026-02-10 11:42:39 +08:00
shaw
5dd83d3cf2
fix: 移除特定system以适配新版cc客户端缓存失效的bug
2026-02-10 10:28:34 +08:00
Wesley Liddick
14e1aac9b5
Merge pull request #533 from GuangYiDing/feat/antigravity-single-account-503-retry
...
feat: Antigravity 单账号分组 503 退避重试机制
2026-02-10 09:59:48 +08:00
yangjianbo
5d1c51a37f
fix(handler): 修复 gjson 迁移后的请求校验语义回退
...
- OpenAI handler: 添加 gjson.ValidBytes 校验 JSON 合法性;model 校验改为
检查 gjson.String 类型而非仅判断非空(拒绝 model:123 等非法类型);stream
字段添加 True/False 类型检查;sjson.SetBytes 返回值显式处理错误
- Sora handler: 添加 gjson.ValidBytes 校验;model 校验同上改为类型检查;
messages 校验从 Exists+Type==JSON 改为 IsArray+len>0(拒绝空数组和对象)
- 补充 TestOpenAIHandler_GjsonValidation 和更新 TestSoraHandler_ValidationExtraction
覆盖新增的边界校验场景
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-10 09:13:20 +08:00
yangjianbo
58912d4ac5
perf(backend): 使用 gjson/sjson 优化热路径 JSON 处理
...
将 API 网关热路径中的 json.Unmarshal+json.Marshal 替换为 gjson 零拷贝查询和 sjson 精准写入:
- unwrapV1InternalResponse 性能提升 22x(4009ns→182ns),内存分配减少 28.5x
- unwrapGeminiResponse、extractGeminiUsage、estimateGeminiCountTokens、ParseGeminiRateLimitResetTime 改为接收 []byte 使用 gjson 提取
- ParseGatewayRequest 的 model/stream/metadata/thinking/max_tokens 改用 gjson 类型安全提取
- Handler 层(sora/openai)改用 gjson 提取字段、sjson 注入/修改字段,移除 map[string]any 中间变量
- Sora Client 响应解析改用 gjson ForEach 遍历,减少内存分配
- 新增约 100 个单元测试用例,所有改动函数覆盖率 >85%
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-10 08:59:30 +08:00
Edric Li
6114f69cca
feat: MODEL_CAPACITY_EXHAUSTED 使用固定1s间隔重试60次,不切换账号
...
MODEL_CAPACITY_EXHAUSTED (503) 表示模型容量不足,所有账号共享同一容量池,
切换账号无意义。改为固定1s间隔重试最多60次,重试耗尽后直接返回上游错误。
- 新增 antigravityModelCapacityRetryMaxAttempts=60 和 antigravityModelCapacityRetryWait=1s
- shouldTriggerAntigravitySmartRetry 新增 isModelCapacityExhausted 返回值
- handleSmartRetry 对 MODEL_CAPACITY_EXHAUSTED 使用独立重试策略
- handleModelRateLimit 对 MODEL_CAPACITY_EXHAUSTED 仅标记 Handled,不设限流
- 重试耗尽后不设置模型限流、不清除粘性会话、不切换账号
2026-02-10 02:03:06 +08:00
Edric Li
d6c2921f2b
feat: same-account retry before failover for transient errors
...
For retryable transient errors (Google 400 "invalid project resource name"
and empty stream responses), retry on the same account up to 2 times
(with 500ms delay) before switching to another account.
- Add RetryableOnSameAccount field to UpstreamFailoverError
- Add same-account retry loop in both Gemini and Claude/OpenAI handler paths
- Move temp-unschedule from service layer to handler layer (only after
all same-account retries exhausted)
- Reduce temp-unschedule cooldown from 30 minutes to 1 minute
2026-02-10 00:53:54 +08:00
yangjianbo
29ca1290b3
chore(test): 清理测试用例与类型导入
2026-02-10 00:37:56 +08:00
yangjianbo
3fcb0cc37c
feat(subscription): 有界队列执行维护并改进鉴权解析
2026-02-10 00:37:47 +08:00
Edric Li
61c73287dc
feat: failover and temp-unschedule on empty stream response
...
- Empty stream responses now return UpstreamFailoverError instead of
plain 502, triggering automatic account switching (up to 10 retries)
- Add tempUnscheduleEmptyResponse: accounts returning empty responses
are temp-unscheduled for 30 minutes
- Apply to both Claude and Gemini non-streaming paths
- Align googleConfigErrorCooldown from 60m to 30m for consistency
2026-02-09 23:25:30 +08:00
Edric Li
89905ec43d
feat: failover and temp-unschedule on Google "Invalid project resource name" 400
...
Google 后端间歇性返回 400 "Invalid project resource name" 错误,
此前该错误直接透传给客户端且不触发账号切换,导致请求失败。
- 在 Antigravity 和 Gemini 两个平台的所有转发路径中,
精确匹配该错误消息后触发 failover 自动换号重试
- 命中后将账号临时封禁 1 小时,避免反复调度到同一故障账号
- 提取共享函数 isGoogleProjectConfigError / tempUnscheduleGoogleConfigError
消除跨 Service 的代码重复
2026-02-09 22:48:32 +08:00
Rose Ding
e4bc35151f
test: 添加单账号 503 退避重试机制的单元测试
...
覆盖 Service 层和 Handler 层的所有新增逻辑:
- isSingleAccountRetry context 标记检查
- handleSmartRetry 中 503 + SingleAccountRetry 分支
- handleSingleAccountRetryInPlace 原地重试逻辑
- antigravityRetryLoop 预检查跳过限流
- sleepAntigravitySingleAccountBackoff 固定延迟退避
- 端到端集成场景验证
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-09 22:06:06 +08:00
yangjianbo
2bfb16291f
fix(unit): 修复 unit tag 测试编译与账号选择用例
2026-02-09 21:35:41 +08:00
Wesley Liddick
56da498b7e
Merge pull request #532 from touwaeriol/fix/clear-model-rate-limits
...
fix: support clearing model-level rate limits from action menu and temp-unsched reset
2026-02-09 20:52:44 +08:00
yangjianbo
d367d1cde6
Merge branch 'main' into test-sora
2026-02-09 20:40:09 +08:00
erio
4a84ca9a02
fix: support clearing model-level rate limits from action menu and temp-unsched reset
2026-02-09 20:37:30 +08:00
yangjianbo
16131c3d3f
Merge branch 'main' of https://github.com/mt21625457/aicodex2api
2026-02-09 20:26:03 +08:00
erio
a70d37a676
fix: Gemini error policy check should precede retry logic
2026-02-09 19:55:17 +08:00
erio
6892e84ad2
fix: skip rate limiting when custom error codes don't match upstream status
...
Add ShouldHandleErrorCode guard at the entry of handleGeminiUpstreamError
and AntigravityGatewayService.handleUpstreamError so that accounts with
custom error codes (e.g. [599]) are not rate-limited when the upstream
returns a non-matching status (e.g. 429).
2026-02-09 19:55:05 +08:00
erio
73f455745c
feat: ErrorPolicySkipped returns 500 instead of upstream status code
...
When custom error codes are enabled and the upstream error code is NOT
in the configured list, return HTTP 500 to the client instead of
transparently forwarding the original status code.
Also adds integration test TestCustomErrorCode599 verifying that 429,
500, 503, 401, 403 all return 500 without triggering SetRateLimited
or SetError.
2026-02-09 19:54:54 +08:00
Rose Ding
021abfca18
fix: 单账号分组首次 503 不设模型限流标记,避免后续请求雪崩
...
单账号 antigravity 分组收到 503 (MODEL_CAPACITY_EXHAUSTED) 时,
原逻辑会设置 ~29s 模型限流标记。由于只有一个账号无法切换,
后续所有新请求在预检查时命中限流 → 几毫秒内直接返回 503,
导致约 30 秒的雪崩窗口。
修复:在 Handler 入口处检查分组是否只有单个 antigravity 账号,
如果是则提前设置 SingleAccountRetry context 标记,让 Service 层
首次 503 就走原地重试逻辑(不设限流标记),避免污染后续请求。
2026-02-09 17:25:36 +08:00
Rose Ding
f6cfab9901
feat: 添加 Antigravity 单账号 503 退避重试机制
...
当分组内只有一个可用账号且上游返回 503 (MODEL_CAPACITY_EXHAUSTED) 时,
不再设置模型限流+切换账号(因为切换回来还是同一个账号),而是在 Service 层
原地等待+重试,避免双重等待问题。
主要变更:
- Handler 层:检测单账号 503 场景,清除排除列表并设置 SingleAccountRetry 标记
- Service 层:新增 handleSingleAccountRetryInPlace 原地重试逻辑
- Service 层:预检查跳过单账号模式下的限流检查
- 新增 ctxkey.SingleAccountRetry 上下文标记
2026-02-09 14:26:01 +08:00
shaw
51572b5da0
chore: update version
2026-02-09 12:00:03 +08:00
QTom
04cedce9a1
test: 为 stubAccountRepo 添加 ListCRSAccountIDs 方法实现
2026-02-09 11:40:37 +08:00
QTom
5e0d789440
feat(admin): 新增 CRS 同步预览和账号选择功能
...
- 后端新增 PreviewFromCRS 接口,允许用户先预览 CRS 中的账号
- 后端支持在同步时选择特定账号,不选中的账号将被跳过
- 前端重构 SyncFromCrsModal 为三步向导:输入凭据 → 预览账号 → 执行同步
- 改进表单无障碍性:添加 for/id 关联和 required 属性
- 修复 Back 按钮返回时的状态清理
- 新增 buildSelectedSet 和 shouldCreateAccount 的单元测试
- 完整的向后兼容性:旧客户端不发送 selected_account_ids 时行为不变
2026-02-09 10:39:09 +08:00
yangjianbo
d7011163b8
fix: 修复代码审核发现的安全和质量问题
...
安全修复(P0):
- 移除硬编码的 OAuth client_secret(Antigravity、Gemini CLI),
改为通过环境变量注入(ANTIGRAVITY_OAUTH_CLIENT_SECRET、
GEMINI_CLI_OAUTH_CLIENT_SECRET)
- 新增 logredact.RedactText() 对非结构化文本做敏感信息脱敏,
覆盖 GOCSPX-*/AIza* 令牌和常见 key=value 模式
- 日志中不再打印 org_uuid、account_uuid、email_address 等敏感值
安全修复(P1):
- URL 验证增强:新增 ValidateHTTPURL 统一入口,支持 allowlist 和
私网地址阻断(localhost/内网 IP)
- 代理回退安全:代理初始化失败时默认阻止直连回退,防止 IP 泄露,
可通过 security.proxy_fallback.allow_direct_on_error 显式开启
- Gemini OAuth 配置校验:client_id 与 client_secret 必须同时
设置或同时留空
其他改进:
- 新增 tools/secret_scan.py 密钥扫描工具和 Makefile secret-scan 目标
- 更新所有 docker-compose 和部署配置,传递 OAuth secret 环境变量
- google_one OAuth 类型使用固定 redirectURI,与 code_assist 对齐
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-09 09:58:13 +08:00
yangjianbo
fc8a39e0f5
test: 删除CI工作流,大幅提升后端单元测试覆盖率至50%+
...
删除因GitHub计费锁定而失败的CI工作流。
为6个核心Go源文件补充单元测试,全部达到50%以上覆盖率:
- response/response.go: 97.6%
- antigravity/oauth.go: 90.1%
- antigravity/client.go: 88.6% (新增27个HTTP客户端测试)
- geminicli/oauth.go: 91.8%
- service/oauth_service.go: 61.2%
- service/gemini_oauth_service.go: 51.9%
新增/增强8个测试文件,共计5600+行测试代码。
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-09 09:07:58 +08:00
erio
9a479d1b55
fix: resolve CI failures from scope removal refactor
...
- Fix gofmt alignment in ops_realtime_models.go
- Remove SetAntigravityQuotaScopeLimit mock from api_contract_test.go
- Add UpdateSortOrders mock to mockGroupRepoForGateway
2026-02-09 08:27:14 +08:00
erio
fc095bf054
refactor: replace scope-level rate limiting with model-level rate limiting
...
Merge functional changes from develop branch:
- Remove AntigravityQuotaScope system (claude/gemini_text/gemini_image)
- Replace with per-model rate limiting using resolveAntigravityModelKey
- Remove model load statistics (IncrModelCallCount/GetModelLoadBatch)
- Simplify account selection to unified priority→load→LRU algorithm
- Remove SetAntigravityQuotaScopeLimit from AccountRepository
- Clean up scope-related UI indicators and API fields
2026-02-09 08:19:01 +08:00
erio
1af06aed96
feat: shuffle accounts within same sort group to prevent thundering herd
...
Add post-sort shuffle for accounts with identical (priority, loadRate,
lastUsedAt) to break deterministic ordering when concurrent requests
read the same scheduler snapshot. Applies to both Antigravity and
OpenAI scheduling paths, plus the sortAccountsByPriorityAndLastUsed
helper.
Keeps upstream CallCount/ModelLoadInfo scheduling intact; shuffle is
additive and only randomises within equivalent-rank groups.
2026-02-09 07:33:17 +08:00
erio
9236936a55
feat: route AccountTypeUpstream to ForwardUpstream in Forward() entry
...
Without this routing guard, ForwardUpstream is never called because
Forward() always proceeds with the standard OAuth/cookie flow.
2026-02-09 07:27:10 +08:00
erio
125152460f
fix: use upstream retryDelay for rate limit duration instead of fixed default
...
- In handleSmartRetry, use the actual upstream retryDelay to set model
rate limit duration instead of always using the 30s default
- Return info.RetryDelay from shouldTriggerAntigravitySmartRetry when
shouldRateLimitModel=true, so callers know the actual delay
- Extract getDefaultRateLimitDuration() and resolveResetTime() helpers
to reduce duplication in handleUpstreamError 429 handling
- Improve debug logging with upstream_retry_delay and response body
2026-02-09 07:11:29 +08:00
erio
6d90fb0bc3
feat: detect client disconnect during streaming and continue draining upstream for billing
2026-02-09 07:06:26 +08:00
erio
b889d5017b
refactor: replace Trie-based digest session store with flat cache
2026-02-09 07:02:12 +08:00
erio
72b08f9cc5
fix: ensure sticky session failover triggers cache billing exemption
2026-02-09 06:57:07 +08:00
erio
681950dadd
feat: add linear delay between Antigravity account failover switches
2026-02-09 06:56:29 +08:00
erio
a67d9337b8
feat: integrate CheckErrorPolicy into Gemini error handling paths
2026-02-09 06:55:45 +08:00
erio
2f1182e8a9
feat: unified error policy for Antigravity + enable custom error codes for Gemini accounts
2026-02-09 06:54:42 +08:00