Commit Graph

808 Commits

Author SHA1 Message Date
yangjianbo
bb5a5dd65e test: 完善自动化测试体系(7个模块,73个任务)
系统性地修复、补充和强化项目的自动化测试能力:

1. 测试基础设施修复
   - 修复 stubConcurrencyCache 缺失方法和构造函数参数不匹配
   - 创建 testutil 共享包(stubs.go, fixtures.go, httptest.go)
   - 为所有 Stub 添加编译期接口断言

2. 中间件测试补充
   - 新增 JWT 认证中间件测试(有效/过期/篡改/缺失 Token)
   - 补充 rate_limiter 和 recovery 中间件测试场景

3. 网关核心路径测试
   - 新增账户选择、等待队列、流式响应、并发控制、计费、Claude Code 检测测试
   - 覆盖负载均衡、粘性会话、SSE 转发、槽位管理等关键逻辑

4. 前端测试体系(11个新测试文件,163个测试用例)
   - Pinia stores: auth, app, subscriptions
   - API client: 请求拦截器、响应拦截器、401 刷新
   - Router guards: 认证重定向、管理员权限、简易模式限制
   - Composables: useForm, useTableLoader, useClipboard
   - Components: LoginForm, ApiKeyCreate, Dashboard

5. CI/CD 流水线重构
   - 重构 backend-ci.yml 为统一的 ci.yml
   - 前后端 4 个并行 Job + Postgres/Redis services
   - Race 检测、覆盖率收集与门禁、Docker 构建验证

6. E2E 自动化测试
   - e2e-test.sh 自动化脚本(Docker 启动→健康检查→测试→清理)
   - 用户注册→登录→API Key→网关调用完整链路测试
   - Mock 模式和 API Key 脱敏支持

7. 修复预存问题
   - tlsfingerprint dialer_test.go 缺失 build tag 导致集成测试编译冲突

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 12:05:39 +08:00
erio
6ab77f5eb5 fix(upstream): passthrough response body directly instead of parsing SSE
ForwardUpstream/ForwardUpstreamGemini should pipe the upstream response
directly to the client (headers + body), not parse it as SSE stream.
2026-02-08 08:49:43 +08:00
erio
4f57d7f761 fix: add nil guard for gin.Context in header passthrough to satisfy staticcheck SA5011 2026-02-08 08:36:35 +08:00
erio
1563bd3dda feat(upstream): passthrough all client headers instead of manual header setting
Replace manual header setting (Content-Type, anthropic-version, anthropic-beta)
with full client header passthrough in ForwardUpstream/ForwardUpstreamGemini.
Only authentication headers (Authorization, x-api-key) are overridden with
upstream account credentials. Hop-by-hop headers are excluded.

Add unit tests covering header passthrough, auth override, and hop-by-hop filtering.
2026-02-08 08:33:09 +08:00
erio
77b66653ed fix(gateway): restore upstream account forwarding with dedicated methods
v0.1.74 merged upstream accounts into the OAuth path, causing requests
to hit the wrong protocol and endpoint. Add three upstream-specific
methods (testUpstreamConnection, ForwardUpstream, ForwardUpstreamGemini)
that use base_url + apiKey auth and passthrough the original body, while
reusing the existing response handling and error/retry logic.
2026-02-08 01:21:02 +08:00
yangjianbo
53e1c8b268 perf(日志): 降噪优化,将常规成功日志降级为 Debug 级别
- GIN Logger 中间件跳过 /health 和 /setup/status 的请求日志
- UsageCleanup 空闲轮询(no_task)日志降级为 slog.Debug
- Scheduler 常规 rebuild ok 日志降级为 slog.Debug
- DashboardAggregation 常规聚合完成日志降级为 slog.Debug
- TokenRefresh 无刷新活动时周期日志降级为 slog.Debug

生产环境(Info 级别)下自动静默,debug 模式下仍可见。
错误、警告类日志保持原有级别不变。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 23:29:24 +08:00
yangjianbo
00caf0bcd8 test: 为代码审核修复添加详细单元测试(7个测试文件,50+测试用例)
新增测试文件:
- cors_test.go: CORS 条件化头部测试(12个测试,覆盖白名单/黑名单/通配符/凭证/多源/Vary)
- gateway_helper_backoff_test.go: nextBackoff 退避测试(6个测试+基准,验证指数增长/边界/抖动/收敛)
- billing_cache_jitter_test.go: jitteredTTL 抖动测试(5个测试+基准,验证范围/上界/方差/均值)
- subscription_calculate_progress_test.go: calculateProgress 纯函数测试(9个测试,覆盖日/周/月限额/超限截断/过期)
- openai_gateway_handler_test.go: SSE JSON 转义测试(7个子用例,验证双引号/反斜杠/换行符安全)

更新测试文件:
- response_transformer_test.go: 增强 generateRandomID 测试(7个测试,含并发/字符集/降级计数器)
- security_headers_test.go: 适配 GenerateNonce 新签名
- api_key_auth_test.go: 适配 NewSubscriptionService 新参数

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 22:14:07 +08:00
yangjianbo
9634494ba9 fix: 修复代码审核发现的10个问题(P0安全+P1数据一致性+P2性能优化)
P0: OpenAI SSE 错误消息 JSON 注入 — 使用 json.Marshal 替代 fmt.Sprintf
P1: subscription 续期包裹 Ent 事务确保原子性
P1: CSP nonce 生成处理 crypto/rand 错误,失败降级为 unsafe-inline
P1: singleflight 透传数据库真实错误,不再吞没为 not found
P1: GetUserSubscriptionsWithProgress 提取 calculateProgress 消除 N+1
P2: billing_cache/gateway_helper 迁移到 math/rand/v2 消除全局锁争用
P2: generateRandomID 降级分支增加原子计数器防碰撞
P2: CORS 非白名单 origin 不再设置 Allow-Headers/Methods/Max-Age
P2: Turnstile 验证移除 VerifyCode 空值跳过条件防绕过
P2: Redis Cluster Lua 脚本空 KEYS 添加兼容性警告注释

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 22:13:45 +08:00
yangjianbo
4a20a2a8ba fix: 修复批量更新凭证明细与缓存TTL抖动
- BatchUpdateCredentials 返回 success/failed/results 及 success_ids/failed_ids

- billing jitteredTTL 改为只减不增,确保TTL不超上界

- crypto/rand 失败时随机ID降级避免 panic

- OpenAI SelectAccount 失败日志去重并补充字段

- 修复两处类型断言以通过 errcheck
2026-02-07 21:18:03 +08:00
yangjianbo
fd43be8d0b merge: 合并 main 分支到 test,解决 config 和 modelWhitelist 冲突
- config.go: 保留 Sora 配置,合入 SubscriptionCache 配置
- useModelWhitelist.ts: 同时保留 soraModels 和 antigravityModels

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 20:18:07 +08:00
yangjianbo
836ba14b70 fix: 修复函数签名变更后的调用参数不匹配
- handleUpstreamError 补齐新增的三个参数 (0, "", false)
- handleStreamingResponse 移除已删除的 nil 参数

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 20:05:29 +08:00
yangjianbo
a14dfb769a Merge branch 'dev-release' 2026-02-07 19:58:00 +08:00
yangjianbo
2588fa6a8f fix(audit): 第二批审计修复 — P0 生产 Bug、安全加固、性能优化、缓存一致性、代码质量
基于 backend-code-audit 审计报告,修复剩余 P0/P1/P2 共 34 项问题:

P0 生产 Bug:
- 修复 time.Since(time.Now()) 计时逻辑错误 (P0-03)
- generateRandomID 改用 crypto/rand 替代固定索引 (P0-04)
- IncrementQuotaUsed 重写为 Ent 原子操作消除 TOCTOU 竞态 (P0-05)

安全加固:
- gateway/openai handler 错误响应替换为泛化消息,防止内部信息泄露 (P1-14)
- usage_log_repo dateFormat 参数改用白名单映射,防止 SQL 注入 (P1-16)
- 默认配置安全加固:sslmode=prefer、response_headers=true、mode=release (P1-18/19, P2-15)

性能优化:
- gateway handler 循环内 defer 替换为显式 releaseWait 闭包 (P1-02)
- group_repo/promo_code_repo Count 前 Clone 查询避免状态污染 (P1-03)
- usage_log_repo 四个查询添加 LIMIT 10000 防止 OOM (P1-07)
- GetBatchUsageStats 添加时间范围参数,默认最近 30 天 (P1-10)
- ip.go CIDR 预编译为包级变量 (P1-11)
- BatchUpdateCredentials 重构为先验证后更新 (P1-13)

缓存一致性:
- billing_cache 添加 jitteredTTL 防止缓存雪崩 (P2-10)
- DeductUserBalance/UpdateSubscriptionUsage 错误传播修复 (P2-12)
- UserService.UpdateBalance 成功后异步失效 billingCache (P2-13)

代码质量:
- search 截断改为按 rune 处理,支持多字节字符 (P2-01)
- TLS Handshake 改为 HandshakeContext 支持 context 取消 (P2-07)
- CORS 预检添加 Access-Control-Max-Age: 86400 (P2-16)

测试覆盖:
- 新增 user_service_test.go(UpdateBalance 缓存失效 6 个用例)
- 新增 batch_update_credentials_test.go(fail-fast + 类型验证 7 个用例)
- 新增 response_transformer_test.go、ip_test.go、usage_log_repo_unit_test.go、search_truncate_test.go
- 集成测试:IncrementQuotaUsed 并发测试、billing_cache 错误传播测试
- config_test.go 补充 server.mode/sslmode 默认值断言

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 19:46:42 +08:00
erio
3077fd279d feat: smart retry max 1 attempt + clear sticky session on failure
- Change antigravitySmartRetryMaxAttempts from 3 to 1 to prevent
  repeated rate limiting and long waits
- Clear sticky session binding (DeleteSessionAccountID) after smart
  retry exhaustion, so subsequent requests don't hit the same
  rate-limited account
- Add flow diagrams to Forward/ForwardGemini doc comments
- Add comprehensive unit tests covering:
  - Sticky session cleared on retry failure (429, 503, network error)
  - Sticky session NOT cleared on retry success
  - Sticky session NOT cleared for non-sticky requests (empty hash)
  - Sticky session NOT cleared on long delay path (handled by handler)
  - Nil cache safety (no panic)
  - MaxAttempts constant verification
  - End-to-end retryLoop → switchError propagation with session clear
2026-02-07 19:30:58 +08:00
shaw
6aaa4aee6a fix: 收敛 Claude Code 探测拦截并补齐回归测试 2026-02-07 19:04:08 +08:00
erio
e3748da860 fix(lint): handle errcheck for strings.Builder.WriteString 2026-02-07 18:18:15 +08:00
erio
50a783ff01 feat: add Anthropic sticky session digest chain matching via Trie
The previous fallback (step 3) in GenerateSessionHash hashed system +
all messages together, producing a different hash each round as the
conversation grew ([a] -> [a,b] -> [a,b,c]). This made fallback sticky
sessions ineffective for multi-turn conversations.

Implement per-message Trie digest chain matching (reusing Gemini's Trie
infrastructure) so that the previous round's chain is always a prefix
of the current round's chain, enabling reliable session affinity.
2026-02-07 18:00:56 +08:00
shaw
1439eb39a9 fix(gateway): harden digest logging and align antigravity ops
- avoid panic by using safe UUID prefix truncation in Gemini digest fallback logs\n- remove unconditional Antigravity 429 full-body debug logs and honor log truncation config\n- align Antigravity quick preset mappings to opus 4.6-thinking targets only\n- restore scope rate-limit aggregation/output in ops availability stats
2026-02-07 17:12:15 +08:00
yangjianbo
8226a4ce4d perf(service): 优化 model 替换函数,用 gjson/sjson 替代全量 JSON 序列化
SSE 热路径中 replaceModelInSSELine 和 replaceModelInResponseBody 原来
使用 json.Unmarshal/Marshal 对每个事件做全量反序列化再序列化,现改为
gjson.Get/sjson.Set 精确字段操作,消除 O(n) 中间 map 分配,保持 JSON
字段顺序不变。涉及 OpenAIGatewayService 和 GatewayService 两个服务。

新增 23 个单元测试覆盖:顶层/嵌套 model 替换、不匹配跳过、空行/[DONE]/
非法 JSON 等边界情况。

Fixes: P1-08

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 17:09:55 +08:00
erio
e1a68497d6 refactor: simplify sticky session rate limit handling — switch immediately on any rate limit
Remove threshold-based waiting in both sticky session and antigravity
pre-check paths. When a model is rate-limited, immediately clear the
sticky session and switch accounts instead of waiting for short durations.
2026-02-07 17:06:49 +08:00
yangjianbo
a9e256ce8c fix(openai): 修复 usage 为空导致 panic(P0-02) 2026-02-07 16:15:30 +08:00
erio
fa28dcbf32 fix(test): update test calls to match method receivers on handleSmartRetry and antigravityRetryLoop 2026-02-07 16:05:09 +08:00
erio
2656320d04 fix(antigravity): fetch default mapping from API and sync Redis on rate limit
1. Frontend: replace hardcoded antigravityDefaultMappings with async
   fetch from GET /admin/accounts/antigravity/default-model-mapping,
   eliminating the duplicate data source that caused frontend/backend
   mapping inconsistency.

2. Backend: convert handleSmartRetry and antigravityRetryLoop from
   standalone functions to AntigravityGatewayService methods, enabling
   Redis cache sync (updateAccountModelRateLimitInCache) after both
   rate-limit write paths — long-delay branch and retry-exhausted branch.
2026-02-07 15:59:27 +08:00
erio
b4f6c4f9d5 style: fix gofmt formatting in gateway_service.go
Remove extra blank line that caused golangci-lint gofmt check to fail.
2026-02-07 14:51:20 +08:00
yangjianbo
0e514ed80b perf(middleware): 优化订阅模式认证中间件,5次串行调用降至2步同步+1步异步
- 为 GetActiveSubscription 添加 ristretto L1 缓存 + singleflight 防击穿
- 合并 ValidateSubscription + CheckUsageLimits 为纯内存 ValidateAndCheckLimits
- 窗口维护操作(激活/重置)异步化,不再阻塞首字节
- 缓存返回浅拷贝,避免并发 data race 和缓存污染
- 所有管理操作(分配/续期/撤销/扩展/窗口重置)同步失效 L1 缓存
- 新增 SubscriptionCacheConfig 可配置 L1 缓存大小/TTL/抖动

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 14:43:12 +08:00
erio
14c6c9321a refactor: remove unused IsAntigravityModelSupported function and its tests 2026-02-07 14:42:28 +08:00
erio
386126b1b2 test(antigravity): add missing unit tests for upstream and custom model_mapping
- Add GetAccessToken upstream branch tests (success/failure/empty/nil)
- Add mapAntigravityModel wildcard-target-equals-request edge case tests
- Add upstream account smart retry test case
- Add GeminiMessagesCompatService custom model_mapping and empty model tests
2026-02-07 14:39:25 +08:00
erio
de0927289e fix(antigravity): support upstream accounts and custom model_mapping in scheduling
- GetAccessToken: add upstream branch to read api_key from credentials
- shouldTriggerAntigravitySmartRetry: relax check from IsOAuth to Platform-based
- isModelSupportedByAccount/WithContext: replace IsAntigravityModelSupported
  whitelist with mapAntigravityModel for unified scheduling/forwarding logic
- mapAntigravityModel: fix edge case where wildcard target equals request model
- Update tests for new behavior and add custom model_mapping test cases
2026-02-07 14:32:08 +08:00
erio
edb0937024 fix: restore non-failover error passthrough from 7b156489 2026-02-07 14:24:55 +08:00
erio
43a4840daf fix: restore error passthrough service improvements from 7b156489 2026-02-07 14:16:19 +08:00
erio
5e98445b22 feat(antigravity): comprehensive enhancements - model mapping, rate limiting, scheduling & ops
Key changes:
- Upgrade model mapping: Opus 4.5 → Opus 4.6-thinking with precise matching
- Unified rate limiting: scope-level → model-level with Redis snapshot sync
- Load-balanced scheduling by call count with smart retry mechanism
- Force cache billing support
- Model identity injection in prompts with leak prevention
- Thinking mode auto-handling (max_tokens/budget_tokens fix)
- Frontend: whitelist mode toggle, model mapping validation, status indicators
- Gemini session fallback with Redis Trie O(L) matching
- Ops: enhanced concurrency monitoring, account availability, retry logic
- Migration scripts: 049-051 for model mapping unification
2026-02-07 12:31:10 +08:00
erio
8917afab2a fix(antigravity): reduce 429 fallback cooldown from 5min to 30s
The default fallback cooldown when rate limit reset time cannot be
parsed was 5 minutes, which is too aggressive and causes accounts
to be unnecessarily locked out. Reduce to 30 seconds for faster
recovery. Config override still works (unit remains minutes).
2026-02-07 11:54:00 +08:00
shaw
39a5b17d31 fix: 账号测试根据类型使用不同的 beta header
- OAuth 账号:使用完整的 DefaultBetaHeader 和 Claude Code 客户端 headers
- API Key 账号:使用 APIKeyBetaHeader(不含 oauth beta)
2026-02-07 11:33:06 +08:00
shaw
5299f3dcf6 fix: ix: antigravity 添加 aude-opus-4-6-thinking 模型支持 2026-02-07 10:38:10 +08:00
shaw
7b1564898b fix: make error passthrough effective for non-failover upstream errors 2026-02-07 10:25:56 +08:00
yangjianbo
4e01126ff2 test(codex): 清理无用的 opencode 缓存测试
移除不再需要的 setupCodexCache 调用与辅助函数(已不再回源/读写缓存)
2026-02-07 09:53:01 +08:00
yangjianbo
55b56328da feat(codex): 移除 opencode 指令回源与缓存
- 不再从 GitHub 拉取 opencode codex_header.txt\n- 删除 ~/.opencode 缓存与异步刷新逻辑\n- 所有 instructions 统一使用内置 codex_cli_instructions.md
2026-02-07 09:28:32 +08:00
yangjianbo
ce764bf2d9 feat(gateway): 支持强制 Codex CLI 模式并伪装 UA
- Codex CLI 请求仅使用内置 instructions,不再读取 opencode 缓存/回源\n- 新增 gateway.force_codex_cli(环境变量 GATEWAY_FORCE_CODEX_CLI)\n- ForceCodexCLI=true 时转发上游强制 User-Agent=codex_cli_rs/0.0.0\n- 更新 deploy 示例配置
2026-02-07 09:21:15 +08:00
yangjianbo
d71537d431 perf(service): SSE Scanner buffer 改用 sync.Pool 复用,减少高并发 GC 压力
将流式响应中 bufio.Scanner 的 64KB buffer 从每次 make 分配改为
sync.Pool 复用,统一切片表达式为 [:0]、变量命名为 scanBuf,
并补充对应的单元测试。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 22:55:12 +08:00
yangjianbo
ae1ba45350 perf(service): jitterTTL 改用 rand/v2 并移除锁 2026-02-06 21:22:38 +08:00
yangjianbo
c4182f8c33 perf(service): 移除 jitter 随机数全局锁 2026-02-06 21:20:25 +08:00
yangjianbo
2d4bbbf49d feat: 优化codex冷启动, 还有连接池数据库配置信息 2026-02-06 20:31:42 +08:00
shaw
9f4c1ef9f9 fix(ops): 添加 token 相关字段白名单避免误脱敏
在敏感字段检测中添加白名单,排除 API 参数和用量统计字段:
- max_tokens, max_completion_tokens, max_output_tokens
- completion_tokens, prompt_tokens, total_tokens
- input_tokens, output_tokens
- cache_creation_input_tokens, cache_read_input_tokens

这些字段名虽然包含 "token" 但只是数值参数,不应被脱敏处理。
2026-02-06 19:47:14 +08:00
Wesley Liddick
a381910e86 Merge pull request #489 from LLLLLLiulei/feat/import-export-bundle
feat: implement account & proxy import/export with migration-ready JSON bundles
2026-02-06 16:29:52 +08:00
shaw
d182ef0391 fix(gateway): 移除 PR #316 引入的工具名转换逻辑
移除响应阶段的工具名/schema/description 转换逻辑,修复第三方工具调用时
工具名被错误转换的问题(如 Task → task)。

移除内容:
- 工具名相关正则变量(toolPrefixRe, toolNameBoundaryRe 等)
- openCodeToolOverrides 和 claudeToolNameOverrides 映射表
- 工具名转换函数(normalizeToolNameForClaude, normalizeToolNameForOpenCode 等)
- 响应体工具名替换函数(replaceToolNamesInText, replaceToolNamesInResponseBody 等)
- 参数名转换函数(normalizeParamNameForOpenCode, rewriteParamKeysInValue)
- 工具描述清理函数(sanitizeToolDescription)
- 输入 schema 转换函数(normalizeToolInputSchema)
- 模型 ID 正则替换函数(replaceModelIDInText)

保留内容:
- 系统提示词清理(sanitizeSystemText)
- Claude Code 指纹 headers 处理
- 模型 ID 映射(通过 JSON 对象操作)
2026-02-06 16:09:58 +08:00
LLLLLLiulei
7319122e92 merge upstream/main 2026-02-06 11:33:45 +08:00
yangjianbo
792bef615c Merge branch 'main' into test 2026-02-06 09:59:15 +08:00
yangjianbo
ee01f80dc1 test(backend): 修复 usage 类型断言未检查 2026-02-06 09:54:29 +08:00
yangjianbo
98671a73f4 Merge branch 'main' of https://github.com/mt21625457/aicodex2api
# Conflicts:
#	backend/internal/service/gateway_cached_tokens_test.go
2026-02-06 09:35:46 +08:00
yangjianbo
f33a950103 fix(兼容): 将 Kimi cached_tokens 映射到 Claude 标准 cache_read_input_tokens
Kimi 等 Claude 兼容 API 返回缓存信息使用 OpenAI 风格的 cached_tokens 字段,
而非 Claude 标准的 cache_read_input_tokens,导致客户端收不到缓存命中信息且
内部计费缓存折扣为 0。

新增 reconcileCachedTokens 辅助函数,在 cache_read_input_tokens == 0 且
cached_tokens > 0 时自动填充,覆盖流式(message_start/message_delta)和
非流式两种响应路径。对 Claude 原生上游无影响。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 09:27:42 +08:00