yangjianbo
abf5de69fb
Merge branch 'main' into test
2026-02-12 23:43:47 +08:00
程序猿MT
174d7c774d
Merge branch 'Wei-Shaw:main' into main
2026-02-12 23:12:41 +08:00
sususu98
d21d70a5cf
fix: include Gemini thoughtsTokenCount in output token billing
...
Gemini 2.5 Pro/Flash thinking models return thoughtsTokenCount separately
from candidatesTokenCount in usageMetadata, but this field was not parsed
or included in billing calculations, causing thinking tokens to be
unbilled.
- Add ThoughtsTokenCount field to GeminiUsageMetadata struct
- Include thoughtsTokenCount in OutputTokens across all 3 Gemini usage
parsing paths (non-streaming, streaming, compat layer)
- Add tests covering thinking token scenarios
Closes #554
2026-02-11 15:41:54 +08:00
yangjianbo
58912d4ac5
perf(backend): 使用 gjson/sjson 优化热路径 JSON 处理
...
将 API 网关热路径中的 json.Unmarshal+json.Marshal 替换为 gjson 零拷贝查询和 sjson 精准写入:
- unwrapV1InternalResponse 性能提升 22x(4009ns→182ns),内存分配减少 28.5x
- unwrapGeminiResponse、extractGeminiUsage、estimateGeminiCountTokens、ParseGeminiRateLimitResetTime 改为接收 []byte 使用 gjson 提取
- ParseGatewayRequest 的 model/stream/metadata/thinking/max_tokens 改用 gjson 类型安全提取
- Handler 层(sora/openai)改用 gjson 提取字段、sjson 注入/修改字段,移除 map[string]any 中间变量
- Sora Client 响应解析改用 gjson ForEach 遍历,减少内存分配
- 新增约 100 个单元测试用例,所有改动函数覆盖率 >85%
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-10 08:59:30 +08:00
yangjianbo
16131c3d3f
Merge branch 'main' of https://github.com/mt21625457/aicodex2api
2026-02-09 20:26:03 +08:00
erio
6d90fb0bc3
feat: detect client disconnect during streaming and continue draining upstream for billing
2026-02-09 07:06:26 +08:00
yangjianbo
a14dfb769a
Merge branch 'dev-release'
2026-02-07 19:58:00 +08:00
erio
5e98445b22
feat(antigravity): comprehensive enhancements - model mapping, rate limiting, scheduling & ops
...
Key changes:
- Upgrade model mapping: Opus 4.5 → Opus 4.6-thinking with precise matching
- Unified rate limiting: scope-level → model-level with Redis snapshot sync
- Load-balanced scheduling by call count with smart retry mechanism
- Force cache billing support
- Model identity injection in prompts with leak prevention
- Thinking mode auto-handling (max_tokens/budget_tokens fix)
- Frontend: whitelist mode toggle, model mapping validation, status indicators
- Gemini session fallback with Redis Trie O(L) matching
- Ops: enhanced concurrency monitoring, account availability, retry logic
- Migration scripts: 049-051 for model mapping unification
2026-02-07 12:31:10 +08:00
yangjianbo
d71537d431
perf(service): SSE Scanner buffer 改用 sync.Pool 复用,减少高并发 GC 压力
...
将流式响应中 bufio.Scanner 的 64KB buffer 从每次 make 分配改为
sync.Pool 复用,统一切片表达式为 [:0]、变量命名为 scanBuf,
并补充对应的单元测试。
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-06 22:55:12 +08:00
song
2220fd18ca
merge upstream main
2026-02-03 15:36:17 +08:00
song
f761afb1ef
antigravity: 区分切换后重试次数
2026-01-28 00:01:03 +08:00
song
fd0370c07a
Add invalid-request fallback routing
2026-01-23 22:24:46 +08:00
IanShaw027
06216aad53
fix(backend): 修复 CI 失败问题
...
修复内容:
1. 修复 6 个 golangci-lint 错误
- 3 个 errcheck 错误:在 gateway_request_test.go 中添加类型断言检查
- 3 个 gofmt 格式化问题:修复代码格式
2. 修复 API 契约测试失败
- 在测试中添加缺失的字段:enable_identity_patch 和 identity_patch_prompt
所有测试和 linter 检查现已通过。
2026-01-05 00:56:48 +08:00
IanShaw027
87426e5dda
fix(backend): 改进 thinking/tool block 签名处理和重试策略
...
主要改动:
- request_transformer: thinking block 缺少签名时降级为文本而非丢弃,保留内容并在上层禁用 thinking mode
- antigravity_gateway_service: 新增两阶段降级策略,先处理 thinking blocks,如仍失败且涉及 tool 签名错误则进一步降级 tool blocks
- gateway_request: 新增 FilterSignatureSensitiveBlocksForRetry 函数,支持将 tool_use/tool_result 降级为文本
- gateway_request: 改进 FilterThinkingBlocksForRetry,禁用顶层 thinking 配置以避免结构约束冲突
- gateway_service: 实现保守的两阶段重试逻辑,优先保留内容,仅在必要时降级工具调用
- 新增 antigravity_gateway_service_test.go 测试签名块剥离逻辑
- 更新相关测试用例以验证降级行为
此修复解决了跨平台/账户切换时历史消息签名失效导致的请求失败问题。
2026-01-04 22:32:36 +08:00