Commit Graph

9 Commits

Author SHA1 Message Date
Elysia
697c41a3f6 fix: create fresh context per watermark write retry attempt
Each retry in the SetOutboxWatermark loop now gets its own 5s context.
Previously a shared context could already be expired when the second or
third attempt ran, making the retries pointless.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 20:41:40 +08:00
Elysia
e44baa1094 fix: fix outbox watermark context expiry and add in-batch group rebuild dedup
Fixes #1691

- pollOutbox() reused a 10s context for SetOutboxWatermark after event
  processing could take much longer, causing "outbox watermark write
  failed: context deadline exceeded". The watermark never advanced so
  the same 200 events were reprocessed every poll cycle, spiking CPU.
  Now uses an independent 5s context with up to 3 retries (200ms apart).

- When multiple Codex accounts sharing the same 21-22 groups are all
  rate-limited in quick succession, each account_changed event triggered
  redundant bucket rebuild attempts for the same groups. Introduce
  batchSeenKey{groupID, platform} and thread a seen map through the
  handler chain; rebuildBucketsForPlatform skips (group, platform) pairs
  already rebuilt within the same poll batch (~80% fewer rebuild calls in
  the 5-accounts-same-groups scenario).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 19:09:40 +08:00
QTom
aeed2eb9ad feat(group-filter): 分组账号过滤控制 — require_oauth_only + require_privacy_set
为 OpenAI/Antigravity/Anthropic/Gemini 分组新增两个布尔控制字段:
- require_oauth_only: 创建/更新账号绑定分组时拒绝 apikey 类型加入
- require_privacy_set: 调度选号时跳过 privacy 未成功设置的账号并标记 error

后端:Ent schema 新增字段 + 迁移、Group CRUD 全链路透传、
      gateway_service 与 openai_account_scheduler 两套调度路径过滤
前端:创建/编辑表单 toggle 开关(OpenAI/Antigravity/Anthropic/Gemini 平台可见)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 13:04:55 +08:00
QTom
530a16291c fix(gateway): 分组隔离 — 禁止未分组账号被跨组调度
当 API Key 无分组时,调度仅从未分组账号池中选取。
修复 isAccountInGroup 在 groupID==nil 时的逻辑,
同时补全 scheduler_snapshot_service 和 gemini_compat_service
中的 SimpleMode 保护,确保分组隔离在所有调度路径生效。

新增 ListSchedulableUngroupedByPlatform/s 方法,
使用 Ent 的 Not(HasAccountGroups()) 谓词实现未分组账号隔离。
新增 17 个单元和端到端隔离测试,覆盖所有分支和边界条件。
2026-03-03 13:20:58 +08:00
yangjianbo
bb664d9bbf feat(sync): full code sync from release 2026-02-28 15:01:20 +08:00
yangjianbo
584cfc3db2 chore(logging): 完成后端日志审计与结构化迁移
- 将高密度服务与处理器日志迁移到新日志系统(LegacyPrintf/结构化日志)
- 增加 stdlog bridge 与兼容测试,保留旧日志捕获能力
- 将 OpenAI 断流告警改为结构化 Warn 并改造对应测试为 sink 捕获
- 补齐后端相关文件 logger 引用并通过全量 go test
2026-02-12 19:01:09 +08:00
yangjianbo
53e1c8b268 perf(日志): 降噪优化,将常规成功日志降级为 Debug 级别
- GIN Logger 中间件跳过 /health 和 /setup/status 的请求日志
- UsageCleanup 空闲轮询(no_task)日志降级为 slog.Debug
- Scheduler 常规 rebuild ok 日志降级为 slog.Debug
- DashboardAggregation 常规聚合完成日志降级为 slog.Debug
- TokenRefresh 无刷新活动时周期日志降级为 slog.Debug

生产环境(Info 级别)下自动静默,debug 模式下仍可见。
错误、警告类日志保持原有级别不变。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 23:29:24 +08:00
erio
5e98445b22 feat(antigravity): comprehensive enhancements - model mapping, rate limiting, scheduling & ops
Key changes:
- Upgrade model mapping: Opus 4.5 → Opus 4.6-thinking with precise matching
- Unified rate limiting: scope-level → model-level with Redis snapshot sync
- Load-balanced scheduling by call count with smart retry mechanism
- Force cache billing support
- Model identity injection in prompts with leak prevention
- Thinking mode auto-handling (max_tokens/budget_tokens fix)
- Frontend: whitelist mode toggle, model mapping validation, status indicators
- Gemini session fallback with Redis Trie O(L) matching
- Ops: enhanced concurrency monitoring, account availability, retry logic
- Migration scripts: 049-051 for model mapping unification
2026-02-07 12:31:10 +08:00
yangjianbo
3141aa5144 feat(scheduler): 引入调度快照缓存与 outbox 回放
- 调度热路径优先读 Redis 快照,保留分组排序语义
- outbox 回放 + 全量重建纠偏,失败重试不推进水位
- 自动 Atlas 基线对齐并同步调度配置示例
2026-01-12 14:19:06 +08:00