Commit Graph

3316 Commits

Author SHA1 Message Date
alfadb
4c474616b9 fix(gateway): emit Anthropic-standard SSE error events and failover body
Two follow-ups to PR #2066's failover-wrap fix:

1. Failover ResponseBody (`UpstreamFailoverError.ResponseBody`) was encoded
   as `{"error": "<msg>"}` (string field). `ExtractUpstreamErrorMessage`
   probes for `error.message`, `detail`, or top-level `message` only — so
   `handleFailoverExhausted` and downstream passthrough rules saw an empty
   message, losing the EOF root cause in ops logs. Re-encode as the
   Anthropic standard shape `{"type":"error","error":{"type":"upstream_disconnected","message":"..."}}`.
   (Addresses the inline review comment from copilot-pull-request-reviewer
   on Wei-Shaw/sub2api#2066.)

2. The streaming `event: error` SSE frame for `response_too_large`,
   `stream_read_error`, and `stream_timeout` was non-standard
   (`{"error":"<reason>"}`). Anthropic SDKs (and Claude Code) expect
   `{"type":"error","error":{"type":"...","message":"..."}}` and parse
   `error.type`/`error.message` accordingly. Refactor `sendErrorEvent` to
   take both reason and message, and emit the standard frame so client
   SDKs surface a real diagnostic message instead of a generic stream error.

This does not by itself prevent task interruption on long-stream EOF
(SSE has no resume; client-side retry remains the only complete fix), but
it gives both server-side ops logs and client-side error UIs a meaningful
upstream message so users know the next step is to retry.

Tests updated to assert the new body shape on both branches plus a new
assertion that `ExtractUpstreamErrorMessage` returns a non-empty string.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 20:24:17 +08:00
alfadb
6327573534 fix(gateway): wrap Anthropic stream EOF as failover error before client output
Anthropic streaming path (gateway_service.go) returned a plain error on
upstream SSE read failure, so the handler-level UpstreamFailoverError check
never fired and the client received a bare `stream_read_error` event,
breaking long-running tasks even when no bytes had been written yet.

The most common trigger is HTTP/2 GOAWAY from api.anthropic.com edge
backends doing graceful rotation: Go's http.Transport surfaces this as
`unexpected EOF` and never auto-retries.

Mirror what the OpenAI and antigravity gateways already do: when the read
error happens before any byte has reached the client (`!c.Writer.Written()`),
return `*UpstreamFailoverError{StatusCode: 502, RetryableOnSameAccount: true}`
so the handler can retry on the same or another account. After client
output has begun, SSE has no resume protocol — keep the existing passthrough
behavior.

Tests cover both branches via streamReadCloser-based fixtures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 19:12:48 +08:00
ivanvolt
04b2866f65 fix: use Responses-compatible function tool_choice format 2026-04-28 16:26:09 +08:00
Wesley Liddick
b0a2252ed1 Merge pull request #2051 from DaydreamCoding/openai-fast-flex-policy
feat(openai): OpenAI Fast/Flex Policy 完整实现(HTTP + WebSocket + Admin)
2026-04-28 12:14:43 +08:00
DaydreamCoding
30f55a1f72 feat(openai): OpenAI Fast/Flex Policy 完整实现(HTTP + WebSocket + Admin)
对称参照 Claude BetaPolicy 的 fast-mode 过滤实现,新增针对 OpenAI 上游
service_tier 字段(priority / flex,含客户端 "fast" → "priority" 归一化)的
pass / filter / block 三态策略,覆盖全部 OpenAI 入口 + admin 配置入口。

后端核心
- 新增 SettingKeyOpenAIFastPolicySettings、OpenAIFastPolicyRule、
  OpenAIFastPolicySettings 配置模型,含规则的 service_tier × action × scope
  × 模型白名单 × fallback action 维度。
- SettingService.Get/SetOpenAIFastPolicySettings;缺失时返回内置默认策略
  (所有模型的 priority 走 filter,whitelist 为空,fallback=pass)。设计
  依据:service_tier=fast 是用户级开关,与 model 字段正交,默认锁定特定
  model slug 会留下"用 gpt-4 + fast 透传 priority 上游"的绕过路径。JSON
  解析失败不再静默 fallback,slog.Warn 记录脏数据,便于运维定位。
- service_tier 归一化(trim + ToLower + fast→priority + 白名单 priority/flex)
  与策略评估(evaluateOpenAIFastPolicy)作为唯一真实来源,HTTP / WS 共用。
  抽出纯函数 evaluateOpenAIFastPolicyWithSettings,配合 ctx-bound settings
  快照(withOpenAIFastPolicyContext / openAIFastPolicySettingsFromContext),
  WS 长会话入口预取一次后所有帧复用,避免每帧打到 settingService。

HTTP 入口(4 个)
- Chat Completions、Anthropic 兼容(Messages,含 BetaFastMode→priority 二次
  命中)、原生 Responses、Passthrough Responses 全部接入
  applyOpenAIFastPolicyToBody,filter 走 sjson 顶层删除 service_tier,block
  返回 403 forbidden_error JSON。
- 4 入口统一使用 upstream 视角的 model(GetMappedModel +
  normalizeOpenAIModelForUpstream + Codex OAuth normalize 后的 slug),
  避免 chat/messages/native /responses/passthrough 因为 model 维度不同
  造成 whitelist 命中差异。
- 在 pass 路径也把客户端 "fast" 别名归一化为 "priority" 写回 body,
  否则 native /responses 与 passthrough 入口会把 "fast" 原样透传给上游
  导致 400/拒绝(chat-completions 入口的 normalizeResponsesBodyServiceTier
  此前已具备同等行为)。

WebSocket 入口
- 新增 applyOpenAIFastPolicyToWSResponseCreate:严格匹配
  type="response.create",仅处理顶层 service_tier;filter 用 sjson 删字段,
  block 返回 typed *OpenAIFastBlockedError。
- ingress 路径在 parseClientPayload 内调用,block 命中先 Write Realtime
  风格 error event 再返回 OpenAIWSClientCloseError(StatusPolicyViolation
  =1008),依赖底层 WebSocket Conn.Write 的同步 flush 保证 error 先于
  close。
- passthrough 路径在 RunEntry 前对 firstClientMessage 应用策略,并通过
  openAIWSPolicyEnforcingFrameConn 包装 ReadFrame 对每个 client→upstream
  帧执行策略;后续帧无 model 字段时回退到 capturedSessionModel。
  filter 闭包内同时侦测 session.update / session.created 帧的 session.model
  字段刷新 capturedSessionModel,封堵"首帧 model=gpt-4o(pass)→
  session.update 改为 gpt-5.5 → 不带 model 的 response.create fallback
  到 gpt-4o"的 mid-session 绕过路径。
- passthrough billing:requestServiceTier 在策略 filter 之后再从
  firstClientMessage 提取,filter 命中时 OpenAIForwardResult.ServiceTier
  上报 nil(default tier),与 HTTP 入口(reqBody 来自 post-filter map)
  / WS ingress(payload 来自 post-filter bytes)的语义一致。
- 错误事件 schema:{event_id: "evt_<32hex>", type: "error",
  error: {type: "forbidden_error", code: "policy_violation", message}},
  与 OpenAI codex 客户端 error event 解析兼容。

Admin / Frontend
- dto.SystemSettings / UpdateSettingsRequest 新增
  openai_fast_policy_settings 字段(omitempty),bulk GET/PUT 接入。
- Settings 页 Gateway 页签新增 Fast/Flex Policy 表单卡片:
  service_tier × action × scope × 模型白名单 × fallback action 全字段配置。
- 前端守门:openaiFastPolicyLoaded 标志仅在 GET 真带回字段时才允许回写,
  避免 rollout/错误把默认规则覆盖成空;saveSettings 回写循环 skip 该字段,
  由专用刷新逻辑处理;仅 action=block 时发送 error_message,匹配后端
  omitempty 行为。

测试
- HTTP 路径:openai_fast_policy_test.go 覆盖默认配置(whitelist=[],所有
  模型 priority filter)/ block 自定义错误 / scope 区分 / filter 删字段 /
  block 不改 body / block 短路上游 / Anthropic BetaFastMode 触发 OpenAI
  fast policy 等场景。
- WebSocket 路径:openai_fast_policy_ws_test.go 覆盖
    helper 单元(filter / fast→priority 归一化 / flex 透传 / block typed
    error / 无 service_tier 字节不变 / 非 response.create 帧不动 / 空 type
    帧不动 / event_id+code 字段断言 / 非字符串 service_tier 容错)+
    pass 路径 fast 别名归一化回归 +
    ingress 端到端(filter 后上游不含 service_tier / block 后客户端先收
    error event 再收 close 1008 且上游 0 写)+
    passthrough capturedSessionModel fallback 用例(whitelist 策略下首帧
    建立、缺 model 命中 fallback、缺少 fallback 时的 leak 文档化)+
    passthrough session.update / session.created 旋转 capturedSessionModel
    的 mid-session 绕过回归 +
    passthrough billing post-filter ServiceTier 与 idempotent filter 回归。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 11:15:09 +08:00
Zven
3d4ca5e8d1 fix(openai): preserve current Codex compact payload fields 2026-04-28 10:55:29 +08:00
Oliver Li
0537a490f0 Merge branch 'Wei-Shaw:main' into vertex 2026-04-27 20:25:11 -04:00
VitalyR
ca5d029e7c fix(openai): honor versioned image base URLs 2026-04-28 04:53:29 +08:00
KnowSky404
1eca03432a fix: format bulk update account request 2026-04-27 18:36:05 +08:00
KnowSky404
53b24bc2d8 fix: tighten account bulk edit target typing 2026-04-27 18:20:36 +08:00
KnowSky404
a161f9d045 feat: align OpenAI bulk edit compact settings 2026-04-27 18:15:23 +08:00
KnowSky404
c5a1a82223 test: cover missing OpenAI bulk edit fields 2026-04-27 18:13:14 +08:00
KnowSky404
2ab6b34fd1 feat: add filtered-result account bulk edit 2026-04-27 18:12:24 +08:00
KnowSky404
764afbe37a test: cover account bulk edit target scopes 2026-04-27 18:08:22 +08:00
KnowSky404
25c7b0d9f4 feat: support filter-target account bulk update 2026-04-27 17:59:49 +08:00
KnowSky404
f422ac6dcc test: cover filter-target account bulk update 2026-04-27 17:32:34 +08:00
KnowSky404
54de4e008c docs: add account bulk edit implementation plan 2026-04-27 17:26:57 +08:00
KnowSky404
65c27d2c69 docs: add account bulk edit scope design 2026-04-27 17:21:11 +08:00
hansnow
53f919f8f0 fix(api-key): reset rate limit usage cache 2026-04-27 16:47:44 +08:00
Wesley Liddick
c92b88e34a Merge pull request #1996 from Cloud370/fix/claude-code-read-empty-pages
fix(anthropic): drop empty Read.pages in responses-to-anthropic tool input
2026-04-27 08:47:13 +08:00
Wesley Liddick
ed0c85a17e Merge pull request #2006 from gaoren002/pr/openai-images-explicit-session
fix(openai): avoid implicit image sticky sessions
2026-04-27 08:43:40 +08:00
gaoren002
9fe02bba7e fix(openai): strip unsupported passthrough fields 2026-04-27 00:39:06 +00:00
gaoren002
615557ec20 fix(openai): avoid implicit image sticky sessions 2026-04-26 17:09:41 +00:00
Oliver Li
3f05ef2ae3 Merge branch 'Wei-Shaw:main' into vertex 2026-04-26 08:39:41 -04:00
Cloud370
3022090365 fix(anthropic): drop empty Read.pages in responses-to-anthropic tool input 2026-04-26 20:21:38 +08:00
Hai Chang
798fd673e9 feat(httputil): decode compressed request bodies (zstd/gzip/deflate)
Codex CLI 0.125+ defaults to sending request bodies with
Content-Encoding: zstd. Without server-side decompression the gateway
returns 'Failed to parse request body' on /v1/responses (and any other
JSON endpoint) because gjson sees raw zstd bytes.

ReadRequestBodyWithPrealloc now inspects Content-Encoding and
transparently decodes zstd, gzip/x-gzip, and deflate bodies before
returning them, then strips the encoding headers and updates
ContentLength so downstream code can reuse the bytes safely.
Unsupported encodings produce a clear error.

Adds unit tests covering identity, zstd, gzip, deflate, unsupported
encoding, corrupt zstd payloads, nil bodies, and explicit identity.
2026-04-26 20:52:45 +10:00
github-actions[bot]
c056db740d chore: sync VERSION to 0.1.119 [skip ci] 2026-04-26 05:24:11 +00:00
Wesley Liddick
a0b5e5bfa0 Merge pull request #1973 from Nobody-Zhang/main
fix(payment): 修复 Zpay 退款接口调用
2026-04-26 13:11:42 +08:00
Wesley Liddick
41d0657330 Merge pull request #1970 from deqiying/fix-1754-claude-openai-cache-usage
fix(anthropic): 修正缓存 token 的 Anthropic 用量语义
2026-04-26 13:08:18 +08:00
Nobody-Zhang
1a0cabbfd6 Fix Zpay refund endpoint handling 2026-04-26 04:57:34 +00:00
shaw
9b6dcc57bd feat(affiliate): 完善邀请返利系统
- 修复返利不到账的根因:tryClaimAffiliateRebateAudit 中 PostgreSQL 参数类型推断冲突
  - 补全 OAuth 注册路径(LinuxDo/OIDC/WeChat/Pending Flow)的邀请码绑定
  - 前端 OAuth 注册页面传递 aff_code 参数
  - 新增返利冻结期机制:可配置冻结时间,到期后自动解冻(懒解冻)
  - 新增返利有效期:绑定后 N 天内有效,过期不再产生返利
  - 新增单人返利上限:超出上限部分精确截断
  - 增强返利流程 slog 结构化日志,便于排查问题
  - 已邀请用户列表增加返利明细列
2026-04-26 12:42:35 +08:00
Oliver
6d11f9ed77 Add Vertex service account support 2026-04-25 20:39:58 -04:00
Oliver
489a4d934e Show today stats for Vertex usage window 2026-04-25 19:46:32 -04:00
deqiying
b17704d6ef fix(anthropic): 修正缓存 token 的 Anthropic 用量语义 2026-04-26 01:14:59 +08:00
shaw
496469ac4e fix(gateway): skip body mimicry for real Claude Code clients to restore prompt caching
PR #1914 unconditionally applied the full mimicry pipeline to all OAuth
accounts, including real Claude Code CLI clients. This replaced the
client's long system prompt (~10K+ tokens with stable cache_control
breakpoints) with a short ~45 token [billing, CC prompt] pair, which
falls below Anthropic's 1024-token minimum cacheable prefix threshold.
The result: every request created a new cache but never hit an existing
one.

Fix: restore the Claude Code client detection gate so that real CC
clients bypass body-level mimicry (system rewrite, message cache
management, tool name obfuscation). Non-CC third-party clients
(opencode, etc.) continue to receive full mimicry.

Also harden the detection logic:
- Make UA regex case-insensitive (align with claude_code_validator.go)
- Validate metadata.user_id format via ParseMetadataUserID() instead of
  just checking non-empty, preventing third-party tools from spoofing
  a claude-cli/* UA with an arbitrary user_id string to bypass mimicry
2026-04-25 22:50:35 +08:00
shaw
c1b52615be fix(payment): allow Stripe payment pages to bypass router auth guard
Stripe payment routes (/payment/stripe, /payment/stripe-popup) are
reached via hard navigation (window.location.href), which caused
the router guard to block access before the page could load.
Set requiresAuth and requiresPayment to false, consistent with
/payment/result. Backend API still enforces authentication.
2026-04-25 21:38:40 +08:00
shaw
3af9940b85 style: fix gofmt and ineffassign lint errors
- gofmt: realign AffiliateDetail struct tags in affiliate_service.go
- ineffassign: remove dead seenCompleted assignment before return in account_test_service.go
2026-04-25 20:37:42 +08:00
Wesley Liddick
22b1277572 Merge pull request #1948 from hungryboy1025/fix/openai-account-test-responses-stream
fix(openai): tighten responses stream account tests
2026-04-25 20:31:07 +08:00
Wesley Liddick
aff98d5ae1 Merge pull request #1960 from gaoren002/fix/openai-stream-keepalive-downstream-idle
fix(openai): keep responses stream alive during pre-output failover
2026-04-25 20:24:25 +08:00
shaw
4e1bb2b445 feat(affiliate): add feature toggle and per-user custom invite settings
- 在系统设置「功能开关」中新增邀请返利总开关,默认关闭;
  关闭态:菜单隐藏、注册忽略 aff、新充值不返利,但已有 quota 仍可转余额
- 支持管理员为指定用户设置专属邀请码(覆盖随机码,全局唯一)
- 支持管理员为指定用户设置专属返利比例(覆盖全局比例,可单条/批量调整)
- 在系统设置邀请返利卡片内嵌入专属用户管理表格(搜索/编辑/批量/删除),
  删除采用项目通用 ConfirmDialog,会同时清除专属比例并把邀请码重置为系统随机码
- /affiliate 用户页新增「我的返利比例」卡片与动态使用说明,让用户直观看到
  分享后能拿到多少(同源 resolveRebateRatePercent 计算,与实际充值一致)
- 新增数据库迁移 132 添加 aff_rebate_rate_percent 与 aff_code_custom 列
- 新增 admin 路由组 /api/v1/admin/affiliates/users/* 共 5 个端点
- AffiliateService 改为只依赖 *SettingService,去除冗余的 SettingRepository
- 邀请码格式校验放宽到 [A-Z0-9_-]{4,32},兼容旧 12 位系统码与新自定义码
- 补充单元测试与集成测试覆盖新方法、冲突路径与边界值
2026-04-25 20:22:07 +08:00
gaoren002
dac6e52091 fix(openai): keep responses stream alive during pre-output failover 2026-04-25 12:11:27 +00:00
hungryboy1025
8987e0ba67 fix(openai): tighten responses stream account tests 2026-04-25 16:56:50 +08:00
github-actions[bot]
9d1751ec57 chore: sync VERSION to 0.1.118 [skip ci] 2026-04-25 08:06:21 +00:00
Wesley Liddick
5d1c12e60e Merge pull request #1943 from AyeSt0/fix/openai-responses-preoutput-failover
fix(openai): 修复 Responses 流式失败前置事件导致无法 failover
2026-04-25 15:43:00 +08:00
AyeSt0
5b63a9b02d fix(openai): fail over before responses stream output 2026-04-25 15:09:40 +08:00
Wesley Liddick
641e61073f Merge pull request #1940 from 4fuu/fix/bump-codex-cli-version-to-0.125.0
fix(openai): bump codex CLI version from 0.104.0 to 0.125.0
2026-04-25 14:57:51 +08:00
shaw
095f457c57 feat(openai): port /responses/compact account support flow (PR #1555)
vansour/sub2api#1555 的 OpenAI compact 能力建模手工移植到当前 main:账号
级 compact 状态/auto-force_on-force_off 模式、compact-only 模型映射、调度器
tier 分层(已支持 > 未知 > 已知不支持)、管理后台 compact 主动探测,以及对应
i18n/状态徽章。普通 /responses 流量行为不变,无数据库迁移。
2026-04-25 14:52:58 +08:00
4fuu
1e57e88e43 fix(openai): bump codex CLI version from 0.104.0 to 0.125.0
The hardcoded codex CLI version (0.104.0) causes upstream rejection
when using gpt-5.5 with compact, as the server treats the request
as an outdated client and returns 400/502.

Update codexCLIVersion, codexCLIUserAgent, and openAICodexProbeVersion
to 0.125.0 to match the current Codex CLI release.

Fixes #1933, #1887, #1865
Related: #1609, #1298, #849
2026-04-25 05:26:33 +00:00
Wesley Liddick
b95ffce244 Merge pull request #1772 from KnowSky404/fix/openai-test-state-reconciliation
[codex] reconcile OpenAI admin test rate-limit state
2026-04-25 10:02:21 +08:00
shaw
8f28a834f8 fix(payment): 同时启用易支付和 Stripe 时显示 Stripe 按钮
VISIBLE_METHOD_ALIASES 漏了 stripe,导致 getVisibleMethods 把后端返回
的 stripe 过滤掉。点 Stripe 按钮时省略 method 查询参数,让落地页渲染
完整的 Payment Element。
2026-04-25 09:46:27 +08:00