Commit Graph

5357 Commits

Author SHA1 Message Date
Seefs
45649249b2 fix: 在Vertex Adapter过滤content[].part[].functionResponse.id 2025-12-21 17:22:04 +08:00
Seefs
219b13af70 fix: 模型设置增加针对Vertex渠道过滤content[].part[].functionResponse.id的选项,默认启用 2025-12-21 17:09:49 +08:00
comeback01
b9d78515f8 Merge branch 'main' into french-translation 2025-12-20 11:08:07 +01:00
长安
6e3bc06fa6 fix: 修复 Anthropic 渠道缓存计费错误
## 问题描述

当使用 Anthropic 渠道通过 `/v1/chat/completions` 端点调用且启用缓存功能时,
计费逻辑错误地减去了缓存 tokens,导致严重的收入损失(94.5%)。

## 根本原因

不同 API 的 `prompt_tokens` 定义不同:

- **Anthropic API**: `input_tokens` 字段已经是纯输入 tokens(不包含缓存)
- **OpenAI API**: `prompt_tokens` 字段包含所有 tokens(包含缓存)
- **OpenRouter API**: `prompt_tokens` 字段包含所有 tokens(包含缓存)

当前 `postConsumeQuota` 函数对所有渠道都减去缓存 tokens,这对 Anthropic
渠道是错误的,因为其 `input_tokens` 已经不包含缓存。

## 修复方案

在 `relay/compatible_handler.go` 的 `postConsumeQuota` 函数中,添加渠道类型判断:

```go
if relayInfo.ChannelType != constant.ChannelTypeAnthropic {
    baseTokens = baseTokens.Sub(dCacheTokens)
}
```

只对非 Anthropic 渠道减去缓存 tokens。

## 影响分析

###  不受影响的场景

1. **无缓存调用**(所有渠道)
   - cache_tokens = 0
   - 减去 0 = 不减去
   - 结果:完全一致

2. **OpenAI/OpenRouter 渠道 + 缓存**
   - 继续减去缓存(因为 ChannelType != Anthropic)
   - 结果:完全一致

3. **Anthropic 渠道 + /v1/messages 端点**
   - 使用 PostClaudeConsumeQuota(不修改)
   - 结果:完全不受影响

###  修复的场景

4. **Anthropic 渠道 + /v1/chat/completions + 缓存**
   - 修复前:错误地减去缓存,导致 94.5% 收入损失
   - 修复后:不减去缓存,计费正确

## 验证数据

以实际记录 143509 为例:

| 项目 | 修复前 | 修复后 | 差异 |
|------|--------|--------|------|
| Quota | 10,489 | 191,330 | +180,841 |
| 费用 | ¥0.020978 | ¥0.382660 | +¥0.361682 |
| 收入恢复 | - | - | **+1724.1%** |

## 测试建议

1. 测试 Anthropic 渠道 + 缓存场景
2. 测试 OpenAI 渠道 + 缓存场景(确保不受影响)
3. 测试无缓存场景(确保不受影响)

## 相关 Issue

修复 Anthropic 渠道使用 prompt caching 时的计费错误。
2025-12-20 14:17:12 +08:00
CaIon
c2a6193497 feat(gin): improve request body handling and error reporting 2025-12-20 13:34:10 +08:00
CaIon
3523acfc2c feat(init): increase MaxRequestBodyMB to enhance request handling 2025-12-20 13:27:55 +08:00
CaIon
f2d2b6e7fc feat(channel): add error handling for SaveWithoutKey when channel ID is 0 2025-12-20 13:26:40 +08:00
Seefs
7a9cfa38ff Merge pull request #2476 from TinsFox/chore/code-inspector-plugin 2025-12-20 11:04:40 +08:00
Seefs
a78fd2dae6 docs: document pyroscope env var 2025-12-19 23:16:56 +08:00
TinsFox
c06a216a14 chore: add code-inspector-plugin integration 2025-12-19 23:04:53 +08:00
Seefs
fb0ffe8c95 docs: document pyroscope env var 2025-12-19 23:03:04 +08:00
Seefs
b49bb48ed1 fix: systemname 2025-12-19 22:27:35 +08:00
Seefs
6d0cb5df75 Merge pull request #2474 from TinsFox/main 2025-12-19 21:39:56 +08:00
TinsFox
530b3eff11 style: add card spacing 2025-12-19 21:00:31 +08:00
Seefs
39df47486c fix(gemini): handle minimal reasoning effort budget
- Add minimal case to clampThinkingBudgetByEffort to avoid defaulting to full thinking budget
2025-12-18 08:10:46 +08:00
comeback01
5aaf006642 Refine French translations for UI conciseness
Updated web/src/i18n/locales/fr.json to improve French translations for the user interface.

Removed verbose prefixes like 'Gestion des...' and 'Paramètres de...' to prevent truncation in sidebars and menus.

Harmonized terms for consistency (e.g., 'Tâches', 'Journaux', 'Dessins').

Renamed 'Place du marché' to 'Marché des modèles'.
2025-12-17 12:10:36 +01:00
Seefs
2f28f265a9 Merge pull request #2452 from QuantumNous/fix/oom-request-body-limit 2025-12-16 18:21:59 +08:00
t0ng7u
fa814b80fe 🧹 fix: harden request-body size handling and error unwrapping
Tighten oversized request handling across relay paths and make error matching reliable.

- Align `MAX_REQUEST_BODY_MB` fallback to `32` in request body reader and decompression middleware
- Stop ignoring `GetRequestBody` errors in relay retry paths; return consistent **413** on oversized bodies (400 for other read errors)
- Add `Unwrap()` to `types.NewAPIError` so `errors.Is/As` can match wrapped underlying errors
- `go test ./...` passes
2025-12-16 18:10:00 +08:00
t0ng7u
c2ed76ddfd 🛡️ fix: prevent OOM on large/decompressed requests; skip heavy prompt meta when token count is disabled
Clamp request body size (including post-decompression) to avoid memory exhaustion caused by huge payloads/zip bombs, especially with large-context Claude requests. Add a configurable `MAX_REQUEST_BODY_MB` (default `32`) and document it.

- Enforce max request body size after gzip/br decompression via `http.MaxBytesReader`
- Add a secondary size guard in `common.GetRequestBody` and cache-safe handling
- Return **413 Request Entity Too Large** on oversized bodies in relay entry
- Avoid building large `TokenCountMeta.CombineText` when both token counting and sensitive check are disabled (use lightweight meta for pricing)
- Update READMEs (CN/EN/FR/JA) with `MAX_REQUEST_BODY_MB`
- Fix a handful of vet/formatting issues encountered during the change
- `go test ./...` passes
2025-12-16 17:00:19 +08:00
Seefs
3eee8c7a21 fix: 支持传入system_instruction和systemInstruction两种风格系统提示词参数名 2025-12-16 13:08:58 +08:00
Calcium-Ion
87195a07b0 Merge pull request #2445 from QuantumNous/feat/token-ip-whitelist-cidr
feat(auth): enhance IP restriction handling with CIDR support
2025-12-15 20:14:09 +08:00
CaIon
692b5ff5ac feat(auth): refactor IP restriction handling to use clearer variable naming 2025-12-15 20:13:09 +08:00
旃蒙
5d0e66b39e fix(task): 修复渠道配置多个key时无法获取任务的问题 2025-12-15 18:15:35 +08:00
CaIon
947a763a1a feat(auth): enhance IP restriction handling with CIDR support 2025-12-15 17:24:09 +08:00
CaIon
1bd0d8de02 Revert "feat(audio): replace SysLog with logger for improved logging in GetAudioDuration"
This reverts commit 2a980bbcf5.
2025-12-14 00:04:40 +08:00
CaIon
2a980bbcf5 feat(audio): replace SysLog with logger for improved logging in GetAudioDuration 2025-12-13 23:59:58 +08:00
CaIon
da0d3ea93c fix(audio): improve WAV duration calculation with enhanced PCM size handling 2025-12-13 23:57:32 +08:00
CaIon
06d1bd404b feat(model_ratio): add default ratios for gpt-4o-mini-tts 2025-12-13 19:14:27 +08:00
CaIon
284ce42c88 refactor(channel_select): improve retry logic with reset functionality 2025-12-13 18:09:10 +08:00
Calcium-Ion
ef5e4a056c Merge pull request #2434 from QuantumNous/feat/gpt-4o-mini-tts
feat: support gpt tts series model quota calculate
2025-12-13 17:55:16 +08:00
CaIon
3822f4577c fix(audio): correct TotalTokens calculation for accurate usage reporting 2025-12-13 17:49:57 +08:00
CaIon
be2a863b9b feat(audio): enhance audio request handling with token type detection and streaming support 2025-12-13 17:24:23 +08:00
CaIon
29565c837f feat(token): add CrossGroupRetry field to token insertion 2025-12-13 16:45:42 +08:00
CaIon
a1299114a6 refactor(error): replace dto.OpenAIError with types.OpenAIError for consistency 2025-12-13 16:43:57 +08:00
CaIon
6175f254a2 refactor(channel_select): enhance retry logic and context key usage for channel selection 2025-12-13 16:43:38 +08:00
Seefs
bdfc875775 feat: pyroscope integrate 2025-12-13 13:49:38 +08:00
CaIon
7d586ef507 fix(helper): improve error handling in FlushWriter and related functions 2025-12-13 13:29:21 +08:00
CaIon
e0a79e853d refactor(auth): replace direct token group setting with context key retrieval 2025-12-13 01:38:12 +08:00
Calcium-Ion
6e998eaf82 Merge pull request #2430 from QuantumNous/fix/cross-group-retry
fix(channel_select): adjust priority retry logic for cross-group
2025-12-13 01:05:40 +08:00
CaIon
51f3a936e4 fix(channel_select): adjust priority retry logic for cross-group channel selection 2025-12-13 01:04:10 +08:00
Calcium-Ion
2a01d1c996 Merge pull request #2429 from QuantumNous/feat/xhigh
feat(adaptor): add '-xhigh' suffix to reasoning effort options
2025-12-12 22:06:19 +08:00
CaIon
413968a0fd refactor(relay): update channel retrieval to use RelayInfo structure 2025-12-12 22:04:38 +08:00
Calcium-Ion
281ebacb8c Merge pull request #2424 from ion1ze/main
fix: correct sender format issues fix #1347
2025-12-12 20:55:22 +08:00
CaIon
27dd42718b feat(adaptor): add '-xhigh' suffix to reasoning effort options for model parsing 2025-12-12 20:53:48 +08:00
Calcium-Ion
3c5edc54b7 Merge pull request #2426 from QuantumNous/feat/auto-cross-group-retry
feat(token): add cross-group retry option for token processing
2025-12-12 20:45:54 +08:00
Calcium-Ion
8b0e710053 Merge pull request #2428 from seefs001/fix/health-check
fix: health check
2025-12-12 20:45:34 +08:00
Seefs
ae30b4d15f fix: health check 2025-12-12 20:37:32 +08:00
CaIon
ffb1931906 feat: implement cross-group retry functionality and update translations 2025-12-12 18:28:33 +08:00
CaIon
c87deaa7d9 feat(token): add cross-group retry option for token processing 2025-12-12 17:59:21 +08:00
hackerxiao
8257438499 feat: 支持仅使用x-api-key获取anthropic格式的模型列表 注释增加 2025-12-12 17:27:24 +08:00