## 问题描述
当使用 Anthropic 渠道通过 `/v1/chat/completions` 端点调用且启用缓存功能时,
计费逻辑错误地减去了缓存 tokens,导致严重的收入损失(94.5%)。
## 根本原因
不同 API 的 `prompt_tokens` 定义不同:
- **Anthropic API**: `input_tokens` 字段已经是纯输入 tokens(不包含缓存)
- **OpenAI API**: `prompt_tokens` 字段包含所有 tokens(包含缓存)
- **OpenRouter API**: `prompt_tokens` 字段包含所有 tokens(包含缓存)
当前 `postConsumeQuota` 函数对所有渠道都减去缓存 tokens,这对 Anthropic
渠道是错误的,因为其 `input_tokens` 已经不包含缓存。
## 修复方案
在 `relay/compatible_handler.go` 的 `postConsumeQuota` 函数中,添加渠道类型判断:
```go
if relayInfo.ChannelType != constant.ChannelTypeAnthropic {
baseTokens = baseTokens.Sub(dCacheTokens)
}
```
只对非 Anthropic 渠道减去缓存 tokens。
## 影响分析
### ✅ 不受影响的场景
1. **无缓存调用**(所有渠道)
- cache_tokens = 0
- 减去 0 = 不减去
- 结果:完全一致
2. **OpenAI/OpenRouter 渠道 + 缓存**
- 继续减去缓存(因为 ChannelType != Anthropic)
- 结果:完全一致
3. **Anthropic 渠道 + /v1/messages 端点**
- 使用 PostClaudeConsumeQuota(不修改)
- 结果:完全不受影响
### ✅ 修复的场景
4. **Anthropic 渠道 + /v1/chat/completions + 缓存**
- 修复前:错误地减去缓存,导致 94.5% 收入损失
- 修复后:不减去缓存,计费正确
## 验证数据
以实际记录 143509 为例:
| 项目 | 修复前 | 修复后 | 差异 |
|------|--------|--------|------|
| Quota | 10,489 | 191,330 | +180,841 |
| 费用 | ¥0.020978 | ¥0.382660 | +¥0.361682 |
| 收入恢复 | - | - | **+1724.1%** |
## 测试建议
1. 测试 Anthropic 渠道 + 缓存场景
2. 测试 OpenAI 渠道 + 缓存场景(确保不受影响)
3. 测试无缓存场景(确保不受影响)
## 相关 Issue
修复 Anthropic 渠道使用 prompt caching 时的计费错误。
New API
🍥 Next-Generation Large Model Gateway and AI Asset Management System
Quick Start • Key Features • Deployment • Documentation • Help
📝 Project Description
Note
This is an open-source project developed based on One API
Important
- This project is for personal learning purposes only, with no guarantee of stability or technical support
- Users must comply with OpenAI's Terms of Use and applicable laws and regulations, and must not use it for illegal purposes
- According to the 《Interim Measures for the Management of Generative Artificial Intelligence Services》, please do not provide any unregistered generative AI services to the public in China.
🤝 Trusted Partners
No particular order
🙏 Special Thanks
Thanks to JetBrains for providing free open-source development license for this project
🚀 Quick Start
Using Docker Compose (Recommended)
# Clone the project
git clone https://github.com/QuantumNous/new-api.git
cd new-api
# Edit docker-compose.yml configuration
nano docker-compose.yml
# Start the service
docker-compose up -d
Using Docker Commands
# Pull the latest image
docker pull calciumion/new-api:latest
# Using SQLite (default)
docker run --name new-api -d --restart always \
-p 3000:3000 \
-e TZ=Asia/Shanghai \
-v ./data:/data \
calciumion/new-api:latest
# Using MySQL
docker run --name new-api -d --restart always \
-p 3000:3000 \
-e SQL_DSN="root:123456@tcp(localhost:3306)/oneapi" \
-e TZ=Asia/Shanghai \
-v ./data:/data \
calciumion/new-api:latest
💡 Tip:
-v ./data:/datawill save data in thedatafolder of the current directory, you can also change it to an absolute path like-v /your/custom/path:/data
🎉 After deployment is complete, visit http://localhost:3000 to start using!
📖 For more deployment methods, please refer to Deployment Guide
📚 Documentation
Quick Navigation:
| Category | Link |
|---|---|
| 🚀 Deployment Guide | Installation Documentation |
| ⚙️ Environment Configuration | Environment Variables |
| 📡 API Documentation | API Documentation |
| ❓ FAQ | FAQ |
| 💬 Community Interaction | Communication Channels |
✨ Key Features
For detailed features, please refer to Features Introduction
🎨 Core Functions
| Feature | Description |
|---|---|
| 🎨 New UI | Modern user interface design |
| 🌍 Multi-language | Supports Chinese, English, French, Japanese |
| 🔄 Data Compatibility | Fully compatible with the original One API database |
| 📈 Data Dashboard | Visual console and statistical analysis |
| 🔒 Permission Management | Token grouping, model restrictions, user management |
💰 Payment and Billing
- ✅ Online recharge (EPay, Stripe)
- ✅ Pay-per-use model pricing
- ✅ Cache billing support (OpenAI, Azure, DeepSeek, Claude, Qwen and all supported models)
- ✅ Flexible billing policy configuration
🔐 Authorization and Security
- 😈 Discord authorization login
- 🤖 LinuxDO authorization login
- 📱 Telegram authorization login
- 🔑 OIDC unified authentication
🚀 Advanced Features
API Format Support:
- ⚡ OpenAI Responses
- ⚡ OpenAI Realtime API (including Azure)
- ⚡ Claude Messages
- ⚡ Google Gemini
- 🔄 Rerank Models (Cohere, Jina)
Intelligent Routing:
- ⚖️ Channel weighted random
- 🔄 Automatic retry on failure
- 🚦 User-level model rate limiting
Format Conversion:
- 🔄 OpenAI ⇄ Claude Messages
- 🔄 OpenAI ⇄ Gemini Chat
- 🔄 Thinking-to-content functionality
Reasoning Effort Support:
View detailed configuration
OpenAI series models:
o3-mini-high- High reasoning efforto3-mini-medium- Medium reasoning efforto3-mini-low- Low reasoning effortgpt-5-high- High reasoning effortgpt-5-medium- Medium reasoning effortgpt-5-low- Low reasoning effort
Claude thinking models:
claude-3-7-sonnet-20250219-thinking- Enable thinking mode
Google Gemini series models:
gemini-2.5-flash-thinking- Enable thinking modegemini-2.5-flash-nothinking- Disable thinking modegemini-2.5-pro-thinking- Enable thinking modegemini-2.5-pro-thinking-128- Enable thinking mode with thinking budget of 128 tokens- You can also append
-low,-medium, or-highto any Gemini model name to request the corresponding reasoning effort (no extra thinking-budget suffix needed).
🤖 Model Support
For details, please refer to API Documentation - Relay Interface
| Model Type | Description | Documentation |
|---|---|---|
| 🤖 OpenAI GPTs | gpt-4-gizmo-* series | - |
| 🎨 Midjourney-Proxy | Midjourney-Proxy(Plus) | Documentation |
| 🎵 Suno-API | Suno API | Documentation |
| 🔄 Rerank | Cohere, Jina | Documentation |
| 💬 Claude | Messages format | Documentation |
| 🌐 Gemini | Google Gemini format | Documentation |
| 🔧 Dify | ChatFlow mode | - |
| 🎯 Custom | Supports complete call address | - |
📡 Supported Interfaces
View complete interface list
🚢 Deployment
Tip
Latest Docker image:
calciumion/new-api:latest
📋 Deployment Requirements
| Component | Requirement |
|---|---|
| Local database | SQLite (Docker must mount /data directory) |
| Remote database | MySQL ≥ 5.7.8 or PostgreSQL ≥ 9.6 |
| Container engine | Docker / Docker Compose |
⚙️ Environment Variable Configuration
Common environment variable configuration
| Variable Name | Description | Default Value |
|---|---|---|
SESSION_SECRET |
Session secret (required for multi-machine deployment) | - |
CRYPTO_SECRET |
Encryption secret (required for Redis) | - |
SQL_DSN |
Database connection string | - |
REDIS_CONN_STRING |
Redis connection string | - |
STREAMING_TIMEOUT |
Streaming timeout (seconds) | 300 |
STREAM_SCANNER_MAX_BUFFER_MB |
Max per-line buffer (MB) for the stream scanner; increase when upstream sends huge image/base64 payloads | 64 |
MAX_REQUEST_BODY_MB |
Max request body size (MB, counted after decompression; prevents huge requests/zip bombs from exhausting memory). Exceeding it returns 413 |
32 |
AZURE_DEFAULT_API_VERSION |
Azure API version | 2025-04-01-preview |
ERROR_LOG_ENABLED |
Error log switch | false |
📖 Complete configuration: Environment Variables Documentation
🔧 Deployment Methods
Method 1: Docker Compose (Recommended)
# Clone the project
git clone https://github.com/QuantumNous/new-api.git
cd new-api
# Edit configuration
nano docker-compose.yml
# Start service
docker-compose up -d
Method 2: Docker Commands
Using SQLite:
docker run --name new-api -d --restart always \
-p 3000:3000 \
-e TZ=Asia/Shanghai \
-v ./data:/data \
calciumion/new-api:latest
Using MySQL:
docker run --name new-api -d --restart always \
-p 3000:3000 \
-e SQL_DSN="root:123456@tcp(localhost:3306)/oneapi" \
-e TZ=Asia/Shanghai \
-v ./data:/data \
calciumion/new-api:latest
💡 Path explanation:
./data:/data- Relative path, data saved in the data folder of the current directory- You can also use absolute path, e.g.:
/your/custom/path:/data
Method 3: BaoTa Panel
- Install BaoTa Panel (≥ 9.2.0 version)
- Search for New-API in the application store
- One-click installation
⚠️ Multi-machine Deployment Considerations
Warning
- Must set
SESSION_SECRET- Otherwise login status inconsistent- Shared Redis must set
CRYPTO_SECRET- Otherwise data cannot be decrypted
🔄 Channel Retry and Cache
Retry configuration: Settings → Operation Settings → General Settings → Failure Retry Count
Cache configuration:
REDIS_CONN_STRING: Redis cache (recommended)MEMORY_CACHE_ENABLED: Memory cache
🔗 Related Projects
Upstream Projects
| Project | Description |
|---|---|
| One API | Original project base |
| Midjourney-Proxy | Midjourney interface support |
Supporting Tools
| Project | Description |
|---|---|
| neko-api-key-tool | Key quota query tool |
| new-api-horizon | New API high-performance optimized version |
💬 Help Support
📖 Documentation Resources
| Resource | Link |
|---|---|
| 📘 FAQ | FAQ |
| 💬 Community Interaction | Communication Channels |
| 🐛 Issue Feedback | Issue Feedback |
| 📚 Complete Documentation | Official Documentation |
🤝 Contribution Guide
Welcome all forms of contribution!
- 🐛 Report Bugs
- 💡 Propose New Features
- 📝 Improve Documentation
- 🔧 Submit Code
🌟 Star History
💖 Thank you for using New API
If this project is helpful to you, welcome to give us a ⭐️ Star!
Official Documentation • Issue Feedback • Latest Release
Built with ❤️ by QuantumNous
