diff --git a/README.en.md b/README.en.md index 3885003f..c3be8381 100644 --- a/README.en.md +++ b/README.en.md @@ -65,6 +65,14 @@ - Add suffix `-low` to set low reasoning effort 17. 🔄 Thinking to content option `thinking_to_content` in `Channel->Edit->Channel Extra Settings`, default is `false`, when `true`, the `reasoning_content` of the thinking content will be converted to `` tags and concatenated to the content returned. 18. 🔄 Model rate limit, support setting total request limit and successful request limit in `System Settings->Rate Limit Settings` +19. 💰 Cache billing support, when enabled can charge a configurable ratio for cache hits: + 1. Set `Prompt Cache Ratio` in `System Settings -> Operation Settings` + 2. Set `Prompt Cache Ratio` in channel settings, range 0-1 (e.g., 0.5 means 50% charge on cache hits) + 3. Supported channels: + - [x] OpenAI + - [x] Azure + - [x] DeepSeek + - [ ] Claude ## Model Support This version additionally supports: diff --git a/README.md b/README.md index cefecbd6..0a0ff71b 100644 --- a/README.md +++ b/README.md @@ -74,6 +74,14 @@ - 添加后缀 `-thinking` 启用思考模式 (例如: `claude-3-7-sonnet-20250219-thinking`) 18. 🔄 思考转内容,支持在 `渠道-编辑-渠道额外设置` 中设置 `thinking_to_content` 选项,默认`false`,开启后会将思考内容`reasoning_content`转换为``标签拼接到内容中返回。 19. 🔄 模型限流,支持在 `系统设置-速率限制设置` 中设置模型限流,支持设置总请求数限制和成功请求数限制 +20. 💰 缓存计费支持,开启后可以在缓存命中时按照设定的比例计费: + 1. 在 `系统设置-运营设置` 中设置 `提示缓存倍率` 选项 + 2. 在渠道中设置 `提示缓存倍率`,范围 0-1,例如设置为 0.5 表示缓存命中时按照 50% 计费 + 3. 支持的渠道: + - [x] OpenAI + - [x] Azure + - [x] DeepSeek + - [ ] Claude ## 模型支持 此版本额外支持以下模型: diff --git a/web/src/i18n/locales/en.json b/web/src/i18n/locales/en.json index 89b2bcbb..9951534e 100644 --- a/web/src/i18n/locales/en.json +++ b/web/src/i18n/locales/en.json @@ -1338,5 +1338,9 @@ "0.1-1之间的小数": "Decimal between 0.1 and 1", "模型相关设置": "Model related settings", "收起侧边栏": "Collapse sidebar", - "展开侧边栏": "Expand sidebar" + "展开侧边栏": "Expand sidebar", + "提示缓存倍率": "Prompt cache ratio", + "缓存:${{price}} * {{ratio}} = ${{total}} / 1M tokens (缓存倍率: {{cacheRatio}})": "Cache: ${{price}} * {{ratio}} = ${{total}} / 1M tokens (cache ratio: {{cacheRatio}})", + "提示 {{nonCacheInput}} tokens + 缓存 {{cacheInput}} tokens * {{cacheRatio}} / 1M tokens * ${{price}} + 补全 {{completion}} tokens / 1M tokens * ${{compPrice}} * 分组 {{ratio}} = ${{total}}": "Prompt {{nonCacheInput}} tokens + cache {{cacheInput}} tokens * {{cacheRatio}} / 1M tokens * ${{price}} + completion {{completion}} tokens / 1M tokens * ${{compPrice}} * group {{ratio}} = ${{total}}", + "缓存 Tokens": "Cache Tokens" }