feat: 区分 Anthropic 5m/1h 缓存创建 token 的差异化计费

Anthropic API 的 cache_creation 对象区分了 ephemeral_5m 和 ephemeral_1h
两种缓存创建 token,1h 单价远高于 5m(如 claude-3-5-haiku: 5m=$1/MTok,
1h=$6/MTok)。此前系统统一按 5m 单价计费,导致计费偏低。

后端:
- pricing_service: 加载 LiteLLM 的 cache_creation_input_token_cost_above_1hr
- billing_service: GetModelPricing 启用分类计费(安全守卫 1h>5m),
  CalculateCost 按 5m/1h 分别计费,无明细时回退到 5m 单价
- gateway_service: parseSSEUsage/handleNonStreamingResponse 用 gjson
  提取嵌套 cache_creation 对象的 ephemeral_5m/1h_input_tokens
- antigravity_gateway_service: extractSSEUsage/extractClaudeUsage 同步提取
- usage_log: 修复 GORM column tag 确保写入正确的数据库列
- 新增迁移 054: 删除 GORM 自动生成的重复列

前端:
- 使用记录 tooltip 展示 5m/1h 缓存创建明细(带彩色 badge 区分)
- 表格单元格缓存写入数值旁显示 1h 标识
This commit is contained in:
shaw
2026-02-14 18:15:35 +08:00
parent 2857fa2ef7
commit a817cafe3d
10 changed files with 174 additions and 42 deletions

View File

@@ -70,6 +70,7 @@
<div v-if="row.cache_creation_tokens > 0" class="inline-flex items-center gap-1">
<svg class="h-3.5 w-3.5 text-amber-500" fill="none" stroke="currentColor" viewBox="0 0 24 24"><path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M11 5H6a2 2 0 00-2 2v11a2 2 0 002 2h11a2 2 0 002-2v-5m-1.414-9.414a2 2 0 112.828 2.828L11.828 15H9v-2.828l8.586-8.586z" /></svg>
<span class="font-medium text-amber-600 dark:text-amber-400">{{ formatCacheTokens(row.cache_creation_tokens) }}</span>
<span v-if="row.cache_creation_1h_tokens > 0" class="inline-flex items-center rounded px-1 py-px text-[10px] font-medium leading-tight bg-orange-100 text-orange-600 ring-1 ring-inset ring-orange-200 dark:bg-orange-500/20 dark:text-orange-400 dark:ring-orange-500/30">1h</span>
</div>
</div>
</div>
@@ -157,9 +158,29 @@
<span class="text-gray-400">{{ t('admin.usage.outputTokens') }}</span>
<span class="font-medium text-white">{{ tokenTooltipData.output_tokens.toLocaleString() }}</span>
</div>
<div v-if="tokenTooltipData && tokenTooltipData.cache_creation_tokens > 0" class="flex items-center justify-between gap-4">
<span class="text-gray-400">{{ t('admin.usage.cacheCreationTokens') }}</span>
<span class="font-medium text-white">{{ tokenTooltipData.cache_creation_tokens.toLocaleString() }}</span>
<div v-if="tokenTooltipData && tokenTooltipData.cache_creation_tokens > 0">
<!-- 5m/1h 明细时展开显示 -->
<template v-if="tokenTooltipData.cache_creation_5m_tokens > 0 || tokenTooltipData.cache_creation_1h_tokens > 0">
<div v-if="tokenTooltipData.cache_creation_5m_tokens > 0" class="flex items-center justify-between gap-4">
<span class="text-gray-400 flex items-center gap-1.5">
{{ t('admin.usage.cacheCreation5mTokens') }}
<span class="inline-flex items-center rounded px-1 py-px text-[10px] font-medium leading-tight bg-amber-500/20 text-amber-400 ring-1 ring-inset ring-amber-500/30">5m</span>
</span>
<span class="font-medium text-white">{{ tokenTooltipData.cache_creation_5m_tokens.toLocaleString() }}</span>
</div>
<div v-if="tokenTooltipData.cache_creation_1h_tokens > 0" class="flex items-center justify-between gap-4">
<span class="text-gray-400 flex items-center gap-1.5">
{{ t('admin.usage.cacheCreation1hTokens') }}
<span class="inline-flex items-center rounded px-1 py-px text-[10px] font-medium leading-tight bg-orange-500/20 text-orange-400 ring-1 ring-inset ring-orange-500/30">1h</span>
</span>
<span class="font-medium text-white">{{ tokenTooltipData.cache_creation_1h_tokens.toLocaleString() }}</span>
</div>
</template>
<!-- 无明细时只显示聚合值 -->
<div v-else class="flex items-center justify-between gap-4">
<span class="text-gray-400">{{ t('admin.usage.cacheCreationTokens') }}</span>
<span class="font-medium text-white">{{ tokenTooltipData.cache_creation_tokens.toLocaleString() }}</span>
</div>
</div>
<div v-if="tokenTooltipData && tokenTooltipData.cache_read_tokens > 0" class="flex items-center justify-between gap-4">
<span class="text-gray-400">{{ t('admin.usage.cacheReadTokens') }}</span>

View File

@@ -2360,6 +2360,8 @@ export default {
inputTokens: 'Input Tokens',
outputTokens: 'Output Tokens',
cacheCreationTokens: 'Cache Creation Tokens',
cacheCreation5mTokens: 'Cache Write',
cacheCreation1hTokens: 'Cache Write',
cacheReadTokens: 'Cache Read Tokens',
failedToLoad: 'Failed to load usage records',
billingType: 'Billing Type',

View File

@@ -2527,6 +2527,8 @@ export default {
inputTokens: '输入 Token',
outputTokens: '输出 Token',
cacheCreationTokens: '缓存创建 Token',
cacheCreation5mTokens: '缓存创建',
cacheCreation1hTokens: '缓存创建',
cacheReadTokens: '缓存读取 Token',
failedToLoad: '加载使用记录失败',
billingType: '计费类型',

View File

@@ -233,6 +233,7 @@
<span class="font-medium text-amber-600 dark:text-amber-400">{{
formatCacheTokens(row.cache_creation_tokens)
}}</span>
<span v-if="row.cache_creation_1h_tokens > 0" class="inline-flex items-center rounded px-1 py-px text-[10px] font-medium leading-tight bg-orange-100 text-orange-600 ring-1 ring-inset ring-orange-200 dark:bg-orange-500/20 dark:text-orange-400 dark:ring-orange-500/30">1h</span>
</div>
</div>
</div>
@@ -350,9 +351,29 @@
<span class="text-gray-400">{{ t('admin.usage.outputTokens') }}</span>
<span class="font-medium text-white">{{ tokenTooltipData.output_tokens.toLocaleString() }}</span>
</div>
<div v-if="tokenTooltipData && tokenTooltipData.cache_creation_tokens > 0" class="flex items-center justify-between gap-4">
<span class="text-gray-400">{{ t('admin.usage.cacheCreationTokens') }}</span>
<span class="font-medium text-white">{{ tokenTooltipData.cache_creation_tokens.toLocaleString() }}</span>
<div v-if="tokenTooltipData && tokenTooltipData.cache_creation_tokens > 0">
<!-- 5m/1h 明细时展开显示 -->
<template v-if="tokenTooltipData.cache_creation_5m_tokens > 0 || tokenTooltipData.cache_creation_1h_tokens > 0">
<div v-if="tokenTooltipData.cache_creation_5m_tokens > 0" class="flex items-center justify-between gap-4">
<span class="text-gray-400 flex items-center gap-1.5">
{{ t('admin.usage.cacheCreation5mTokens') }}
<span class="inline-flex items-center rounded px-1 py-px text-[10px] font-medium leading-tight bg-amber-500/20 text-amber-400 ring-1 ring-inset ring-amber-500/30">5m</span>
</span>
<span class="font-medium text-white">{{ tokenTooltipData.cache_creation_5m_tokens.toLocaleString() }}</span>
</div>
<div v-if="tokenTooltipData.cache_creation_1h_tokens > 0" class="flex items-center justify-between gap-4">
<span class="text-gray-400 flex items-center gap-1.5">
{{ t('admin.usage.cacheCreation1hTokens') }}
<span class="inline-flex items-center rounded px-1 py-px text-[10px] font-medium leading-tight bg-orange-500/20 text-orange-400 ring-1 ring-inset ring-orange-500/30">1h</span>
</span>
<span class="font-medium text-white">{{ tokenTooltipData.cache_creation_1h_tokens.toLocaleString() }}</span>
</div>
</template>
<!-- 无明细时只显示聚合值 -->
<div v-else class="flex items-center justify-between gap-4">
<span class="text-gray-400">{{ t('admin.usage.cacheCreationTokens') }}</span>
<span class="font-medium text-white">{{ tokenTooltipData.cache_creation_tokens.toLocaleString() }}</span>
</div>
</div>
<div v-if="tokenTooltipData && tokenTooltipData.cache_read_tokens > 0" class="flex items-center justify-between gap-4">
<span class="text-gray-400">{{ t('admin.usage.cacheReadTokens') }}</span>