运维监控系统安全加固和功能优化 (#21)

* fix(ops): 修复运维监控系统的关键安全和稳定性问题 ## 修复内容 ### P0 严重问题 1. **DNS Rebinding防护** (ops_alert_service.go) - 实现IP钉住机制防止验证后的DNS rebinding攻击 - 自定义Transport.DialContext强制只允许拨号到验证过的公网IP - 扩展IP黑名单，包括云metadata地址(169.254.169.254) - 添加完整的单元测试覆盖 2. **OpsAlertService生命周期管理** (wire.go) - 在ProvideOpsMetricsCollector中添加opsAlertService.Start()调用 - 确保stopCtx正确初始化，避免nil指针问题 - 实现防御式启动，保证服务启动顺序 3. **数据库查询排序** (ops_repo.go) - 在ListRecentSystemMetrics中添加显式ORDER BY updated_at DESC, id DESC - 在GetLatestSystemMetric中添加排序保证 - 避免数据库返回顺序不确定导致告警误判 ### P1 重要问题 4. **并发安全** (ops_metrics_collector.go) - 为lastGCPauseTotal字段添加sync.Mutex保护 - 防止数据竞争 5. **Goroutine泄漏** (ops_error_logger.go) - 实现worker pool模式限制并发goroutine数量 - 使用256容量缓冲队列和10个固定worker - 非阻塞投递，队列满时丢弃任务 6. **生命周期控制** (ops_alert_service.go) - 添加Start/Stop方法实现优雅关闭 - 使用context控制goroutine生命周期 - 实现WaitGroup等待后台任务完成 7. **Webhook URL验证** (ops_alert_service.go) - 防止SSRF攻击：验证scheme、禁止内网IP - DNS解析验证，拒绝解析到私有IP的域名 - 添加8个单元测试覆盖各种攻击场景 8. **资源泄漏** (ops_repo.go) - 修复多处defer rows.Close()问题 - 简化冗余的defer func()包装 9. **HTTP超时控制** (ops_alert_service.go) - 创建带10秒超时的http.Client - 添加buildWebhookHTTPClient辅助函数 - 防止HTTP请求无限期挂起 10. **数据库查询优化** (ops_repo.go) - 将GetWindowStats的4次独立查询合并为1次CTE查询 - 减少网络往返和表扫描次数 - 显著提升性能 11. **重试机制** (ops_alert_service.go) - 实现邮件发送重试：最多3次，指数退避(1s/2s/4s) - 添加webhook备用通道 - 实现完整的错误处理和日志记录 12. **魔法数字** (ops_repo.go, ops_metrics_collector.go) - 提取硬编码数字为有意义的常量 - 提高代码可读性和可维护性 ## 测试验证 - ✅ go test ./internal/service -tags opsalert_unit 通过 - ✅ 所有webhook验证测试通过 - ✅ 重试机制测试通过 ## 影响范围 - 运维监控系统安全性显著提升 - 系统稳定性和性能优化 - 无破坏性变更，向后兼容 * feat(ops): 运维监控系统V2 - 完整实现 ## 核心功能 - 运维监控仪表盘V2（实时监控、历史趋势、告警管理） - WebSocket实时QPS/TPS监控（30s心跳，自动重连） - 系统指标采集（CPU、内存、延迟、错误率等） - 多维度统计分析（按provider、model、user等维度） - 告警规则管理（阈值配置、通知渠道） - 错误日志追踪（详细错误信息、堆栈跟踪） ## 数据库Schema (Migration 025) ### 扩展现有表 - ops_system_metrics: 新增RED指标、错误分类、延迟指标、资源指标、业务指标 - ops_alert_rules: 新增JSONB字段（dimension_filters, notify_channels, notify_config） ### 新增表 - ops_dimension_stats: 多维度统计数据 - ops_data_retention_config: 数据保留策略配置 ### 新增视图和函数 - ops_latest_metrics: 最新1分钟窗口指标（已修复字段名和window过滤） - ops_active_alerts: 当前活跃告警（已修复字段名和状态值） - calculate_health_score: 健康分数计算函数 ## 一致性修复（98/100分） ### P0级别（阻塞Migration） - ✅ 修复ops_latest_metrics视图字段名（latency_p99→p99_latency_ms, cpu_usage→cpu_usage_percent） - ✅ 修复ops_active_alerts视图字段名（metric→metric_type, triggered_at→fired_at, trigger_value→metric_value, threshold→threshold_value） - ✅ 统一告警历史表名（删除ops_alert_history，使用ops_alert_events） - ✅ 统一API参数限制（ListMetricsHistory和ListErrorLogs的limit改为5000） ### P1级别（功能完整性） - ✅ 修复ops_latest_metrics视图未过滤window_minutes（添加WHERE m.window_minutes = 1） - ✅ 修复数据回填UPDATE逻辑（QPS计算改为request_count/(window_minutes*60.0)） - ✅ 添加ops_alert_rules JSONB字段后端支持（Go结构体+序列化） ### P2级别（优化） - ✅ 前端WebSocket自动重连（指数退避1s→2s→4s→8s→16s，最大5次） - ✅ 后端WebSocket心跳检测（30s ping，60s pong超时） ## 技术实现 ### 后端 (Go) - Handler层: ops_handler.go（REST API）, ops_ws_handler.go（WebSocket） - Service层: ops_service.go（核心逻辑）, ops_cache.go（缓存）, ops_alerts.go（告警） - Repository层: ops_repo.go（数据访问）, ops.go（模型定义） - 路由: admin.go（新增ops相关路由） - 依赖注入: wire_gen.go（自动生成） ### 前端 (Vue3 + TypeScript) - 组件: OpsDashboardV2.vue（仪表盘主组件） - API: ops.ts（REST API + WebSocket封装） - 路由: index.ts（新增/admin/ops路由） - 国际化: en.ts, zh.ts（中英文支持） ## 测试验证 - ✅ 所有Go测试通过 - ✅ Migration可正常执行 - ✅ WebSocket连接稳定 - ✅ 前后端数据结构对齐 * refactor: 代码清理和测试优化 ## 测试文件优化 - 简化integration test fixtures和断言 - 优化test helper函数 - 统一测试数据格式 ## 代码清理 - 移除未使用的代码和注释 - 简化concurrency_cache实现 - 优化middleware错误处理 ## 小修复 - 修复gateway_handler和openai_gateway_handler的小问题 - 统一代码风格和格式变更统计: 27个文件，292行新增，322行删除（净减少30行） * fix(ops): 运维监控系统安全加固和功能优化 ## 安全增强 - feat(security): WebSocket日志脱敏机制，防止token/api_key泄露 - feat(security): X-Forwarded-Host白名单验证，防止CSRF绕过 - feat(security): Origin策略配置化，支持strict/permissive模式 - feat(auth): WebSocket认证支持query参数传递token ## 配置优化 - feat(config): 支持环境变量配置代理信任和Origin策略 - OPS_WS_TRUST_PROXY - OPS_WS_TRUSTED_PROXIES - OPS_WS_ORIGIN_POLICY - fix(ops): 错误日志查询限流从5000降至500，优化内存使用 ## 架构改进 - refactor(ops): 告警服务解耦，独立运行评估定时器 - refactor(ops): OpsDashboard统一版本，移除V2分离 ## 测试和文档 - test(ops): 添加WebSocket安全验证单元测试（8个测试用例） - test(ops): 添加告警服务集成测试 - docs(api): 更新API文档，标注限流变更 - docs: 添加CHANGELOG记录breaking changes ## 修复文件 Backend: - backend/internal/server/middleware/logger.go - backend/internal/handler/admin/ops_handler.go - backend/internal/handler/admin/ops_ws_handler.go - backend/internal/server/middleware/admin_auth.go - backend/internal/service/ops_alert_service.go - backend/internal/service/ops_metrics_collector.go - backend/internal/service/wire.go Frontend: - frontend/src/views/admin/ops/OpsDashboard.vue - frontend/src/router/index.ts - frontend/src/api/admin/ops.ts Tests: - backend/internal/handler/admin/ops_ws_handler_test.go (新增) - backend/internal/service/ops_alert_service_integration_test.go (新增) Docs: - CHANGELOG.md (新增) - docs/API-运维监控中心2.0.md (更新) * fix(migrations): 修复calculate_health_score函数类型匹配问题在ops_latest_metrics视图中添加显式类型转换，确保参数类型与函数签名匹配 * fix(lint): 修复golangci-lint检查发现的所有问题 - 将Redis依赖从service层移到repository层 - 添加错误检查（WebSocket连接和读取超时） - 运行gofmt格式化代码 - 添加nil指针检查 - 删除未使用的alertService字段修复问题： - depguard: 3个（service层不应直接import redis） - errcheck: 3个（未检查错误返回值） - gofmt: 2个（代码格式问题） - staticcheck: 4个（nil指针解引用） - unused: 1个（未使用字段）代码统计： - 修改文件：11个 - 删除代码：490行 - 新增代码：105行 - 净减少：385行
2026-01-02 20:01:12 +08:00
parent 7fdc2b2d29
commit 45bd9ac705
171 changed files with 10618 additions and 2965 deletions
--- a/frontend/src/api/admin/dashboard.ts
+++ b/frontend/src/api/admin/dashboard.ts
@@ -8,7 +8,7 @@ import type {
  DashboardStats,
  TrendDataPoint,
  ModelStat,
-  ApiKeyUsageTrendPoint,
+  APIKeyUsageTrendPoint,
  UserUsageTrendPoint
 } from '@/types'

@@ -93,7 +93,7 @@ export interface ApiKeyTrendParams extends TrendParams {
 }

 export interface ApiKeyTrendResponse {
-  trend: ApiKeyUsageTrendPoint[]
+  trend: APIKeyUsageTrendPoint[]
  start_date: string
  end_date: string
  granularity: string
--- a/frontend/src/api/admin/index.ts
+++ b/frontend/src/api/admin/index.ts
@@ -15,6 +15,7 @@ import subscriptionsAPI from './subscriptions'
 import usageAPI from './usage'
 import geminiAPI from './gemini'
 import antigravityAPI from './antigravity'
+import opsAPI from './ops'
 import userAttributesAPI from './userAttributes'

 /**
@@ -33,6 +34,7 @@ export const adminAPI = {
  usage: usageAPI,
  gemini: geminiAPI,
  antigravity: antigravityAPI,
+  ops: opsAPI,
  userAttributes: userAttributesAPI
 }

@@ -49,6 +51,7 @@ export {
  usageAPI,
  geminiAPI,
  antigravityAPI,
+  opsAPI,
  userAttributesAPI
 }

--- a/frontend/src/api/admin/ops.ts
+++ b/frontend/src/api/admin/ops.ts
@@ -0,0 +1,324 @@
+/**
+ * Admin Ops API endpoints
+ * Provides stability metrics and error logs for ops dashboard
+ */
+
+import { apiClient } from '../client'
+
+export type OpsSeverity = 'P0' | 'P1' | 'P2' | 'P3'
+export type OpsPhase =
+  | 'auth'
+  | 'concurrency'
+  | 'billing'
+  | 'scheduling'
+  | 'network'
+  | 'upstream'
+  | 'response'
+  | 'internal'
+export type OpsPlatform = 'gemini' | 'openai' | 'anthropic' | 'antigravity'
+
+export interface OpsMetrics {
+  window_minutes: number
+  request_count: number
+  success_count: number
+  error_count: number
+  success_rate: number
+  error_rate: number
+  p95_latency_ms: number
+  p99_latency_ms: number
+  http2_errors: number
+  active_alerts: number
+  cpu_usage_percent?: number
+  memory_used_mb?: number
+  memory_total_mb?: number
+  memory_usage_percent?: number
+  heap_alloc_mb?: number
+  gc_pause_ms?: number
+  concurrency_queue_depth?: number
+  updated_at?: string
+}
+
+export interface OpsErrorLog {
+  id: number
+  created_at: string
+  phase: OpsPhase
+  type: string
+  severity: OpsSeverity
+  status_code: number
+  platform: OpsPlatform
+  model: string
+  latency_ms: number | null
+  request_id: string
+  message: string
+  user_id?: number | null
+  api_key_id?: number | null
+  account_id?: number | null
+  group_id?: number | null
+  client_ip?: string
+  request_path?: string
+  stream?: boolean
+}
+
+export interface OpsErrorListParams {
+  start_time?: string
+  end_time?: string
+  platform?: OpsPlatform
+  phase?: OpsPhase
+  severity?: OpsSeverity
+  q?: string
+  /**
+   * Max 500 (legacy endpoint uses a hard cap); use paginated /admin/ops/errors for larger result sets.
+   */
+  limit?: number
+}
+
+export interface OpsErrorListResponse {
+  items: OpsErrorLog[]
+  total?: number
+}
+
+export interface OpsMetricsHistoryParams {
+  window_minutes?: number
+  minutes?: number
+  start_time?: string
+  end_time?: string
+  limit?: number
+}
+
+export interface OpsMetricsHistoryResponse {
+  items: OpsMetrics[]
+}
+
+/**
+ * Get latest ops metrics snapshot
+ */
+export async function getMetrics(): Promise<OpsMetrics> {
+  const { data } = await apiClient.get<OpsMetrics>('/admin/ops/metrics')
+  return data
+}
+
+/**
+ * List metrics history for charts
+ */
+export async function listMetricsHistory(params?: OpsMetricsHistoryParams): Promise<OpsMetricsHistoryResponse> {
+  const { data } = await apiClient.get<OpsMetricsHistoryResponse>('/admin/ops/metrics/history', { params })
+  return data
+}
+
+/**
+ * List recent error logs with optional filters
+ */
+export async function listErrors(params?: OpsErrorListParams): Promise<OpsErrorListResponse> {
+  const { data } = await apiClient.get<OpsErrorListResponse>('/admin/ops/error-logs', { params })
+  return data
+}
+
+export interface OpsDashboardOverview {
+  timestamp: string
+  health_score: number
+  sla: {
+    current: number
+    threshold: number
+    status: string
+    trend: string
+    change_24h: number
+  }
+  qps: {
+    current: number
+    peak_1h: number
+    avg_1h: number
+    change_vs_yesterday: number
+  }
+  tps: {
+    current: number
+    peak_1h: number
+    avg_1h: number
+  }
+  latency: {
+    p50: number
+    p95: number
+    p99: number
+    p999: number
+    avg: number
+    max: number
+    threshold_p99: number
+    status: string
+  }
+  errors: {
+    total_count: number
+    error_rate: number
+    '4xx_count': number
+    '5xx_count': number
+    timeout_count: number
+    top_error?: {
+      code: string
+      message: string
+      count: number
+    }
+  }
+  resources: {
+    cpu_usage: number
+    memory_usage: number
+    disk_usage: number
+    goroutines: number
+    db_connections: {
+      active: number
+      idle: number
+      waiting: number
+      max: number
+    }
+  }
+  system_status: {
+    redis: string
+    database: string
+    background_jobs: string
+  }
+}
+
+export interface ProviderHealthData {
+  name: string
+  request_count: number
+  success_rate: number
+  error_rate: number
+  latency_avg: number
+  latency_p99: number
+  status: string
+  errors_by_type: {
+    '4xx': number
+    '5xx': number
+    timeout: number
+  }
+}
+
+export interface ProviderHealthResponse {
+  providers: ProviderHealthData[]
+  summary: {
+    total_requests: number
+    avg_success_rate: number
+    best_provider: string
+    worst_provider: string
+  }
+}
+
+export interface LatencyHistogramResponse {
+  buckets: {
+    range: string
+    count: number
+    percentage: number
+  }[]
+  total_requests: number
+  slow_request_threshold: number
+}
+
+export interface ErrorDistributionResponse {
+  items: {
+    code: string
+    message: string
+    count: number
+    percentage: number
+  }[]
+}
+
+/**
+ * Get realtime ops dashboard overview
+ */
+export async function getDashboardOverview(timeRange = '1h'): Promise<OpsDashboardOverview> {
+  const { data } = await apiClient.get<OpsDashboardOverview>('/admin/ops/dashboard/overview', {
+    params: { time_range: timeRange }
+  })
+  return data
+}
+
+/**
+ * Get provider health comparison
+ */
+export async function getProviderHealth(timeRange = '1h'): Promise<ProviderHealthResponse> {
+  const { data } = await apiClient.get<ProviderHealthResponse>('/admin/ops/dashboard/providers', {
+    params: { time_range: timeRange }
+  })
+  return data
+}
+
+/**
+ * Get latency histogram
+ */
+export async function getLatencyHistogram(timeRange = '1h'): Promise<LatencyHistogramResponse> {
+  const { data } = await apiClient.get<LatencyHistogramResponse>('/admin/ops/dashboard/latency-histogram', {
+    params: { time_range: timeRange }
+  })
+  return data
+}
+
+/**
+ * Get error distribution
+ */
+export async function getErrorDistribution(timeRange = '1h'): Promise<ErrorDistributionResponse> {
+  const { data } = await apiClient.get<ErrorDistributionResponse>('/admin/ops/dashboard/errors/distribution', {
+    params: { time_range: timeRange }
+  })
+  return data
+}
+
+/**
+ * Subscribe to realtime QPS updates via WebSocket
+ */
+export function subscribeQPS(onMessage: (data: any) => void): () => void {
+  let ws: WebSocket | null = null
+  let reconnectAttempts = 0
+  const maxReconnectAttempts = 5
+  let reconnectTimer: ReturnType<typeof setTimeout> | null = null
+  let shouldReconnect = true
+
+  const connect = () => {
+    const protocol = window.location.protocol === 'https:' ? 'wss:' : 'ws:'
+    const host = window.location.host
+    ws = new WebSocket(`${protocol}//${host}/api/v1/admin/ops/ws/qps`)
+
+    ws.onopen = () => {
+      console.log('[OpsWS] Connected')
+      reconnectAttempts = 0
+    }
+
+    ws.onmessage = (e) => {
+      const data = JSON.parse(e.data)
+      onMessage(data)
+    }
+
+    ws.onerror = (error) => {
+      console.error('[OpsWS] Connection error:', error)
+    }
+
+    ws.onclose = () => {
+      console.log('[OpsWS] Connection closed')
+      if (shouldReconnect && reconnectAttempts < maxReconnectAttempts) {
+        const delay = Math.min(1000 * Math.pow(2, reconnectAttempts), 30000)
+        console.log(`[OpsWS] Reconnecting in ${delay}ms...`)
+        reconnectTimer = setTimeout(() => {
+          reconnectAttempts++
+          connect()
+        }, delay)
+      }
+    }
+  }
+
+  connect()
+
+  return () => {
+    shouldReconnect = false
+    if (reconnectTimer) clearTimeout(reconnectTimer)
+    if (ws) ws.close()
+  }
+}
+
+export const opsAPI = {
+  getMetrics,
+  listMetricsHistory,
+  listErrors,
+  getDashboardOverview,
+  getProviderHealth,
+  getLatencyHistogram,
+  getErrorDistribution,
+  subscribeQPS
+}
+
+export default opsAPI
--- a/frontend/src/components/layout/AppSidebar.vue
+++ b/frontend/src/components/layout/AppSidebar.vue
@@ -183,6 +183,21 @@ const DashboardIcon = {
    )
 }

+const ActivityIcon = {
+  render: () =>
+    h(
+      'svg',
+      { fill: 'none', viewBox: '0 0 24 24', stroke: 'currentColor', 'stroke-width': '1.5' },
+      [
+        h('path', {
+          'stroke-linecap': 'round',
+          'stroke-linejoin': 'round',
+          d: 'M3 12h4l3 6 4-12 3 6h4'
+        })
+      ]
+    )
+}
+
 const KeyIcon = {
  render: () =>
    h(
@@ -442,6 +457,7 @@ const personalNavItems = computed(() => {
 const adminNavItems = computed(() => {
  const baseItems = [
    { path: '/admin/dashboard', label: t('nav.dashboard'), icon: DashboardIcon },
+    { path: '/admin/ops', label: t('nav.ops'), icon: ActivityIcon },
    { path: '/admin/users', label: t('nav.users'), icon: UsersIcon, hideInSimpleMode: true },
    { path: '/admin/groups', label: t('nav.groups'), icon: FolderIcon, hideInSimpleMode: true },
    { path: '/admin/subscriptions', label: t('nav.subscriptions'), icon: CreditCardIcon, hideInSimpleMode: true },
--- a/frontend/src/i18n/locales/en.ts
+++ b/frontend/src/i18n/locales/en.ts
@@ -127,6 +127,8 @@ export default {
    total: 'Total',
    balance: 'Balance',
    available: 'Available',
+    copy: 'Copy',
+    details: 'Details',
    copiedToClipboard: 'Copied to clipboard',
    copyFailed: 'Failed to copy',
    contactSupport: 'Contact Support',
@@ -147,6 +149,7 @@ export default {
  // Navigation
  nav: {
    dashboard: 'Dashboard',
+    ops: 'Ops Center',
    apiKeys: 'API Keys',
    usage: 'Usage',
    redeem: 'Redeem',
@@ -546,6 +549,123 @@ export default {
      recentUsage: 'Recent Usage',
      failedToLoad: 'Failed to load dashboard statistics'
    },
+    ops: {
+      title: 'Ops Monitoring Center 2.0',
+      description: 'Stability metrics, error distribution, and system health',
+      status: {
+        title: 'System Health Snapshot',
+        subtitle: 'Real-time metrics and error visibility',
+        systemNormal: 'System Normal',
+        systemDegraded: 'System Degraded',
+        systemDown: 'System Down',
+        noData: 'No Data',
+        monitoring: 'Monitoring',
+        lastUpdated: 'Last Updated',
+        live: 'Live',
+        waiting: 'Waiting for data',
+        realtime: 'Connected',
+        disconnected: 'Disconnected'
+      },
+      charts: {
+        errorTrend: 'Error Trend',
+        errorDistribution: 'Error Distribution',
+        errorRate: 'Error Rate',
+        requestCount: 'Request Count',
+        rateLimits: 'Rate Limits (429)',
+        serverErrors: 'Server Errors (5xx)',
+        clientErrors: 'Client Errors (4xx)',
+        otherErrors: 'Other',
+        latencyDist: 'Latency Distribution',
+        providerSla: 'Upstream SLA Comparison',
+        errorDist: 'Error Type Distribution',
+        systemStatus: 'System Resources'
+      },
+      metrics: {
+        successRate: 'Success Rate',
+        errorRate: 'Error Rate',
+        p95: 'P95 Latency',
+        p99: 'P99 Latency',
+        http2Errors: 'HTTP/2 Errors',
+        activeAlerts: 'Active Alerts',
+        cpuUsage: 'CPU Usage',
+        queueDepth: 'Queue Depth',
+        healthScore: 'Health Score',
+        sla: 'Availability (SLA)',
+        qps: 'Real-time QPS',
+        tps: 'Real-time TPS',
+        errorCount: 'Error Count'
+      },
+      errors: {
+        title: 'Recent Errors',
+        subtitle: 'Inspect failures across platforms and phases',
+        count: '{n} errors'
+      },
+      filters: {
+        allSeverities: 'All severities',
+        allPlatforms: 'All platforms',
+        allPhases: 'All phases',
+        p0: 'P0 (Critical)',
+        p1: 'P1 (High)',
+        p2: 'P2 (Medium)',
+        p3: 'P3 (Low)'
+      },
+      searchPlaceholder: 'Search by request ID, model, or message',
+      range: {
+        '15m': 'Last 15 minutes',
+        '1h': 'Last 1 hour',
+        '24h': 'Last 24 hours',
+        '7d': 'Last 7 days'
+      },
+      platform: {
+        anthropic: 'Anthropic',
+        openai: 'OpenAI',
+        gemini: 'Gemini',
+        antigravity: 'Antigravity'
+      },
+      phase: {
+        auth: 'Auth',
+        concurrency: 'Concurrency',
+        billing: 'Billing',
+        scheduling: 'Scheduling',
+        network: 'Network',
+        upstream: 'Upstream',
+        response: 'Response',
+        internal: 'Internal'
+      },
+      severity: {
+        p0: 'P0',
+        p1: 'P1',
+        p2: 'P2',
+        p3: 'P3'
+      },
+      table: {
+        time: 'Time',
+        severity: 'Severity',
+        phase: 'Phase',
+        statusCode: 'Status',
+        platform: 'Platform',
+        model: 'Model',
+        latency: 'Latency',
+        requestId: 'Request ID',
+        message: 'Message'
+      },
+      details: {
+        title: 'Error Details',
+        requestId: 'Request ID',
+        errorMessage: 'Error Message',
+        requestPath: 'Request path',
+        clientIp: 'Client IP',
+        userId: 'User ID',
+        apiKeyId: 'API Key ID',
+        groupId: 'Group ID',
+        stream: 'Stream'
+      },
+      empty: {
+        title: 'No ops data yet',
+        subtitle: 'Enable error logging and metrics to populate this view'
+      },
+      failedToLoad: 'Failed to load ops data'
+    },

    // Users
    users: {
--- a/frontend/src/i18n/locales/zh.ts
+++ b/frontend/src/i18n/locales/zh.ts
@@ -124,6 +124,8 @@ export default {
    total: '总计',
    balance: '余额',
    available: '可用',
+    copy: '复制',
+    details: '详情',
    copiedToClipboard: '已复制到剪贴板',
    copyFailed: '复制失败',
    contactSupport: '联系客服',
@@ -144,6 +146,7 @@ export default {
  // Navigation
  nav: {
    dashboard: '仪表盘',
+    ops: '运维监控',
    apiKeys: 'API 密钥',
    usage: '使用记录',
    redeem: '兑换',
@@ -559,6 +562,123 @@ export default {
      configureSystem: '配置系统设置',
      failedToLoad: '加载仪表盘数据失败'
    },
+    ops: {
+      title: '运维监控中心 2.0',
+      description: '稳定性指标、错误分布与系统健康',
+      status: {
+        title: '系统健康快照',
+        subtitle: '实时指标与错误可见性',
+        systemNormal: '系统正常',
+        systemDegraded: '系统降级',
+        systemDown: '系统异常',
+        noData: '无数据',
+        monitoring: '监控中',
+        lastUpdated: '最后更新',
+        live: '实时',
+        waiting: '等待数据',
+        realtime: '实时连接中',
+        disconnected: '连接已断开'
+      },
+      charts: {
+        errorTrend: '错误趋势',
+        errorDistribution: '错误分布',
+        errorRate: '错误率',
+        requestCount: '请求数',
+        rateLimits: '限流 (429)',
+        serverErrors: '服务端错误 (5xx)',
+        clientErrors: '客户端错误 (4xx)',
+        otherErrors: '其他',
+        latencyDist: '请求延迟分布',
+        providerSla: '上游供应商健康度 (SLA)',
+        errorDist: '错误类型分布',
+        systemStatus: '系统运行状态'
+      },
+      metrics: {
+        successRate: '成功率',
+        errorRate: '错误率',
+        p95: 'P95 延迟',
+        p99: 'P99 延迟',
+        http2Errors: 'HTTP/2 错误',
+        activeAlerts: '活跃告警',
+        cpuUsage: 'CPU 使用率',
+        queueDepth: '排队深度',
+        healthScore: '健康评分',
+        sla: '服务可用率 (SLA)',
+        qps: '实时 QPS',
+        tps: '实时 TPS',
+        errorCount: '周期错误数'
+      },
+      errors: {
+        title: '最近错误',
+        subtitle: '按平台与阶段定位失败原因',
+        count: '{n} 条错误'
+      },
+      filters: {
+        allSeverities: '全部级别',
+        allPlatforms: '全部平台',
+        allPhases: '全部阶段',
+        p0: 'P0（致命）',
+        p1: 'P1（高）',
+        p2: 'P2（中）',
+        p3: 'P3（低）'
+      },
+      searchPlaceholder: '按请求ID、模型或错误信息搜索',
+      range: {
+        '15m': '近 15 分钟',
+        '1h': '近 1 小时',
+        '24h': '近 24 小时',
+        '7d': '近 7 天'
+      },
+      platform: {
+        anthropic: 'Anthropic',
+        openai: 'OpenAI',
+        gemini: 'Gemini',
+        antigravity: 'Antigravity'
+      },
+      phase: {
+        auth: '认证',
+        concurrency: '并发',
+        billing: '计费',
+        scheduling: '调度',
+        network: '网络',
+        upstream: '上游',
+        response: '响应',
+        internal: '内部'
+      },
+      severity: {
+        p0: 'P0',
+        p1: 'P1',
+        p2: 'P2',
+        p3: 'P3'
+      },
+      table: {
+        time: '时间',
+        severity: '级别',
+        phase: '阶段',
+        statusCode: '状态码',
+        platform: '平台',
+        model: '模型',
+        latency: '延迟',
+        requestId: '请求ID',
+        message: '错误信息'
+      },
+      details: {
+        title: '错误详情',
+        requestId: '请求ID',
+        errorMessage: '错误信息',
+        requestPath: '请求路径',
+        clientIp: '客户端IP',
+        userId: '用户ID',
+        apiKeyId: 'API Key ID',
+        groupId: '分组ID',
+        stream: '流式'
+      },
+      empty: {
+        title: '暂无运维数据',
+        subtitle: '启用错误日志与指标采集后将展示在此处'
+      },
+      failedToLoad: '加载运维数据失败'
+    },

    // Users Management
    users: {
--- a/frontend/src/router/index.ts
+++ b/frontend/src/router/index.ts
@@ -163,6 +163,18 @@ const routes: RouteRecordRaw[] = [
      descriptionKey: 'admin.dashboard.description'
    }
  },
+  {
+    path: '/admin/ops',
+    name: 'AdminOps',
+    component: () => import('@/views/admin/ops/OpsDashboard.vue'),
+    meta: {
+      requiresAuth: true,
+      requiresAdmin: true,
+      title: 'Ops Dashboard',
+      titleKey: 'admin.ops.title',
+      descriptionKey: 'admin.ops.description'
+    }
+  },
  {
    path: '/admin/users',
    name: 'AdminUsers',
--- a/frontend/src/types/index.ts
+++ b/frontend/src/types/index.ts
@@ -619,7 +619,7 @@ export interface UserUsageTrendPoint {
  actual_cost: number // 实际扣除
 }

-export interface ApiKeyUsageTrendPoint {
+export interface APIKeyUsageTrendPoint {
  date: string
  api_key_id: number
  key_name: string
--- a/frontend/src/views/admin/ops/OpsDashboard.vue
+++ b/frontend/src/views/admin/ops/OpsDashboard.vue
@@ -0,0 +1,417 @@
+<script setup lang="ts">
+import { ref, computed, onMounted, onUnmounted, watch } from 'vue'
+import { useI18n } from 'vue-i18n'
+import { Bar, Doughnut } from 'vue-chartjs'
+import {
+  Chart as ChartJS,
+  Title,
+  Tooltip,
+  Legend,
+  LineElement,
+  LinearScale,
+  PointElement,
+  CategoryScale,
+  BarElement,
+  ArcElement
+} from 'chart.js'
+import { useIntervalFn } from '@vueuse/core'
+import AppLayout from '@/components/layout/AppLayout.vue'
+import { opsAPI, type OpsDashboardOverview, type ProviderHealthData, type LatencyHistogramResponse, type ErrorDistributionResponse } from '@/api/admin/ops'
+import { useAuthStore } from '@/stores/auth'
+
+ChartJS.register(
+  Title,
+  Tooltip,
+  Legend,
+  LineElement,
+  LinearScale,
+  PointElement,
+  CategoryScale,
+  BarElement,
+  ArcElement
+)
+
+const { t } = useI18n()
+const authStore = useAuthStore()
+const loading = ref(false)
+const errorMessage = ref('')
+const timeRange = ref('1h')
+const lastUpdated = ref(new Date())
+
+const overview = ref<OpsDashboardOverview | null>(null)
+const providers = ref<ProviderHealthData[]>([])
+const latencyData = ref<LatencyHistogramResponse | null>(null)
+const errorDistribution = ref<ErrorDistributionResponse | null>(null)
+
+// WebSocket for real-time QPS
+const realTimeQPS = ref(0)
+const realTimeTPS = ref(0)
+const wsConnected = ref(false)
+let ws: WebSocket | null = null
+let reconnectTimer: ReturnType<typeof setTimeout> | null = null
+
+const connectWS = () => {
+  const protocol = window.location.protocol === 'https:' ? 'wss:' : 'ws:'
+  const wsBaseUrl = import.meta.env.VITE_WS_BASE_URL || window.location.host
+  const wsURL = new URL(`${protocol}//${wsBaseUrl}/api/v1/admin/ops/ws/qps`)
+  const token = authStore.token || localStorage.getItem('auth_token')
+  if (token) {
+    wsURL.searchParams.set('token', token)
+  }
+  ws = new WebSocket(wsURL.toString())
+
+  ws.onopen = () => {
+    wsConnected.value = true
+  }
+
+  ws.onmessage = (event) => {
+    try {
+      const payload = JSON.parse(event.data)
+      if (payload && typeof payload === 'object' && payload.type === 'qps_update' && payload.data) {
+        realTimeQPS.value = payload.data.qps || 0
+        realTimeTPS.value = payload.data.tps || 0
+      }
+    } catch (e) {
+      console.error('WS parse error', e)
+    }
+  }
+
+  ws.onclose = () => {
+    wsConnected.value = false
+    if (reconnectTimer) clearTimeout(reconnectTimer)
+    reconnectTimer = setTimeout(connectWS, 5000)
+  }
+}
+
+const fetchData = async () => {
+  loading.value = true
+  errorMessage.value = ''
+  try {
+    const [ov, pr, lt, er] = await Promise.all([
+      opsAPI.getDashboardOverview(timeRange.value),
+      opsAPI.getProviderHealth(timeRange.value),
+      opsAPI.getLatencyHistogram(timeRange.value),
+      opsAPI.getErrorDistribution(timeRange.value)
+    ])
+    overview.value = ov
+    providers.value = pr.providers
+    latencyData.value = lt
+    errorDistribution.value = er
+    lastUpdated.value = new Date()
+  } catch (err) {
+    console.error('Failed to fetch ops data', err)
+    errorMessage.value = '数据加载失败，请稍后重试'
+  } finally {
+    loading.value = false
+  }
+}
+
+// Refresh data every 30 seconds (fallback for L2/L3)
+useIntervalFn(fetchData, 30000)
+
+onMounted(() => {
+  fetchData()
+  connectWS()
+})
+
+onUnmounted(() => {
+  if (ws) ws.close()
+  if (reconnectTimer) clearTimeout(reconnectTimer)
+})
+
+watch(timeRange, () => {
+  fetchData()
+})
+
+// Chart Data: Latency Distribution
+const latencyChartData = computed(() => {
+  if (!latencyData.value) return null
+  return {
+    labels: latencyData.value.buckets.map(b => b.range),
+    datasets: [
+      {
+        label: t('admin.ops.charts.requestCount'),
+        data: latencyData.value.buckets.map(b => b.count),
+        backgroundColor: '#3b82f6',
+        borderRadius: 4
+      }
+    ]
+  }
+})
+
+// Chart Data: Error Distribution
+const errorChartData = computed(() => {
+  if (!errorDistribution.value) return null
+  return {
+    labels: errorDistribution.value.items.map(i => i.code),
+    datasets: [
+      {
+        data: errorDistribution.value.items.map(i => i.count),
+        backgroundColor: [
+          '#ef4444', '#f59e0b', '#3b82f6', '#10b981', '#8b5cf6', '#ec4899'
+        ]
+      }
+    ]
+  }
+})
+
+// Chart Data: Provider SLA
+const providerChartData = computed(() => {
+  if (!providers.value.length) return null
+  return {
+    labels: providers.value.map(p => p.name),
+    datasets: [
+      {
+        label: 'SLA (%)',
+        data: providers.value.map(p => p.success_rate),
+        backgroundColor: providers.value.map(p => p.success_rate > 99.5 ? '#10b981' : p.success_rate > 98 ? '#f59e0b' : '#ef4444'),
+        borderRadius: 4
+      }
+    ]
+  }
+})
+
+const chartOptions = {
+  responsive: true,
+  maintainAspectRatio: false,
+  plugins: {
+    legend: {
+      display: false
+    }
+  },
+  scales: {
+    y: {
+      beginAtZero: true,
+      grid: {
+        display: false
+      }
+    },
+    x: {
+      grid: {
+        display: false
+      }
+    }
+  }
+}
+
+const healthScoreClass = computed(() => {
+  const score = overview.value?.health_score || 0
+  if (score >= 90) return 'text-green-500 border-green-500'
+  if (score >= 70) return 'text-yellow-500 border-yellow-500'
+  return 'text-red-500 border-red-500'
+})
+
+</script>
+
+<template>
+  <AppLayout>
+    <div class="space-y-6 pb-12">
+      <!-- Error Message -->
+      <div v-if="errorMessage" class="rounded-2xl bg-red-50 p-4 text-sm text-red-600 dark:bg-red-900/20 dark:text-red-400">
+        {{ errorMessage }}
+      </div>
+
+      <!-- L1: Header & Realtime Stats -->
+      <div class="flex flex-wrap items-center justify-between gap-4 rounded-2xl bg-white p-6 shadow-sm ring-1 ring-gray-900/5 dark:bg-dark-800 dark:ring-dark-700">
+        <div class="flex items-center gap-6">
+          <!-- Health Score Gauge -->
+          <div class="flex h-20 w-20 flex-col items-center justify-center rounded-full border-4 bg-gray-50 dark:bg-dark-900" :class="healthScoreClass">
+            <span class="text-2xl font-black">{{ overview?.health_score || '--' }}</span>
+            <span class="text-[10px] font-bold opacity-60">HEALTH</span>
+          </div>
+          
+          <div>
+            <h1 class="text-xl font-black text-gray-900 dark:text-white">运维监控中心 2.0</h1>
+            <div class="mt-1 flex items-center gap-3">
+              <span class="flex items-center gap-1.5">
+                <span class="h-2 w-2 rounded-full bg-green-500 animate-pulse" v-if="wsConnected"></span>
+                <span class="h-2 w-2 rounded-full bg-red-500" v-else></span>
+                <span class="text-xs font-medium text-gray-500">{{ wsConnected ? '实时连接中' : '连接已断开' }}</span>
+              </span>
+              <span class="text-xs text-gray-400">最后更新: {{ lastUpdated.toLocaleTimeString() }}</span>
+            </div>
+          </div>
+        </div>
+
+        <div class="flex items-center gap-4">
+          <div class="hidden items-center gap-6 border-r border-gray-100 pr-6 dark:border-dark-700 lg:flex">
+            <div class="text-center">
+              <div class="text-sm font-black text-gray-900 dark:text-white">{{ realTimeQPS.toFixed(1) }}</div>
+              <div class="text-[10px] font-bold text-gray-400 uppercase">实时 QPS</div>
+            </div>
+            <div class="text-center">
+              <div class="text-sm font-black text-gray-900 dark:text-white">{{ (realTimeTPS / 1000).toFixed(1) }}K</div>
+              <div class="text-[10px] font-bold text-gray-400 uppercase">实时 TPS</div>
+            </div>
+          </div>
+          
+          <select v-model="timeRange" class="rounded-lg border-gray-200 bg-gray-50 py-1.5 pl-3 pr-8 text-sm font-medium text-gray-700 focus:border-blue-500 focus:ring-blue-500 dark:border-dark-700 dark:bg-dark-900 dark:text-gray-300">
+            <option value="5m">5 分钟</option>
+            <option value="30m">30 分钟</option>
+            <option value="1h">1 小时</option>
+            <option value="24h">24 小时</option>
+          </select>
+          
+          <button @click="fetchData" :disabled="loading" class="flex h-9 w-9 items-center justify-center rounded-lg bg-gray-100 text-gray-500 hover:bg-gray-200 dark:bg-dark-700 dark:text-gray-400">
+            <svg class="h-5 w-5" :class="{ 'animate-spin': loading }" fill="none" viewBox="0 0 24 24" stroke="currentColor">
+              <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M4 4v5h.582m15.356 2A8.001 8.001 0 004.582 9m0 0H9m11 11v-5h-.581m0 0a8.003 8.003 0 01-15.357-2m15.357 2H15" />
+            </svg>
+          </button>
+        </div>
+      </div>
+
+      <!-- L1: Core Metrics Grid -->
+      <div class="grid grid-cols-1 gap-4 sm:grid-cols-2 lg:grid-cols-4">
+        <div class="rounded-2xl bg-white p-5 shadow-sm ring-1 ring-gray-900/5 dark:bg-dark-800 dark:ring-dark-700">
+          <div class="flex items-center justify-between">
+            <span class="text-xs font-bold text-gray-400 uppercase tracking-wider">服务可用率 (SLA)</span>
+            <span class="rounded-full bg-green-50 px-2 py-0.5 text-[10px] font-bold text-green-600 dark:bg-green-900/30">{{ overview?.sla.status }}</span>
+          </div>
+          <div class="mt-2 flex items-baseline gap-2">
+            <span class="text-2xl font-black text-gray-900 dark:text-white">{{ overview?.sla.current.toFixed(2) }}%</span>
+            <span class="text-xs font-bold" :class="overview?.sla.change_24h && overview.sla.change_24h >= 0 ? 'text-green-500' : 'text-red-500'">
+              {{ overview?.sla.change_24h && overview.sla.change_24h >= 0 ? '+' : '' }}{{ overview?.sla.change_24h }}%
+            </span>
+          </div>
+          <div class="mt-3 h-1 w-full overflow-hidden rounded-full bg-gray-100 dark:bg-dark-700">
+            <div class="h-full bg-green-500" :style="{ width: `${overview?.sla.current}%` }"></div>
+          </div>
+        </div>
+
+        <div class="rounded-2xl bg-white p-5 shadow-sm ring-1 ring-gray-900/5 dark:bg-dark-800 dark:ring-dark-700">
+          <div class="flex items-center justify-between">
+            <span class="text-xs font-bold text-gray-400 uppercase tracking-wider">P99 响应延迟</span>
+            <span class="rounded-full bg-blue-50 px-2 py-0.5 text-[10px] font-bold text-blue-600 dark:bg-blue-900/30">Target 1s</span>
+          </div>
+          <div class="mt-2 flex items-baseline gap-2">
+            <span class="text-2xl font-black text-gray-900 dark:text-white">{{ overview?.latency.p99 }}ms</span>
+            <span class="text-xs font-bold text-gray-400">Avg: {{ overview?.latency.avg }}ms</span>
+          </div>
+          <div class="mt-3 flex gap-1">
+            <div v-for="i in 10" :key="i" class="h-1 flex-1 rounded-full" :class="i <= (overview?.latency.p99 || 0) / 200 ? 'bg-blue-500' : 'bg-gray-100 dark:bg-dark-700'"></div>
+          </div>
+        </div>
+
+        <div class="rounded-2xl bg-white p-5 shadow-sm ring-1 ring-gray-900/5 dark:bg-dark-800 dark:ring-dark-700">
+          <div class="flex items-center justify-between">
+            <span class="text-xs font-bold text-gray-400 uppercase tracking-wider">周期请求总数</span>
+            <svg class="h-4 w-4 text-gray-300" fill="none" viewBox="0 0 24 24" stroke="currentColor"><path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M13 7h8m0 0v8m0-8l-8 8-4-4-6 6" /></svg>
+          </div>
+          <div class="mt-2 flex items-baseline gap-2">
+            <span class="text-2xl font-black text-gray-900 dark:text-white">{{ overview?.qps.avg_1h.toFixed(1) }}</span>
+            <span class="text-xs font-bold text-gray-400">req/s</span>
+          </div>
+          <div class="mt-1 text-[10px] font-bold text-gray-400 uppercase">对比昨日: {{ overview?.qps.change_vs_yesterday }}%</div>
+        </div>
+
+        <div class="rounded-2xl bg-white p-5 shadow-sm ring-1 ring-gray-900/5 dark:bg-dark-800 dark:ring-dark-700">
+          <div class="flex items-center justify-between">
+            <span class="text-xs font-bold text-gray-400 uppercase tracking-wider">周期错误数</span>
+            <span class="rounded-full bg-red-50 px-2 py-0.5 text-[10px] font-bold text-red-600 dark:bg-red-900/30">{{ overview?.errors.error_rate.toFixed(2) }}%</span>
+          </div>
+          <div class="mt-2 flex items-baseline gap-2">
+            <span class="text-2xl font-black text-gray-900 dark:text-white">{{ overview?.errors.total_count }}</span>
+            <span class="text-xs font-bold text-red-500">5xx: {{ overview?.errors['5xx_count'] }}</span>
+          </div>
+          <div class="mt-1 text-[10px] font-bold text-gray-400 uppercase">主要错误码: {{ overview?.errors.top_error?.code || 'N/A' }}</div>
+        </div>
+      </div>
+
+      <!-- L2: Visual Analysis -->
+      <div class="grid grid-cols-1 gap-6 lg:grid-cols-2">
+        <!-- Latency Distribution -->
+        <div class="rounded-2xl bg-white p-6 shadow-sm ring-1 ring-gray-900/5 dark:bg-dark-800 dark:ring-dark-700">
+          <div class="mb-6 flex items-center justify-between">
+            <h3 class="text-sm font-black text-gray-900 dark:text-white uppercase tracking-wider">请求延迟分布</h3>
+          </div>
+          <div class="h-64">
+            <Bar v-if="latencyChartData" :data="latencyChartData" :options="chartOptions" />
+            <div v-else class="flex h-full items-center justify-center text-gray-400">加载中...</div>
+          </div>
+        </div>
+
+        <!-- Provider Health -->
+        <div class="rounded-2xl bg-white p-6 shadow-sm ring-1 ring-gray-900/5 dark:bg-dark-800 dark:ring-dark-700">
+          <div class="mb-6 flex items-center justify-between">
+            <h3 class="text-sm font-black text-gray-900 dark:text-white uppercase tracking-wider">上游供应商健康度 (SLA)</h3>
+          </div>
+          <div class="h-64">
+            <Bar v-if="providerChartData" :data="providerChartData" :options="chartOptions" />
+            <div v-else class="flex h-full items-center justify-center text-gray-400">加载中...</div>
+          </div>
+        </div>
+
+        <!-- Error Distribution -->
+        <div class="rounded-2xl bg-white p-6 shadow-sm ring-1 ring-gray-900/5 dark:bg-dark-800 dark:ring-dark-700">
+          <div class="mb-6 flex items-center justify-between">
+            <h3 class="text-sm font-black text-gray-900 dark:text-white uppercase tracking-wider">错误类型分布</h3>
+          </div>
+          <div class="flex h-64 gap-6">
+            <div class="relative w-1/2">
+              <Doughnut v-if="errorChartData" :data="errorChartData" :options="{ ...chartOptions, cutout: '70%' }" />
+            </div>
+            <div class="flex flex-1 flex-col justify-center space-y-3">
+              <div v-for="(item, idx) in errorDistribution?.items.slice(0, 5)" :key="item.code" class="flex items-center justify-between">
+                <div class="flex items-center gap-2">
+                  <div class="h-2 w-2 rounded-full" :style="{ backgroundColor: ['#ef4444', '#f59e0b', '#3b82f6', '#10b981', '#8b5cf6'][idx] }"></div>
+                  <span class="text-xs font-bold text-gray-700 dark:text-gray-300">{{ item.code }}</span>
+                </div>
+                <span class="text-xs font-black text-gray-900 dark:text-white">{{ item.percentage }}%</span>
+              </div>
+            </div>
+          </div>
+        </div>
+
+        <!-- System Resources -->
+        <div class="rounded-2xl bg-white p-6 shadow-sm ring-1 ring-gray-900/5 dark:bg-dark-800 dark:ring-dark-700">
+          <div class="mb-6 flex items-center justify-between">
+            <h3 class="text-sm font-black text-gray-900 dark:text-white uppercase tracking-wider">系统运行状态</h3>
+          </div>
+          <div class="grid grid-cols-2 gap-6">
+            <div class="space-y-4">
+              <div>
+                <div class="mb-1 flex justify-between text-[10px] font-bold text-gray-400 uppercase">CPU 使用率</div>
+                <div class="h-2 w-full rounded-full bg-gray-100 dark:bg-dark-700">
+                  <div class="h-full rounded-full bg-purple-500" :style="{ width: `${overview?.resources.cpu_usage}%` }"></div>
+                </div>
+                <div class="mt-1 text-right text-xs font-bold text-gray-900 dark:text-white">{{ overview?.resources.cpu_usage }}%</div>
+              </div>
+              <div>
+                <div class="mb-1 flex justify-between text-[10px] font-bold text-gray-400 uppercase">内存使用率</div>
+                <div class="h-2 w-full rounded-full bg-gray-100 dark:bg-dark-700">
+                  <div class="h-full rounded-full bg-indigo-500" :style="{ width: `${overview?.resources.memory_usage}%` }"></div>
+                </div>
+                <div class="mt-1 text-right text-xs font-bold text-gray-900 dark:text-white">{{ overview?.resources.memory_usage }}%</div>
+              </div>
+            </div>
+            <div class="flex flex-col justify-center space-y-4 rounded-xl bg-gray-50 p-4 dark:bg-dark-900">
+              <div class="flex items-center justify-between">
+                <span class="text-[10px] font-bold text-gray-400 uppercase">Redis 状态</span>
+                <span class="text-xs font-bold text-green-500 uppercase">{{ overview?.system_status.redis }}</span>
+              </div>
+              <div class="flex items-center justify-between">
+                <span class="text-[10px] font-bold text-gray-400 uppercase">DB 连接</span>
+                <span class="text-xs font-bold text-gray-900 dark:text-white">{{ overview?.resources.db_connections.active }} / {{ overview?.resources.db_connections.max }}</span>
+              </div>
+              <div class="flex items-center justify-between">
+                <span class="text-[10px] font-bold text-gray-400 uppercase">Goroutines</span>
+                <span class="text-xs font-bold text-gray-900 dark:text-white">{{ overview?.resources.goroutines }}</span>
+              </div>
+            </div>
+          </div>
+        </div>
+      </div>
+    </div>
+  </AppLayout>
+</template>
+
+<style scoped>
+/* Custom select styling */
+select {
+  appearance: none;
+  background-image: url("data:image/svg+xml,%3csvg xmlns='http://www.w3.org/2000/svg' fill='none' viewBox='0 0 20 20'%3e%3cpath stroke='%236b7280' stroke-linecap='round' stroke-linejoin='round' stroke-width='1.5' d='M6 8l4 4 4-4'/%3e%3c/svg%3e");
+  background-repeat: no-repeat;
+  background-position: right 0.5rem center;
+  background-size: 1.5em 1.5em;
+}
+</style>