* fix(ops): 修复运维监控系统的关键安全和稳定性问题
## 修复内容
### P0 严重问题
1. **DNS Rebinding防护** (ops_alert_service.go)
- 实现IP钉住机制防止验证后的DNS rebinding攻击
- 自定义Transport.DialContext强制只允许拨号到验证过的公网IP
- 扩展IP黑名单,包括云metadata地址(169.254.169.254)
- 添加完整的单元测试覆盖
2. **OpsAlertService生命周期管理** (wire.go)
- 在ProvideOpsMetricsCollector中添加opsAlertService.Start()调用
- 确保stopCtx正确初始化,避免nil指针问题
- 实现防御式启动,保证服务启动顺序
3. **数据库查询排序** (ops_repo.go)
- 在ListRecentSystemMetrics中添加显式ORDER BY updated_at DESC, id DESC
- 在GetLatestSystemMetric中添加排序保证
- 避免数据库返回顺序不确定导致告警误判
### P1 重要问题
4. **并发安全** (ops_metrics_collector.go)
- 为lastGCPauseTotal字段添加sync.Mutex保护
- 防止数据竞争
5. **Goroutine泄漏** (ops_error_logger.go)
- 实现worker pool模式限制并发goroutine数量
- 使用256容量缓冲队列和10个固定worker
- 非阻塞投递,队列满时丢弃任务
6. **生命周期控制** (ops_alert_service.go)
- 添加Start/Stop方法实现优雅关闭
- 使用context控制goroutine生命周期
- 实现WaitGroup等待后台任务完成
7. **Webhook URL验证** (ops_alert_service.go)
- 防止SSRF攻击:验证scheme、禁止内网IP
- DNS解析验证,拒绝解析到私有IP的域名
- 添加8个单元测试覆盖各种攻击场景
8. **资源泄漏** (ops_repo.go)
- 修复多处defer rows.Close()问题
- 简化冗余的defer func()包装
9. **HTTP超时控制** (ops_alert_service.go)
- 创建带10秒超时的http.Client
- 添加buildWebhookHTTPClient辅助函数
- 防止HTTP请求无限期挂起
10. **数据库查询优化** (ops_repo.go)
- 将GetWindowStats的4次独立查询合并为1次CTE查询
- 减少网络往返和表扫描次数
- 显著提升性能
11. **重试机制** (ops_alert_service.go)
- 实现邮件发送重试:最多3次,指数退避(1s/2s/4s)
- 添加webhook备用通道
- 实现完整的错误处理和日志记录
12. **魔法数字** (ops_repo.go, ops_metrics_collector.go)
- 提取硬编码数字为有意义的常量
- 提高代码可读性和可维护性
## 测试验证
- ✅ go test ./internal/service -tags opsalert_unit 通过
- ✅ 所有webhook验证测试通过
- ✅ 重试机制测试通过
## 影响范围
- 运维监控系统安全性显著提升
- 系统稳定性和性能优化
- 无破坏性变更,向后兼容
* feat(ops): 运维监控系统V2 - 完整实现
## 核心功能
- 运维监控仪表盘V2(实时监控、历史趋势、告警管理)
- WebSocket实时QPS/TPS监控(30s心跳,自动重连)
- 系统指标采集(CPU、内存、延迟、错误率等)
- 多维度统计分析(按provider、model、user等维度)
- 告警规则管理(阈值配置、通知渠道)
- 错误日志追踪(详细错误信息、堆栈跟踪)
## 数据库Schema (Migration 025)
### 扩展现有表
- ops_system_metrics: 新增RED指标、错误分类、延迟指标、资源指标、业务指标
- ops_alert_rules: 新增JSONB字段(dimension_filters, notify_channels, notify_config)
### 新增表
- ops_dimension_stats: 多维度统计数据
- ops_data_retention_config: 数据保留策略配置
### 新增视图和函数
- ops_latest_metrics: 最新1分钟窗口指标(已修复字段名和window过滤)
- ops_active_alerts: 当前活跃告警(已修复字段名和状态值)
- calculate_health_score: 健康分数计算函数
## 一致性修复(98/100分)
### P0级别(阻塞Migration)
- ✅ 修复ops_latest_metrics视图字段名(latency_p99→p99_latency_ms, cpu_usage→cpu_usage_percent)
- ✅ 修复ops_active_alerts视图字段名(metric→metric_type, triggered_at→fired_at, trigger_value→metric_value, threshold→threshold_value)
- ✅ 统一告警历史表名(删除ops_alert_history,使用ops_alert_events)
- ✅ 统一API参数限制(ListMetricsHistory和ListErrorLogs的limit改为5000)
### P1级别(功能完整性)
- ✅ 修复ops_latest_metrics视图未过滤window_minutes(添加WHERE m.window_minutes = 1)
- ✅ 修复数据回填UPDATE逻辑(QPS计算改为request_count/(window_minutes*60.0))
- ✅ 添加ops_alert_rules JSONB字段后端支持(Go结构体+序列化)
### P2级别(优化)
- ✅ 前端WebSocket自动重连(指数退避1s→2s→4s→8s→16s,最大5次)
- ✅ 后端WebSocket心跳检测(30s ping,60s pong超时)
## 技术实现
### 后端 (Go)
- Handler层: ops_handler.go(REST API), ops_ws_handler.go(WebSocket)
- Service层: ops_service.go(核心逻辑), ops_cache.go(缓存), ops_alerts.go(告警)
- Repository层: ops_repo.go(数据访问), ops.go(模型定义)
- 路由: admin.go(新增ops相关路由)
- 依赖注入: wire_gen.go(自动生成)
### 前端 (Vue3 + TypeScript)
- 组件: OpsDashboardV2.vue(仪表盘主组件)
- API: ops.ts(REST API + WebSocket封装)
- 路由: index.ts(新增/admin/ops路由)
- 国际化: en.ts, zh.ts(中英文支持)
## 测试验证
- ✅ 所有Go测试通过
- ✅ Migration可正常执行
- ✅ WebSocket连接稳定
- ✅ 前后端数据结构对齐
* refactor: 代码清理和测试优化
## 测试文件优化
- 简化integration test fixtures和断言
- 优化test helper函数
- 统一测试数据格式
## 代码清理
- 移除未使用的代码和注释
- 简化concurrency_cache实现
- 优化middleware错误处理
## 小修复
- 修复gateway_handler和openai_gateway_handler的小问题
- 统一代码风格和格式
变更统计: 27个文件,292行新增,322行删除(净减少30行)
* fix(ops): 运维监控系统安全加固和功能优化
## 安全增强
- feat(security): WebSocket日志脱敏机制,防止token/api_key泄露
- feat(security): X-Forwarded-Host白名单验证,防止CSRF绕过
- feat(security): Origin策略配置化,支持strict/permissive模式
- feat(auth): WebSocket认证支持query参数传递token
## 配置优化
- feat(config): 支持环境变量配置代理信任和Origin策略
- OPS_WS_TRUST_PROXY
- OPS_WS_TRUSTED_PROXIES
- OPS_WS_ORIGIN_POLICY
- fix(ops): 错误日志查询限流从5000降至500,优化内存使用
## 架构改进
- refactor(ops): 告警服务解耦,独立运行评估定时器
- refactor(ops): OpsDashboard统一版本,移除V2分离
## 测试和文档
- test(ops): 添加WebSocket安全验证单元测试(8个测试用例)
- test(ops): 添加告警服务集成测试
- docs(api): 更新API文档,标注限流变更
- docs: 添加CHANGELOG记录breaking changes
## 修复文件
Backend:
- backend/internal/server/middleware/logger.go
- backend/internal/handler/admin/ops_handler.go
- backend/internal/handler/admin/ops_ws_handler.go
- backend/internal/server/middleware/admin_auth.go
- backend/internal/service/ops_alert_service.go
- backend/internal/service/ops_metrics_collector.go
- backend/internal/service/wire.go
Frontend:
- frontend/src/views/admin/ops/OpsDashboard.vue
- frontend/src/router/index.ts
- frontend/src/api/admin/ops.ts
Tests:
- backend/internal/handler/admin/ops_ws_handler_test.go (新增)
- backend/internal/service/ops_alert_service_integration_test.go (新增)
Docs:
- CHANGELOG.md (新增)
- docs/API-运维监控中心2.0.md (更新)
* fix(migrations): 修复calculate_health_score函数类型匹配问题
在ops_latest_metrics视图中添加显式类型转换,确保参数类型与函数签名匹配
* fix(lint): 修复golangci-lint检查发现的所有问题
- 将Redis依赖从service层移到repository层
- 添加错误检查(WebSocket连接和读取超时)
- 运行gofmt格式化代码
- 添加nil指针检查
- 删除未使用的alertService字段
修复问题:
- depguard: 3个(service层不应直接import redis)
- errcheck: 3个(未检查错误返回值)
- gofmt: 2个(代码格式问题)
- staticcheck: 4个(nil指针解引用)
- unused: 1个(未使用字段)
代码统计:
- 修改文件:11个
- 删除代码:490行
- 新增代码:105行
- 净减少:385行
836 lines
27 KiB
Go
836 lines
27 KiB
Go
package service
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
"errors"
|
||
"fmt"
|
||
"io"
|
||
"net/http"
|
||
"regexp"
|
||
"strconv"
|
||
"strings"
|
||
"time"
|
||
|
||
"github.com/Wei-Shaw/sub2api/internal/config"
|
||
"github.com/Wei-Shaw/sub2api/internal/pkg/geminicli"
|
||
"github.com/Wei-Shaw/sub2api/internal/pkg/httpclient"
|
||
)
|
||
|
||
const (
|
||
TierAIPremium = "AI_PREMIUM"
|
||
TierGoogleOneStandard = "GOOGLE_ONE_STANDARD"
|
||
TierGoogleOneBasic = "GOOGLE_ONE_BASIC"
|
||
TierFree = "FREE"
|
||
TierGoogleOneUnknown = "GOOGLE_ONE_UNKNOWN"
|
||
TierGoogleOneUnlimited = "GOOGLE_ONE_UNLIMITED"
|
||
)
|
||
|
||
const (
|
||
GB = 1024 * 1024 * 1024
|
||
TB = 1024 * GB
|
||
|
||
StorageTierUnlimited = 100 * TB // 100TB
|
||
StorageTierAIPremium = 2 * TB // 2TB
|
||
StorageTierStandard = 200 * GB // 200GB
|
||
StorageTierBasic = 100 * GB // 100GB
|
||
StorageTierFree = 15 * GB // 15GB
|
||
)
|
||
|
||
type GeminiOAuthService struct {
|
||
sessionStore *geminicli.SessionStore
|
||
proxyRepo ProxyRepository
|
||
oauthClient GeminiOAuthClient
|
||
codeAssist GeminiCliCodeAssistClient
|
||
cfg *config.Config
|
||
}
|
||
|
||
type GeminiOAuthCapabilities struct {
|
||
AIStudioOAuthEnabled bool `json:"ai_studio_oauth_enabled"`
|
||
RequiredRedirectURIs []string `json:"required_redirect_uris"`
|
||
}
|
||
|
||
func NewGeminiOAuthService(
|
||
proxyRepo ProxyRepository,
|
||
oauthClient GeminiOAuthClient,
|
||
codeAssist GeminiCliCodeAssistClient,
|
||
cfg *config.Config,
|
||
) *GeminiOAuthService {
|
||
return &GeminiOAuthService{
|
||
sessionStore: geminicli.NewSessionStore(),
|
||
proxyRepo: proxyRepo,
|
||
oauthClient: oauthClient,
|
||
codeAssist: codeAssist,
|
||
cfg: cfg,
|
||
}
|
||
}
|
||
|
||
func (s *GeminiOAuthService) GetOAuthConfig() *GeminiOAuthCapabilities {
|
||
// AI Studio OAuth is only enabled when the operator configures a custom OAuth client.
|
||
clientID := strings.TrimSpace(s.cfg.Gemini.OAuth.ClientID)
|
||
clientSecret := strings.TrimSpace(s.cfg.Gemini.OAuth.ClientSecret)
|
||
enabled := clientID != "" && clientSecret != "" &&
|
||
(clientID != geminicli.GeminiCLIOAuthClientID || clientSecret != geminicli.GeminiCLIOAuthClientSecret)
|
||
|
||
return &GeminiOAuthCapabilities{
|
||
AIStudioOAuthEnabled: enabled,
|
||
RequiredRedirectURIs: []string{geminicli.AIStudioOAuthRedirectURI},
|
||
}
|
||
}
|
||
|
||
type GeminiAuthURLResult struct {
|
||
AuthURL string `json:"auth_url"`
|
||
SessionID string `json:"session_id"`
|
||
State string `json:"state"`
|
||
}
|
||
|
||
func (s *GeminiOAuthService) GenerateAuthURL(ctx context.Context, proxyID *int64, redirectURI, projectID, oauthType string) (*GeminiAuthURLResult, error) {
|
||
state, err := geminicli.GenerateState()
|
||
if err != nil {
|
||
return nil, fmt.Errorf("failed to generate state: %w", err)
|
||
}
|
||
codeVerifier, err := geminicli.GenerateCodeVerifier()
|
||
if err != nil {
|
||
return nil, fmt.Errorf("failed to generate code verifier: %w", err)
|
||
}
|
||
codeChallenge := geminicli.GenerateCodeChallenge(codeVerifier)
|
||
sessionID, err := geminicli.GenerateSessionID()
|
||
if err != nil {
|
||
return nil, fmt.Errorf("failed to generate session ID: %w", err)
|
||
}
|
||
|
||
var proxyURL string
|
||
if proxyID != nil {
|
||
proxy, err := s.proxyRepo.GetByID(ctx, *proxyID)
|
||
if err == nil && proxy != nil {
|
||
proxyURL = proxy.URL()
|
||
}
|
||
}
|
||
|
||
// OAuth client selection:
|
||
// - code_assist: always use built-in Gemini CLI OAuth client (public), regardless of configured client_id/secret.
|
||
// - google_one: same as code_assist, uses built-in client for personal Google accounts.
|
||
// - ai_studio: requires a user-provided OAuth client.
|
||
oauthCfg := geminicli.OAuthConfig{
|
||
ClientID: s.cfg.Gemini.OAuth.ClientID,
|
||
ClientSecret: s.cfg.Gemini.OAuth.ClientSecret,
|
||
Scopes: s.cfg.Gemini.OAuth.Scopes,
|
||
}
|
||
if oauthType == "code_assist" || oauthType == "google_one" {
|
||
oauthCfg.ClientID = ""
|
||
oauthCfg.ClientSecret = ""
|
||
}
|
||
|
||
session := &geminicli.OAuthSession{
|
||
State: state,
|
||
CodeVerifier: codeVerifier,
|
||
ProxyURL: proxyURL,
|
||
RedirectURI: redirectURI,
|
||
ProjectID: strings.TrimSpace(projectID),
|
||
OAuthType: oauthType,
|
||
CreatedAt: time.Now(),
|
||
}
|
||
s.sessionStore.Set(sessionID, session)
|
||
|
||
effectiveCfg, err := geminicli.EffectiveOAuthConfig(oauthCfg, oauthType)
|
||
if err != nil {
|
||
return nil, err
|
||
}
|
||
|
||
isBuiltinClient := effectiveCfg.ClientID == geminicli.GeminiCLIOAuthClientID &&
|
||
effectiveCfg.ClientSecret == geminicli.GeminiCLIOAuthClientSecret
|
||
|
||
// AI Studio OAuth requires a user-provided OAuth client (built-in Gemini CLI client is scope-restricted).
|
||
if oauthType == "ai_studio" && isBuiltinClient {
|
||
return nil, fmt.Errorf("AI Studio OAuth requires a custom OAuth Client (GEMINI_OAUTH_CLIENT_ID / GEMINI_OAUTH_CLIENT_SECRET). If you don't want to configure an OAuth client, please use an AI Studio API Key account instead")
|
||
}
|
||
|
||
// Redirect URI strategy:
|
||
// - code_assist: use Gemini CLI redirect URI (codeassist.google.com/authcode)
|
||
// - ai_studio: use localhost callback for manual copy/paste flow
|
||
if oauthType == "code_assist" {
|
||
redirectURI = geminicli.GeminiCLIRedirectURI
|
||
} else {
|
||
redirectURI = geminicli.AIStudioOAuthRedirectURI
|
||
}
|
||
session.RedirectURI = redirectURI
|
||
s.sessionStore.Set(sessionID, session)
|
||
|
||
authURL, err := geminicli.BuildAuthorizationURL(effectiveCfg, state, codeChallenge, redirectURI, session.ProjectID, oauthType)
|
||
if err != nil {
|
||
return nil, err
|
||
}
|
||
|
||
return &GeminiAuthURLResult{
|
||
AuthURL: authURL,
|
||
SessionID: sessionID,
|
||
State: state,
|
||
}, nil
|
||
}
|
||
|
||
type GeminiExchangeCodeInput struct {
|
||
SessionID string
|
||
State string
|
||
Code string
|
||
ProxyID *int64
|
||
OAuthType string // "code_assist" 或 "ai_studio"
|
||
}
|
||
|
||
type GeminiTokenInfo struct {
|
||
AccessToken string `json:"access_token"`
|
||
RefreshToken string `json:"refresh_token"`
|
||
ExpiresIn int64 `json:"expires_in"`
|
||
ExpiresAt int64 `json:"expires_at"`
|
||
TokenType string `json:"token_type"`
|
||
Scope string `json:"scope,omitempty"`
|
||
ProjectID string `json:"project_id,omitempty"`
|
||
OAuthType string `json:"oauth_type,omitempty"` // "code_assist" 或 "ai_studio"
|
||
TierID string `json:"tier_id,omitempty"` // Gemini Code Assist tier: LEGACY/PRO/ULTRA
|
||
Extra map[string]any `json:"extra,omitempty"` // Drive metadata
|
||
}
|
||
|
||
// validateTierID validates tier_id format and length
|
||
func validateTierID(tierID string) error {
|
||
if tierID == "" {
|
||
return nil // Empty is allowed
|
||
}
|
||
if len(tierID) > 64 {
|
||
return fmt.Errorf("tier_id exceeds maximum length of 64 characters")
|
||
}
|
||
// Allow alphanumeric, underscore, hyphen, and slash (for tier paths)
|
||
if !regexp.MustCompile(`^[a-zA-Z0-9_/-]+$`).MatchString(tierID) {
|
||
return fmt.Errorf("tier_id contains invalid characters")
|
||
}
|
||
return nil
|
||
}
|
||
|
||
// extractTierIDFromAllowedTiers extracts tierID from LoadCodeAssist response
|
||
// Prioritizes IsDefault tier, falls back to first non-empty tier
|
||
func extractTierIDFromAllowedTiers(allowedTiers []geminicli.AllowedTier) string {
|
||
tierID := "LEGACY"
|
||
// First pass: look for default tier
|
||
for _, tier := range allowedTiers {
|
||
if tier.IsDefault && strings.TrimSpace(tier.ID) != "" {
|
||
tierID = strings.TrimSpace(tier.ID)
|
||
break
|
||
}
|
||
}
|
||
// Second pass: if still LEGACY, take first non-empty tier
|
||
if tierID == "LEGACY" {
|
||
for _, tier := range allowedTiers {
|
||
if strings.TrimSpace(tier.ID) != "" {
|
||
tierID = strings.TrimSpace(tier.ID)
|
||
break
|
||
}
|
||
}
|
||
}
|
||
return tierID
|
||
}
|
||
|
||
// inferGoogleOneTier infers Google One tier from Drive storage limit
|
||
func inferGoogleOneTier(storageBytes int64) string {
|
||
if storageBytes <= 0 {
|
||
return TierGoogleOneUnknown
|
||
}
|
||
|
||
if storageBytes > StorageTierUnlimited {
|
||
return TierGoogleOneUnlimited
|
||
}
|
||
if storageBytes >= StorageTierAIPremium {
|
||
return TierAIPremium
|
||
}
|
||
if storageBytes >= StorageTierStandard {
|
||
return TierGoogleOneStandard
|
||
}
|
||
if storageBytes >= StorageTierBasic {
|
||
return TierGoogleOneBasic
|
||
}
|
||
if storageBytes >= StorageTierFree {
|
||
return TierFree
|
||
}
|
||
return TierGoogleOneUnknown
|
||
}
|
||
|
||
// FetchGoogleOneTier fetches Google One tier from Drive API
|
||
func (s *GeminiOAuthService) FetchGoogleOneTier(ctx context.Context, accessToken, proxyURL string) (string, *geminicli.DriveStorageInfo, error) {
|
||
driveClient := geminicli.NewDriveClient()
|
||
|
||
storageInfo, err := driveClient.GetStorageQuota(ctx, accessToken, proxyURL)
|
||
if err != nil {
|
||
// Check if it's a 403 (scope not granted)
|
||
if strings.Contains(err.Error(), "status 403") {
|
||
fmt.Printf("[GeminiOAuth] Drive API scope not available: %v\n", err)
|
||
return TierGoogleOneUnknown, nil, err
|
||
}
|
||
// Other errors
|
||
fmt.Printf("[GeminiOAuth] Failed to fetch Drive storage: %v\n", err)
|
||
return TierGoogleOneUnknown, nil, err
|
||
}
|
||
|
||
tierID := inferGoogleOneTier(storageInfo.Limit)
|
||
return tierID, storageInfo, nil
|
||
}
|
||
|
||
// RefreshAccountGoogleOneTier 刷新单个账号的 Google One Tier
|
||
func (s *GeminiOAuthService) RefreshAccountGoogleOneTier(
|
||
ctx context.Context,
|
||
account *Account,
|
||
) (tierID string, extra map[string]any, credentials map[string]any, err error) {
|
||
if account == nil {
|
||
return "", nil, nil, fmt.Errorf("account is nil")
|
||
}
|
||
|
||
// 验证账号类型
|
||
oauthType, ok := account.Credentials["oauth_type"].(string)
|
||
if !ok || oauthType != "google_one" {
|
||
return "", nil, nil, fmt.Errorf("not a google_one OAuth account")
|
||
}
|
||
|
||
// 获取 access_token
|
||
accessToken, ok := account.Credentials["access_token"].(string)
|
||
if !ok || accessToken == "" {
|
||
return "", nil, nil, fmt.Errorf("missing access_token")
|
||
}
|
||
|
||
// 获取 proxy URL
|
||
var proxyURL string
|
||
if account.ProxyID != nil && account.Proxy != nil {
|
||
proxyURL = account.Proxy.URL()
|
||
}
|
||
|
||
// 调用 Drive API
|
||
tierID, storageInfo, err := s.FetchGoogleOneTier(ctx, accessToken, proxyURL)
|
||
if err != nil {
|
||
return "", nil, nil, err
|
||
}
|
||
|
||
// 构建 extra 数据(保留原有 extra 字段)
|
||
extra = make(map[string]any)
|
||
for k, v := range account.Extra {
|
||
extra[k] = v
|
||
}
|
||
if storageInfo != nil {
|
||
extra["drive_storage_limit"] = storageInfo.Limit
|
||
extra["drive_storage_usage"] = storageInfo.Usage
|
||
extra["drive_tier_updated_at"] = time.Now().Format(time.RFC3339)
|
||
}
|
||
|
||
// 构建 credentials 数据
|
||
credentials = make(map[string]any)
|
||
for k, v := range account.Credentials {
|
||
credentials[k] = v
|
||
}
|
||
credentials["tier_id"] = tierID
|
||
|
||
return tierID, extra, credentials, nil
|
||
}
|
||
|
||
func (s *GeminiOAuthService) ExchangeCode(ctx context.Context, input *GeminiExchangeCodeInput) (*GeminiTokenInfo, error) {
|
||
session, ok := s.sessionStore.Get(input.SessionID)
|
||
if !ok {
|
||
return nil, fmt.Errorf("session not found or expired")
|
||
}
|
||
if strings.TrimSpace(input.State) == "" || input.State != session.State {
|
||
return nil, fmt.Errorf("invalid state")
|
||
}
|
||
|
||
proxyURL := session.ProxyURL
|
||
if input.ProxyID != nil {
|
||
proxy, err := s.proxyRepo.GetByID(ctx, *input.ProxyID)
|
||
if err == nil && proxy != nil {
|
||
proxyURL = proxy.URL()
|
||
}
|
||
}
|
||
|
||
redirectURI := session.RedirectURI
|
||
|
||
// Resolve oauth_type early (defaults to code_assist for backward compatibility).
|
||
oauthType := session.OAuthType
|
||
if oauthType == "" {
|
||
oauthType = "code_assist"
|
||
}
|
||
|
||
// If the session was created for AI Studio OAuth, ensure a custom OAuth client is configured.
|
||
if oauthType == "ai_studio" {
|
||
effectiveCfg, err := geminicli.EffectiveOAuthConfig(geminicli.OAuthConfig{
|
||
ClientID: s.cfg.Gemini.OAuth.ClientID,
|
||
ClientSecret: s.cfg.Gemini.OAuth.ClientSecret,
|
||
Scopes: s.cfg.Gemini.OAuth.Scopes,
|
||
}, "ai_studio")
|
||
if err != nil {
|
||
return nil, err
|
||
}
|
||
isBuiltinClient := effectiveCfg.ClientID == geminicli.GeminiCLIOAuthClientID &&
|
||
effectiveCfg.ClientSecret == geminicli.GeminiCLIOAuthClientSecret
|
||
if isBuiltinClient {
|
||
return nil, fmt.Errorf("AI Studio OAuth requires a custom OAuth Client. Please use an AI Studio API Key account, or configure GEMINI_OAUTH_CLIENT_ID / GEMINI_OAUTH_CLIENT_SECRET and re-authorize")
|
||
}
|
||
}
|
||
|
||
// code_assist always uses the built-in client and its fixed redirect URI.
|
||
if oauthType == "code_assist" {
|
||
redirectURI = geminicli.GeminiCLIRedirectURI
|
||
}
|
||
|
||
tokenResp, err := s.oauthClient.ExchangeCode(ctx, oauthType, input.Code, session.CodeVerifier, redirectURI, proxyURL)
|
||
if err != nil {
|
||
return nil, fmt.Errorf("failed to exchange code: %w", err)
|
||
}
|
||
sessionProjectID := strings.TrimSpace(session.ProjectID)
|
||
s.sessionStore.Delete(input.SessionID)
|
||
|
||
// 计算过期时间:减去 5 分钟安全时间窗口(考虑网络延迟和时钟偏差)
|
||
// 同时设置下界保护,防止 expires_in 过小导致过去时间(引发刷新风暴)
|
||
const safetyWindow = 300 // 5 minutes
|
||
const minTTL = 30 // minimum 30 seconds
|
||
expiresAt := time.Now().Unix() + tokenResp.ExpiresIn - safetyWindow
|
||
minExpiresAt := time.Now().Unix() + minTTL
|
||
if expiresAt < minExpiresAt {
|
||
expiresAt = minExpiresAt
|
||
}
|
||
|
||
projectID := sessionProjectID
|
||
var tierID string
|
||
|
||
// 对于 code_assist 模式,project_id 是必需的,需要调用 Code Assist API
|
||
// 对于 google_one 模式,使用个人 Google 账号,不需要 project_id,配额由 Google 网关自动识别
|
||
// 对于 ai_studio 模式,project_id 是可选的(不影响使用 AI Studio API)
|
||
switch oauthType {
|
||
case "code_assist":
|
||
if projectID == "" {
|
||
var err error
|
||
projectID, tierID, err = s.fetchProjectID(ctx, tokenResp.AccessToken, proxyURL)
|
||
if err != nil {
|
||
// 记录警告但不阻断流程,允许后续补充 project_id
|
||
fmt.Printf("[GeminiOAuth] Warning: Failed to fetch project_id during token exchange: %v\n", err)
|
||
}
|
||
} else {
|
||
// 用户手动填了 project_id,仍需调用 LoadCodeAssist 获取 tierID
|
||
_, fetchedTierID, err := s.fetchProjectID(ctx, tokenResp.AccessToken, proxyURL)
|
||
if err != nil {
|
||
fmt.Printf("[GeminiOAuth] Warning: Failed to fetch tierID: %v\n", err)
|
||
} else {
|
||
tierID = fetchedTierID
|
||
}
|
||
}
|
||
if strings.TrimSpace(projectID) == "" {
|
||
return nil, fmt.Errorf("missing project_id for Code Assist OAuth: please fill Project ID (optional field) and regenerate the auth URL, or ensure your Google account has an ACTIVE GCP project")
|
||
}
|
||
// tierID 缺失时使用默认值
|
||
if tierID == "" {
|
||
tierID = "LEGACY"
|
||
}
|
||
case "google_one":
|
||
// Attempt to fetch Drive storage tier
|
||
tierID, storageInfo, err := s.FetchGoogleOneTier(ctx, tokenResp.AccessToken, proxyURL)
|
||
if err != nil {
|
||
// Log warning but don't block - use fallback
|
||
fmt.Printf("[GeminiOAuth] Warning: Failed to fetch Drive tier: %v\n", err)
|
||
tierID = TierGoogleOneUnknown
|
||
}
|
||
|
||
// Store Drive info in extra field for caching
|
||
if storageInfo != nil {
|
||
tokenInfo := &GeminiTokenInfo{
|
||
AccessToken: tokenResp.AccessToken,
|
||
RefreshToken: tokenResp.RefreshToken,
|
||
TokenType: tokenResp.TokenType,
|
||
ExpiresIn: tokenResp.ExpiresIn,
|
||
ExpiresAt: expiresAt,
|
||
Scope: tokenResp.Scope,
|
||
ProjectID: projectID,
|
||
TierID: tierID,
|
||
OAuthType: oauthType,
|
||
Extra: map[string]any{
|
||
"drive_storage_limit": storageInfo.Limit,
|
||
"drive_storage_usage": storageInfo.Usage,
|
||
"drive_tier_updated_at": time.Now().Format(time.RFC3339),
|
||
},
|
||
}
|
||
return tokenInfo, nil
|
||
}
|
||
}
|
||
// ai_studio 模式不设置 tierID,保持为空
|
||
|
||
return &GeminiTokenInfo{
|
||
AccessToken: tokenResp.AccessToken,
|
||
RefreshToken: tokenResp.RefreshToken,
|
||
TokenType: tokenResp.TokenType,
|
||
ExpiresIn: tokenResp.ExpiresIn,
|
||
ExpiresAt: expiresAt,
|
||
Scope: tokenResp.Scope,
|
||
ProjectID: projectID,
|
||
TierID: tierID,
|
||
OAuthType: oauthType,
|
||
}, nil
|
||
}
|
||
|
||
func (s *GeminiOAuthService) RefreshToken(ctx context.Context, oauthType, refreshToken, proxyURL string) (*GeminiTokenInfo, error) {
|
||
var lastErr error
|
||
|
||
for attempt := 0; attempt <= 3; attempt++ {
|
||
if attempt > 0 {
|
||
backoff := time.Duration(1<<uint(attempt-1)) * time.Second
|
||
if backoff > 30*time.Second {
|
||
backoff = 30 * time.Second
|
||
}
|
||
time.Sleep(backoff)
|
||
}
|
||
|
||
tokenResp, err := s.oauthClient.RefreshToken(ctx, oauthType, refreshToken, proxyURL)
|
||
if err == nil {
|
||
// 计算过期时间:减去 5 分钟安全时间窗口(考虑网络延迟和时钟偏差)
|
||
// 同时设置下界保护,防止 expires_in 过小导致过去时间(引发刷新风暴)
|
||
const safetyWindow = 300 // 5 minutes
|
||
const minTTL = 30 // minimum 30 seconds
|
||
expiresAt := time.Now().Unix() + tokenResp.ExpiresIn - safetyWindow
|
||
minExpiresAt := time.Now().Unix() + minTTL
|
||
if expiresAt < minExpiresAt {
|
||
expiresAt = minExpiresAt
|
||
}
|
||
return &GeminiTokenInfo{
|
||
AccessToken: tokenResp.AccessToken,
|
||
RefreshToken: tokenResp.RefreshToken,
|
||
TokenType: tokenResp.TokenType,
|
||
ExpiresIn: tokenResp.ExpiresIn,
|
||
ExpiresAt: expiresAt,
|
||
Scope: tokenResp.Scope,
|
||
}, nil
|
||
}
|
||
|
||
if isNonRetryableGeminiOAuthError(err) {
|
||
return nil, err
|
||
}
|
||
lastErr = err
|
||
}
|
||
|
||
return nil, fmt.Errorf("token refresh failed after retries: %w", lastErr)
|
||
}
|
||
|
||
func isNonRetryableGeminiOAuthError(err error) bool {
|
||
msg := err.Error()
|
||
nonRetryable := []string{
|
||
"invalid_grant",
|
||
"invalid_client",
|
||
"unauthorized_client",
|
||
"access_denied",
|
||
}
|
||
for _, needle := range nonRetryable {
|
||
if strings.Contains(msg, needle) {
|
||
return true
|
||
}
|
||
}
|
||
return false
|
||
}
|
||
|
||
func (s *GeminiOAuthService) RefreshAccountToken(ctx context.Context, account *Account) (*GeminiTokenInfo, error) {
|
||
if account.Platform != PlatformGemini || account.Type != AccountTypeOAuth {
|
||
return nil, fmt.Errorf("account is not a Gemini OAuth account")
|
||
}
|
||
|
||
refreshToken := account.GetCredential("refresh_token")
|
||
if strings.TrimSpace(refreshToken) == "" {
|
||
return nil, fmt.Errorf("no refresh token available")
|
||
}
|
||
|
||
// Preserve oauth_type from the account (defaults to code_assist for backward compatibility).
|
||
oauthType := strings.TrimSpace(account.GetCredential("oauth_type"))
|
||
if oauthType == "" {
|
||
oauthType = "code_assist"
|
||
}
|
||
|
||
var proxyURL string
|
||
if account.ProxyID != nil {
|
||
proxy, err := s.proxyRepo.GetByID(ctx, *account.ProxyID)
|
||
if err == nil && proxy != nil {
|
||
proxyURL = proxy.URL()
|
||
}
|
||
}
|
||
|
||
tokenInfo, err := s.RefreshToken(ctx, oauthType, refreshToken, proxyURL)
|
||
// Backward compatibility:
|
||
// Older versions could refresh Code Assist tokens using a user-provided OAuth client when configured.
|
||
// If the refresh token was originally issued to that custom client, forcing the built-in client will
|
||
// fail with "unauthorized_client". In that case, retry with the custom client (ai_studio path) when available.
|
||
if err != nil && oauthType == "code_assist" && strings.Contains(err.Error(), "unauthorized_client") && s.GetOAuthConfig().AIStudioOAuthEnabled {
|
||
if alt, altErr := s.RefreshToken(ctx, "ai_studio", refreshToken, proxyURL); altErr == nil {
|
||
tokenInfo = alt
|
||
err = nil
|
||
}
|
||
}
|
||
if err != nil {
|
||
// Provide a more actionable error for common OAuth client mismatch issues.
|
||
if strings.Contains(err.Error(), "unauthorized_client") {
|
||
return nil, fmt.Errorf("%w (OAuth client mismatch: the refresh_token is bound to the OAuth client used during authorization; please re-authorize this account or restore the original GEMINI_OAUTH_CLIENT_ID/SECRET)", err)
|
||
}
|
||
return nil, err
|
||
}
|
||
|
||
tokenInfo.OAuthType = oauthType
|
||
|
||
// Preserve account's project_id when present.
|
||
existingProjectID := strings.TrimSpace(account.GetCredential("project_id"))
|
||
if existingProjectID != "" {
|
||
tokenInfo.ProjectID = existingProjectID
|
||
}
|
||
|
||
// 尝试从账号凭证获取 tierID(向后兼容)
|
||
existingTierID := strings.TrimSpace(account.GetCredential("tier_id"))
|
||
|
||
// For Code Assist, project_id is required. Auto-detect if missing.
|
||
// For AI Studio OAuth, project_id is optional and should not block refresh.
|
||
switch oauthType {
|
||
case "code_assist":
|
||
// 先设置默认值或保留旧值,确保 tier_id 始终有值
|
||
if existingTierID != "" {
|
||
tokenInfo.TierID = existingTierID
|
||
} else {
|
||
tokenInfo.TierID = "LEGACY" // 默认值
|
||
}
|
||
|
||
// 尝试自动探测 project_id 和 tier_id
|
||
needDetect := strings.TrimSpace(tokenInfo.ProjectID) == "" || existingTierID == ""
|
||
if needDetect {
|
||
projectID, tierID, err := s.fetchProjectID(ctx, tokenInfo.AccessToken, proxyURL)
|
||
if err != nil {
|
||
fmt.Printf("[GeminiOAuth] Warning: failed to auto-detect project/tier: %v\n", err)
|
||
} else {
|
||
if strings.TrimSpace(tokenInfo.ProjectID) == "" && projectID != "" {
|
||
tokenInfo.ProjectID = projectID
|
||
}
|
||
// 只有当原来没有 tier_id 且探测成功时才更新
|
||
if existingTierID == "" && tierID != "" {
|
||
tokenInfo.TierID = tierID
|
||
}
|
||
}
|
||
}
|
||
|
||
if strings.TrimSpace(tokenInfo.ProjectID) == "" {
|
||
return nil, fmt.Errorf("failed to auto-detect project_id: empty result")
|
||
}
|
||
case "google_one":
|
||
// Check if tier cache is stale (> 24 hours)
|
||
needsRefresh := true
|
||
if account.Extra != nil {
|
||
if updatedAtStr, ok := account.Extra["drive_tier_updated_at"].(string); ok {
|
||
if updatedAt, err := time.Parse(time.RFC3339, updatedAtStr); err == nil {
|
||
if time.Since(updatedAt) <= 24*time.Hour {
|
||
needsRefresh = false
|
||
// Use cached tier
|
||
if existingTierID != "" {
|
||
tokenInfo.TierID = existingTierID
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
|
||
if needsRefresh {
|
||
tierID, storageInfo, err := s.FetchGoogleOneTier(ctx, tokenInfo.AccessToken, proxyURL)
|
||
if err == nil && storageInfo != nil {
|
||
tokenInfo.TierID = tierID
|
||
tokenInfo.Extra = map[string]any{
|
||
"drive_storage_limit": storageInfo.Limit,
|
||
"drive_storage_usage": storageInfo.Usage,
|
||
"drive_tier_updated_at": time.Now().Format(time.RFC3339),
|
||
}
|
||
} else {
|
||
// Fallback to cached or unknown
|
||
if existingTierID != "" {
|
||
tokenInfo.TierID = existingTierID
|
||
} else {
|
||
tokenInfo.TierID = TierGoogleOneUnknown
|
||
}
|
||
}
|
||
}
|
||
}
|
||
|
||
return tokenInfo, nil
|
||
}
|
||
|
||
func (s *GeminiOAuthService) BuildAccountCredentials(tokenInfo *GeminiTokenInfo) map[string]any {
|
||
creds := map[string]any{
|
||
"access_token": tokenInfo.AccessToken,
|
||
"expires_at": strconv.FormatInt(tokenInfo.ExpiresAt, 10),
|
||
}
|
||
if tokenInfo.RefreshToken != "" {
|
||
creds["refresh_token"] = tokenInfo.RefreshToken
|
||
}
|
||
if tokenInfo.TokenType != "" {
|
||
creds["token_type"] = tokenInfo.TokenType
|
||
}
|
||
if tokenInfo.Scope != "" {
|
||
creds["scope"] = tokenInfo.Scope
|
||
}
|
||
if tokenInfo.ProjectID != "" {
|
||
creds["project_id"] = tokenInfo.ProjectID
|
||
}
|
||
if tokenInfo.TierID != "" {
|
||
// Validate tier_id before storing
|
||
if err := validateTierID(tokenInfo.TierID); err == nil {
|
||
creds["tier_id"] = tokenInfo.TierID
|
||
}
|
||
// Silently skip invalid tier_id (don't block account creation)
|
||
}
|
||
if tokenInfo.OAuthType != "" {
|
||
creds["oauth_type"] = tokenInfo.OAuthType
|
||
}
|
||
// Store extra metadata (Drive info) if present
|
||
if len(tokenInfo.Extra) > 0 {
|
||
for k, v := range tokenInfo.Extra {
|
||
creds[k] = v
|
||
}
|
||
}
|
||
return creds
|
||
}
|
||
|
||
func (s *GeminiOAuthService) Stop() {
|
||
s.sessionStore.Stop()
|
||
}
|
||
|
||
func (s *GeminiOAuthService) fetchProjectID(ctx context.Context, accessToken, proxyURL string) (string, string, error) {
|
||
if s.codeAssist == nil {
|
||
return "", "", errors.New("code assist client not configured")
|
||
}
|
||
|
||
loadResp, loadErr := s.codeAssist.LoadCodeAssist(ctx, accessToken, proxyURL, nil)
|
||
|
||
// Extract tierID from response (works whether CloudAICompanionProject is set or not)
|
||
tierID := "LEGACY"
|
||
if loadResp != nil {
|
||
tierID = extractTierIDFromAllowedTiers(loadResp.AllowedTiers)
|
||
}
|
||
|
||
// If LoadCodeAssist returned a project, use it
|
||
if loadErr == nil && loadResp != nil && strings.TrimSpace(loadResp.CloudAICompanionProject) != "" {
|
||
return strings.TrimSpace(loadResp.CloudAICompanionProject), tierID, nil
|
||
}
|
||
|
||
req := &geminicli.OnboardUserRequest{
|
||
TierID: tierID,
|
||
Metadata: geminicli.LoadCodeAssistMetadata{
|
||
IDEType: "ANTIGRAVITY",
|
||
Platform: "PLATFORM_UNSPECIFIED",
|
||
PluginType: "GEMINI",
|
||
},
|
||
}
|
||
|
||
maxAttempts := 5
|
||
for attempt := 1; attempt <= maxAttempts; attempt++ {
|
||
resp, err := s.codeAssist.OnboardUser(ctx, accessToken, proxyURL, req)
|
||
if err != nil {
|
||
// If Code Assist onboarding fails (e.g. INVALID_ARGUMENT), fallback to Cloud Resource Manager projects.
|
||
fallback, fbErr := fetchProjectIDFromResourceManager(ctx, accessToken, proxyURL)
|
||
if fbErr == nil && strings.TrimSpace(fallback) != "" {
|
||
return strings.TrimSpace(fallback), tierID, nil
|
||
}
|
||
return "", tierID, err
|
||
}
|
||
if resp.Done {
|
||
if resp.Response != nil && resp.Response.CloudAICompanionProject != nil {
|
||
switch v := resp.Response.CloudAICompanionProject.(type) {
|
||
case string:
|
||
return strings.TrimSpace(v), tierID, nil
|
||
case map[string]any:
|
||
if id, ok := v["id"].(string); ok {
|
||
return strings.TrimSpace(id), tierID, nil
|
||
}
|
||
}
|
||
}
|
||
|
||
fallback, fbErr := fetchProjectIDFromResourceManager(ctx, accessToken, proxyURL)
|
||
if fbErr == nil && strings.TrimSpace(fallback) != "" {
|
||
return strings.TrimSpace(fallback), tierID, nil
|
||
}
|
||
return "", tierID, errors.New("onboardUser completed but no project_id returned")
|
||
}
|
||
time.Sleep(2 * time.Second)
|
||
}
|
||
|
||
fallback, fbErr := fetchProjectIDFromResourceManager(ctx, accessToken, proxyURL)
|
||
if fbErr == nil && strings.TrimSpace(fallback) != "" {
|
||
return strings.TrimSpace(fallback), tierID, nil
|
||
}
|
||
if loadErr != nil {
|
||
return "", tierID, fmt.Errorf("loadCodeAssist failed (%v) and onboardUser timeout after %d attempts", loadErr, maxAttempts)
|
||
}
|
||
return "", tierID, fmt.Errorf("onboardUser timeout after %d attempts", maxAttempts)
|
||
}
|
||
|
||
type googleCloudProject struct {
|
||
ProjectID string `json:"projectId"`
|
||
DisplayName string `json:"name"`
|
||
LifecycleState string `json:"lifecycleState"`
|
||
}
|
||
|
||
type googleCloudProjectsResponse struct {
|
||
Projects []googleCloudProject `json:"projects"`
|
||
}
|
||
|
||
func fetchProjectIDFromResourceManager(ctx context.Context, accessToken, proxyURL string) (string, error) {
|
||
req, err := http.NewRequestWithContext(ctx, http.MethodGet, "https://cloudresourcemanager.googleapis.com/v1/projects", nil)
|
||
if err != nil {
|
||
return "", fmt.Errorf("failed to create resource manager request: %w", err)
|
||
}
|
||
|
||
req.Header.Set("Authorization", "Bearer "+accessToken)
|
||
req.Header.Set("User-Agent", geminicli.GeminiCLIUserAgent)
|
||
|
||
client, err := httpclient.GetClient(httpclient.Options{
|
||
ProxyURL: strings.TrimSpace(proxyURL),
|
||
Timeout: 30 * time.Second,
|
||
})
|
||
if err != nil {
|
||
client = &http.Client{Timeout: 30 * time.Second}
|
||
}
|
||
|
||
resp, err := client.Do(req)
|
||
if err != nil {
|
||
return "", fmt.Errorf("resource manager request failed: %w", err)
|
||
}
|
||
defer func() { _ = resp.Body.Close() }()
|
||
|
||
bodyBytes, err := io.ReadAll(resp.Body)
|
||
if err != nil {
|
||
return "", fmt.Errorf("failed to read resource manager response: %w", err)
|
||
}
|
||
|
||
if resp.StatusCode != http.StatusOK {
|
||
return "", fmt.Errorf("resource manager HTTP %d: %s", resp.StatusCode, string(bodyBytes))
|
||
}
|
||
|
||
var projectsResp googleCloudProjectsResponse
|
||
if err := json.Unmarshal(bodyBytes, &projectsResp); err != nil {
|
||
return "", fmt.Errorf("failed to parse resource manager response: %w", err)
|
||
}
|
||
|
||
active := make([]googleCloudProject, 0, len(projectsResp.Projects))
|
||
for _, p := range projectsResp.Projects {
|
||
if p.LifecycleState == "ACTIVE" && strings.TrimSpace(p.ProjectID) != "" {
|
||
active = append(active, p)
|
||
}
|
||
}
|
||
if len(active) == 0 {
|
||
return "", errors.New("no ACTIVE projects found from resource manager")
|
||
}
|
||
|
||
// Prefer likely companion projects first.
|
||
for _, p := range active {
|
||
id := strings.ToLower(strings.TrimSpace(p.ProjectID))
|
||
name := strings.ToLower(strings.TrimSpace(p.DisplayName))
|
||
if strings.Contains(id, "cloud-ai-companion") || strings.Contains(name, "cloud ai companion") || strings.Contains(name, "code assist") {
|
||
return strings.TrimSpace(p.ProjectID), nil
|
||
}
|
||
}
|
||
// Then prefer "default".
|
||
for _, p := range active {
|
||
id := strings.ToLower(strings.TrimSpace(p.ProjectID))
|
||
name := strings.ToLower(strings.TrimSpace(p.DisplayName))
|
||
if strings.Contains(id, "default") || strings.Contains(name, "default") {
|
||
return strings.TrimSpace(p.ProjectID), nil
|
||
}
|
||
}
|
||
|
||
return strings.TrimSpace(active[0].ProjectID), nil
|
||
}
|