This commit introduces a major architectural refactoring to improve quota management, centralize logging, and streamline the relay handling logic.
Key changes:
- **Pre-consume Quota:** Implements a new mechanism to check and reserve user quota *before* making the request to the upstream provider. This ensures more accurate quota deduction and prevents users from exceeding their limits due to concurrent requests.
- **Unified Relay Handlers:** Refactors the relay logic to use generic handlers (e.g., `ChatHandler`, `ImageHandler`) instead of provider-specific implementations. This significantly reduces code duplication and simplifies adding new channels.
- **Centralized Logger:** A new dedicated `logger` package is introduced, and all system logging calls are migrated to use it, moving this responsibility out of the `common` package.
- **Code Reorganization:** DTOs are generalized (e.g., `dalle.go` -> `openai_image.go`) and utility code is moved to more appropriate packages (e.g., `common/http.go` -> `service/http.go`) for better code structure.
This commit refactors the application's error handling mechanism by introducing a new standardized error type, `types.NewAPIError`. It also renames common JSON utility functions for better clarity.
Previously, internal error handling was tightly coupled to the `dto.OpenAIError` format. This change decouples the internal logic from the external API representation.
Key changes:
- A new `types.NewAPIError` struct is introduced to serve as a canonical internal representation for all API errors.
- All relay adapters (OpenAI, Claude, Gemini, etc.) are updated to return `*types.NewAPIError`.
- Controllers now convert the internal `NewAPIError` to the client-facing `OpenAIError` format at the API boundary, ensuring backward compatibility.
- Channel auto-disable/enable logic is updated to use the new standardized error type.
- JSON utility functions are renamed to align with Go's standard library conventions (e.g., `UnmarshalJson` -> `Unmarshal`, `EncodeJson` -> `Marshal`).
- Introduced `isNoThinkingRequest` and `trimModelThinking` functions to manage model names and thinking configurations.
- Updated `GeminiHelper` to conditionally adjust the model name based on the thinking budget and request settings.
- Refactored `ThinkingAdaptor` to streamline the integration of thinking capabilities into Gemini requests.
- Cleaned up commented-out code in `FetchUpstreamModels` for clarity.
These changes improve the handling of model configurations and enhance the adaptability of the Gemini relay system.
- Added new handlers: AudioHelper, ImageHelper, EmbeddingHelper, and ResponsesHelper to manage respective requests.
- Updated ModelMappedHelper to accept request parameters for better model mapping.
- Enhanced error handling and validation across new handlers to ensure robust request processing.
- Introduced support for new relay formats in relay_info and updated relevant functions accordingly.