Mirror the existing /v1/messages platform split pattern:
- OpenAI groups → OpenAIGateway handlers (existing, unchanged)
- Non-OpenAI groups → Gateway handlers (new Anthropic-upstream path)
Updated both /v1 prefixed routes and non-prefixed alias routes
(/responses, /chat/completions). WebSocket route (/v1/responses GET)
remains OpenAI-only as Anthropic has no WebSocket equivalent.
New HTTP handlers for Anthropic platform groups accepting OpenAI-format
endpoints:
- GatewayHandler.Responses: /v1/responses for non-OpenAI groups
- GatewayHandler.ChatCompletions: /v1/chat/completions for non-OpenAI groups
Both handlers include:
- Claude Code only restriction (403 reject when claude_code_only enabled,
since these endpoints are never Claude Code clients)
- Full auth → billing → user/account concurrency → failover loop
- Ops error/endpoint context propagation
- Async usage recording via worker pool
Error responses use each endpoint's native format (Responses API format
for /v1/responses, CC format for /v1/chat/completions).
New forwarding methods on GatewayService for Anthropic platform groups:
- ForwardAsResponses: accept Responses body → convert to Anthropic →
forward to upstream → convert response back to Responses format.
Supports both streaming (SSE event-by-event conversion) and buffered
(accumulate then convert) response modes.
- ForwardAsChatCompletions: chain CC→Responses→Anthropic for request,
Anthropic→Responses→CC for response. Streaming uses dual state machine
chain with [DONE] marker.
Both methods reuse existing GatewayService infrastructure:
buildUpstreamRequest, Claude Code mimicry, cache control enforcement,
model mapping, and return UpstreamFailoverError for handler-level retry.
Empty text blocks inside tool_result.content were not being filtered,
causing upstream 400 errors: 'text content blocks must be non-empty'.
Changes:
- Add stripEmptyTextBlocksFromSlice helper for recursive content filtering
- FilterThinkingBlocksForRetry now recurses into tool_result nested content
- Add StripEmptyTextBlocks pre-filter on initial request path to avoid
unnecessary 400+retry round-trips
- Add unit tests for nested empty text block scenarios
LegacyPrintf uses inferStdLogLevel() to infer log level from message
text. Any message containing the word "error" is classified as ERROR
level, causing the entire signature-retry recovery flow (which succeeds)
to produce spurious ERROR log entries.
Changes:
- Remove noisy [SignatureCheck] debug logs inside isThinkingBlockSignatureError
that were logging every detected signature check as ERROR
- Change retry-start log to WARN level via [warn] prefix
- Change retry-success log to INFO level by removing "error" from message
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The quota reset mechanism is lazy — quota_daily_used/quota_weekly_used
in the database are only reset on the next IncrementQuotaUsed call.
The scheduling layer (IsQuotaExceeded) correctly checks period expiry
before enforcing limits, so the account remains usable. However, the
API response mapper reads the raw DB value without checking expiry,
causing the frontend to display cumulative usage (e.g. 110%) even
after the reset period has passed.
Add IsDailyQuotaPeriodExpired/IsWeeklyQuotaPeriodExpired methods and
use them in the mapper to return used=0 when the period has expired.