Before: the OpenAI-compat forwarders only called injectClaudeCodePrompt,
which prepends the Claude Code banner but leaves the rest of the body
in its original non-Claude-Code shape. The codebase already admits this
is insufficient (see the comment on rewriteSystemForNonClaudeCode in
gateway_service.go: "仅前置追加 Claude Code 提示词无法通过检测").
Effect: OAuth accounts served through /v1/chat/completions or /v1/responses
were detected as third-party apps and bled plan quota with:
Third-party apps now draw from your extra usage, not your plan limits.
Fix:
- apicompat.AnthropicRequest: add Metadata json.RawMessage so metadata
survives the OpenAI->Anthropic->Marshal round trip; without it the
downstream rewrite has no user_id to work with.
- service: extract applyClaudeCodeOAuthMimicryToBody, a ParsedRequest-free
variant of the /v1/messages mimicry pipeline
(rewriteSystemForNonClaudeCode + normalizeClaudeOAuthRequestBody +
metadata.user_id injection) so the OpenAI-compat forwarders can reuse it.
- service: add buildOAuthMetadataUserIDFromBody + hashBodyForSessionSeed
for the same reason (no ParsedRequest at the call site).
- ForwardAsChatCompletions / ForwardAsResponses: replace the 3-line
prompt-prepend with the full mimicry pipeline.
- applyClaudeCodeMimicHeaders: set x-client-request-id per-request
(real Claude CLI always does); missing/duplicated values are one more
third-party fingerprint signal.
No change to the native /v1/messages path: it already called the full
pipeline, we only lift those helpers into a reusable function.
Tests:
- go build ./... passes
- go test ./internal/service/... ./internal/pkg/apicompat/... passes
- lsp_diagnostics clean on all touched files
- pre-existing failures in internal/config are unrelated (env-sensitive
tests that also fail on upstream main)
New forwarding methods on GatewayService for Anthropic platform groups:
- ForwardAsResponses: accept Responses body → convert to Anthropic →
forward to upstream → convert response back to Responses format.
Supports both streaming (SSE event-by-event conversion) and buffered
(accumulate then convert) response modes.
- ForwardAsChatCompletions: chain CC→Responses→Anthropic for request,
Anthropic→Responses→CC for response. Streaming uses dual state machine
chain with [DONE] marker.
Both methods reuse existing GatewayService infrastructure:
buildUpstreamRequest, Claude Code mimicry, cache control enforcement,
model mapping, and return UpstreamFailoverError for handler-level retry.