Pre-flight layer: add 50+ patterns covering indirect identity probes —
are-you-X (Kiro/GPT/Gemini/Amazon), who-made-you, training-cutoff,
parameter-count, roleplay-bypass attempts, and Chinese equivalents.
Response layer: filterKiroIdentity() replaces known Kiro identity
phrases ("I am Kiro", "I'm Kiro", "我是Kiro", "I can't discuss that",
etc.) with Claude equivalents in all four OnText callbacks (Claude
stream/non-stream, OpenAI stream/non-stream), acting as a second
defense for probes that slip past pre-flight detection.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Kiro's upstream system prompt overrides all user-provided system
prompts and returns "I can't discuss that" for identity questions.
This pre-flight interceptor detects identity questions (Chinese and
English patterns) in the last user message and returns a Claude-style
response directly, bypassing Kiro entirely.
Response language matches the question language; model name reflects
the requested model (Claude Opus 4.7, Claude Sonnet 4.5, etc.).
Applied to both /v1/messages (Claude) and /v1/chat/completions (OpenAI).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Kiro's upstream model is trained to identify and resist
--- SYSTEM PROMPT --- marker blocks as injection attempts, causing it
to actively reject the user's system prompt and self-correct its
identity. Switch the Claude path to the same plain-prepend approach
already used by the OpenAI path: system content is joined directly
before the user message without any marker, matching natural context.
The sanitizer (reSysPromptBlock) still strips the old marker format
from conversation history until existing contamination clears out.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Writes request body + response body of failed upstream calls to
kiro_errors.log in the working directory. File is capped at 10MB;
when the next write would exceed that, the file is truncated so
only the most recent records are kept.
Helps diagnose 400 "Improperly formed request" errors where both
CW and Q reject the same payload.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Kiro backend does not support Anthropic prompt cache protocol.
The local cache tracker simulates cache hits/creation for Claude Code
compatibility, but subtracting those values from input_tokens caused
the reported input_tokens to drop to single digits.
input_tokens now reflects the real value; cache_creation_input_tokens
and cache_read_input_tokens are still reported for protocol compliance.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When the proxy's own --- SYSTEM PROMPT --- wrapper or Claude Code's
<system-reminder> blocks appear in conversation history (e.g. echoed
back by Kiro and included in the next request), strip them from user
and assistant message content before building the Kiro payload.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Old format had 9+ lines per request with key=value noise, and concurrent
requests interleaved without any way to tell which line belongs to which
call. The new format:
- Each line starts with the status code/outcome (200, 400, 429, FAIL,
TIMEOUT, ERR) so success/failure is visible at a glance.
- Every request gets a 6-char hex req_id; all lines for that request
share it, disambiguating interleaved concurrent traffic.
- Endpoint abbreviated to 2 chars (CW/Q), model stripped of "claude-"
prefix, attempt compacted to "a1"/"a2".
- Successful requests collapse to 2 lines (REQ start + 200 done with
first_byte and total elapsed). Retries/errors add one line each.
- Durations use fmtMs: <1s -> "235ms", >=1s -> "2.8s" (one sig fig).
Sample successful request:
[KiroAPI] REQ a3f2b1 model=opus-4.7 account=x@y endpoints=CW,Q
[KiroAPI] 200 a3f2b1 CW/a1 first_byte=1.2s total=2.8s
Sample fallback chain:
[KiroAPI] REQ b8e3c4 model=opus-4.6 account=x@y endpoints=Q,CW
[KiroAPI] 400 b8e3c4 Q /a1 INVALID_MODEL_ID 325ms retry 1/3
[KiroAPI] 400 b8e3c4 Q /a2 INVALID_MODEL_ID 242ms retry 2/3
[KiroAPI] 400 b8e3c4 Q /a4 INVALID_MODEL_ID 216ms exhausted -> fallback
[KiroAPI] 400 b8e3c4 CW/a1 INVALID_MODEL_ID 452ms retry 1/3
...
[KiroAPI] FAIL b8e3c4 all endpoints failed 2.1s last=400
Upstream sometimes accepts a request (HTTP 200 headers) but stalls without
sending any event-stream packet. Add a configurable timeout that counts
from request dispatch until the first AWS event-stream prelude is read,
and retry on the same endpoint before falling back.
- Config: FirstByteTimeoutSec (default 10s, 0=disabled, range 0-300),
FirstByteRetries (default 1, range 0-10), with Get/Update helpers.
- kiro.go: parseEventStream signature gains onFirstByte callback, fired
once when the first 12-byte prelude reads successfully. CallKiroAPI
wraps each attempt in a context.WithCancel + time.AfterFunc timer that
cancels the HTTP request if no event arrives before the deadline.
Separate retry budgets for INVALID_MODEL_ID and first-byte timeout,
tracked on the same attempt loop; maxAttempts = max(both)+1.
- handler.go: /admin/api/general extended to read/write the two new
fields with validation (timeout 0-300, retries 0-10).
- web/index.html: General Settings card gains two numeric inputs plus
CN/EN i18n and the corresponding load/save JS.
Brought in 9 upstream commits:
- 221348b thinking routing: ClaudeRequest.Thinking + Signature + includeEmptyThinkingBlock
- 0203357 + 31aa6aa accurate input_tokens via contextUsageEvent
- 404e242 + 50f1a7e outbound proxy (socks5/http) + UI
- 940dc78 version bump to 1.0.6
- 3 CI workflow changes
Strategy: took upstream base for the 4 conflicting files, then re-applied
our local changes on top:
- config.go: InvalidModelRetries field + GetInvalidModelRetries/UpdateInvalidModelRetries
- kiro.go: AmazonQ origin CLI->AI_EDITOR, attempt-level retry loop for
INVALID_MODEL_ID, detailed log.Printf (account/model/attempt/elapsed),
log import; adopted upstream's kiroHttpStore atomic pointer for Do()
- handler.go: /admin/api/general GET/POST + apiGetGeneralConfig +
apiUpdateGeneralConfig
- web/index.html: General Settings card (invalid-model-retries),
CN/EN i18n, loadGeneralConfig/saveGeneralConfig, call from initSettings
Build + full test suite green on Go 1.24.3.
With origin=CLI, q.us-east-1.amazonaws.com returns only 3 base models
(sonnet-4.5, sonnet-4, haiku-4.5) and rejects everything else with
INVALID_MODEL_ID. With origin=AI_EDITOR it returns the full catalog
(opus-4.5/4.6/4.7, sonnet-4.6, haiku-4.5, deepseek, minimax, glm, qwen,
auto).
Verified via direct curl to /ListAvailableModels on both origin values
with two different tokens.
- Add missing claude-sonnet-4-7/4.7 and claude-haiku-4-7/4.7 mappings;
previously claude-sonnet-4.7 was substring-matched by the bare
"claude-sonnet-4" key and silently downgraded to claude-sonnet-4.
- Introduce modelMapping.boundary flag and modelKeyMatches() helper.
Bare digit-ending keys (like claude-sonnet-4) now require the next
character to NOT be a digit, dot, or dash-digit, so future versions
(4.8, 5.x) also pass through without silent downgrade.
- Add 8 regression tests in TestParseModelAndThinkingNoSilentDowngrade
covering the 4.7 family, hypothetical 4.8, Bedrock-style names, and
thinking-suffix variants.
- Config: new InvalidModelRetries field (default 3, range 0-20)
- Admin API: /admin/api/general GET/POST for general settings
- Admin UI: new "通用设置" card with retry count input
- CallKiroAPI: same-endpoint retry on HTTP 400 INVALID_MODEL_ID
before falling back to next endpoint
- CallKiroAPI: switched to log.Printf with timestamp, account,
model, attempt counter, elapsed time, error body truncation
* feat: Add validation and account management functionality
- Add validation for clientID and clientSecret in refreshOIDCToken function
- Add weight field for load balancing priority in Account struct
- Implement weighted轮询策略以根据账号权重分配选择概率。
- Add batch account management functionality including enabling, disabling, refreshing, and retrieving account details.
- Update Kiro API version and adjust user agent strings to reflect new version numbers.
- Update Kiro version and modify user agent strings and header settings.
- Refactor model mapping to an ordered list for precise key matching.
- Add account bulk actions and filtering toolbar to index.html
* feat: Add logic to skip accounts with exhausted usage limits
- Add logic to skip accounts with exhausted usage limits when selecting the next account.
* fix: stabilize multimodal image compatibility across OpenCode flows
Advertise vision-capable metadata in /v1/models and make model matching deterministic so OpenCode does not downgrade image support or route 4.6 models incorrectly. Expand request translation to accept OpenCode/OpenAI attachment shapes, sanitize [Image N] placeholders safely, keep image-only follow-up turns non-empty, and improve token accounting so base64 image bytes no longer inflate prompt token usage and trigger premature compaction.
* fix: deduplicate thinking streams and trim injected prompt noise
* fix: align /v1/messages thinking blocks and message_start usage
* fix: reduce repetitive thinking across tool turns
Select a single reasoning stream source, prevent chunk replay, and preserve structured tool-loop context so the model keeps continuity instead of re-planning each turn.
* fix: unify token counting on existing API endpoints
Compute usage deterministically on /v1/messages and /v1/chat/completions even when upstream omits tokenUsage.
- remove roo-only token path and keep behavior on existing endpoints
- add proxy/token_estimator.go with shared Claude/OpenAI estimators (input/system/messages/tools + output/thinking/tool calls)
- wire stream/non-stream handlers to use estimator-derived input/output usage
- update /v1/messages/count_tokens to reuse the same estimator
- keep robust upstream usage parsing/normalization in proxy/kiro.go while dropping parser-level estimate fallback
Why: direct upstream tests show metering/context events frequently arrive without tokenUsage in this environment; this made usage zero or inconsistent. Local deterministic accounting keeps reported usage stable and explicit.
- Add claude-sonnet-4.6 (dot and dash variants) to modelMap in translator.go
- Add claude-sonnet-4.6 and claude-opus-4.6 (plus -thinking variants) to the
static fallback model list in handler.go
- Realign existing opus-4.6 entries for consistency
* feat: Add JSON copy functionality with success animation
- Add functionality to copy account data as JSON and show success animation.
* feat: Add endpoints for account details and error handling
- Add endpoint to retrieve full account details including sensitive information
- Add error handling for fetching and copying full account JSON data
- Add ban status and reason fields to account configuration
- Add account ban status and details handling in API refresh account function.
- Add logic to handle account suspension and authentication errors, updating ban status accordingly.
- Add and style badge classes for different account statuses and modify account status display logic.