Commit Graph

63 Commits

Author SHA1 Message Date
huangzhenpc
6b73571f5b feat: expand identity interception to cover reverse-engineering probes
Some checks failed
Build Docker Image / build (push) Has been cancelled
Pre-flight layer: add 50+ patterns covering indirect identity probes —
are-you-X (Kiro/GPT/Gemini/Amazon), who-made-you, training-cutoff,
parameter-count, roleplay-bypass attempts, and Chinese equivalents.

Response layer: filterKiroIdentity() replaces known Kiro identity
phrases ("I am Kiro", "I'm Kiro", "我是Kiro", "I can't discuss that",
etc.) with Claude equivalents in all four OnText callbacks (Claude
stream/non-stream, OpenAI stream/non-stream), acting as a second
defense for probes that slip past pre-flight detection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:20:39 +08:00
huangzhenpc
1c2edd5f0d feat: intercept identity questions and return consistent Claude identity
Some checks failed
Build Docker Image / build (push) Has been cancelled
Kiro's upstream system prompt overrides all user-provided system
prompts and returns "I can't discuss that" for identity questions.
This pre-flight interceptor detects identity questions (Chinese and
English patterns) in the last user message and returns a Claude-style
response directly, bypassing Kiro entirely.

Response language matches the question language; model name reflects
the requested model (Claude Opus 4.7, Claude Sonnet 4.5, etc.).
Applied to both /v1/messages (Claude) and /v1/chat/completions (OpenAI).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:14:54 +08:00
huangzhenpc
a6e11c6d22 fix: drop --- SYSTEM PROMPT --- wrapper in Claude path to avoid Kiro injection detection
Some checks failed
Build Docker Image / build (push) Has been cancelled
Kiro's upstream model is trained to identify and resist
--- SYSTEM PROMPT --- marker blocks as injection attempts, causing it
to actively reject the user's system prompt and self-correct its
identity. Switch the Claude path to the same plain-prepend approach
already used by the OpenAI path: system content is joined directly
before the user message without any marker, matching natural context.

The sanitizer (reSysPromptBlock) still strips the old marker format
from conversation history until existing contamination clears out.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 11:41:44 +08:00
huangzhenpc
64df2d6083 feat: log non-200/non-429 kiro errors to rolling file for debugging
Some checks failed
Build Docker Image / build (push) Has been cancelled
Writes request body + response body of failed upstream calls to
kiro_errors.log in the working directory. File is capped at 10MB;
when the next write would exceed that, the file is truncated so
only the most recent records are kept.

Helps diagnose 400 "Improperly formed request" errors where both
CW and Q reject the same payload.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 11:33:27 +08:00
huangzhenpc
e8ab5b11e7 fix: stop deducting simulated cache tokens from input_tokens
Some checks failed
Build Docker Image / build (push) Has been cancelled
Kiro backend does not support Anthropic prompt cache protocol.
The local cache tracker simulates cache hits/creation for Claude Code
compatibility, but subtracting those values from input_tokens caused
the reported input_tokens to drop to single digits.

input_tokens now reflects the real value; cache_creation_input_tokens
and cache_read_input_tokens are still reported for protocol compliance.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 11:14:51 +08:00
huangzhenpc
c87517c0bf feat: sanitize injection blocks from conversation history before forwarding upstream
Some checks failed
Build Docker Image / build (push) Has been cancelled
When the proxy's own --- SYSTEM PROMPT --- wrapper or Claude Code's
<system-reminder> blocks appear in conversation history (e.g. echoed
back by Kiro and included in the next request), strip them from user
and assistant message content before building the Kiro payload.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 10:58:19 +08:00
huangzhenpc
2b29616723 refactor: compact kiro log format (status-first, per-request tracing)
Some checks failed
Build Docker Image / build (push) Has been cancelled
Old format had 9+ lines per request with key=value noise, and concurrent
requests interleaved without any way to tell which line belongs to which
call. The new format:

- Each line starts with the status code/outcome (200, 400, 429, FAIL,
  TIMEOUT, ERR) so success/failure is visible at a glance.
- Every request gets a 6-char hex req_id; all lines for that request
  share it, disambiguating interleaved concurrent traffic.
- Endpoint abbreviated to 2 chars (CW/Q), model stripped of "claude-"
  prefix, attempt compacted to "a1"/"a2".
- Successful requests collapse to 2 lines (REQ start + 200 done with
  first_byte and total elapsed). Retries/errors add one line each.
- Durations use fmtMs: <1s -> "235ms", >=1s -> "2.8s" (one sig fig).

Sample successful request:
  [KiroAPI] REQ  a3f2b1  model=opus-4.7  account=x@y  endpoints=CW,Q
  [KiroAPI] 200  a3f2b1  CW/a1  first_byte=1.2s  total=2.8s

Sample fallback chain:
  [KiroAPI] REQ  b8e3c4  model=opus-4.6  account=x@y  endpoints=Q,CW
  [KiroAPI] 400  b8e3c4  Q /a1  INVALID_MODEL_ID  325ms  retry 1/3
  [KiroAPI] 400  b8e3c4  Q /a2  INVALID_MODEL_ID  242ms  retry 2/3
  [KiroAPI] 400  b8e3c4  Q /a4  INVALID_MODEL_ID  216ms  exhausted -> fallback
  [KiroAPI] 400  b8e3c4  CW/a1  INVALID_MODEL_ID  452ms  retry 1/3
  ...
  [KiroAPI] FAIL b8e3c4  all endpoints failed  2.1s  last=400
2026-05-12 10:42:32 +08:00
huangzhenpc
89f731cb19 feat: first-byte timeout with same-endpoint retry
Some checks failed
Build Docker Image / build (push) Has been cancelled
Upstream sometimes accepts a request (HTTP 200 headers) but stalls without
sending any event-stream packet. Add a configurable timeout that counts
from request dispatch until the first AWS event-stream prelude is read,
and retry on the same endpoint before falling back.

- Config: FirstByteTimeoutSec (default 10s, 0=disabled, range 0-300),
  FirstByteRetries (default 1, range 0-10), with Get/Update helpers.
- kiro.go: parseEventStream signature gains onFirstByte callback, fired
  once when the first 12-byte prelude reads successfully. CallKiroAPI
  wraps each attempt in a context.WithCancel + time.AfterFunc timer that
  cancels the HTTP request if no event arrives before the deadline.
  Separate retry budgets for INVALID_MODEL_ID and first-byte timeout,
  tracked on the same attempt loop; maxAttempts = max(both)+1.
- handler.go: /admin/api/general extended to read/write the two new
  fields with validation (timeout 0-300, retries 0-10).
- web/index.html: General Settings card gains two numeric inputs plus
  CN/EN i18n and the corresponding load/save JS.
2026-05-12 09:04:11 +08:00
huangzhenpc
de4524ad19 Merge upstream Quorinex/Kiro-Go v1.0.6 with local features preserved
Some checks failed
Build Docker Image / build (push) Has been cancelled
Brought in 9 upstream commits:
- 221348b thinking routing: ClaudeRequest.Thinking + Signature + includeEmptyThinkingBlock
- 0203357 + 31aa6aa accurate input_tokens via contextUsageEvent
- 404e242 + 50f1a7e outbound proxy (socks5/http) + UI
- 940dc78 version bump to 1.0.6
- 3 CI workflow changes

Strategy: took upstream base for the 4 conflicting files, then re-applied
our local changes on top:
- config.go: InvalidModelRetries field + GetInvalidModelRetries/UpdateInvalidModelRetries
- kiro.go: AmazonQ origin CLI->AI_EDITOR, attempt-level retry loop for
  INVALID_MODEL_ID, detailed log.Printf (account/model/attempt/elapsed),
  log import; adopted upstream's kiroHttpStore atomic pointer for Do()
- handler.go: /admin/api/general GET/POST + apiGetGeneralConfig +
  apiUpdateGeneralConfig
- web/index.html: General Settings card (invalid-model-retries),
  CN/EN i18n, loadGeneralConfig/saveGeneralConfig, call from initSettings

Build + full test suite green on Go 1.24.3.
2026-05-12 00:09:33 +08:00
Quorinex
940dc782cb chore: bump version to 1.0.6 2026-05-11 22:31:31 +08:00
Quorinex
5cf2cce1d1 ci: use Go cross-compilation to eliminate slow arm64 runner 2026-05-11 22:31:31 +08:00
Quorinex
fdbf511b11 ci: fix image name must be lowercase for ghcr.io 2026-05-11 22:31:30 +08:00
Quorinex
0e03808b0d ci: parallel native arm64/amd64 builds, add Go BuildKit cache mounts 2026-05-11 22:31:30 +08:00
Quorinex
50f1a7e5ad refactor: improve proxy settings UI with type selector and structured fields 2026-05-11 22:31:30 +08:00
Quorinex
404e2425fa feat: add outbound proxy support (socks5/http) for restricted networks 2026-05-11 22:31:30 +08:00
huangzhenpc
2ca248175f fix(kiro): AmazonQ endpoint origin CLI -> AI_EDITOR
Some checks failed
Build Docker Image / build (push) Has been cancelled
With origin=CLI, q.us-east-1.amazonaws.com returns only 3 base models
(sonnet-4.5, sonnet-4, haiku-4.5) and rejects everything else with
INVALID_MODEL_ID. With origin=AI_EDITOR it returns the full catalog
(opus-4.5/4.6/4.7, sonnet-4.6, haiku-4.5, deepseek, minimax, glm, qwen,
auto).

Verified via direct curl to /ListAvailableModels on both origin values
with two different tokens.
2026-05-11 21:39:30 +08:00
Henry Yang
221348b975 fix: support Claude thinking config routing (#40) 2026-05-11 21:01:54 +08:00
Quorinex
0203357b34 refactor: remove buffered stream mode, keep contextUsageEvent for accurate input tokens 2026-05-11 19:47:39 +08:00
huangzhenpc
74babe3133 Merge upstream Quorinex/Kiro-Go: cache tracker fixes + license + version 1.0.5
Some checks failed
Build Docker Image / build (push) Has been cancelled
2026-05-11 19:16:35 +08:00
huangzhenpc
6d1d1c68a9 fix(translator): prevent silent model downgrade with boundary-aware matching
- Add missing claude-sonnet-4-7/4.7 and claude-haiku-4-7/4.7 mappings;
  previously claude-sonnet-4.7 was substring-matched by the bare
  "claude-sonnet-4" key and silently downgraded to claude-sonnet-4.
- Introduce modelMapping.boundary flag and modelKeyMatches() helper.
  Bare digit-ending keys (like claude-sonnet-4) now require the next
  character to NOT be a digit, dot, or dash-digit, so future versions
  (4.8, 5.x) also pass through without silent downgrade.
- Add 8 regression tests in TestParseModelAndThinkingNoSilentDowngrade
  covering the 4.7 family, hypothetical 4.8, Bedrock-style names, and
  thinking-suffix variants.
2026-05-11 19:16:05 +08:00
huangzhenpc
3b791a6926 feat: add INVALID_MODEL_ID retry config + detailed request logging
- Config: new InvalidModelRetries field (default 3, range 0-20)
- Admin API: /admin/api/general GET/POST for general settings
- Admin UI: new "通用设置" card with retry count input
- CallKiroAPI: same-endpoint retry on HTTP 400 INVALID_MODEL_ID
  before falling back to next endpoint
- CallKiroAPI: switched to log.Printf with timestamp, account,
  model, attempt counter, elapsed time, error body truncation
2026-05-11 19:15:49 +08:00
Naive YH
31aa6aa421 fix: accurate input_tokens via contextUsageEvent + smart routing for SDK clients 2026-05-11 17:23:21 +08:00
Quorinex
acc5fe45ce fix: improve prompt cache tracking 2026-05-11 16:13:38 +08:00
Quorinex
496b14df3f fix: improve prompt cache tracking 2026-05-11 15:58:21 +08:00
Quorinex
9dbe0cb55f docs: simplify README and add contributing notes 2026-05-10 22:04:57 +08:00
Quorinex
834890f4be docs: simplify README and add contributing notes 2026-05-10 22:03:18 +08:00
Quorinex
3089d028d2 chore: sync config version 2026-05-10 21:22:10 +08:00
Quorinex
e20b2a8816 chore: sync config version 2026-05-10 21:21:24 +08:00
Quorinex
f853d0544b Merge branch 'dev' (#32)
* chore: optimize model handling

* chore: update version metadata

---------

Co-authored-by: Quorinex <quorinex@users.noreply.github.com>
2026-05-10 21:16:36 +08:00
Quorinex
140492e6c7 chore: update version metadata 2026-05-10 21:14:13 +08:00
Quorinex
74a959260e chore: optimize model handling 2026-05-10 20:57:40 +08:00
Quorinex
f1351c3ef4 Merge branch 'dev' 2026-05-10 19:47:03 +08:00
Quorinex
bdc9c7babc chore: update dev branch model aggregation and naming 2026-05-10 19:22:34 +08:00
Quorinex
a24529d783 chore: sync dev branch proxy and workflow updates 2026-05-10 18:57:40 +08:00
luka7620
a063efd494 v1.1 适配opus4.7调用 2026-05-10 12:53:00 +08:00
hkxiaoyao
ad7aabd554 feat: Add validation and account management functionality (#21)
* feat: Add validation and account management functionality

- Add validation for clientID and clientSecret in refreshOIDCToken function
- Add weight field for load balancing priority in Account struct
- Implement weighted轮询策略以根据账号权重分配选择概率。
- Add batch account management functionality including enabling, disabling, refreshing, and retrieving account details.
- Update Kiro API version and adjust user agent strings to reflect new version numbers.
- Update Kiro version and modify user agent strings and header settings.
- Refactor model mapping to an ordered list for precise key matching.
- Add account bulk actions and filtering toolbar to index.html

* feat: Add logic to skip accounts with exhausted usage limits

- Add logic to skip accounts with exhausted usage limits when selecting the next account.
2026-02-23 21:47:17 +08:00
Quorinex
d71bf09dde chore: bump version to 1.0.3 and refactor model mapping 2026-02-23 21:46:42 +08:00
edxeth
6151888df5 fix: stabilize thinking streams, multimodal parsing, and token accounting (#20)
* fix: stabilize multimodal image compatibility across OpenCode flows

Advertise vision-capable metadata in /v1/models and make model matching deterministic so OpenCode does not downgrade image support or route 4.6 models incorrectly. Expand request translation to accept OpenCode/OpenAI attachment shapes, sanitize [Image N] placeholders safely, keep image-only follow-up turns non-empty, and improve token accounting so base64 image bytes no longer inflate prompt token usage and trigger premature compaction.

* fix: deduplicate thinking streams and trim injected prompt noise

* fix: align /v1/messages thinking blocks and message_start usage

* fix: reduce repetitive thinking across tool turns

Select a single reasoning stream source, prevent chunk replay, and preserve structured tool-loop context so the model keeps continuity instead of re-planning each turn.

* fix: unify token counting on existing API endpoints

Compute usage deterministically on /v1/messages and /v1/chat/completions even when upstream omits tokenUsage.

- remove roo-only token path and keep behavior on existing endpoints
- add proxy/token_estimator.go with shared Claude/OpenAI estimators (input/system/messages/tools + output/thinking/tool calls)
- wire stream/non-stream handlers to use estimator-derived input/output usage
- update /v1/messages/count_tokens to reuse the same estimator
- keep robust upstream usage parsing/normalization in proxy/kiro.go while dropping parser-level estimate fallback

Why: direct upstream tests show metering/context events frequently arrive without tokenUsage in this environment; this made usage zero or inconsistent. Local deterministic accounting keeps reported usage stable and explicit.
2026-02-23 20:33:53 +08:00
edxeth
f4049948f1 feat: add Claude Sonnet 4.6 and Opus 4.6 to model list and mapping (#18)
- Add claude-sonnet-4.6 (dot and dash variants) to modelMap in translator.go
- Add claude-sonnet-4.6 and claude-opus-4.6 (plus -thinking variants) to the
  static fallback model list in handler.go
- Realign existing opus-4.6 entries for consistency
2026-02-21 14:33:41 +08:00
hkxiaoyao
84998a0672 feat: Extract and export client credentials securely (#17)
- Extract and export client credentials and access tokens securely.
2026-02-19 20:07:27 +08:00
hkxiaoyao
f080fe3d54 feat: Add endpoints for account details and error handling (#16)
* feat: Add JSON copy functionality with success animation

- Add functionality to copy account data as JSON and show success animation.

* feat: Add endpoints for account details and error handling

- Add endpoint to retrieve full account details including sensitive information
- Add error handling for fetching and copying full account JSON data
2026-02-13 16:59:03 +08:00
Quorinex
60cf204823 chore: bump version to 1.0.2 2026-02-10 12:36:23 +08:00
hkxiaoyao
1afc82c29c feat: Add account ban handling and UI updates (#11)
- Add ban status and reason fields to account configuration
- Add account ban status and details handling in API refresh account function.
- Add logic to handle account suspension and authentication errors, updating ban status accordingly.
- Add and style badge classes for different account statuses and modify account status display logic.
2026-02-10 12:23:39 +08:00
Quorinex
306f49f9ac fix: add usage field to OpenAI streaming response final chunk (#10) 2026-02-10 09:42:31 +08:00
Quorinex
a308630156 feat: add admin logout, 72h session expiry, /v1/stats endpoint, and UI fixes 2026-02-08 19:23:00 +08:00
hkxiaoyao
99ce5c9c39 feat: Add privacy mode toggle and update email masking (#6)
- Add privacy mode toggle switch and update email masking logic.
2026-02-08 18:21:58 +08:00
Quorinex
3e7cca04ba feat: add versioning, account export, and dynamic models list 2026-02-08 01:48:24 +08:00
hkxiaoyao
9aad3dec7e refactor: Refactor authentication method based on presence of client credentials (#5)
- Refactor authentication method based on presence of client credentials in payload.
2026-02-08 00:02:35 +08:00
hkxiaoyao
5332a5381e Optimize loading model display, add credit (#4) 2026-02-07 21:17:21 +08:00
Quorinex
d6fa49f24e feat: add i18n support and batch JSON credentials import 2026-02-06 21:54:04 +08:00