Releases: nnemirovsky/sluice
v0.19.1
Bug Fixes
- derive pool member cooldown from the 429 reset window. When no
Retry-After/x-ratelimit-reset*header is present, sluice now parses the JSON body (resets_in_seconds/resets_at,retry_after/reset_after, top-level and nested undererror), and the header set is broadened to coverRateLimit-Reset,X-RateLimit-Reset,X-RateLimit-Reset-After, and Anthropic ISO-8601 resets. This fixes the ~60s exhausted/recovered cycle on OpenAI Codex usage-limit 429s, whose reset window is carried only in the body. Header windows clamp to 6h, body windows to 24h #50 @nnemirovsky
v0.19.0
New Features
- pool exhaustion handling: cooldowns now derive from the upstream Retry-After and x-ratelimit-reset hints, exhaustion is detected when no pool member is healthy, and the exhausted and recovered notices fire once on the edge via a server-side recovery monitor #49 @nnemirovsky
- opt-in per-pool
auth_reset_targetthat resets the agent's auth on the exhausted-to-recovered edge, settable from CLI, REST, and Telegram #49 @nnemirovsky
v0.18.0
Bug Fixes
- sticky pool failover: there is no longer a "main" account. The pool stays on whichever member is currently active until that member itself exhausts, then advances forward to the next member and stays there. A lower-position member recovering from cooldown no longer causes a switch back to it. Previously a position-0 member with an exhausted upstream quota would be re-probed every 60s (when its short rate-limit cooldown lapsed), 429 again, and fail over again, producing an endless stream of identical "failed over" Telegram notices while the agent kept working on the other account. A failover, and its single audit event and Telegram notice, now happens only on a genuine exhaustion transition #48 @nnemirovsky
Improvements
- the sticky current-active selection lives on the swap-surviving shared pool health state, is epoch-scoped so a removed-and-recreated pool with overlapping member names cannot inherit a stale hold, and uses a read-mostly fast path so the per-request resolve no longer serializes on a write lock. The all-cooling degrade behavior and the audit Reason and pool_exhausted formats are unchanged. A selectable position-vs-sticky strategy mode is a possible follow-up #48 @nnemirovsky
v0.17.0
New Features
- credential pools now reachable from all channels: new REST
/api/poolsendpoints (list, create, status, rotate, remove) and Telegram/pool create|list|status|rotate|remove, alongside the existing CLI. Pool operation logic lifted into a channel-agnosticinternal/poolopspackage so CLI, REST, and Telegram cannot drift #47 @nnemirovsky
Bug Fixes
- pool token-host phantom expansion is scoped to
grant_type=refresh_token, so a fresh in-containercodex login --device-auth(a device_code grant) is no longer corrupted into a 400 token_exchange_user_error #47 @nnemirovsky - REST pool-create error mapping corrected: client validation returns 400, name or member conflict returns 409 via a dedicated pool-referenced schema, internal and DB faults return 500 instead of a misleading 400 #47 @nnemirovsky
- Telegram pool replies are HTML-safe: pool names and the bind hint placeholder are escaped so a created-pool success message is never rejected by the Bot API #47 @nnemirovsky
Improvements
- friendlier pool failover Telegram notification with humanized reason text instead of a bare HTTP code, while the audit Reason format is unchanged #47 @nnemirovsky
- the request-side grant probe is gated to token-host POSTs and bounded, keeping the proxy hot path cheap #47 @nnemirovsky
Docs
- channel feature-parity principle codified in CLAUDE.md and CONTRIBUTING.md #45 @nnemirovsky
- condensed CLAUDE.md without dropping facts #46 @nnemirovsky
v0.16.0
Credential pools with automatic failover
One phantom identity the agent holds can now be backed by N real OAuth credentials, with transparent auto-failover when one is rate-limited or its auth fails.
- Pool model:
sluice pool create|list|status|rotate|remove. A pool maps a single pool-stable phantom (byte-identical synthetic JWT, R3) to the currently active member; refreshed tokens are attributed back to the issuing member via the injected refresh token (R1, fail-closed). - Auto-failover (Phase 2):
429/403 insufficient_quota→ rate-limited;401/ token-bodyinvalid_grant/invalid_token→ auth-failure. The active member is cooled and switched synchronously in-memory before the response returns, so the agent's own retry lands on the next member. Monotonic cooldown extension; lazy recovery; degrade-never-hard-fail. - QUIC request-side pool awareness + the R3 pool-stable phantom (response-side R1/failover is HTTP-only by capability).
- Data model: migrations
000006_credential_pools,000007_pool_membership_epoch.
Approval-prompt coalescing
Concurrent approvals to the same dest:port collapse into one Telegram prompt; one tap dismisses the whole burst, and the final coalesced count is folded into the existing resolve/cancel edit (zero extra Telegram API calls). MCP tool calls opt out (arg-sensitive). QUIC keeps its own packet-dedup.
Failover hardening (fixes found in live operation)
- Operator-park stranding fixed:
pool rotateparks the displaced member with reasonmanual rotate— that member is healthy, just deprioritized. The all-cooled degrade now prefers an operator-parked-but-healthy member over a genuinely-failed one, so a rotate onto an exhausted account fails over to the healthy peer instead of self-looping (which previously hard-failed the agent). Normal position-order rotate semantics are unchanged. - Self-failover spam fixed: when there is no distinct failover target (
to == from), it is classified as pool exhaustion — a distinctpool_exhaustedaudit action and an honest "pool exhausted" operator notice instead of a meaningless self-referentialcred_failover.FailoverEvent.Exhaustedcarries the distinction. - Notice dedup: identical
(pool, from, to, tag)signals are deduplicated within a 30s window, so an agent retry storm yields one audit row + one notice instead of N.
Fail-before/pass-after tests cover the degrade preference, the pool-exhausted suppression + dedup, and real failover to an operator-parked-but-healthy peer.
Known limitation
sluice's pooled token-host phantom expansion currently also rewrites non-refresh OAuth grants (e.g. device_code) to the pool's shared token host, which breaks a fresh in-container OAuth login for a pooled provider. Perform the initial login outside the proxy (or before binding the pool). Tracked for a follow-up.
v0.15.1
Bug Fixes
- proxy: SSH jump host close race that dropped the agent's exec reply (#41)
When an upstream SSH server replied + wrote data + sent exit-status + closed the channel in one burst, sluice's wait on the three upstream-to-agent goroutines completed and closed srcChan while the agent-to-upstream forwarder was still mid-reply for the agent's exec request. The agent's session.SendRequest("exec", true, ...) observed SSH_MSG_CHANNEL_CLOSE before the SUCCESS reply landed on ch.msg, gossh surfaced the closed channel as io.EOF, and session.Output("whoami") failed with EOF even though the upstream succeeded. The fix tracks in-flight agent-to-upstream requests with a mutex+cond barrier so the close path drains any pending reply before srcChan.CloseWrite() / srcChan.Close(). Symptom was visible on the CI e2e-linux runners since d27b05e narrowed the close timing window; production SSH clients have enough natural latency between reply and close that the race window almost never opens.
v0.15.0
Bug Fixes
- proxy: URL-encoded phantom tokens now swap correctly in
application/x-www-form-urlencodedbodies, URL query strings, URL paths, request headers, streaming bodies, QUIC/HTTP3 paths, and WebSocket text frames (#40)
OAuth refresh round-trips for providers that POST grant_type=refresh_token as form-urlencoded data (Anthropic Claude Code, Google) now go through the phantom swap cleanly. Previously the colon in SLUICE_PHANTOM:<name> got percent-encoded to %3A on the wire and the scanner missed it, so the upstream received the phantom verbatim and returned invalid_grant. The fix matches both casings (%3A and %3a) per RFC 3986 §2.1, uses path-correct escaping (PathEscape vs QueryEscape) so secrets containing spaces don't corrupt URL paths, and keeps secrets in byte slices that SecureBytes.Release() can zero.
Internals
- Each
phantomPairnow carries precomputedencodedPhantomandencodedPhantomLowerbyte slices populated once at pair construction time, so the hot-path swap reads precomputed bytes instead of recomputingurl.QueryEscapeon every request, header, and stream chunk - Added byte-in/byte-out
queryEscapeBytesandpathEscapeByteshelpers (RFC 3986 §2.3 unreserved sets) that replaceurl.QueryEscape(string(secret.Bytes()))patterns and avoid leaving immutable string copies of secrets on the heap swapPhantomBytesselects between query and path escaping via an explicitpathContext boolparameter instead of comparing the human-readable location label, so the type system enforces the encoding choice- Allocation-free fast paths in
encodePhantomForPairandencodePhantomLowerForPairskip the byte/string copy when the input contains no characters that would change under escape
v0.14.0
New Features
- CIDR rule destinations: rules whose destination contains a
/are now interpreted as CIDR (e.g.192.168.0.0/16,2001:db8::/32) and matched via IP containment instead of being treated as literal glob patterns (#39) - HTTP Host header peeking on port 80 / 8080: SOCKS5 CONNECT requests that arrive with a bare IP and a non-Allow / non-Deny verdict now defer policy evaluation, peek the request's
Hostheader, and re-evaluate against the recovered hostname. Mirrors the existing TLS SNI peek path. Eliminates the need for one approval rule per IP behind a hostname rule (e.g. tailscale's DERP probes hitting dozens ofderp[N].tailscale.comIPs) (#39)
Security hardening for the new HTTP Host path
- Spoofing guard verifies the recovered Host actually binds to the destination IP via the DNS interceptor's reverse cache or a forward DNS lookup. A claim like
Host: api.openai.comto an arbitrary IP is rejected before the verdict is upgraded (#39) - Peek failure on a deferred port-80 connection attaches a per-request policy checker bound to the IP destination so the broker still gets to ask, instead of silently upgrading the original Ask verdict to an allow (#39)
- HTTP-host deferral is gated on broker presence so Ask-without-broker continues to collapse to Deny via the IP-based path before SOCKS5 success goes out, avoiding success-then-reset on the client side (#39)
v0.13.2
Bug Fixes
- env-file ownership: chown the agent env file back to the runtime user after
docker execwrites it as root, sohermes claw migrateand other agent-side writes keep working (#38) - panic recovery in MITM response and stream paths: deferred recovers in
ResponseandStreamResponseModifierlog the stack and fall back to safe defaults so an OAuth handler panic no longer abandons the response body and triggers aJSONDecodeErrorin the agent (#38) - OAuth-vs-static header dispatch: header bindings on OAuth credentials no longer substitute the full JSON envelope into the request header. A new metadata-driven helper (
extractInjectableSecret) reads fromOAuthIndexto extract justaccess_tokenfor OAuth credentials and pass static credentials through unchanged. Mirrored into the QUIC proxy so HTTP/3 follows the same dispatch as HTTP/1 and HTTP/2 (#38) - stream OAuth body leak guard: when a panic fires after
io.ReadAllbut beforeswapOAuthTokensreturns, the recover no longer hands the agent the raw upstream bytes (which would contain real access and refresh tokens). The fallback is nowhttp.NoBodyuntil a successful swap produces a phantom-only buffer (#38) - nil-input guard ordering: the
StreamResponseModifiernil-input check now runs before the flow-nil early return, so a call with bothf == nilandin == nilreturnshttp.NoBodyinstead of a nil reader (#38)
v0.13.1
Bug Fixes
Hermes deployment fixes that emerged from running v0.13.0 in production. All six are real correctness or compatibility issues, not just polish.
- OAuth response handler: now decompresses gzip/br/deflate before parsing the token JSON, and is wrapped in a deferred recover with snapshot/rollback so a malformed body cannot panic the proxy or leave a half-rewritten response with stripped encoding headers. Reproduced live against
auth.openai.comwhich returns gzip by default. - Env-file marker block: sluice now writes phantom tokens into a fenced
BEGIN sluice-managed/END sluice-managedblock and replaces only that block on each call. Foreign keys (set byhermes claw migrate, the agent's own auth flow, or an operator) are preserved across both incremental updates and full reconciliation runs. Values are written single-quoted so the file is safe under both shellsourceand dotenv parsing. - MCP gateway always mounts: the
/mcpendpoint used to mount only when a sluice MCP upstream was registered. Agents that registered sluice as an MCP server (the documented setup) hit a 404 before the operator could add the first upstream. Now the gateway always starts; with zero upstreams it exposes an empty tool list. - HermesProfile WireMCPCmd uses the bundled venv: a sh wrapper activates
/opt/hermes/.venvwhen present so PyYAML is on the import path inside the official Hermes image. Native installs without the venv keep working via the systempython3. - SSH proxy exit-status race:
sshHandleChannelpreviously calledsrcChan.CloseWritefrom the upstream→agent data-copy goroutine the moment it saw EOF, racing the request-forwarder writing exit-status on the same channel. Fix holds the agent-side stdout EOF until every upstream→agent goroutine has drained, then issuesCloseWritefollowed byClose. Stdin direction is unchanged so upstream commands likecatstill terminate correctly. - Configurable Telegram agent label: approval messages used to read "OpenClaw wants to connect to..." regardless of the active profile. New
SetAgentDisplayNameis wired from the--agentflag at startup. Hermes deployments now read "Hermes wants to connect to...". The display name is HTML-escaped at render time.
Deploy files
The repo's compose.yml, compose.dev.yml, and Caddyfile switch to the Hermes stack as the supported deployment. A new bootstrap.sh runs hermes claw migrate against an existing OpenClaw home volume one time, then patches mcp_servers.sluice.url into ~/.hermes/config.yaml. Caddy cert paths moved off provider-specific /etc/cloudflare/... to standard FHS /etc/ssl/certs/agent.pem + /etc/ssl/private/agent.key.
Operators on OpenClaw who want to keep the v0.12.x deployment shape can pin to ghcr.io/nnemirovsky/sluice:0.12 and use the compose / Caddyfile from any v0.12.x tag.