Releases · nnemirovsky/sluice

23 May 04:09

nnemirovsky

v0.19.1

7b66bdf

v0.19.1 Latest

Latest

Bug Fixes

derive pool member cooldown from the 429 reset window. When no Retry-After / x-ratelimit-reset* header is present, sluice now parses the JSON body (resets_in_seconds / resets_at, retry_after / reset_after, top-level and nested under error), and the header set is broadened to cover RateLimit-Reset, X-RateLimit-Reset, X-RateLimit-Reset-After, and Anthropic ISO-8601 resets. This fixes the ~60s exhausted/recovered cycle on OpenAI Codex usage-limit 429s, whose reset window is carried only in the body. Header windows clamp to 6h, body windows to 24h #50 @nnemirovsky

Contributors

nnemirovsky

Assets 7

23 May 03:27

nnemirovsky

v0.19.0

ccc7255

v0.19.0

New Features

pool exhaustion handling: cooldowns now derive from the upstream Retry-After and x-ratelimit-reset hints, exhaustion is detected when no pool member is healthy, and the exhausted and recovered notices fire once on the edge via a server-side recovery monitor #49 @nnemirovsky
opt-in per-pool auth_reset_target that resets the agent's auth on the exhausted-to-recovered edge, settable from CLI, REST, and Telegram #49 @nnemirovsky

Contributors

nnemirovsky

Assets 7

18 May 13:30

nnemirovsky

v0.18.0

e5223ce

v0.18.0

Bug Fixes

sticky pool failover: there is no longer a "main" account. The pool stays on whichever member is currently active until that member itself exhausts, then advances forward to the next member and stays there. A lower-position member recovering from cooldown no longer causes a switch back to it. Previously a position-0 member with an exhausted upstream quota would be re-probed every 60s (when its short rate-limit cooldown lapsed), 429 again, and fail over again, producing an endless stream of identical "failed over" Telegram notices while the agent kept working on the other account. A failover, and its single audit event and Telegram notice, now happens only on a genuine exhaustion transition #48 @nnemirovsky

Improvements

the sticky current-active selection lives on the swap-surviving shared pool health state, is epoch-scoped so a removed-and-recreated pool with overlapping member names cannot inherit a stale hold, and uses a read-mostly fast path so the per-request resolve no longer serializes on a write lock. The all-cooling degrade behavior and the audit Reason and pool_exhausted formats are unchanged. A selectable position-vs-sticky strategy mode is a possible follow-up #48 @nnemirovsky

Contributors

nnemirovsky

Assets 7

18 May 09:09

nnemirovsky

v0.17.0

6d5e680

v0.17.0

New Features

credential pools now reachable from all channels: new REST /api/pools endpoints (list, create, status, rotate, remove) and Telegram /pool create|list|status|rotate|remove, alongside the existing CLI. Pool operation logic lifted into a channel-agnostic internal/poolops package so CLI, REST, and Telegram cannot drift #47 @nnemirovsky

Bug Fixes

pool token-host phantom expansion is scoped to grant_type=refresh_token, so a fresh in-container codex login --device-auth (a device_code grant) is no longer corrupted into a 400 token_exchange_user_error #47 @nnemirovsky
REST pool-create error mapping corrected: client validation returns 400, name or member conflict returns 409 via a dedicated pool-referenced schema, internal and DB faults return 500 instead of a misleading 400 #47 @nnemirovsky
Telegram pool replies are HTML-safe: pool names and the bind hint placeholder are escaped so a created-pool success message is never rejected by the Bot API #47 @nnemirovsky

Improvements

friendlier pool failover Telegram notification with humanized reason text instead of a bare HTTP code, while the audit Reason format is unchanged #47 @nnemirovsky
the request-side grant probe is gated to token-host POSTs and bounded, keeping the proxy hot path cheap #47 @nnemirovsky

Docs

channel feature-parity principle codified in CLAUDE.md and CONTRIBUTING.md #45 @nnemirovsky
condensed CLAUDE.md without dropping facts #46 @nnemirovsky

Contributors

nnemirovsky

Assets 7

17 May 02:26

nnemirovsky

v0.16.0

70fcff3

v0.16.0

Credential pools with automatic failover

One phantom identity the agent holds can now be backed by N real OAuth credentials, with transparent auto-failover when one is rate-limited or its auth fails.

Pool model: sluice pool create|list|status|rotate|remove. A pool maps a single pool-stable phantom (byte-identical synthetic JWT, R3) to the currently active member; refreshed tokens are attributed back to the issuing member via the injected refresh token (R1, fail-closed).
Auto-failover (Phase 2): 429 / 403 insufficient_quota → rate-limited; 401 / token-body invalid_grant/invalid_token → auth-failure. The active member is cooled and switched synchronously in-memory before the response returns, so the agent's own retry lands on the next member. Monotonic cooldown extension; lazy recovery; degrade-never-hard-fail.
QUIC request-side pool awareness + the R3 pool-stable phantom (response-side R1/failover is HTTP-only by capability).
Data model: migrations 000006_credential_pools, 000007_pool_membership_epoch.

Approval-prompt coalescing

Concurrent approvals to the same dest:port collapse into one Telegram prompt; one tap dismisses the whole burst, and the final coalesced count is folded into the existing resolve/cancel edit (zero extra Telegram API calls). MCP tool calls opt out (arg-sensitive). QUIC keeps its own packet-dedup.

Failover hardening (fixes found in live operation)

Operator-park stranding fixed: pool rotate parks the displaced member with reason manual rotate — that member is healthy, just deprioritized. The all-cooled degrade now prefers an operator-parked-but-healthy member over a genuinely-failed one, so a rotate onto an exhausted account fails over to the healthy peer instead of self-looping (which previously hard-failed the agent). Normal position-order rotate semantics are unchanged.
Self-failover spam fixed: when there is no distinct failover target (to == from), it is classified as pool exhaustion — a distinct pool_exhausted audit action and an honest "pool exhausted" operator notice instead of a meaningless self-referential cred_failover. FailoverEvent.Exhausted carries the distinction.
Notice dedup: identical (pool, from, to, tag) signals are deduplicated within a 30s window, so an agent retry storm yields one audit row + one notice instead of N.

Fail-before/pass-after tests cover the degrade preference, the pool-exhausted suppression + dedup, and real failover to an operator-parked-but-healthy peer.

Known limitation

sluice's pooled token-host phantom expansion currently also rewrites non-refresh OAuth grants (e.g. device_code) to the pool's shared token host, which breaks a fresh in-container OAuth login for a pooled provider. Perform the initial login outside the proxy (or before binding the pool). Tracked for a follow-up.

Assets 7

12 May 04:14

nnemirovsky

v0.15.1

a8317a5

v0.15.1

Bug Fixes

proxy: SSH jump host close race that dropped the agent's exec reply (#41)

When an upstream SSH server replied + wrote data + sent exit-status + closed the channel in one burst, sluice's wait on the three upstream-to-agent goroutines completed and closed srcChan while the agent-to-upstream forwarder was still mid-reply for the agent's exec request. The agent's session.SendRequest("exec", true, ...) observed SSH_MSG_CHANNEL_CLOSE before the SUCCESS reply landed on ch.msg, gossh surfaced the closed channel as io.EOF, and session.Output("whoami") failed with EOF even though the upstream succeeded. The fix tracks in-flight agent-to-upstream requests with a mutex+cond barrier so the close path drains any pending reply before srcChan.CloseWrite() / srcChan.Close(). Symptom was visible on the CI e2e-linux runners since d27b05e narrowed the close timing window; production SSH clients have enough natural latency between reply and close that the race window almost never opens.

Assets 7

12 May 02:30

nnemirovsky

v0.15.0

55092f4

v0.15.0

Bug Fixes

proxy: URL-encoded phantom tokens now swap correctly in application/x-www-form-urlencoded bodies, URL query strings, URL paths, request headers, streaming bodies, QUIC/HTTP3 paths, and WebSocket text frames (#40)

OAuth refresh round-trips for providers that POST grant_type=refresh_token as form-urlencoded data (Anthropic Claude Code, Google) now go through the phantom swap cleanly. Previously the colon in SLUICE_PHANTOM:<name> got percent-encoded to %3A on the wire and the scanner missed it, so the upstream received the phantom verbatim and returned invalid_grant. The fix matches both casings (%3A and %3a) per RFC 3986 §2.1, uses path-correct escaping (PathEscape vs QueryEscape) so secrets containing spaces don't corrupt URL paths, and keeps secrets in byte slices that SecureBytes.Release() can zero.

Internals

Each phantomPair now carries precomputed encodedPhantom and encodedPhantomLower byte slices populated once at pair construction time, so the hot-path swap reads precomputed bytes instead of recomputing url.QueryEscape on every request, header, and stream chunk
Added byte-in/byte-out queryEscapeBytes and pathEscapeBytes helpers (RFC 3986 §2.3 unreserved sets) that replace url.QueryEscape(string(secret.Bytes())) patterns and avoid leaving immutable string copies of secrets on the heap
swapPhantomBytes selects between query and path escaping via an explicit pathContext bool parameter instead of comparing the human-readable location label, so the type system enforces the encoding choice
Allocation-free fast paths in encodePhantomForPair and encodePhantomLowerForPair skip the byte/string copy when the input contains no characters that would change under escape

Assets 7

08 May 13:31

nnemirovsky

v0.14.0

399807c

v0.14.0

New Features

CIDR rule destinations: rules whose destination contains a / are now interpreted as CIDR (e.g. 192.168.0.0/16, 2001:db8::/32) and matched via IP containment instead of being treated as literal glob patterns (#39)
HTTP Host header peeking on port 80 / 8080: SOCKS5 CONNECT requests that arrive with a bare IP and a non-Allow / non-Deny verdict now defer policy evaluation, peek the request's Host header, and re-evaluate against the recovered hostname. Mirrors the existing TLS SNI peek path. Eliminates the need for one approval rule per IP behind a hostname rule (e.g. tailscale's DERP probes hitting dozens of derp[N].tailscale.com IPs) (#39)

Security hardening for the new HTTP Host path

Spoofing guard verifies the recovered Host actually binds to the destination IP via the DNS interceptor's reverse cache or a forward DNS lookup. A claim like Host: api.openai.com to an arbitrary IP is rejected before the verdict is upgraded (#39)
Peek failure on a deferred port-80 connection attaches a per-request policy checker bound to the IP destination so the broker still gets to ask, instead of silently upgrading the original Ask verdict to an allow (#39)
HTTP-host deferral is gated on broker presence so Ask-without-broker continues to collapse to Deny via the IP-based path before SOCKS5 success goes out, avoiding success-then-reset on the client side (#39)

Assets 7

07 May 11:02

nnemirovsky

v0.13.2

61f4c00

v0.13.2

Bug Fixes

env-file ownership: chown the agent env file back to the runtime user after docker exec writes it as root, so hermes claw migrate and other agent-side writes keep working (#38)
panic recovery in MITM response and stream paths: deferred recovers in Response and StreamResponseModifier log the stack and fall back to safe defaults so an OAuth handler panic no longer abandons the response body and triggers a JSONDecodeError in the agent (#38)
OAuth-vs-static header dispatch: header bindings on OAuth credentials no longer substitute the full JSON envelope into the request header. A new metadata-driven helper (extractInjectableSecret) reads from OAuthIndex to extract just access_token for OAuth credentials and pass static credentials through unchanged. Mirrored into the QUIC proxy so HTTP/3 follows the same dispatch as HTTP/1 and HTTP/2 (#38)
stream OAuth body leak guard: when a panic fires after io.ReadAll but before swapOAuthTokens returns, the recover no longer hands the agent the raw upstream bytes (which would contain real access and refresh tokens). The fallback is now http.NoBody until a successful swap produces a phantom-only buffer (#38)
nil-input guard ordering: the StreamResponseModifier nil-input check now runs before the flow-nil early return, so a call with both f == nil and in == nil returns http.NoBody instead of a nil reader (#38)

Assets 7

07 May 07:59

nnemirovsky

v0.13.1

d27b05e

v0.13.1

Bug Fixes

Hermes deployment fixes that emerged from running v0.13.0 in production. All six are real correctness or compatibility issues, not just polish.

OAuth response handler: now decompresses gzip/br/deflate before parsing the token JSON, and is wrapped in a deferred recover with snapshot/rollback so a malformed body cannot panic the proxy or leave a half-rewritten response with stripped encoding headers. Reproduced live against auth.openai.com which returns gzip by default.
Env-file marker block: sluice now writes phantom tokens into a fenced BEGIN sluice-managed / END sluice-managed block and replaces only that block on each call. Foreign keys (set by hermes claw migrate, the agent's own auth flow, or an operator) are preserved across both incremental updates and full reconciliation runs. Values are written single-quoted so the file is safe under both shell source and dotenv parsing.
MCP gateway always mounts: the /mcp endpoint used to mount only when a sluice MCP upstream was registered. Agents that registered sluice as an MCP server (the documented setup) hit a 404 before the operator could add the first upstream. Now the gateway always starts; with zero upstreams it exposes an empty tool list.
HermesProfile WireMCPCmd uses the bundled venv: a sh wrapper activates /opt/hermes/.venv when present so PyYAML is on the import path inside the official Hermes image. Native installs without the venv keep working via the system python3.
SSH proxy exit-status race: sshHandleChannel previously called srcChan.CloseWrite from the upstream→agent data-copy goroutine the moment it saw EOF, racing the request-forwarder writing exit-status on the same channel. Fix holds the agent-side stdout EOF until every upstream→agent goroutine has drained, then issues CloseWrite followed by Close. Stdin direction is unchanged so upstream commands like cat still terminate correctly.
Configurable Telegram agent label: approval messages used to read "OpenClaw wants to connect to..." regardless of the active profile. New SetAgentDisplayName is wired from the --agent flag at startup. Hermes deployments now read "Hermes wants to connect to...". The display name is HTML-escaped at render time.

Deploy files

The repo's compose.yml, compose.dev.yml, and Caddyfile switch to the Hermes stack as the supported deployment. A new bootstrap.sh runs hermes claw migrate against an existing OpenClaw home volume one time, then patches mcp_servers.sluice.url into ~/.hermes/config.yaml. Caddy cert paths moved off provider-specific /etc/cloudflare/... to standard FHS /etc/ssl/certs/agent.pem + /etc/ssl/private/agent.key.

Operators on OpenClaw who want to keep the v0.12.x deployment shape can pin to ghcr.io/nnemirovsky/sluice:0.12 and use the compose / Caddyfile from any v0.12.x tag.

#37 @nnemirovsky

Contributors

nnemirovsky

Assets 7

Releases: nnemirovsky/sluice

v0.19.1

Contributors

Uh oh!

v0.19.0

Contributors

Uh oh!

v0.18.0

Contributors

Uh oh!

v0.17.0

Contributors

Uh oh!

v0.16.0

Credential pools with automatic failover

Approval-prompt coalescing

Failover hardening (fixes found in live operation)

Known limitation

Uh oh!

v0.15.1

Uh oh!

v0.15.0

Uh oh!

v0.14.0

Uh oh!

v0.13.2

Uh oh!

v0.13.1

Contributors

Uh oh!