Skip to content

Draft: Bundle relay performance and reliability improvements#131

Open
poulcarlsen53 wants to merge 1 commit into
Kianmhz:mainfrom
poulcarlsen53:optimization-reliability-bundle
Open

Draft: Bundle relay performance and reliability improvements#131
poulcarlsen53 wants to merge 1 commit into
Kianmhz:mainfrom
poulcarlsen53:optimization-reliability-bundle

Conversation

@poulcarlsen53
Copy link
Copy Markdown
Contributor

Summary

This draft PR bundles performance, reliability, diagnostics, transport, and configuration improvements from an optimized GooseRelayVPN fork.

Major areas:

  • Apps Script relay improvements and safer examples.
  • Frame/crypto allocation reductions.
  • Quota rotation and endpoint recovery improvements.
  • Exit server request hardening and session caps.
  • Direct POST and WebSocket stream transport options.
  • Diagnostics bundle support.
  • Additional tests, benchmarks, and documentation.

Verification

  • go test -count=1 ./...
  • go vet ./...
  • go build ./cmd/client ./cmd/server ./bench/harness ./bench/sink ./bench/diff

Notes

This is intentionally opened as a draft because it is a large change set. I am happy to split pieces into smaller PRs if that is easier to review.

@Kianmhz
Copy link
Copy Markdown
Owner

Kianmhz commented May 17, 2026

Thanks for putting this together, and for offering to split it up front. I'd like to take you up on that. At ~7.8k additions across 62 files touching crypto, transport, session, and config, it's not realistically reviewable as one PR, and it's currently conflicting with main (likely the frame allocations work in 86ffe94 that already landed).

Before splitting, a few product questions worth settling so the split PRs aren't chasing a moving target:

  1. performance_mode presets vs. raw knobs. Do you want latency / throughput / balanced as a first-class config, or just expose the individual *_ms fields and let users tune directly? Presets are friendlier but lock in opinions that may not survive contact with real deployments.
  2. auto_tune. Is this in scope, or scope creep? An auto-tuner adds a lot of surface area (and a lot of "why did my throughput change overnight?" support load) for unclear win over good defaults.
  3. direct_stream_urls vs. existing relay_urls. Today relay_urls already bypasses Google fronting (see internal/config/client.go:337). Is direct_stream_urls doing something different (WebSocket-only?), or should the two be unified?

Happy to discuss any of these in a separate issue if easier.

Once those are settled, suggested split (each rebased onto main, each independently reviewable):

  • A. Quota rotation + endpoint recovery: internal/carrier/quota*.go, internal/carrier/client.go.
  • B. Exit hardening + session caps: internal/exit/exit.go, internal/exit/exit_timing_test.go, internal/session/*.
  • C. New transports (direct POST + WS stream): internal/carrier/stream*.go, internal/exit/stream.go, plus the related client/server config fields they need.
  • D. Autotune + performance_mode: internal/{carrier,exit}/autotune*.go and the preset wiring. Only if (1) and (2) above land as "yes".
  • E. Diagnostics bundle: cmd/client/diagnostics*.go.
  • F. Docs + benchmarks: bench/*, docs/*, README updates. Can go last or piggyback on whichever PR introduces the feature being documented.

The Apps Script + example-config pieces overlap with #129, which I've left review comments on. Once that lands, please rebase this branch on top to avoid re-litigating those files.

Frame/crypto: 86ffe94 already reduced marshaling allocations on internal/frame/*. Before re-doing that work in a split PR, worth diffing against current main to see what's still needed.

Thanks again, looking forward to the smaller PRs.

@poulcarlsen53
Copy link
Copy Markdown
Contributor Author

Thanks for the detailed review and for laying out the split. I agree that this PR is too large to review safely as one bundle, especially because it mixes config, transport, session behavior, diagnostics, benchmarks, and docs.

I’ll treat this PR as superseded and will open smaller PRs instead. Please feel free to close this one.

On the product questions:

  1. performance_mode presets vs raw knobs

For upstream, I think the safer direction is to prioritize raw knobs first, because they are explicit and easier to reason about in review. Presets are useful for users, but they encode opinions that may need real deployment data before becoming stable defaults.

If performance_mode stays, I think it should be only a convenience layer:

  • explicit config fields should always override the preset
  • preset behavior should be documented clearly
  • no hidden changes to payload limits, transport selection, or compatibility-sensitive behavior

So my current preference for upstream is: land the individual knobs with conservative defaults first, then discuss performance_mode separately once the behavior is proven.

  1. auto_tune

I agree this is easy to turn into scope creep. Auto-tune can be useful, but it adds support/debugging complexity because users may not understand why latency or throughput changes over time.

For upstream, I think auto_tune should not be part of the first split PRs. It should be deferred until after the static knobs, metrics, and diagnostics are in place. If it comes back later, it should be:

  • off by default
  • bounded by explicit min/max caps
  • limited to latency-window tuning only
  • visible in logs/diagnostics when it changes anything

So for now I’ll keep auto_tune out of the next PRs unless you want to discuss it in a separate issue.

  1. direct_stream_urls vs relay_urls

My intent was:

  • relay_urls = direct HTTP POST /tunnel path, bypassing Apps Script but still using the request/response batch model
  • direct_stream_urls = WebSocket /stream path, meant for a persistent direct-to-VPS transport

So yes, direct_stream_urls is WebSocket-only in intent and does something different from existing relay_urls.

That said, I agree the naming can be confusing because both are “direct relay” paths. I’m open to whichever naming you prefer before I split that work. Possible options:

  • keep relay_urls for direct POST and direct_stream_urls for WS
  • rename direct_stream_urls to stream_urls or websocket_urls
  • eventually move toward a more structured transport config, but that may be too much for now

For the split PRs, I’ll avoid introducing stream config until we settle the naming/design.

Planned split:

  1. Finish/land Improve Apps Script relay and example configs #129 first, then rebase future branches on current main.
  2. Quota rotation + endpoint recovery as its own PR.
  3. Exit/session hardening as its own PR.
  4. Diagnostics bundle as its own PR.
  5. Direct POST / WebSocket stream transport separately, only after the relay_urls/direct_stream_urls naming is agreed.
  6. Docs and benchmarks either attached to the feature PR they validate or sent after the code PRs.
  7. Autotune/performance_mode last, only if we decide they belong upstream.

On frame/crypto: agreed. Since 86ffe94 already landed allocation improvements, I’ll diff against current main before proposing any frame/crypto work and avoid redoing already-merged changes.

I’ll stop updating this large bundle and move the work into smaller reviewable PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants