feat(tunnel): IPv6 support — prefix delegation + PCP pinholes (WIP) by helix-nine · Pull Request #3388 · Start9Labs/start-technologies

helix-nine · 2026-07-02T23:19:45Z

Status: ready for review. This is PR1 of a staged rollout (per @dr-bonez: 1a+2b, 3b, 4a). It lands the IPv6 addressing foundation — allocation, WireGuard config generation, proxy-NDP dataplane, DNS, CLI/API, UI, and docs. PCP pinholes are PR2. Commits: aa0603a (foundation), 6fea6f3 (proxy-NDP), c060ce1 (docs/changelog/version), 6201a62 (UI), d21dace (self-review fixes).

Decision (research-backed)

The addressing model hinges on what VPS providers actually delegate. I surveyed the providers the tunnel docs recommend:

Provider	Default IPv6
Hetzner Cloud	/64 (routed, gw `fe80::1`)
Vultr	/64
BuyVM	/64 (optional routed /48)
DigitalOcean	/124 (16 addrs)
Linode/Akamai	/128 default; /64 or /56 on request
OVH VPS	/128 (shared /64); dedicated → /64, High-Grade → /56

So the common case is a single /64, sometimes less. Per the rule "if it's just a /64 → 1a+2b": the tunnel takes a statically configured routed prefix (1a) and assigns each client a single /128 out of it (2b) — while scaling up for free: a prefix shorter than /64 (Linode /56, dedicated, BuyVM /48) delegates a whole /64 per client (2a, real PD, so a StartOS box can sub-address its containers).

What's in this PR

tunnel/wg6.rs — the allocation module (server_addr, client_v6), unit-tested across shared-/64, short-prefix (per-client /64), and narrow (/124) cases. For a shared /64 the client's /128 host bits are its tunnel IPv4 (stable, allocation-free); ::1 is reserved for the tunnel.
WgServer.ipv6 prefix field + WireGuard config generation: the server interface carries <prefix>::1, each peer's AllowedIPs routes its delegated /64 or /128 back over wg, and client configs get their v6 Address, v6 DNS (= the tunnel), and ::/0 for v6.
- Routing note: IPv6 is full-tunnel (AllowedIPs = …, ::/0). Replies sourced from a VPS-delegated GUA must return through the tunnel, and a plain WireGuard peer can't source-route otherwise. IPv4 stays split.
set-ipv6 API + CLI to configure/clear the prefix (+ i18n across 5 locales, TS bindings export).
forward chain now accepts client-originated IPv6 (nft_rule_v6); inbound-to-client is deliberately a pinhole (see PR2).

Tested: cargo test -p start-core --features test tunnel::wg → 9 passing (allocation math + rendered configs).

PR1 checklist

Dataplane: proxy-NDP for on-link /64s (Hetzner/Vultr) — the VM answers ND for client /128s on the WAN; routed prefixes / per-client /64s need none (6fea6f3, resync_v6). Guarded by a v6_lock (d21dace).
AAAA DNS — already served by the existing RFC 2136 injection path (parse_rdata is record-type-generic), so a device can publish an AAAA for its GUA today. Documented; no new code needed.
Angular UI: prefix config + Disable on the settings page (6201a62). Each device's v6 appears in its generated config (device show-config); I left it out of the devices list rather than duplicate the allocation math in TS — happy to expose it via the API for the list if you'd prefer a column.
Docs (ipv6.md + cli-reference.md + generated manpage), CHANGELOG.md, version 1.1.0 → 1.2.0 (c060ce1).
Self-review: adversarial review of the full diff → 7 findings fixed (2 major) in d21dace (see comment).
End-to-end verification on a real VPS with a routed prefix — needs real global IPv6, so it can't run on a local VM. Worth a manual pass before release.

PR2 (follow-up, `3b`)

Extend the tunnel's PCP server for IPv6 GUA pinholes (ip6 nft accept, no NAT) — what StartOS's PCP client already requests.
Manual pinhole management in the UI/CLI.

Feedback welcome on the allocation scheme and the full-tunnel-v6 AllowedIPs choice.

helix-nine · 2026-07-03T00:06:34Z

PR1 is now feature-complete (addressing foundation). Four commits on the branch:

aa0603a — IPv6 address-delegation foundation (wg6 allocation module + WG config generation) — unit-tested.
6fea6f3 — proxy-NDP dataplane for on-link /64s (Hetzner/Vultr).
c060ce1 — docs (ipv6.md + CLI reference + manpage), CHANGELOG, version 1.1.0 → 1.2.0.
6201a62 — IPv6 settings UI card (api service live+mock, mock db seed, regenerated TS bindings).

Verification (all pass): cargo test tunnel::wg (9), cargo check -p start-core, check:tunnel, build:tunnel (full Angular template type-check), prettier.

Not automated: end-to-end reachability on a real VPS with a routed IPv6 prefix — needs real global IPv6, which a local libvirt VM can't provide. Worth a manual pass before release.

Deferred to PR2 (per the 3b decision): extend the tunnel's PCP server for IPv6 GUA pinholes (ip6 nft accept, no NAT) + a manual pinhole path.

Still a draft — happy to adjust the allocation scheme or the full-tunnel-v6 AllowedIPs choice on review before I start PR2.

helix-nine · 2026-07-03T02:08:01Z

Before marking this ready I ran an adversarial self-review over the whole PR1 diff (6 dimensions — allocation math, proxy-NDP dataplane, WG config generation, API/CLI, UI, docs — each finding verified against the code). It surfaced 7 real issues, all now fixed in d21dace:

Major

resync_v6 did the same non-atomic read→diff→apply→overwrite as resync_egress but without a lock — two concurrent config-change RPCs could leave the tracked proxy-NDP map out of sync with the kernel neighbor table (leaked entries). Added a dedicated v6_lock mirroring egress_lock.
The client config advertised an IPv6 DNS server (<prefix>::1) that nothing binds — the DNS proxy is IPv4-only. Dropped the v6 DNS entry (AAAA resolves fine over the v4 resolver); v6 DNS can come with the proxy binding v6 later.

Minor / nit

set-ipv6 now also rejects link-local prefixes (a fat-fingered fe80::/64 would have persisted and broken all client v6).
UI validator now requires an explicit /prefix in [0,128] (IpNet.parse doesn't enforce it).
Docs: corrected the CLI clearing syntax (omit --prefix, not --prefix null); added a note that inbound IPv6 hosting isn't supported yet (PR2); dropped the v6-DNS claim; moved the changelog entry under a ## [1.2.0] heading.

Re-verified after the fixes: cargo test tunnel::wg (9 pass), cargo check, check:tunnel, full build:tunnel, prettier — all green.

helix-nine · 2026-07-04T01:13:54Z

Implemented the per-subnet IPv6 redesign (handoff §6) — 14a07a1. Global WgServer.ipv6 → per-subnet WgSubnetConfig.ipv6; allocation collapsed to one /128 per host (host_v6 = prefix.network() | tunnel_ipv4, server + clients alike); subnet <net> set-ipv6 replaces the global command (carrying the egress validation, keyed on the subnet); UI moved from the settings card to the subnet dialog + a computed IPv6 column in the subnet/device tables.

Ran an adversarial review over the diff before pushing; it caught 4 real issues, all fixed in fd650b2:

Library bug (root-caused, not worked around). @start9labs/start-core's IpAddress rendered IPv6 by joining raw decimal octets ("32:1:13:184:…" not "2001:db8::…"), and fromOctets's 16-octet path spun forever on a no-op unshift() whenever the 9th octet was 0 — so zero()/fromOctets/.address on any computed v6 hung the browser or produced garbage. Fixed both the .address getter and fromOctets with one correct renderIpv6 (8 hex groups, longest zero-run → ::). Verified against RFC 5952 edge cases and across all web projects (npm run check) — no consumer regressed.
Out-of-prefix addresses. host_v6 only stays in-prefix for prefixes ≤ /96; a /124 escaped the block silently. set_subnet_ipv6 now rejects > /96, the web validator mirrors it, and a wg6 boundary test covers /48–/96.
Stale comments from the old /64-delegation model removed.

Verified: cargo test (backend), npm run check (all projects), check:tunnel, full build:tunnel, prettier, and a runtime check of the render fix.

Add an operator-configured routed IPv6 prefix to the tunnel and carve per-client global addresses out of it, adapting to the prefix size: a /64 per client when the prefix is shorter than /64 (real prefix delegation), a single /128 per client for a shared /64 (the common budget-VPS case), or an indexed /128 for a longer-than-/64 prefix (e.g. DigitalOcean /124). - wg6: address-allocation module (server_addr + client_v6), unit-tested. - WgServer.ipv6 prefix field + server/peer/client WireGuard config generation carrying v6 addresses, gateway/DNS, and AllowedIPs (v6 is full-tunnel so replies from the delegated GUA return through the tunnel). - set-ipv6 API + CLI to configure/clear the prefix (+ i18n, bindings). - forward chain: accept client-originated IPv6. Foundation for the tunnel IPv6 work; runtime dataplane (proxy-NDP, AAAA DNS), UI, docs, and PCP v6 pinholes follow.

When the configured prefix is a /64 held on-link by a WAN interface (Hetzner, Vultr), the VPS gateway resolves each client's global address via Neighbor Discovery on the WAN link. The tunnel host now answers for those addresses (net.ipv6.conf.all.proxy_ndp + `ip -6 neigh … proxy`) and forwards to the client over WireGuard. Routed prefixes and per-client /64s are delivered without ND, so they get no proxy entry. Reconciled on every network sync and at startup; installed entries are tracked so stale ones (client removed, prefix cleared) are withdrawn.

Document the new IPv6 delegation: a docs/src/ipv6.md page (linked in SUMMARY), the `set-ipv6` command in the CLI reference, its generated manpage, and a CHANGELOG entry. Bump start-tunnel 1.1.0 -> 1.2.0.

Add an IPv6 card to the tunnel settings page: shows the current routed prefix, validates and saves a new one (`set-ipv6`), and can disable it. Wires `setIpv6` through the api service (live + mock) and seeds the mock db with `wg.ipv6`. Regenerates the tunnel TS bindings (SetIpv6Params, WgServer.ipv6). Also syncs Cargo.lock for the 1.2.0 bump.

- resync_v6: guard with a dedicated v6_lock, mirroring egress_lock, so concurrent config changes can't leave the tracked proxy-NDP map out of sync with the kernel neighbor table (major). - client config: stop advertising an IPv6 DNS server (<prefix>::1) — the DNS proxy binds IPv4 only, and AAAA resolves fine over it; a dead v6 DNS entry caused latency/failures for v6-preferring stub resolvers (major). - set-ipv6: also reject link-local prefixes (fe80::/10); a fat-fingered fe80::/64 would otherwise persist and break all client IPv6. - UI validator: require an explicit /prefix in [0,128] (IpNet.parse does not enforce it, so a bare address slipped through to a backend error). - docs: correct the CLI clearing syntax (omit --prefix, not "--prefix null"); note inbound IPv6 hosting is not yet supported; drop the v6 DNS claim; move the changelog entry under a [1.2.0] heading.

…r-PSK persistence StartOS applied policy routing only to IPv4, so NetworkManager's forced full-tunnel `::/0` captured the host's entire IPv6 default route into any imported WireGuard gateway. A tunnel that carried an IPv6 address (e.g. a StartTunnel with a delegated prefix) but couldn't route IPv6 blackholed all of the box's IPv6, and a v4-only commercial VPN selected as the default outbound leaked IPv6 straight out the ISP link. Mirror the IPv4 policy-routing layer for IPv6 (NAT/reply-routing omitted — IPv6 has no NAT here): - wifi.rs: `ip -6 rule` pref 1000/1100 (main/default) above NM's per-tunnel `::/0` rules, plus a pref-1200 terminal blackhole so v6 with no usable route is dropped instead of falling through to NM's capture. - apply_policy_routing_v6: populate each managed interface's v6 table (`1000 + ifindex`) with main's non-default routes plus a default — a real route when the interface can carry v6, else `blackhole default` so a non-v6 gateway selected as the default outbound drops v6 (leak guard). - apply_default_outbound: install the v6 priority-74/75 rules (the desired set is family-agnostic, reconciled per family via new snapshot/reconcile helpers). - gc_policy_routing: flush the v6 table for removed interfaces. A gateway carries the box's IPv6 only when selected as the outbound gateway, exactly like IPv4 — no hijack, no leak. Also fix the in-place WireGuard update path (`Update2` + `Reapply`), which persisted the interface private-key but silently dropped each peer's preshared-key, so a re-issued PSK-using tunnel failed its handshake and went dead (taking tunnel-routed DNS with it). Flag the peer PSK system-owned (`preshared-key-flags = 0`) so Update2 persists it, matching AddAndActivateConnection on the add path.

A device with an IPv6 assignment routes all its IPv6 full-tunnel (`AllowedIPs = ::/0`), so a prefix delegated on a server that can't actually route IPv6 just blackholes the device's IPv6. `set-ipv6` now hard-errors (leaving the config unchanged) when the server has no IPv6 default route, and logs an actionable warning when the prefix is neither on-link on a WAN interface nor otherwise verifiable — catching a misconfigured VPS at set-time instead of on the device. Adds a `has_ipv6_default_route` helper.

Rename LAN IP/WAN IP -> LAN IPv4/WAN IPv4 and IP Range -> IPv4 Range so the tables read unambiguously once per-subnet IPv6 columns are added.

Drop the single global `WgServer.ipv6` in favor of an optional per-subnet `WgSubnetConfig.ipv6`, so a server with multiple disjoint IPv6 allocations can point different subnets at different prefixes. Allocation simplifies to one `/128` per host with the tunnel IPv4 embedded (`prefix-network | v4`) — uniform for the server and every client, stable, allocation-free, and UI-computable. No per-device /64 delegation (StartOS containers use NAT6). Backend: - wg6: replace the 3-case client_v6/server_addr/ClientV6 with one host_v6. - wg/db: remove WgServer.ipv6; add WgSubnetConfig.ipv6 (serde default, no migration). Server/peer/client configs source v6 per subnet. - api: replace the top-level `set-ipv6` with `subnet <net> set-ipv6`, carrying the egress + deliverability validation keyed on that subnet's prefix. show_config derives the client /128 from its subnet. - context: resync_v6 iterates per subnet (drops the global running index). - i18n: about.set-tunnel-ipv6 -> about.set-subnet-ipv6. Frontend: - Remove the Settings IPv6 card; add the prefix to the subnet Add/Edit dialog + a subnets-table column; setSubnetIpv6 in the api services. - Devices tables gain an IPv6 column computed in the UI from the subnet's prefix + the device's v4 (mirrors host_v6). Docs/bindings/manpage regenerated for the per-subnet surface.

…ixes Adversarial review of the per-subnet IPv6 diff surfaced three real issues: - **Library bug (root cause).** `IpAddress` rendered IPv6 by joining raw decimal octets (e.g. "32:1:13:184:…" instead of "2001:db8::…"), and `fromOctets`'s 16-octet path spun forever on a no-op `unshift()` when the 9th octet was 0 — so `zero()`/`fromOctets`/`.address` on any computed v6 hung the browser or produced garbage. Replace both the `.address` getter and `fromOctets` v6 paths with one correct `renderIpv6` (eight hex groups, longest zero-run collapsed to `::`). The tunnel devices IPv6 column now uses the library directly. Verified across all web projects (npm run check) — no consumer regressed. - **Out-of-prefix addresses.** `host_v6` OR's the 32-bit IPv4 into the low bits, which only stays in-prefix for prefixes /96 or shorter; a /124 (or any >/96) escaped the delegated block silently. `set_subnet_ipv6` now rejects prefixes longer than /96, and the web validator mirrors the bound. Added a wg6 boundary test. - **Stale comments.** Drop the "/64 delegation" / "per-client /64" asides left over from the old model in wg.rs and context.rs.

- Clamp the tunnel IPv4 to the prefix's host space in `host_v6` instead of rejecting prefixes longer than /96. A /64 keeps the whole IPv4; a smaller block (e.g. a /124) keeps only its low host bits, so the address stays in-prefix. Drop the >/96 rejection in `set_subnet_ipv6` and the matching cap in the web validator; mirror the clamp in the UI's device-IPv6 computation. A /124 now validates and works. - Replace the real host/prefix that had crept into tests and docs with documentation-range values (RFC 3849 `2001:db8::`, RFC 2606 example.com). Verified: cargo test (22 tunnel tests, incl. a /124 case and every prefix length staying in-prefix), UI computation matches host_v6 at runtime, check:tunnel, build:tunnel, prettier.

Every host on a subnet gets a /128 out of the subnet's prefix with its tunnel IPv4 clamped to the host space; on a block smaller than the IPv4 that can leave two devices sharing an address. Two devices must never share one, so enforce uniqueness: - wg6: `v6_conflict` / `first_v6_collision` helpers (+ unit test). - add_device: auto-assign skips any IP whose IPv6 collides with the server (.1) or an existing device; an explicit colliding IP is rejected with a message naming the conflict. All inside the mutate, so atomic. - set_subnet_ipv6: reject a prefix that can't give every existing host a distinct address, checked inside the mutate so a concurrent add can't slip a colliding device in between check and write. - UI getIp: the suggested IP is IPv6-aware, so it never proposes a colliding address; a hand-typed one surfaces the backend error. - docs: note the uniqueness requirement. A /64 never collides (full IPv4 fits); this only bites on small blocks.

Mark inbound v6 connections by ingress interface (nft mangle in table ip6 startos), restore the mark on replies, and route via a priority-50 fwmark rule — so a reply to an inbound IPv6 connection that arrived over a tunnel (terminated on the host or DNAT'd to a service container) routes back out that interface. The v6 reply-routing layer was previously omitted, so those replies had no route back and were blackholed: inbound IPv6 over a tunnel was dead. Remove the terminal pref-1200 v6 blackhole; the v6 default is chosen by metric like v4, and leak prevention stays per-gateway — a v6-incapable gateway selected as the default outbound gets a blackhole default in its own table, reached via the pref-75 catch-all. gc_policy_routing now cleans the pref-50 rule and per-interface table in both families. Validated live device<->tunnel: host-terminated and DNAT'd-container inbound replies route back; a marked packet routes to its own table authoritatively over NetworkManager's ::/0 capture.

1.1.0 has not shipped, so the per-subnet IPv6 work tagged 1.2.0 belongs in 1.1.0. Collapse the version (Cargo.toml + Cargo.lock) and merge the changelog section. Update the IPv6 entry: device-side inbound hosting over IPv6 now works with StartOS 0.4.0-beta.10.

helix-nine marked this pull request as ready for review July 3, 2026 02:07

dr-bonez force-pushed the helix/tunnel-ipv6 branch from d21dace to b4a3e4b Compare July 3, 2026 19:09

dr-bonez closed this Jul 3, 2026

dr-bonez force-pushed the helix/tunnel-ipv6 branch from 7a16ed2 to 77a97e1 Compare July 3, 2026 23:14

dr-bonez reopened this Jul 3, 2026

helix-nine and others added 12 commits July 4, 2026 06:25

docs(tunnel): IPv6 page, CLI reference, changelog, version bump

e71db4f

Document the new IPv6 delegation: a docs/src/ipv6.md page (linked in SUMMARY), the `set-ipv6` command in the CLI reference, its generated manpage, and a CHANGELOG entry. Bump start-tunnel 1.1.0 -> 1.2.0.

feat(tunnel): mark IPv4-specific columns in device and subnet tables

b59f70f

Rename LAN IP/WAN IP -> LAN IPv4/WAN IPv4 and IP Range -> IPv4 Range so the tables read unambiguously once per-subnet IPv6 columns are added.

helix-nine force-pushed the helix/tunnel-ipv6 branch from d3535be to 4a3fa57 Compare July 4, 2026 06:28

dr-bonez added 2 commits July 4, 2026 09:17

dr-bonez approved these changes Jul 4, 2026

View reviewed changes

dr-bonez merged commit 48c4b0c into master Jul 4, 2026
19 checks passed

dr-bonez deleted the helix/tunnel-ipv6 branch July 4, 2026 15:37

helix-nine mentioned this pull request Jul 4, 2026

feat(tunnel): IPv6 GUA pinholes — PCP v6 + manual forwards (PR2) #3403

Draft

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(tunnel): IPv6 support — prefix delegation + PCP pinholes (WIP)#3388

feat(tunnel): IPv6 support — prefix delegation + PCP pinholes (WIP)#3388
dr-bonez merged 14 commits into
masterfrom
helix/tunnel-ipv6

helix-nine commented Jul 2, 2026 •

edited

Loading

Uh oh!

helix-nine commented Jul 3, 2026

Uh oh!

helix-nine commented Jul 3, 2026

Uh oh!

helix-nine commented Jul 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

helix-nine commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Decision (research-backed)

What's in this PR

PR1 checklist

PR2 (follow-up, 3b)

Uh oh!

helix-nine commented Jul 3, 2026

Uh oh!

helix-nine commented Jul 3, 2026

Uh oh!

helix-nine commented Jul 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

helix-nine commented Jul 2, 2026 •

edited

Loading

PR2 (follow-up, `3b`)