Skip to content

Commit 89fa719

Browse files
committed
proposals(idaptik): server-authoritative-determinism architecture note + diagram
Why ReScript->AffineScript->wasm unlocks deterministic rollback netcode (the same vm.wasm on client + server), grounded with the multiplayer migratability tally, gamer-facing examples, and a language matrix (not Elixir-specific). Threads, grounded in the named repos: - SNIFS (github/hyperpolymath/snifs) = Safe NIFs (wasm-in-BEAM via wasmex) IS the safe server-side embedding, not a threat to it. - Burble data channel (protobuf) as a partial P2P string-offload around the AS string gap (non-authoritative comms only). - three-runtime debugging concrete anchor (cross-runtime causal replay by tick). Includes server-authoritative-determinism.svg. https://claude.ai/code/session_01WoKhFQePiRsAj7aqnxbG8s
1 parent 869a0b3 commit 89fa719

2 files changed

Lines changed: 339 additions & 0 deletions

File tree

Lines changed: 244 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,244 @@
1+
// SPDX-License-Identifier: MPL-2.0
2+
// SPDX-FileCopyrightText: 2025-2026 Jonathan D.A. Jewell <j.d.a.jewell@open.ac.uk>
3+
= Server-authoritative determinism — the same wasm on both sides
4+
:toc: macro
5+
6+
[IMPORTANT]
7+
====
8+
*Status: ARCHITECTURE NOTE / PROPOSAL.* Analysis of why the
9+
ReScript→AffineScript→wasm migration *unlocks* a class of multiplayer
10+
netcode that ReScript→JS made impractical, why it is not Elixir-specific,
11+
and the open threads it raises (Burble string-offload, safe-NIF embedding,
12+
three-runtime debugging). Staged in affinescript (MPL); subject is idaptik
13+
(AGPL). A companion SVG (`server-authoritative-determinism.svg`) renders
14+
the two diagrams below.
15+
====
16+
17+
toc::[]
18+
19+
== The thesis in one line
20+
21+
The *server's* simulation is the single source of truth (authoritative);
22+
every machine runs the *same deterministic* simulation; so a client can
23+
*predict ahead* of the server and, when it guesses wrong, *rewind and
24+
replay* to land exactly on the server's result — no drift, no cheating, no
25+
laggy feel.
26+
27+
Three legs: **authority** (server decides), **prediction** (client
28+
simulates locally so input feels instant), **determinism** (same inputs +
29+
same start state → bit-identical result everywhere). Determinism is the
30+
load-bearing leg — and AffineScript→wasm makes it nearly free.
31+
32+
== Diagram 1 — same binary, both sides
33+
34+
----
35+
┌──────────── CLIENT (browser) ───────────┐ ┌──────────── SERVER (Elixir / OTP) ──────────┐
36+
│ my input ─▶ [ vm.wasm ] ─▶ predict ─▶ Pixi│ │ collect ALL inputs ─▶ [ vm.wasm ] ─▶ step │
37+
│ ▲ │ │ (authoritative) via wasmex │
38+
│ rollback + replay │ │ per-entity GenServer + OTP supervision │
39+
└───────────────────────────┬───────────────┘ └───────────────┬─────────────────────────────┘
40+
│ inputs (tick, player, bits) │
41+
└──────────────▶ Phoenix Channel ◀─────┘
42+
authoritative state / hash @ tick T
43+
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
44+
the IDENTICAL AffineScript-compiled vm.wasm runs in BOTH boxes
45+
----
46+
47+
Only *inputs* cross client→server (tiny), and *authoritative state (or a
48+
hash)* crosses server→client. Determinism is what lets you ship inputs
49+
instead of whole world-states.
50+
51+
== Diagram 2 — predict → confirm → reconcile
52+
53+
----
54+
ticks → 8 9 10 11 12 13 14 ← client's predicted "now"
55+
CLIENT: [C] [C] [C] [P] [P] [P] [P] C = confirmed P = predicted (my inputs only)
56+
57+
SERVER confirms tick 11 — but it includes PLAYER B's input the client never predicted
58+
59+
▼ predicted@11 ≠ authoritative@11 → MISMATCH
60+
61+
RECONCILE: 1. rewind to 10 (last confirmed)
62+
2. re-apply now-known inputs (mine + B's) for 11..14
63+
3. replay forward → corrected "now" @14
64+
(determinism ⇒ this lands exactly on the server's result)
65+
----
66+
67+
If the prediction was right (the common case) the compare matches and the
68+
player never notices. Mispredictions cost a one-frame correction, not a
69+
stall.
70+
71+
== Diagram 3 — why ReScript couldn't, and what the real requirement is
72+
73+
----
74+
CLASSIC netcode (brittle) AFFINESCRIPT → wasm (by construction)
75+
───────────────────────── ─────────────────────────────────────
76+
client sim server sim client server
77+
(JS / ReScript) (rewritten in Elixir) [ vm.wasm ] ===== [ vm.wasm ]
78+
└─ must match BIT-FOR-BIT, by hand same bytes → same result, always
79+
float rounding / map order / overflow compile ONCE, run both places
80+
→ silent divergence → DESYNC (browser + Elixir via wasmex/wasmtime NIF)
81+
----
82+
83+
In the classic world you maintain *two* implementations that must agree to
84+
the last bit forever; one float-rounding difference between V8 and the BEAM
85+
and you desync. AffineScript compiles to one `vm.wasm` you run in both
86+
places, so "the two sims match" is true *by construction*.
87+
88+
== Every gamer knows this (the examples)
89+
90+
[cols="1,2a",options="header"]
91+
|===
92+
| What gamers say | The concept it is
93+
94+
| *"GGPO / rollback netcode is so good"* (Guilty Gear Strive, Skullgirls, Street Fighter 6)
95+
| This *is* predict + rollback + replay. It feels like magic *because* the
96+
sim is deterministic — the praised netcode is literally this mechanism.
97+
98+
| *"I SHOT him, he was behind the wall!"* (CS, Valorant, Apex — peeker's advantage)
99+
| Prediction vs authority under latency: you acted on your predicted frame;
100+
the server's authoritative timeline disagreed. The eternal complaint is
101+
the reconciliation gap made visible.
102+
103+
| *Rubber-banding / teleporting players*
104+
| Reconciliation snapping a client to the authoritative position after a
105+
misprediction (or packet loss). The "warp" is the correction.
106+
107+
| *"This guy's on wifi"* (Smash Ultimate desync/lag jokes — "why is my Falco teleporting")
108+
| Latency + weak determinism = the experience you are trying to *kill*.
109+
Delay-based, non-deterministic netcode is the thing gamers mock.
110+
111+
| *Dark Souls backstab / "phantom hit" desyncs* (P2P, no authoritative host)
112+
| The horror of *no authority + weak determinism*: two peers each think
113+
they're right, neither can reconcile. This is the architecture you are
114+
*avoiding*.
115+
|===
116+
117+
The punchline: determinism + same-binary + reversibility is the recipe
118+
behind the netcode gamers *praise* (GGPO rollback) and whose *absence* is
119+
the netcode they *mock* (delay-based, P2P backstab desync, wifi Falco).
120+
121+
== Is this Elixir-specific? (no)
122+
123+
It is a *determinism + shared-artifact* property, not an Elixir property.
124+
The requirement is: **a deterministic simulation that the same artifact can
125+
run on both the client and the authoritative host.**
126+
127+
[cols="2,1,3a",options="header"]
128+
|===
129+
| Stack | Works? | Why
130+
| ReScript → JS + Elixir | ✗ | JS float/GC nondeterminism; and you can't run the *same artifact* server-side — you'd reimplement the sim in Elixir (the two-sims problem).
131+
| *AffineScript → wasm* + Elixir | ✓✓ | Deterministic integer wasm both sides + affine no-aliasing + *reversible* VM (rollback = step backward).
132+
| Rust → wasm + Elixir | ✓ | Rust is deterministic; same wasm both sides (Rust-native also works server-side via a Rustler NIF). A very common, strong combo.
133+
| C/C++/Zig → wasm + any host | ✓ | Deterministic if you avoid float / UB / map-order. wasm is the shared artifact.
134+
| "Elixir sim on both sides" | ✗ (browser) | BEAM doesn't run in the browser; Gleam→JS puts you back on JS-client nondeterminism.
135+
|===
136+
137+
**wasm is the lingua franca** that makes "same binary both sides" possible.
138+
Elixir's role is just a *great authoritative host* (per-entity processes,
139+
OTP supervision, `wasmex` to embed the module) — swap Elixir for a
140+
Rust/Go server and the determinism story is unchanged. So: *Rust+Elixir,
141+
Rust+Rust, AffineScript+Elixir* all work; *ReScript+anything* doesn't,
142+
because it can't give you the shared deterministic artifact.
143+
144+
== Honest caveats
145+
146+
* The synced sim must be **pure**: no wall-clock, unseeded RNG, or host
147+
effects in the stepped path (exactly the effect-codegen wall — networking
148+
and time stay *outside* the VM). AffineScript *enforces* this rather than
149+
hoping for it.
150+
* The VM's I/O ports must be **fed identically** server-side (the server's
151+
wasm consumes recorded inputs, it does not make live host calls).
152+
* `wasmex`/`wasmtime` is real but you **batch ticks** across the NIF
153+
boundary to amortise call overhead, and validate under load.
154+
* Determinism removes *desync*, not *latency* — you still need input-delay /
155+
a rollback window.
156+
157+
== Open threads (raised, not yet resolved)
158+
159+
=== T1 — Burble's data channel to offload the string gap
160+
161+
CONFIRMED from `github.com/hyperpolymath/burble`: a *media plane* (WebRTC
162+
RTP/SRTP voice), an Elixir/OTP *control plane* (auth, rooms, presence,
163+
signaling), a P2P *data channel* (the burble proof-spec: a "bidirectional
164+
AI agent data channel" exchanging JSON over the same connection as voice),
165+
and a *Protobuf-defined wire protocol* shared by server and clients. (Your
166+
three-channel framing — voice/chat, an LLM channel, an "rtsm" real-time
167+
state channel — maps onto media-plane + two uses of the data channel; the
168+
exact `rtsm` name isn't in the public ARCHITECTURE.adoc I could read.)
169+
170+
Idea: route the game's *stringy* comms (chat, names, AI/LLM text) over
171+
Burble's data channel so the AffineScript sim never touches them — routing
172+
strings *around* the AS string-gap, not through it.
173+
174+
* *Strong fit:* the data channel is **Protobuf**, not ad-hoc string
175+
parsing — structured, length-delimited, integer-tagged. That is exactly
176+
the shape AffineScript likes at a boundary; the AS sim can read protobuf
177+
field tags/ints without needing variable-string ops.
178+
* *Works for:* non-authoritative peer strings (chat, voice, the LLM
179+
channel). Brain/senses: AS = integer brain on Phoenix; Burble data
180+
channel = peer string/voice senses.
181+
* *Does NOT replace the authoritative path:* game-*affecting* strings must
182+
still traverse Phoenix→Elixir→wasm (authority + determinism), not a P2P
183+
side-channel.
184+
* *Integration cost is real:* Burble needs signaling/discovery (its Elixir
185+
control plane, or Groove); adding it as a game dependency is a second
186+
transport. Verdict: a good *partial* offload for the comms layer, the
187+
protobuf wire is a bonus — but not a substitute for the variable-string
188+
backend on the authoritative path.
189+
190+
=== T2 — "snifs instead of nifs": it doesn't dismantle this — it IS the server side
191+
192+
CONFIRMED from `github.com/hyperpolymath/snifs`: *SNIFS = Safe NIFs* —
193+
native (Zig) code compiled to WebAssembly and run via `wasmex`/`wasmtime`,
194+
so guest faults (out-of-bounds, overflow, divide-by-zero, crashes) become
195+
`{:error, reason}` tuples instead of taking down the BEAM. Tagline:
196+
"WebAssembly sandboxing provides genuine crash isolation for BEAM NIFs."
197+
198+
This is not a threat to server-authoritative determinism — *it is the
199+
recommended way to do the server side of it.* The diagram's "server-side
200+
`vm.wasm` via wasmex" literally IS a SNIF. So:
201+
202+
* *Determinism is unaffected:* it comes from the *wasm module being
203+
identical both sides*. SNIFS runs that same wasm; the computation is
204+
bit-identical.
205+
* *SNIFS improves the architecture:* a runaway or faulting authoritative
206+
tick yields `{:error}` (the entity's GenServer rejects/rolls back that
207+
input) instead of crashing the lobby — exactly the OTP-shaped recovery
208+
you want. Raw-NIF embedding gives determinism but not containment; SNIFS
209+
gives both.
210+
* *A convergence worth naming:* the determinism argument *wants* wasm on
211+
the server (the same-binary property); SNIFS *independently* wants wasm
212+
on the server (crash isolation). Same choice, two reasons. SNIFS is the
213+
production substrate for "vm.wasm via wasmex."
214+
215+
Net: use SNIFs, not raw NIFs — *same determinism, crash-contained
216+
authority*. Far from dismantling it, SNIFS is the piece that makes the
217+
server side safe to ship.
218+
219+
=== T3 — three-runtime debugging (concrete anchor)
220+
221+
A desync spans three runtimes. Concrete bug: *"Player A sees the door open;
222+
Player B sees it closed."* The door bit lives in the AS-wasm VM; the input
223+
was marshalled by the JS host; relayed/authorised by Elixir. Suspects:
224+
225+
. *AS↔JS ABI* — A's input mis-marshalled (wrong integer crossed the wasm boundary).
226+
. *JS↔Elixir codec* — schema drift (a field dropped in JSON).
227+
. *Elixir ordering* — input applied at the wrong tick.
228+
. *determinism break* — B's wasm ≠ A's wasm (shouldn't happen with the same binary).
229+
230+
One symptom, three runtimes, four boundaries, *no single stack trace*. The
231+
concrete handle: a **unified trace keyed by `(tick, entityId, traceId)`**
232+
that every runtime emits, so the cross-runtime causal chain can be
233+
reconstructed — and because the VM is *reversible + deterministic*, the
234+
exact desync can be **replayed from recorded inputs** across all three
235+
runtimes (record-and-replay debugging end to end). A generalised
236+
debugging idea is worth testing against this anchor: *does it help
237+
reconstruct/replay the cross-runtime causal chain keyed by tick?*
238+
239+
== Provenance
240+
241+
2026-06-04 AffineScript co-development session. Companion: the multiplayer
242+
architecture analysis in `proposals/idaptik/README.adoc` and the migratability
243+
tally of `src/app/multiplayer/*.res` (6 MIGRATABLE NOW / 5 EFFECT-GATED /
244+
2 STRING-GATED) that empirically confirms the brain/senses cleavage.

0 commit comments

Comments
 (0)