Skip to content

Commit 938dcaf

Browse files
hyperpolymathclaude
andcommitted
feat(formal): real-lift R0 + scoping plan (RealWasm.v + REAL-LIFT.adoc)
Begins the "real lift" (PROOF-NEEDS.adoc section 6 Wave 3): replacing the toy proof models with the real AffineScript AST + real typed-WASM semantics. - formal/REAL-LIFT.adoc: the engineering plan, grounded in lib/ast.ml (the real ~30-constructor expr) and lib/wasm.ml (the real WebAssembly-1.0 IR). Defines the Coq module structure (RealWasm / RealWasmSem / RealAst / RealAstSem / RealCompile / RealPreservation) and a strict-superset fragment ladder R0->R1->R2->R-mem->R-float->R-str->R-call->R-wrap->R-eff, each axiom-free before the next, with the strategy for each hard part (loops/termination via fuel, linear memory, abstract floats, the post-typecheck elaboration nodes, effects gated on #555). - formal/RealWasm.v: R0, the first stone. Re-targets the ACTUAL lib/wasm.ml instr/value_type names (vs the toy K-1's ad-hoc machine) for the pure i32 numeric core, with a stack-machine wexec, an arity checker wcheck, and wexec_sound: arity-checked code never gets stuck — the target-side invariant the real K-1 preservation proof rests on. i32 modelled as Z (wrap deferred to rung R-wrap). Axiom-free. Wired into _CoqProject / justfile / .hypatia-ignore / README.adoc. PROOF-NEEDS.adoc: K-1 row + section 6 Wave 3 note R0 landed; formal/ now 16 files, 27 axiom-free closure reports. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01KPG9mEQXFyA3k7NWAzMNMr
1 parent ee5000d commit 938dcaf

7 files changed

Lines changed: 450 additions & 12 deletions

File tree

.hypatia-ignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,3 +73,5 @@ cicd_rules/vlang_detected:formal/QttDynamic.v
7373
cicd_rules/banned_language_file:formal/QttDynamic.v
7474
cicd_rules/vlang_detected:formal/F5_RenderFaithful.v
7575
cicd_rules/banned_language_file:formal/F5_RenderFaithful.v
76+
cicd_rules/vlang_detected:formal/RealWasm.v
77+
cicd_rules/banned_language_file:formal/RealWasm.v

docs/PROOF-NEEDS.adoc

Lines changed: 17 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ than the prose corpus suggests:
7070
`TypeSafety.v` is example lemmas about list length, not AffineScript. Not
7171
core-metatheory.
7272
* **`formal/`** — the directory #513 names as the mechanized-proof target now
73-
**exists** (Coq/Rocq 8.18), axiom-free throughout — **15 files, 26 `Print
73+
**exists** (Coq/Rocq 8.18), axiom-free throughout — **16 files, 27 `Print
7474
Assumptions` closure reports**, all "Closed under the global context". Codegen
7575
keystone: `K1_CodegenPreservation.v` (K-1 minimal) +
7676
`K1Let_CodegenPreservation.v` (K-1 grown with variables/`let`/environment).
@@ -221,8 +221,13 @@ The work in flight changes which obligations are *load-bearing* and which are
221221
(was `prose`): a Wave-0 mechanized seed — `formal/K1_CodegenPreservation.v`
222222
(Coq, no axioms) — proves preservation for a minimal nat/bool · add/and
223223
fragment compiled to a stack machine — `K1Let_CodegenPreservation.v` grows it
224-
with de Bruijn variables, `let`, and an environment; the real AffineScript AST
225-
+ real typed-WASM operational semantics remain the open obligation.
224+
with de Bruijn variables, `let`, and an environment. The **real lift** onto the
225+
real AffineScript AST + real typed-WASM semantics is scoped in
226+
`formal/REAL-LIFT.adoc` (fragment ladder R0→R-eff); its first stone is
227+
**landed** — `formal/RealWasm.v` (R0) re-targets the *actual* `lib/wasm.ml`
228+
`instr`/`value_type` with a stack-arity soundness theorem (checked code never
229+
traps). R1 (real `RealAst`/`RealAstSem`/`RealCompile` for the int/bool/let
230+
fragment) is the first real `⟦compile p⟧ = ⟦p⟧`.
226231
* **K-2 — Effect-soundness is *blocked*, not merely unproven.** P-6 cannot be
227232
honestly *stated against the current backend* because #555 drops handler arms
228233
on three of the codegen targets. The obligation must be split: (a) prove
@@ -389,13 +394,16 @@ F-1 (full transformer preservation) is non-trivial rather than a formality.
389394
| Wave 1
390395

391396
| 3
392-
| **The real lift (open — the bulk of the remaining work).** Replace the toy
393-
objects with the real ones: formalise the **real AffineScript AST** and the
394-
**real typed-WASM operational semantics**, then re-prove K-1/F-1/F-2 against
395-
them (this is what makes K-1/F-1 `XL`, and what actually resolves #601). Then
397+
| **The real lift (the bulk of the remaining work) — scoped + started.** Replace
398+
the toy objects with the real ones: formalise the **real AffineScript AST** and
399+
the **real typed-WASM operational semantics**, then re-prove K-1/F-1/F-2 against
400+
them (this is what makes K-1/F-1 `XL`, and what actually resolves #601). Planned
401+
as a strict-superset *fragment ladder* in `formal/REAL-LIFT.adoc`
402+
(R0→R1→R2→R-mem→R-float→R-str→R-call→R-wrap→R-eff); **R0 landed**
403+
(`formal/RealWasm.v` — real `lib/wasm.ml` IR, i32 core, arity soundness). Then
396404
the still-`prose`/`absent` obligations: P-5 (inference), P-6 (effects — blocked
397-
on #555/K-2), P-7 (resolution), P-8 (parser), P-10 (coherence); faces F-6/F-7;
398-
full-language borrowing (P-3) pending #553.
405+
on #555/K-2, = rung R-eff), P-7 (resolution), P-8 (parser), P-10 (coherence);
406+
faces F-6/F-7; full-language borrowing (P-3) pending #553.
399407
| Wave 2; #555; #553
400408
|===
401409

formal/README.adoc

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,11 @@ cannot mistake it. Coq has no `v.mod` manifest, so `vmod_detected` never fires.
9494
| **F-5** — `render_ty` per-face type renaming is injective (no two canonical
9595
types collapse onto one displayed string), incl. the Js/Cafe "null"/"?" overlap
9696
| **mechanized**, axiom-free
97+
98+
| `RealWasm.v`
99+
| **Real lift R0** (`REAL-LIFT.adoc`) — the real `lib/wasm.ml` IR (i32 core):
100+
`wexec` + arity checker + soundness (`wexec_sound`: checked code never traps)
101+
| **mechanized**, axiom-free
97102
|===
98103

99104
All mechanized theorems report *Closed under the global context* under `Print
@@ -215,12 +220,26 @@ the same canonical AST emit identical code). The observational version for the
215220
trailing-statement/tail split is mechanized separately in
216221
`invariant-path/proofs/SameCube.agda` (F-2).
217222

223+
== The real lift (R0)
224+
225+
`RealWasm.v` is the first foundation stone of the *real lift* — replacing the
226+
toy stack machine with the actual `lib/wasm.ml` IR. R0 covers the pure i32
227+
numeric core (real `instr` / `value_type` names), a stack-machine `wexec`, an
228+
arity checker `wcheck`, and the soundness theorem `wexec_sound`: arity-checked
229+
code never gets stuck — the target-side invariant the real K-1 preservation
230+
proof rests on. The full plan — the fragment ladder R0 → R1 → R2 → R-mem →
231+
R-float → R-str → R-call → R-wrap → R-eff, the Coq module structure, and the
232+
strategy for each hard part (loops/termination, linear memory, floats, the
233+
elaboration nodes, effects) — is `formal/REAL-LIFT.adoc`.
234+
218235
== Scope (honest)
219236

220-
These are Wave-0 seeds, not the discharged obligations. The fragments are
237+
These are Wave-0/1 seeds, not the discharged obligations. The fragments are
221238
deliberately small; the full theorems need the real AffineScript AST and the
222239
real typed-WASM operational semantics, expanded the way solo-core's
223-
Duet/Ensemble tracks expand Solo. Tracked in `docs/PROOF-NEEDS.adoc` and #513.
240+
Duet/Ensemble tracks expand Solo. That expansion has **begun** — see the real
241+
lift above (`RealWasm.v`, R0) and the plan in `formal/REAL-LIFT.adoc`. Tracked
242+
in `docs/PROOF-NEEDS.adoc` and #513.
224243

225244
== Checking
226245

formal/REAL-LIFT.adoc

Lines changed: 277 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,277 @@
1+
// SPDX-License-Identifier: CC-BY-SA-4.0
2+
// SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell (hyperpolymath)
3+
= The Real Lift — from toy seeds to the real AffineScript AST + typed-WASM
4+
Jonathan D.A. Jewell
5+
:toc: left
6+
:toclevels: 3
7+
:sectnums:
8+
:sectnumlevels: 2
9+
10+
This is the engineering plan for the single largest remaining body of proof work
11+
in `docs/PROOF-NEEDS.adoc` §6 Wave 3: replacing the deliberately-small models the
12+
`formal/` track proves today with the *real* objects, so that K-1 / F-1 / F-2 and
13+
the dynamic halves of P-2 / P-4 become discharged obligations rather than
14+
meaning-pinning seeds.
15+
16+
It is grounded in the actual compiler: the source AST is `lib/ast.ml`, the target
17+
IR is `lib/wasm.ml`, the compiler is `lib/codegen.ml` (+ `lib/codegen_gc.ml`),
18+
and the reference dynamic semantics is `lib/interp.ml`.
19+
20+
[IMPORTANT]
21+
====
22+
The lift is **XL** (CompCert-scale for a substantial language). It is delivered
23+
as a *ladder* of strict-superset fragments, each fully mechanized and axiom-free
24+
before the next — the way `K1 → K1Let` already grew, and the way solo-core's
25+
`Solo → Duet → Ensemble` is meant to expand. No milestone lands with an
26+
`Admitted`.
27+
====
28+
29+
== 1. The theorem
30+
31+
For the real compiler `compile : Ast.program → Wasm.wasm_module` and the real
32+
operational semantics on each side:
33+
34+
[source]
35+
----
36+
⟦ compile(p) ⟧_wasm = ⟦ p ⟧_source (for every well-typed p)
37+
----
38+
39+
K-1 is the backend half; F-1 (`⟦T_F(p)⟧ = ⟦T_canonical(canon(p))⟧`) composes the
40+
face transformers on top of it; F-2 (#601) is the observational corollary across
41+
faces. All three collapse to "grow K-1 onto the real objects, then re-chain the
42+
existing F-1/F-2 scaffolding."
43+
44+
== 2. The two real objects (what we are formalising)
45+
46+
=== 2.1 Source — `lib/ast.ml`
47+
48+
The real `expr` has ~30 constructors. Grouped by the order the ladder tackles
49+
them:
50+
51+
[cols="1,3"]
52+
|===
53+
| Group | Constructors
54+
55+
| Core values/binding
56+
| `ExprLit` (Int/Float/Bool/Char/String/Unit), `ExprVar`, `ExprLet`
57+
(carries `el_quantity : quantity option` — the QTT multiplicity), `ExprBlock`,
58+
`stmt` (`StmtLet`/`StmtExpr`/`StmtAssign`/`StmtWhile`/`StmtFor`)
59+
60+
| Control
61+
| `ExprIf`, `ExprMatch` (+ `match_arm`, guards), `ExprReturn`, `ExprBreak`,
62+
`ExprContinue`
63+
64+
| Functions
65+
| `ExprLambda` (+ `param` with `p_quantity`/`p_ownership`), `ExprApp`,
66+
top-level `fn_decl` (`FnBlock`/`FnExpr`/`FnExtern`)
67+
68+
| Aggregates (heap)
69+
| `ExprTuple`/`ExprTupleIndex`, `ExprArray`/`ExprIndex`, `ExprRecord`/`ExprField`/`ExprRowRestrict`
70+
71+
| Operators
72+
| `ExprBinary` (`binary_op`), `ExprUnary` (`unary_op`, incl. `OpRef`/`OpMutRef`/`OpDeref`)
73+
74+
| Elaboration nodes (post-typecheck, backend-only — `Typecheck.elaborate_string_concat`)
75+
| `ExprStringConcat`/`ExprStringEq`/`ExprStringRel` (the "string wall"),
76+
`ExprFloatBinary`/`ExprFloatArray`/`ExprFloatIndex` (the "float wall"),
77+
`ExprCellTuple`/`ExprCellTupleIndex`/`ExprCellRecord`/`ExprCellField`
78+
(uniform-8 float-bearing layout)
79+
80+
| Effects
81+
| `ExprHandle` (+ `handler_arm`), `ExprResume`, `ExprTry`, effect rows on
82+
`TyArrow`
83+
84+
| Unsafe
85+
| `ExprUnsafe` (`UnsafeRead`/`Write`/`Offset`/`Transmute`/`Forget`)
86+
|===
87+
88+
The type layer (`type_expr`) carries the substructural data the proofs care
89+
about: `quantity` `{QZero,QOne,QOmega}` (→ already mechanized in `QttSemiring.v`),
90+
`ownership` `{Own,Ref,Mut}`, `origin_var` (ADR-022 Polonius regions, → P-3 /
91+
`P3_BorrowGraph.v`), effect rows (→ P-6, blocked on #555), and record rows.
92+
93+
=== 2.2 Target — `lib/wasm.ml`
94+
95+
A faithful WebAssembly-1.0 IR: `value_type {I32,I64,F32,F64}`, ~150 `instr`
96+
constructors (numeric i32/i64/f32/f64, `LocalGet/Set/Tee`, `GlobalGet/Set`, the
97+
load/store family with align+offset, `Block/Loop/If/Br/BrIf/Return/Call/CallIndirect`,
98+
`MemorySize/Grow`), and a `wasm_module` with `funcs`/`mems`/`tables`/`globals`/
99+
`exports`/`imports`/`elems`/`datas`/`custom_sections`. The
100+
`typedwasm.ownership` custom section carries `TyOwn/TyRef/TyMut` to the binary
101+
for typed-wasm Level-7/10 verification — the seam where P-3 (borrow soundness)
102+
connects to the target.
103+
104+
=== 2.3 The gap from the toy K-1
105+
106+
`K1_CodegenPreservation.v` invents an ad-hoc stack machine; `RealWasm.v`
107+
(this directory, milestone **R0**, below) re-targets the *actual* `instr` /
108+
`value_type` names with a stack-arity soundness theorem (`wexec_sound`:
109+
arity-checked code never gets stuck). That is the first real foundation stone.
110+
111+
== 3. Coq module structure
112+
113+
[cols="1,3,1"]
114+
|===
115+
| Module | Contents | Status
116+
117+
| `RealWasm.v`
118+
| Target IR (real `instr`/`value_type` names) + stack-machine `wexec` +
119+
arity checker `wcheck` + soundness `wexec_sound`.
120+
| **R0 landed**
121+
122+
| `RealWasmSem.v`
123+
| The full target operational semantics as the ladder reaches control/memory:
124+
fuel-indexed step relation, linear-memory model, frame/locals.
125+
| planned (R2+)
126+
127+
| `RealAst.v`
128+
| The real source `expr`/`stmt`/`program` inductives (faithful to `lib/ast.ml`),
129+
per-fragment.
130+
| planned (S0)
131+
132+
| `RealAstSem.v`
133+
| The reference dynamic semantics (mirrors `lib/interp.ml`), per-fragment.
134+
| planned (S0)
135+
136+
| `RealCompile.v`
137+
| The real `compile` (port of `lib/codegen.ml`), per-fragment.
138+
| planned (K-1/R1)
139+
140+
| `RealPreservation.v`
141+
| `⟦compile p⟧ = ⟦p⟧` — the real K-1, per-fragment; then re-chains
142+
`F1_TransformerPreservation.v` and the `SameCube` F-2 story.
143+
| planned
144+
|===
145+
146+
== 4. The fragment ladder
147+
148+
Each rung is a strict superset; each is fully proven (axiom-free) before the
149+
next. The rung names are stable references the `PROOF-NEEDS.adoc` rows will cite.
150+
151+
[cols="1,3,2"]
152+
|===
153+
| Rung | Adds | Retires / advances
154+
155+
| **R0** ✅
156+
| Pure i32 numeric stack core (`I32Const`, add/sub/mul/and/or/xor, `eqz`, drop);
157+
`wexec` + arity soundness. (`RealWasm.v`)
158+
| The toy stack machine's *target* — real names now.
159+
160+
| **R1**
161+
| Source side S0+S1: `ExprLit`/`ExprVar`/`ExprLet`/`ExprBinary` (int/bool) →
162+
`RealAst.v` + `RealAstSem.v`; `compile` for them (`RealCompile.v`); locals
163+
(`LocalGet/Set/Tee`) + i64 in `RealWasm.v`. First real `⟦compile p⟧=⟦p⟧`.
164+
| Subsumes **K1** and **K1Let** on real objects.
165+
166+
| **R2**
167+
| Structured control: `Block/Loop/If/Br/BrIf/Return`; `ExprIf`/`ExprMatch`/
168+
`while`/`for`/`break`/`continue`. Needs a **fuel-indexed** `wexec` (loops may
169+
diverge) and the matching source semantics.
170+
| Closes the toy K-1's "`if`/control is the next increment" gap. Lets **F-2**
171+
(#601) be settled *concretely*: the statement-tail vs expression-tail lowerings
172+
become two real `instr` sequences whose `wexec` agreement (unit) / divergence
173+
(value-returning) is a theorem, not a prose observation.
174+
175+
| **R-mem**
176+
| Linear memory: load/store family, `MemorySize/Grow`, a byte-addressed memory
177+
model; `ExprTuple`/`ExprArray`/`ExprRecord` + indexing/field access.
178+
| Heap layout — the precondition for strings and the "cell" elaboration nodes.
179+
180+
| **R-float**
181+
| `F64` (then `F32`) lane; the float-wall elaboration nodes
182+
(`ExprFloatBinary`/`FloatArray`/`FloatIndex`, `ExprCellTuple`/`CellRecord`/`CellField`,
183+
uniform-8 layout). Float model: an abstract IEEE carrier with deterministic ops
184+
first (defer Flocq/NaN-bit fidelity to R-float-exact).
185+
| The "float wall" / "float heap wall" become preservation lemmas.
186+
187+
| **R-str**
188+
| The string wall: `ExprStringConcat`/`StringEq`/`StringRel` over the
189+
`[len][utf8]` byte layout (lengths-prefixed compare/concat).
190+
| String-wall slices 8b/9/10 (#458) as preservation lemmas.
191+
192+
| **R-call**
193+
| `Call`/`CallIndirect` + `tables`/`elems`; multi-function `program`; `ExprApp`
194+
to top-level `fn_decl`; closures (the indirect-dispatch path).
195+
| Whole-program K-1.
196+
197+
| **R-wrap**
198+
| Replace `Z` i32 with mod-2^32 semantics (normalise at each op); i64 mod-2^64.
199+
| Bit-exact integer fidelity.
200+
201+
| **R-eff**
202+
| Effects/handlers (`ExprHandle`/`Resume`/`Try`, effect rows) via CPS lowering.
203+
**Gated on #555 / K-2**: prove against the interpreter's *intended* handler
204+
semantics, then show the lowered targets refine it (the part that currently
205+
fails on three backends).
206+
| **P-6** effect-row soundness; the #555 proof-blocking defect.
207+
|===
208+
209+
The source-side rungs S0…Sn track the target rungs one-for-one (each `RealAst`/
210+
`RealAstSem` increment matches the `RealWasm`/`RealCompile` increment it is
211+
proven against).
212+
213+
== 5. Hard parts and the strategy for each
214+
215+
* **Loops & termination.** Wasm `Loop` can diverge, so `wexec` cannot stay a
216+
structural fixpoint past R1. Strategy: **fuel-indexed** big-step
217+
(`wexec : nat → …`) with the standard "more fuel never changes a `Some`
218+
result" monotonicity lemma; preservation is stated up to fuel. (Same device
219+
the source `while`/`for` semantics will use.)
220+
* **Linear memory.** Model memory as `Z → byte` (or a finite map) + a bound;
221+
load/store mirror `lib/wasm.ml`'s align/offset. The allocator
222+
(`runtime/src/alloc.rs`) is modelled abstractly (a bump pointer in a global) —
223+
enough for the deterministic layout the elaboration nodes assume.
224+
* **Floats.** Start with an *abstract* IEEE carrier: opaque `f64` with the
225+
arithmetic/compare ops as deterministic uninterpreted functions, so R-float
226+
proves *layout/dispatch* preservation without a NaN-bit model. A later
227+
R-float-exact swaps in Flocq if bit-fidelity is ever load-bearing.
228+
* **The elaboration nodes.** `ExprStringConcat`/`Float*`/`Cell*` are introduced
229+
by `Typecheck.elaborate_string_concat` *after* typing. The proof must include
230+
that elaboration as part of `compile` (source semantics is on the
231+
pre-elaboration AST; the elaboration must be shown meaning-preserving — itself
232+
a clean lemma per node).
233+
* **Borrowing / QTT at the seam.** `origin_var` + the `typedwasm.ownership`
234+
custom section connect to typed-wasm Level-7/10. P-3 (`P3_BorrowGraph.v`) and
235+
P-4 (`QttTyping.v`/`QttDynamic.v`) are the *static+dynamic* discipline; the
236+
lift shows `compile` preserves the multiplicity/ownership facts into the custom
237+
section (a refinement, not a re-proof).
238+
* **Effects.** The genuinely hardest, and *blocked*: do it last (R-eff), only
239+
after #555 is resolved per K-2.
240+
241+
== 6. Reuse: typed-WASM, don't reinvent
242+
243+
`hyperpolymath/typed-wasm` and `ephapax` both have Coq (`Semantics.v`). The
244+
target operational semantics (`RealWasmSem.v`) and `value_type`/`instr` encoding
245+
should be *imported or aligned* with typed-wasm's, not independently reinvented —
246+
the compile target is the one genuinely shared surface across the estate
247+
(per `.claude/CLAUDE.md`'s disambiguation table). The `typedwasm.ownership`
248+
custom section is the contracted hand-off point; the `crates/typed-wasm-verify/`
249+
verifier crate is the executable companion to the Coq semantics.
250+
251+
== 7. Sequencing & effort
252+
253+
[cols="1,3,1"]
254+
|===
255+
| Phase | Rungs | Size
256+
257+
| First-order K-1 | R0 ✅ → R1 → R2 | M (multi-week)
258+
| Heap & data | R-mem → R-float → R-str | M–L
259+
| Whole program | R-call → R-wrap | M
260+
| The long tail | R-eff (after #555), generics/traits, `unsafe` | L–XL
261+
|===
262+
263+
"First-order K-1" (R0→R2) is the natural first *publishable* milestone: it
264+
discharges K-1 for the nat/bool/let/if/control fragment on the **real** AST and
265+
**real** wasm IR, retires `K1`/`K1Let`, and settles #601 concretely. The estate
266+
already has the front-end scaffolding (`F1_TransformerPreservation.v`,
267+
`SameCube.agda`) waiting to re-chain onto it.
268+
269+
== 8. Status
270+
271+
* **R0 landed** — `RealWasm.v`: real `instr`/`value_type` names, `wexec`,
272+
`wcheck`, `wexec_sound` (arity-checked code never traps), axiom-free.
273+
* Next: **R1** — stand up `RealAst.v` + `RealAstSem.v` + `RealCompile.v` for the
274+
int/bool/let/binary fragment and prove the first real `⟦compile p⟧ = ⟦p⟧`,
275+
adding locals + i64 to `RealWasm.v`.
276+
277+
Tracked against `docs/PROOF-NEEDS.adoc` §6 Wave 3 and #513.

0 commit comments

Comments
 (0)