|
| 1 | +// SPDX-License-Identifier: CC-BY-SA-4.0 |
| 2 | +// SPDX-FileCopyrightText: 2026 Jonathan D.A. Jewell (hyperpolymath) |
| 3 | += The Real Lift — from toy seeds to the real AffineScript AST + typed-WASM |
| 4 | +Jonathan D.A. Jewell |
| 5 | +:toc: left |
| 6 | +:toclevels: 3 |
| 7 | +:sectnums: |
| 8 | +:sectnumlevels: 2 |
| 9 | + |
| 10 | +This is the engineering plan for the single largest remaining body of proof work |
| 11 | +in `docs/PROOF-NEEDS.adoc` §6 Wave 3: replacing the deliberately-small models the |
| 12 | +`formal/` track proves today with the *real* objects, so that K-1 / F-1 / F-2 and |
| 13 | +the dynamic halves of P-2 / P-4 become discharged obligations rather than |
| 14 | +meaning-pinning seeds. |
| 15 | + |
| 16 | +It is grounded in the actual compiler: the source AST is `lib/ast.ml`, the target |
| 17 | +IR is `lib/wasm.ml`, the compiler is `lib/codegen.ml` (+ `lib/codegen_gc.ml`), |
| 18 | +and the reference dynamic semantics is `lib/interp.ml`. |
| 19 | + |
| 20 | +[IMPORTANT] |
| 21 | +==== |
| 22 | +The lift is **XL** (CompCert-scale for a substantial language). It is delivered |
| 23 | +as a *ladder* of strict-superset fragments, each fully mechanized and axiom-free |
| 24 | +before the next — the way `K1 → K1Let` already grew, and the way solo-core's |
| 25 | +`Solo → Duet → Ensemble` is meant to expand. No milestone lands with an |
| 26 | +`Admitted`. |
| 27 | +==== |
| 28 | + |
| 29 | +== 1. The theorem |
| 30 | + |
| 31 | +For the real compiler `compile : Ast.program → Wasm.wasm_module` and the real |
| 32 | +operational semantics on each side: |
| 33 | + |
| 34 | +[source] |
| 35 | +---- |
| 36 | +⟦ compile(p) ⟧_wasm = ⟦ p ⟧_source (for every well-typed p) |
| 37 | +---- |
| 38 | + |
| 39 | +K-1 is the backend half; F-1 (`⟦T_F(p)⟧ = ⟦T_canonical(canon(p))⟧`) composes the |
| 40 | +face transformers on top of it; F-2 (#601) is the observational corollary across |
| 41 | +faces. All three collapse to "grow K-1 onto the real objects, then re-chain the |
| 42 | +existing F-1/F-2 scaffolding." |
| 43 | + |
| 44 | +== 2. The two real objects (what we are formalising) |
| 45 | + |
| 46 | +=== 2.1 Source — `lib/ast.ml` |
| 47 | + |
| 48 | +The real `expr` has ~30 constructors. Grouped by the order the ladder tackles |
| 49 | +them: |
| 50 | + |
| 51 | +[cols="1,3"] |
| 52 | +|=== |
| 53 | +| Group | Constructors |
| 54 | + |
| 55 | +| Core values/binding |
| 56 | +| `ExprLit` (Int/Float/Bool/Char/String/Unit), `ExprVar`, `ExprLet` |
| 57 | + (carries `el_quantity : quantity option` — the QTT multiplicity), `ExprBlock`, |
| 58 | + `stmt` (`StmtLet`/`StmtExpr`/`StmtAssign`/`StmtWhile`/`StmtFor`) |
| 59 | + |
| 60 | +| Control |
| 61 | +| `ExprIf`, `ExprMatch` (+ `match_arm`, guards), `ExprReturn`, `ExprBreak`, |
| 62 | + `ExprContinue` |
| 63 | + |
| 64 | +| Functions |
| 65 | +| `ExprLambda` (+ `param` with `p_quantity`/`p_ownership`), `ExprApp`, |
| 66 | + top-level `fn_decl` (`FnBlock`/`FnExpr`/`FnExtern`) |
| 67 | + |
| 68 | +| Aggregates (heap) |
| 69 | +| `ExprTuple`/`ExprTupleIndex`, `ExprArray`/`ExprIndex`, `ExprRecord`/`ExprField`/`ExprRowRestrict` |
| 70 | + |
| 71 | +| Operators |
| 72 | +| `ExprBinary` (`binary_op`), `ExprUnary` (`unary_op`, incl. `OpRef`/`OpMutRef`/`OpDeref`) |
| 73 | + |
| 74 | +| Elaboration nodes (post-typecheck, backend-only — `Typecheck.elaborate_string_concat`) |
| 75 | +| `ExprStringConcat`/`ExprStringEq`/`ExprStringRel` (the "string wall"), |
| 76 | + `ExprFloatBinary`/`ExprFloatArray`/`ExprFloatIndex` (the "float wall"), |
| 77 | + `ExprCellTuple`/`ExprCellTupleIndex`/`ExprCellRecord`/`ExprCellField` |
| 78 | + (uniform-8 float-bearing layout) |
| 79 | + |
| 80 | +| Effects |
| 81 | +| `ExprHandle` (+ `handler_arm`), `ExprResume`, `ExprTry`, effect rows on |
| 82 | + `TyArrow` |
| 83 | + |
| 84 | +| Unsafe |
| 85 | +| `ExprUnsafe` (`UnsafeRead`/`Write`/`Offset`/`Transmute`/`Forget`) |
| 86 | +|=== |
| 87 | + |
| 88 | +The type layer (`type_expr`) carries the substructural data the proofs care |
| 89 | +about: `quantity` `{QZero,QOne,QOmega}` (→ already mechanized in `QttSemiring.v`), |
| 90 | +`ownership` `{Own,Ref,Mut}`, `origin_var` (ADR-022 Polonius regions, → P-3 / |
| 91 | +`P3_BorrowGraph.v`), effect rows (→ P-6, blocked on #555), and record rows. |
| 92 | + |
| 93 | +=== 2.2 Target — `lib/wasm.ml` |
| 94 | + |
| 95 | +A faithful WebAssembly-1.0 IR: `value_type {I32,I64,F32,F64}`, ~150 `instr` |
| 96 | +constructors (numeric i32/i64/f32/f64, `LocalGet/Set/Tee`, `GlobalGet/Set`, the |
| 97 | +load/store family with align+offset, `Block/Loop/If/Br/BrIf/Return/Call/CallIndirect`, |
| 98 | +`MemorySize/Grow`), and a `wasm_module` with `funcs`/`mems`/`tables`/`globals`/ |
| 99 | +`exports`/`imports`/`elems`/`datas`/`custom_sections`. The |
| 100 | +`typedwasm.ownership` custom section carries `TyOwn/TyRef/TyMut` to the binary |
| 101 | +for typed-wasm Level-7/10 verification — the seam where P-3 (borrow soundness) |
| 102 | +connects to the target. |
| 103 | + |
| 104 | +=== 2.3 The gap from the toy K-1 |
| 105 | + |
| 106 | +`K1_CodegenPreservation.v` invents an ad-hoc stack machine; `RealWasm.v` |
| 107 | +(this directory, milestone **R0**, below) re-targets the *actual* `instr` / |
| 108 | +`value_type` names with a stack-arity soundness theorem (`wexec_sound`: |
| 109 | +arity-checked code never gets stuck). That is the first real foundation stone. |
| 110 | + |
| 111 | +== 3. Coq module structure |
| 112 | + |
| 113 | +[cols="1,3,1"] |
| 114 | +|=== |
| 115 | +| Module | Contents | Status |
| 116 | + |
| 117 | +| `RealWasm.v` |
| 118 | +| Target IR (real `instr`/`value_type` names) + stack-machine `wexec` + |
| 119 | + arity checker `wcheck` + soundness `wexec_sound`. |
| 120 | +| **R0 landed** |
| 121 | + |
| 122 | +| `RealWasmSem.v` |
| 123 | +| The full target operational semantics as the ladder reaches control/memory: |
| 124 | + fuel-indexed step relation, linear-memory model, frame/locals. |
| 125 | +| planned (R2+) |
| 126 | + |
| 127 | +| `RealAst.v` |
| 128 | +| The real source `expr`/`stmt`/`program` inductives (faithful to `lib/ast.ml`), |
| 129 | + per-fragment. |
| 130 | +| planned (S0) |
| 131 | + |
| 132 | +| `RealAstSem.v` |
| 133 | +| The reference dynamic semantics (mirrors `lib/interp.ml`), per-fragment. |
| 134 | +| planned (S0) |
| 135 | + |
| 136 | +| `RealCompile.v` |
| 137 | +| The real `compile` (port of `lib/codegen.ml`), per-fragment. |
| 138 | +| planned (K-1/R1) |
| 139 | + |
| 140 | +| `RealPreservation.v` |
| 141 | +| `⟦compile p⟧ = ⟦p⟧` — the real K-1, per-fragment; then re-chains |
| 142 | + `F1_TransformerPreservation.v` and the `SameCube` F-2 story. |
| 143 | +| planned |
| 144 | +|=== |
| 145 | + |
| 146 | +== 4. The fragment ladder |
| 147 | + |
| 148 | +Each rung is a strict superset; each is fully proven (axiom-free) before the |
| 149 | +next. The rung names are stable references the `PROOF-NEEDS.adoc` rows will cite. |
| 150 | + |
| 151 | +[cols="1,3,2"] |
| 152 | +|=== |
| 153 | +| Rung | Adds | Retires / advances |
| 154 | + |
| 155 | +| **R0** ✅ |
| 156 | +| Pure i32 numeric stack core (`I32Const`, add/sub/mul/and/or/xor, `eqz`, drop); |
| 157 | + `wexec` + arity soundness. (`RealWasm.v`) |
| 158 | +| The toy stack machine's *target* — real names now. |
| 159 | + |
| 160 | +| **R1** |
| 161 | +| Source side S0+S1: `ExprLit`/`ExprVar`/`ExprLet`/`ExprBinary` (int/bool) → |
| 162 | + `RealAst.v` + `RealAstSem.v`; `compile` for them (`RealCompile.v`); locals |
| 163 | + (`LocalGet/Set/Tee`) + i64 in `RealWasm.v`. First real `⟦compile p⟧=⟦p⟧`. |
| 164 | +| Subsumes **K1** and **K1Let** on real objects. |
| 165 | + |
| 166 | +| **R2** |
| 167 | +| Structured control: `Block/Loop/If/Br/BrIf/Return`; `ExprIf`/`ExprMatch`/ |
| 168 | + `while`/`for`/`break`/`continue`. Needs a **fuel-indexed** `wexec` (loops may |
| 169 | + diverge) and the matching source semantics. |
| 170 | +| Closes the toy K-1's "`if`/control is the next increment" gap. Lets **F-2** |
| 171 | + (#601) be settled *concretely*: the statement-tail vs expression-tail lowerings |
| 172 | + become two real `instr` sequences whose `wexec` agreement (unit) / divergence |
| 173 | + (value-returning) is a theorem, not a prose observation. |
| 174 | + |
| 175 | +| **R-mem** |
| 176 | +| Linear memory: load/store family, `MemorySize/Grow`, a byte-addressed memory |
| 177 | + model; `ExprTuple`/`ExprArray`/`ExprRecord` + indexing/field access. |
| 178 | +| Heap layout — the precondition for strings and the "cell" elaboration nodes. |
| 179 | + |
| 180 | +| **R-float** |
| 181 | +| `F64` (then `F32`) lane; the float-wall elaboration nodes |
| 182 | + (`ExprFloatBinary`/`FloatArray`/`FloatIndex`, `ExprCellTuple`/`CellRecord`/`CellField`, |
| 183 | + uniform-8 layout). Float model: an abstract IEEE carrier with deterministic ops |
| 184 | + first (defer Flocq/NaN-bit fidelity to R-float-exact). |
| 185 | +| The "float wall" / "float heap wall" become preservation lemmas. |
| 186 | + |
| 187 | +| **R-str** |
| 188 | +| The string wall: `ExprStringConcat`/`StringEq`/`StringRel` over the |
| 189 | + `[len][utf8]` byte layout (lengths-prefixed compare/concat). |
| 190 | +| String-wall slices 8b/9/10 (#458) as preservation lemmas. |
| 191 | + |
| 192 | +| **R-call** |
| 193 | +| `Call`/`CallIndirect` + `tables`/`elems`; multi-function `program`; `ExprApp` |
| 194 | + to top-level `fn_decl`; closures (the indirect-dispatch path). |
| 195 | +| Whole-program K-1. |
| 196 | + |
| 197 | +| **R-wrap** |
| 198 | +| Replace `Z` i32 with mod-2^32 semantics (normalise at each op); i64 mod-2^64. |
| 199 | +| Bit-exact integer fidelity. |
| 200 | + |
| 201 | +| **R-eff** |
| 202 | +| Effects/handlers (`ExprHandle`/`Resume`/`Try`, effect rows) via CPS lowering. |
| 203 | + **Gated on #555 / K-2**: prove against the interpreter's *intended* handler |
| 204 | + semantics, then show the lowered targets refine it (the part that currently |
| 205 | + fails on three backends). |
| 206 | +| **P-6** effect-row soundness; the #555 proof-blocking defect. |
| 207 | +|=== |
| 208 | + |
| 209 | +The source-side rungs S0…Sn track the target rungs one-for-one (each `RealAst`/ |
| 210 | +`RealAstSem` increment matches the `RealWasm`/`RealCompile` increment it is |
| 211 | +proven against). |
| 212 | + |
| 213 | +== 5. Hard parts and the strategy for each |
| 214 | + |
| 215 | +* **Loops & termination.** Wasm `Loop` can diverge, so `wexec` cannot stay a |
| 216 | + structural fixpoint past R1. Strategy: **fuel-indexed** big-step |
| 217 | + (`wexec : nat → …`) with the standard "more fuel never changes a `Some` |
| 218 | + result" monotonicity lemma; preservation is stated up to fuel. (Same device |
| 219 | + the source `while`/`for` semantics will use.) |
| 220 | +* **Linear memory.** Model memory as `Z → byte` (or a finite map) + a bound; |
| 221 | + load/store mirror `lib/wasm.ml`'s align/offset. The allocator |
| 222 | + (`runtime/src/alloc.rs`) is modelled abstractly (a bump pointer in a global) — |
| 223 | + enough for the deterministic layout the elaboration nodes assume. |
| 224 | +* **Floats.** Start with an *abstract* IEEE carrier: opaque `f64` with the |
| 225 | + arithmetic/compare ops as deterministic uninterpreted functions, so R-float |
| 226 | + proves *layout/dispatch* preservation without a NaN-bit model. A later |
| 227 | + R-float-exact swaps in Flocq if bit-fidelity is ever load-bearing. |
| 228 | +* **The elaboration nodes.** `ExprStringConcat`/`Float*`/`Cell*` are introduced |
| 229 | + by `Typecheck.elaborate_string_concat` *after* typing. The proof must include |
| 230 | + that elaboration as part of `compile` (source semantics is on the |
| 231 | + pre-elaboration AST; the elaboration must be shown meaning-preserving — itself |
| 232 | + a clean lemma per node). |
| 233 | +* **Borrowing / QTT at the seam.** `origin_var` + the `typedwasm.ownership` |
| 234 | + custom section connect to typed-wasm Level-7/10. P-3 (`P3_BorrowGraph.v`) and |
| 235 | + P-4 (`QttTyping.v`/`QttDynamic.v`) are the *static+dynamic* discipline; the |
| 236 | + lift shows `compile` preserves the multiplicity/ownership facts into the custom |
| 237 | + section (a refinement, not a re-proof). |
| 238 | +* **Effects.** The genuinely hardest, and *blocked*: do it last (R-eff), only |
| 239 | + after #555 is resolved per K-2. |
| 240 | + |
| 241 | +== 6. Reuse: typed-WASM, don't reinvent |
| 242 | + |
| 243 | +`hyperpolymath/typed-wasm` and `ephapax` both have Coq (`Semantics.v`). The |
| 244 | +target operational semantics (`RealWasmSem.v`) and `value_type`/`instr` encoding |
| 245 | +should be *imported or aligned* with typed-wasm's, not independently reinvented — |
| 246 | +the compile target is the one genuinely shared surface across the estate |
| 247 | +(per `.claude/CLAUDE.md`'s disambiguation table). The `typedwasm.ownership` |
| 248 | +custom section is the contracted hand-off point; the `crates/typed-wasm-verify/` |
| 249 | +verifier crate is the executable companion to the Coq semantics. |
| 250 | + |
| 251 | +== 7. Sequencing & effort |
| 252 | + |
| 253 | +[cols="1,3,1"] |
| 254 | +|=== |
| 255 | +| Phase | Rungs | Size |
| 256 | + |
| 257 | +| First-order K-1 | R0 ✅ → R1 → R2 | M (multi-week) |
| 258 | +| Heap & data | R-mem → R-float → R-str | M–L |
| 259 | +| Whole program | R-call → R-wrap | M |
| 260 | +| The long tail | R-eff (after #555), generics/traits, `unsafe` | L–XL |
| 261 | +|=== |
| 262 | + |
| 263 | +"First-order K-1" (R0→R2) is the natural first *publishable* milestone: it |
| 264 | +discharges K-1 for the nat/bool/let/if/control fragment on the **real** AST and |
| 265 | +**real** wasm IR, retires `K1`/`K1Let`, and settles #601 concretely. The estate |
| 266 | +already has the front-end scaffolding (`F1_TransformerPreservation.v`, |
| 267 | +`SameCube.agda`) waiting to re-chain onto it. |
| 268 | + |
| 269 | +== 8. Status |
| 270 | + |
| 271 | +* **R0 landed** — `RealWasm.v`: real `instr`/`value_type` names, `wexec`, |
| 272 | + `wcheck`, `wexec_sound` (arity-checked code never traps), axiom-free. |
| 273 | +* Next: **R1** — stand up `RealAst.v` + `RealAstSem.v` + `RealCompile.v` for the |
| 274 | + int/bool/let/binary fragment and prove the first real `⟦compile p⟧ = ⟦p⟧`, |
| 275 | + adding locals + i64 to `RealWasm.v`. |
| 276 | + |
| 277 | +Tracked against `docs/PROOF-NEEDS.adoc` §6 Wave 3 and #513. |
0 commit comments