- 1. Status & rigour vocabulary
- 2. 1. Current mechanization status — the "afresh" baseline
- 3. 2. Proof obligations missed before (pre-faces, uncatalogued)
- 4. 3. Additional needs arising from outstanding work
- 5. 4. New proof needs from the faces work
- 6. 5. Adjacent repositories (same cube, wider lens)
- 7. 6. Recommended sequencing
- 8. Appendix A — relationship to #513 / #563
- 9. Appendix B — historical note (preserved)
This is the single document that enumerates what AffineScript must prove, at
what rigour, with what current status. It is deliberately honest about the
gap between the soundness arguments the project relies on (prose in
docs/academic/proofs/, the borrow-checker comments in lib/borrow.ml,
docs/CAPABILITY-MATRIX.adoc) and the soundness proofs that are actually
mechanized today: a growing axiom-free Coq core (18 files) that pins each
obligation on a deliberately small model — no full-language theorem is
discharged yet (see 1. Current mechanization status — the "afresh" baseline).
It supersedes the previous near-empty PROOF-NEEDS.md, which recorded only the
2026-03-29 template-ABI cleanup (preserved in Appendix B — historical note (preserved)).
|
Important
|
This inventory is a catalogue of obligations, not a relicensing or a code change. Filing it does not assert any obligation is discharged. Each row’s Status column is the ground truth; "stated", "prose", and "absent" all mean not proven. |
| Term | Meaning |
|---|---|
|
No artifact exists. Not even a statement. |
|
The theorem is written as a type/lemma signature, but the body is a hole ( |
|
A paper argument exists (markdown). Convincing-to-a-human, machine-unchecked. |
|
A restricted case is machine-checked; the general theorem is not. |
|
Fully machine-checked under a |
Rigour tiers (effort, mirroring the umbrella issue #513): S small ·
M medium · L large · XL extra-large.
A fresh sweep of the repo (2026-06-20) finds the mechanized core is thinner than the prose corpus suggests:
-
docs/academic/formal-verification/solo-core/— the only core-language formalisation. Covers the Solo fragment only: STLC + Unit + Product + Sum
let+ QTT quantities{0, 1, ω}.Syntax/Context/Quantity/Typingare populated;Soundness.idris statements-only —data Stephas no constructors ("intentionally omitted until week 3"), andprogress = ?todo_progress,preservation = ?todo_preservation. Status:stated. It proves nothing yet. Solo excludes traits, effects, rows, refinements, dependent types, ownership/borrowing, records, arrays, modules, andunsafe. -
docs/academic/proofs/*.md— seven prose proofs (type-soundness, ownership-soundness, effect-soundness, quantitative-types, dependent-types, row-polymorphism, coherence-parametricity) plus DB-theory notes. Status:prose. None mechanized. -
docs/academic/mechanized/{agda,coq,lean}/README.md— explicit "Status: Stub / Planned" placeholders. No proof content. -
affinescript-vite/verification/proofs/{agda,coq,lean4,idris2}/— RSR templates ("Replace with your project’s domain-specific proofs"); the CoqTypeSafety.vis example lemmas about list length, not AffineScript. Not core-metatheory. -
formal/— the directory #513 names as the mechanized-proof target now exists (Coq/Rocq 8.18), axiom-free throughout — 18 files, 31Print Assumptionsclosure reports, all "Closed under the global context". Codegen keystone:K1_CodegenPreservation.v(K-1 minimal)
K1Let_CodegenPreservation.v(K-1 grown with variables/let/environment). Faces front-end:F1_TransformerPreservation.v(F-1, composing on K-1).Siblings_Stated.vstates P-2/P-3/F-3/F-4, discharged for concrete models inP2_Progress.v(first-order) +P2_Stlc.v(STLC with functions + the substitution lemma, funext-free) /P3_BorrowSound.v(single-bit)
P3_BorrowGraph.v(multi-resource loan graph) /F3_PragmaDecidable.v/F4_ErrorFaithful.v/F5_RenderFaithful.v(render non-collision). Affine/QTT layer:QttSemiring.v({0,1,ω})
AffineUsage.v(λx.xaffine,λx. x xnot) +QttTyping.v(quantitative typing,usage x t = Γ xsound) +QttDynamic.v(the dynamic half: β-reduction preserves the usage profile). Real lift (REAL-LIFT.adoc):RealWasm.v(reallib/wasm.mlIR — i32 core + locals) +RealCompile.v(R1: the first real⟦compile p⟧ = ⟦p⟧). Record rows:Rows.v(P-11 — progress + preservation for extensible records). See 3. Additional needs arising from outstanding work, 4. New proof needs from the faces work. -
Research tracks (not core soundness):
docs/academic/tropical-session-types/(Lean),proposals/echo-types/EchoEncodingFaithfulness.agda,proposals/idaptik/migrated/*/*Boundary.agda(echo-types loss-with-residue, a *different project living inproposals/).
|
Warning
|
Known soundness holes (execution-verified, open). These are bugs that falsify the corresponding prose proof’s assumptions. Each is tracked as a defect but none has a stated proof obligation whose discharge would have caught it or would prevent regression — that linkage is itself a gap (see 2. Proof obligations missed before (pre-faces, uncatalogued)). These were the execution-verified implementation holes. As of 2026-06 their
implementation status is closed and now tracked, test-anchored, in
docs/SOUNDNESS.adoc — #554 fixed (UAM via callee-returned
borrow rejected), #555 fenced loud on every compiled backend (one pinned
interpreter residual), #556 fixed, #558 removed (refinements withdrawn, not
left unenforced), #559 fixed for concrete overlaps, #553 Polonius M1–M3 but
test-only/unwired. That does not discharge the proof obligations below.
Closing a hole means the implementation no longer exhibits the counterexample;
it does not mean the property is mechanically proven. The obligations these
holes motivate — most pointedly that the borrow proof must now be stated to
reject the (now-fixed) #554 program as a negative test — remain v1 readiness ledger: #563. Proof programme umbrella: #513. |
These predate the faces work; they were simply never written down as obligations. They are the "we might have missed" half of the brief.
| ID | Obligation | Rigour | Status | Notes / issue |
|---|---|---|---|---|
P-1 |
The inventory itself. No document enumerated proof needs; |
S |
|
— |
P-2 |
Solo progress + preservation. Discharge the two holes in |
XL |
|
#513 must-have 1; solo-core (Idris2) |
P-3 |
Borrow-graph soundness. A well-typed program never observes a moved/aliased value. Must exclude the #554 counterexample (callee-returned borrow). |
XL |
|
#513 must-have 2; mechanized for a single-bit model
( |
P-4 |
QTT affine usage. Quantities |
L |
|
#513 must-have 3; mechanized across four files: |
P-5 |
HM inference soundness + principality. Inferred types are well-typed and principal. |
L |
|
#513 must-have 4; prose |
P-6 |
Effect-row soundness. A function performs no effect outside its row; row unification is sound. Blocked — the backend (#555) does not implement the semantics this proof would assume (see 3. Additional needs arising from outstanding work). |
M |
|
#513 must-have 5; blocked on #555 |
P-7 |
Name-resolution determinism. Resolution is a deterministic total function of the program. |
M |
|
#513 must-have 6 |
P-8 |
Parser conformance. The grammar accepts exactly the documented language (property-tested). |
M |
|
#513 must-have 8 |
P-9 |
Refinement enforcement = proof obligation. #558 was simultaneously a bug and the missing obligation "refinement predicates are checked." Withdrawn: the refinement/dependent surface was removed in v1, so there is no unenforced predicate left to prove about (see SOUNDNESS.adoc). Retained for history; reopen only if refinements return. |
L |
|
#558 |
P-10 |
Trait coherence. At most one instance resolves per (trait, type). The implementation now rejects concrete overlaps (#559, see SOUNDNESS.adoc); generic-subsumption overlap is the open follow-up. The proof that resolution is unique remains unwritten. |
L |
|
#559; #513 high-priority |
P-11 |
Record-row soundness. Well-typed record operations never get stuck: a
selection of a present label always reduces, and extension/restriction
preserve typing over a |
M |
|
|
The work in flight changes which obligations are load-bearing and which are blocked.
-
K-1 — Codegen WASM semantic-preservation is the keystone (#513 must-have 7).
⟦compile(p)⟧_wasm = ⟦p⟧_sourcefor the operational semantics. This is the single obligation every face and every aLib conformer ultimately rests on: prove the backend preserves meaning once, and front-end face theorems (4. New proof needs from the faces work) compose into end-to-end correctness. RigourXL. Statuspartial(wasprose): a Wave-0 mechanized seed —formal/K1_CodegenPreservation.v(Coq, no axioms) — proves preservation for a minimal nat/bool · add/and fragment compiled to a stack machine —K1Let_CodegenPreservation.vgrows it with de Bruijn variables,let, and an environment. The real lift onto the real AffineScript AST + real typed-WASM semantics is scoped informal/REAL-LIFT.adoc(fragment ladder R0→R-eff); R0 + R1 landed.formal/RealWasm.v(R0) re-targets the actuallib/wasm.mlinstr/value_type(i32 core + locals) with a stack-arity soundness theorem;formal/RealCompile.v(R1) gives the first real⟦compile p⟧ = ⟦p⟧— resolved int/bool/let/binary source, referenceeval,compileto real wasm,compile_correct(env↔locals agreement) +compile_program_correct, retiringK1/K1Leton real objects. R2 (structured control, fuel-indexed) is next, and is the rung that settles #601 concretely. -
K-2 — Effect-soundness is blocked, not merely unproven. P-6 cannot be honestly stated against the current backend because #555 drops handler arms on three of the codegen targets. The obligation must be split: (a) prove soundness against the interpreter’s intended handler semantics, then (b) prove the lowered targets refine that semantics — which (b) currently fails. This makes #555 a proof-blocking defect, not just a runtime bug.
-
K-3 — Borrow soundness must be stated to reject #554. Any mechanized P-3 must take the #554 program as a negative test (must not typecheck). Done in the model:
P3_BorrowSound.vandP3_BorrowGraph.vreject the #554 witness ([OBorrow; OMove; OUseRef]); the full-language negative test still waits on Polonius (#553), which the real-checker statement needs to reject it in general. -
K-4 —
CAPABILITY-MATRIX.adocrows are latent obligations. Every row not markedenforced(effectspartial, refinementsdeclared-but-unwired, …) is a place where a prose proof presumes behaviour the implementation does not deliver. The matrix should cross-link each such row to its P-/F- obligation here.
The faces architecture (ADR-010: surface transformers lib/<face>_face.ml
error-vocabulary formatter lib/face.ml + pragma detector lib/face_pragma.ml;
"different faces, same cube") introduces obligations that are entirely absent
from #513’s 8 must-haves and 12 high-priority items. This is the core
discovery of the review: the proof programme predates faces and never grew to
cover them.
| ID | Obligation | Rigour | Status | Seed / issue |
|---|---|---|---|---|
F-1 |
Transformer semantics-preservation — the real same-cube theorem. For
each face |
XL |
|
mechanized fragment: |
F-2 |
Same-cube cross-face agreement (observational). Any two faces compiling
the same abstract program emit observationally-equivalent wasm (same effect
trace, same return). Partially mechanized: |
L |
|
|
F-3 |
Face pragma detection is total + deterministic + confluent.
|
M |
|
mechanized for a model: |
F-4 |
Error-vocabulary faithfulness (simulation). Each |
M |
|
mechanized for a model: |
F-5 |
|
S |
|
mechanized for the vocabulary model: |
F-6 |
Preview round-trip totality. The |
M |
|
new |
F-7 |
Face confluence with canonicalisation. An explicit pragma is never
silently overridden in a way that changes denotation; resolving a face and
then canonicalising commutes ( |
M |
|
new |
|
Note
|
Why F-2 is |
-
aggregate-library (aLib) — cross-cube conformance, not cross-face. Faces unify syntax over one cube; aLib unifies a library API over many cubes. Its YAML conformance vectors are a property-test obligation (each conformer satisfies every vector — the aLib analogue of P-8), and its spec "Properties" sections (commutativity, associativity, identity, …) are exactly the stdlib algebraic-law obligations #513 lists as high-priority. The AffineScript reference conformer (
aggregate-library/src/affinescript/ALib.affine) is where those laws become AffineScript proof obligations. -
invariant-path — the same-cube harness.
proofs/SameCube.agdais the first mechanized faces proof (seed of F-2);scripts/verify-same-cube.shcompiles every face to wasm and sha256-compares as the empirical (test-tier) companion. Treat these as the F-2 home until aformal/faces track exists in this repo.
| Wave | Items | Gates |
|---|---|---|
0 |
Done. |
— |
1 |
Done. Grew the models toward the real language: P-2 → STLC with functions
+ the substitution lemma ( |
Wave 0 |
2 |
Done. The affine/QTT layer — |
Wave 1 |
3 |
The real lift (the bulk of the remaining work) — scoped + started. Replace
the toy objects with the real ones: formalise the real AffineScript AST and
the real typed-WASM operational semantics, then re-prove K-1/F-1/F-2 against
them (this is what makes K-1/F-1 |
Wave 2; #555; #553 |
-
#513 (proof-programme umbrella) supplies P-2…P-10 and K-1. This document adds the faces obligations F-1…F-7 and the aLib law-conformance obligation, which #513 does not mention, and links the open soundness holes (#554/#555/#556/#558/#559) to the obligations they falsify.
-
#563 (v1 readiness ledger) tracks the holes as release blockers; the
Statuscolumn here should stay consistent with that ledger. -
When
formal/lands, eachP-/F-ID should become a module name or lemma label so the catalogue and the mechanization share one namespace.
Template ABI Cleanup (2026-03-29). The template ABI was removed — it was
creating a false impression of formal verification. The removed files
(Types.idr, Layout.idr, Foreign.idr) contained only RSR template
scaffolding with unresolved {{PROJECT}}/{Jonathan D.A. Jewell} placeholders and no
domain-specific proofs. When this project needs formal ABI verification, create
domain-specific Idris2 proofs following the pattern in repos like typed-wasm,
proven, echidna, or boj-server. (The same caution applies to the
affinescript-vite/verification/proofs/ templates surveyed in 1. Current mechanization status — the "afresh" baseline:
they are scaffolding, not metatheory.)