Demonstrata: from theorem statements to Lean-verified mathematical notes.
Mathematics, proved and reproduced.
Demonstrata is an open-source theorem proving workflow for Codex and Lean 4. It takes an informal mathematical problem or theorem statement and produces:
- a human-readable LaTeX proof,
- a Lean 4 formalization,
- reproducibility artifacts,
- axiom audit output,
- and a final PDF note.
The working principle is simple: Codex can propose proof strategies and drafts, but Lean and mathlib are the source of truth. A note is publishable only when the formal development compiles and its reproducibility evidence can be regenerated.
The repository includes local Codex configuration in .codex/config.toml for
gpt-5.5 with model_reasoning_effort = "high". Demonstrata deliberately uses
one high-reasoning Codex session for formal proof work, Lean diagnostics and
multi-step planning.
Multi-agent execution is disabled in the project configuration. With Codex CLI
0.125.0 or newer, use /status or /debug-config to confirm that the
repo-local configuration is loaded. For non-interactive evidence, codex exec --json can expose planning or reasoning-token telemetry, but it should not be
used here to spawn multiple agents.
- It turns mathematical statements into artifacts that are checked by a formal kernel, not only by informal plausibility.
- It keeps the human-readable proof, Lean source and final PDF coordinated.
- It records reproducibility evidence, including the Lean version, mathlib revision, build output and axiom audit output.
- It uses
mathlib4as the formal mathematical base without reducing the project to theorem search. - It treats the repository as a growing library of verified notes rather than a scratch directory.
| Component | Role |
|---|---|
Biblioteca/ |
Lean library root and technical namespace Biblioteca.*. |
Biblioteca/Demonstrations/ |
Timestamped Lean demonstration modules. |
blueprint/src/sections/ |
LaTeX sections paired with Lean demonstrations. |
blueprint/library/pdf/ |
Archived, publishable PDF notes. |
scripts/ |
Reproducible entrypoints for verification and artifact generation. |
tools/ |
Python support for demo naming, blueprint rendering and diagnostics. |
.agents/skills/ |
Repo-local Codex workflows for strategy, proving and verification. |
.github/workflows/ |
CI, dependency update and release automation. |
Demonstrata is the public project name. Biblioteca remains the Lean namespace
and directory name for this iteration, so existing imports and theorem
references stay stable.
Install the local workflow dependencies:
gitfor the repository and Lake dependencies.python3andvenvfor local automation.elanfor the Lean 4 toolchain.leanandlake, installed throughelan.latexmkandxelatexfor blueprint PDFs.nodeandnpmif you install or update Codex CLI locally.
Then, inside the repository:
.venv/bin/python -m pip install -r requirements.txt
scripts/get_mathlib_cache.sh
scripts/build_strict.shIf the Python environment does not exist yet:
python3 -m venv .venv
.venv/bin/python -m pip install --upgrade pip
.venv/bin/python -m pip install -r requirements.txtpytest: tests for the local automation.sympy: support for mathematical exploration before formalization.
Lean 4 v4.29.0: proof assistant, language and verification kernel.mathlib v4.29.0: the main formal mathematics dependency.Lake: Lean's package and build manager.
lake-manifest.json also records transitive Lean dependencies such as aesop,
batteries, proofwidgets, LeanSearchClient, importGraph, Cli, quote4,
Qq and plausible.
# Strict Lean build
scripts/build_strict.sh
# JSON diagnostics for one Lean file
scripts/check_lean_json.sh Biblioteca/Demonstrations/Demo_20260402_155130_sum_first_odds.lean
# Human-readable summary of JSON diagnostics
.venv/bin/python scripts/summarize_lean_json.py diagnostics.jsonl
# Create a new demonstration
scripts/new_demo.sh "odd numbers sum"
scripts/new_demo.sh --prefix IMO "least norwegian number"
# Check Lean references from the blueprint
scripts/check_blueprint_decls.sh
# Build the current PDF note
scripts/build_blueprint_pdf.sh
# Run Python tests
.venv/bin/pytest -q
# Build the local loogle index to avoid cold-start waits
scripts/build_loogle_index.sh
# Query Mathlib through the canonical persisted local index
scripts/loogle_local.sh --read-index .local-tools/loogle-indexes/Mathlib.extra --module Mathlib 'Fintype.card_subtype'
# Check that the local loogle service answers JSON
scripts/check_loogle_local.sh
# Start loogle, build, test and open Codex with an initial query
scripts/init.sh "Your Codex query"Ask Codex CLI for a complete olympiad-style flow:
$olympiad-formalize solve this problem:
Let \(p\) be a prime number. Prove that there exist integers
\(x,y,z,t\), not all divisible by \(p\), such that
\[
p \mid x^4 - 2y^4 + 3z^4 + 4t^4.
\]
From this input, Demonstrata generates a reproducible proof dossier:
- a human-readable mathematical proof;
- a Lean 4 formalization;
- LaTeX sources;
- reproducibility logs;
- axiom-audit output;
- and a final PDF note.
Example output: blueprint/library/pdf/Demo_20260501_213926_composite_terms_in_recurrence.pdf
$olympiad-formalize coordinates the full path from an informal olympiad-style
statement to a verified note:
- It normalizes the problem statement.
- It invokes
mimate-proof-strategyto compare proof approaches. - It chooses a structural proof route before writing Lean.
- It uses
lean-proveto create the formal development and helper lemmas. - It uses
lean-verifyto run file-level diagnostics and then the strict build. - After Lean accepts the proof, it revises only the paired LaTeX section so the exposition matches the accepted formal argument.
- It builds the final blueprint PDF.
The goal is a readable mathematical argument backed by Lean, not brute-force case enumeration or theorem lookup alone.
For a new demonstration, the workflow creates or updates:
Biblioteca/Demonstrations/<Prefix>_<YYYYMMDD_HHMMSS>_<slug>.leanwith the formal Lean proof.blueprint/src/sections/<stem>.texwith the matching LaTeX exposition.Biblioteca/Demonstrations.leanwith the aggregate import.blueprint/src/content.texwith the blueprint section entry.blueprint/.current_demowith the active demonstration marker.blueprint/build/<timestamp>_<stem>/with temporary PDF build files.blueprint/library/pdf/<stem>.pdfwith the archived publishable note.
The PDF builder creates temporary files such as paper.tex,
lean_glossary.tex, lean_reproducibility.tex, lean_appendix.tex and
selected_content.tex. Temporary build directories are not versioned; archived
PDF notes are.
- Create a demonstration with
scripts/new_demo.sh. - Write or refine the mathematical argument with the repo-local skills.
- Run
scripts/check_lean_json.sh <demo.lean>until the file is clean. - Run
scripts/build_strict.shafter file-level diagnostics pass. - Review the paired LaTeX section for consistency with the accepted Lean proof.
- Run
scripts/check_blueprint_decls.sh. - Build the PDF with
scripts/build_blueprint_pdf.sh. - Keep the
.lean,.texand archived PDF as the publishable output.
The blueprint uses amsart and \lean{...} references to connect prose with
Lean declarations. Each generated note includes:
- short Lean declaration names in the body;
- an automatically generated Lean glossary;
- reproducibility evidence for the selected formalization;
- an
Anexocontaining the full Lean source.
By default, scripts/build_blueprint_pdf.sh builds the current demonstration.
For a collection, use repeated --demo flags or --all:
scripts/build_blueprint_pdf.sh --demo demo_20260402_155831_cubic_increment_sum --demo IMO_20260403_085959_finite_sets_with_divisibility_b_plus_two_c
scripts/build_blueprint_pdf.sh --allFor deeper Mathlib4 navigation, see docs/mathlib-exploration.md. The
recommended order is:
- Lean built-ins such as
#check,#find,exact?,apply?,rw?, plus directrgsearches. - Local
looglethroughscripts/build_loogle_local.sh,scripts/build_loogle_index.sh,scripts/check_loogle_local.sh,scripts/loogle_local.shandscripts/start_loogle_local_server.sh. - NDJSON export with
lean4export. - Semantic exploration with LeanExplore.
For this workspace, the canonical persisted Mathlib index is:
.local-tools/loogle-indexes/Mathlib.extra
Search over Biblioteca is intentionally module-scoped today; the aggregate
Biblioteca import is not treated as a stable global loogle index.
This project is distributed under the MIT license. See LICENSE.
