Deterministic, production-grade C++ inference engine built around Boost.SML orchestration.
This repository is under active development. APIs, state machines, and formats will change. If you’re evaluating EMEL, expect fast iteration and breaking changes until the core loader, allocator, and execution pipelines stabilize.
This inference engine is being implemented by AI under human engineering and architecture direction.
EMEL exists to make inference behavior explicit and verifiable. Instead of ad-hoc control flow, orchestration is modeled as Boost.SML state machines with deterministic, testable transitions. That enables:
- Clear operational semantics and failure modes.
- Deterministic, reproducible inference paths.
- High-performance, C-compatible boundaries without dynamic dispatch in hot paths.
- Auditable parity work against reference implementations without copying their control flow.
“EMEL” is pronounced like “ML”. It’s a short, neutral name that doesn’t carry existing assumptions or baggage. It’s intentionally low-ceremony while we iterate on the core design.
scripts/quality_gates.shIndividual gates live in scripts/build_with_zig.sh, scripts/test_with_coverage.sh,
scripts/test_with_sanitizers.sh, scripts/fuzz_smoke.sh, scripts/lint_snapshot.sh,
and scripts/bench.sh.
Zig’s C/C++ toolchain gives us consistent, fast, cross-platform builds without forcing a full dependency on any single system compiler or SDK. It keeps the default dev path reproducible, while still allowing native toolchains when needed.
Coverage and CI tooling are already standardized around CMake + CTest + llvm-cov/gcovr in this repo. Using CMake for test/coverage builds keeps gates deterministic and portable across CI environments, while Zig remains the default for day-to-day builds.
- Architecture (generated state-machine docs + Mermaid diagrams)
- Benchmarks (generated benchmark snapshot table)
- SML Conventions (Boost.SML conventions and usage)
- Parity Audit (parity audit status)
docs/benchmarks.mddocs/architecture/batch_sanitizer.mddocs/architecture/batch_splitter.mddocs/architecture/buffer_allocator.mddocs/architecture/buffer_chunk_allocator.mddocs/architecture/buffer_planner.mddocs/architecture/buffer_realloc_analyzer.mddocs/architecture/decoder_compute_executor.mddocs/architecture/decoder.mddocs/architecture/decoder_ubatch_executor.mddocs/architecture/encoder_bpe.mddocs/architecture/encoder_fallback.mddocs/architecture/encoder_plamo2.mddocs/architecture/encoder_rwkv.mddocs/architecture/encoder.mddocs/architecture/encoder_spm.mddocs/architecture/encoder_ugm.mddocs/architecture/encoder_wpm.mddocs/architecture/gbnf_parser.mddocs/architecture/generator.mddocs/architecture/jinja_parser.mddocs/architecture/jinja_renderer.mddocs/architecture/kv_cache.mddocs/architecture/memory_coordinator_hybrid.mddocs/architecture/memory_coordinator_kv.mddocs/architecture/memory_coordinator_recurrent.mddocs/architecture/memory_coordinator.mddocs/architecture/model_loader.mddocs/architecture/model_weight_loader.mddocs/architecture/parser_gguf.mddocs/architecture/parser.mddocs/architecture/sampler_candidate_builder.mddocs/architecture/sampler_pipeline.mddocs/architecture/sampler_token_selector.mddocs/architecture/telemetry_exporter.mddocs/architecture/telemetry_provider.mddocs/architecture/tensor_allocator.mddocs/architecture/tensor_lifetime_analyzer.mddocs/architecture/tokenizer_preprocessor_bpe.mddocs/architecture/tokenizer_preprocessor_fallback.mddocs/architecture/tokenizer_preprocessor_plamo2.mddocs/architecture/tokenizer_preprocessor_rwkv.mddocs/architecture/tokenizer_preprocessor_spm.mddocs/architecture/tokenizer_preprocessor_ugm.mddocs/architecture/tokenizer_preprocessor_wpm.mddocs/architecture/tokenizer.md
scripts/generate_docs.shUse scripts/generate_docs.sh --check in CI to validate generated artifacts.