Vex is a compact, statically typed, procedural toy programming language built for experimenting with compiler frontends and IR optimization. VexC is the compiler for that language.
The language design is documented in SYNTAX.md. The implementation
in src/ covers a narrow but working pipeline: lexing, parsing, semantic
analysis, custom IR lowering, a small optimization pipeline, restricted
SLP-style vectorization, and LLVM IR text emission.
- Canonical syntax specification for the first planned version of Vex
- Recursive-descent frontend in C++
- Custom SSA-style textual IR used for experimentation
- Small optimization pipeline with constant folding and copy cleanup
- Restricted SLP-style vectorization pass for straight-line arithmetic groups
- LLVM IR text emission stage for backend-oriented output
- Examples, tests, benchmarks, and design notes
fun add4(i32 a0, i32 a1, i32 b0, i32 b1) -> i32 {
i32 x0 = a0 + b0;
i32 x1 = a1 + b1;
i32 x2 = a0 + b1;
i32 x3 = a1 + b0;
return x0 + x1 + x2 + x3;
}
Planned first-version language constructs include:
i32,bool, andnil- explicit typed declarations such as
i32 total = ...; - arithmetic, comparison, and logical expressions
%if/elif/elsewhile- inclusive range-based
for - fixed-size homogeneous arrays
- built-in
read,print,println, andexit - string literals for
print/printlnonly - function definitions with
fun
See SYNTAX.md for the full language reference.
Vex source
-> tokens
-> AST
-> semantic analysis
-> Vex IR
-> scalar optimization passes
-> restricted SLP vectorization
-> LLVM IR
VexC currently supports a narrow, demo-oriented subset of the language with a working end-to-end compiler pipeline.
Implemented today:
- lexing for the core Vex token set
- recursive-descent parsing for functions, declarations, calls, control flow, and arrays
- semantic checks for declarations, types, assignments, calls, loops, and builtins
- lowering into a small SSA-style textual IR
- simple scalar cleanup plus restricted SLP-style vectorization
- LLVM IR text emission as the final stage
Remaining gaps are mostly about breadth rather than the core pipeline:
- the full aspirational surface in SYNTAX.md
- richer type support beyond
i32,bool, andnil - deeper optimization beyond the current local passes
- full executable LLVM lowering instead of readable stage output
The vectorizer targets straight-line basic blocks. It looks for adjacent scalar instructions that share:
- the same opcode
- the same scalar type
- no internal dependencies
- compatible operand structure
- simple, unambiguous memory access patterns
Example:
%t0 = add i32 %a0, %b0
%t1 = add i32 %a1, %b1
%t2 = add i32 %a2, %b2
%t3 = add i32 %a3, %b3
can become:
%tv = vadd <4 x i32> <%a0, %a1, %a2, %a3>, <%b0, %b1, %b2, %b3>
This is deliberately not a full auto-vectorizer. It is a restricted pass with clear legality rules and measurable output.
The screenshots below show the compiler stages on the included example programs.
cmake -S . -B build
cmake --build build./build/vexc path/to/program.vex --emit-ir
./build/vexc path/to/program.vex --emit-optimized-ir
./build/vexc path/to/program.vex --emit-llvmThe CLI exposes each pipeline stage directly so the lexer, AST, IR, optimized IR, and LLVM IR output can be inspected independently.
ctest --test-dir buildThe repository is structured for:
- lexer and parser unit tests
- semantic error tests
- IR generation snapshots
- optimization pass tests
- integration tests for complete Vex programs
The test suite includes stage-level integration checks plus a handful of negative cases for lexer and semantic failures.
Benchmarks focus on small integer kernels where the vectorizer has clear, defensible opportunities:
- array addition
- dot-product variants
- straight-line arithmetic kernels
- small loop bodies with repeated independent scalar operations
Tracked metrics include scalar instruction count, vectorized group count, generated LLVM IR size, and runtime for selected kernels.
include/vex/ Public VexC headers
src/frontend/ Lexer and parser implementation
src/ast/ AST definitions and utilities
src/sema/ Semantic analysis
src/ir/ Custom Vex IR
src/passes/ Optimization and vectorization passes
src/llvm/ LLVM IR lowering
src/driver/ Command-line compiler driver
tests/ Unit and integration tests
examples/ Small Vex programs
benchmarks/ Benchmark kernels and measurement notes
docs/ Design notes and diagrams
VexC is organized as a compiler pipeline with a narrow implementation surface and a broader language target documented in SYNTAX.md. The main engineering goal is clarity: each stage has a small API, testable behavior, and readable IR output before and after optimization.







