Skip to content

vibhaas/vex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VexC: The Vex Compiler and IR Optimizer

Vex is a compact, statically typed, procedural toy programming language built for experimenting with compiler frontends and IR optimization. VexC is the compiler for that language.

The language design is documented in SYNTAX.md. The implementation in src/ covers a narrow but working pipeline: lexing, parsing, semantic analysis, custom IR lowering, a small optimization pipeline, restricted SLP-style vectorization, and LLVM IR text emission.

Features

  • Canonical syntax specification for the first planned version of Vex
  • Recursive-descent frontend in C++
  • Custom SSA-style textual IR used for experimentation
  • Small optimization pipeline with constant folding and copy cleanup
  • Restricted SLP-style vectorization pass for straight-line arithmetic groups
  • LLVM IR text emission stage for backend-oriented output
  • Examples, tests, benchmarks, and design notes

Language Snapshot

fun add4(i32 a0, i32 a1, i32 b0, i32 b1) -> i32 {
    i32 x0 = a0 + b0;
    i32 x1 = a1 + b1;
    i32 x2 = a0 + b1;
    i32 x3 = a1 + b0;
    return x0 + x1 + x2 + x3;
}

Planned first-version language constructs include:

  • i32, bool, and nil
  • explicit typed declarations such as i32 total = ...;
  • arithmetic, comparison, and logical expressions
  • %
  • if / elif / else
  • while
  • inclusive range-based for
  • fixed-size homogeneous arrays
  • built-in read, print, println, and exit
  • string literals for print / println only
  • function definitions with fun

See SYNTAX.md for the full language reference.

Compiler Pipeline

Vex source
  -> tokens
  -> AST
  -> semantic analysis
  -> Vex IR
  -> scalar optimization passes
  -> restricted SLP vectorization
  -> LLVM IR

Implementation Status

VexC currently supports a narrow, demo-oriented subset of the language with a working end-to-end compiler pipeline.

Implemented today:

  • lexing for the core Vex token set
  • recursive-descent parsing for functions, declarations, calls, control flow, and arrays
  • semantic checks for declarations, types, assignments, calls, loops, and builtins
  • lowering into a small SSA-style textual IR
  • simple scalar cleanup plus restricted SLP-style vectorization
  • LLVM IR text emission as the final stage

Remaining gaps are mostly about breadth rather than the core pipeline:

  • the full aspirational surface in SYNTAX.md
  • richer type support beyond i32, bool, and nil
  • deeper optimization beyond the current local passes
  • full executable LLVM lowering instead of readable stage output

Restricted SLP Vectorization

The vectorizer targets straight-line basic blocks. It looks for adjacent scalar instructions that share:

  • the same opcode
  • the same scalar type
  • no internal dependencies
  • compatible operand structure
  • simple, unambiguous memory access patterns

Example:

%t0 = add i32 %a0, %b0
%t1 = add i32 %a1, %b1
%t2 = add i32 %a2, %b2
%t3 = add i32 %a3, %b3

can become:

%tv = vadd <4 x i32> <%a0, %a1, %a2, %a3>, <%b0, %b1, %b2, %b3>

This is deliberately not a full auto-vectorizer. It is a restricted pass with clear legality rules and measurable output.

Pipeline Screenshots

The screenshots below show the compiler stages on the included example programs.

Build

Build output

Tokens

Token stream

AST

AST output

IR: Control Flow

IR control flow output

IR: Arrays

IR arrays output

Optimized IR

Optimized IR output

IR: While And Break

IR while break output

LLVM IR

LLVM IR output

Build

cmake -S . -B build
cmake --build build

Usage

./build/vexc path/to/program.vex --emit-ir
./build/vexc path/to/program.vex --emit-optimized-ir
./build/vexc path/to/program.vex --emit-llvm

The CLI exposes each pipeline stage directly so the lexer, AST, IR, optimized IR, and LLVM IR output can be inspected independently.

Tests

ctest --test-dir build

The repository is structured for:

  • lexer and parser unit tests
  • semantic error tests
  • IR generation snapshots
  • optimization pass tests
  • integration tests for complete Vex programs

The test suite includes stage-level integration checks plus a handful of negative cases for lexer and semantic failures.

Benchmarks

Benchmarks focus on small integer kernels where the vectorizer has clear, defensible opportunities:

  • array addition
  • dot-product variants
  • straight-line arithmetic kernels
  • small loop bodies with repeated independent scalar operations

Tracked metrics include scalar instruction count, vectorized group count, generated LLVM IR size, and runtime for selected kernels.

Repository Layout

include/vex/       Public VexC headers
src/frontend/      Lexer and parser implementation
src/ast/           AST definitions and utilities
src/sema/          Semantic analysis
src/ir/            Custom Vex IR
src/passes/        Optimization and vectorization passes
src/llvm/          LLVM IR lowering
src/driver/        Command-line compiler driver
tests/             Unit and integration tests
examples/          Small Vex programs
benchmarks/        Benchmark kernels and measurement notes
docs/              Design notes and diagrams

Implementation Notes

VexC is organized as a compiler pipeline with a narrow implementation surface and a broader language target documented in SYNTAX.md. The main engineering goal is clarity: each stage has a small API, testable behavior, and readable IR output before and after optimization.

About

Vex is a minimal C-like language for me to experiment with compiler frontends and IR optimization.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors