A compiler for two toy languages: AC (an imperative C-like language) and DC (a stack-based postfix language).
AC/DC Compiler is a two-part compiler project developed for educational purposes. It supports two minimalistic languages:
- AC: An imperative toy language with expressions, assignments, control flow, and I/O.
- DC: A stack-based postfix language inspired by Unixβs
dc, using reverse Polish notation. To see the documentation, click this link.
- π Lexical Analysis with custom tokenizer
- π§Ύ Parser with recursive descent for grammar rules
- π§ Abstract Syntax Tree (AST) generation
- π οΈ Semantic checks for identifiers and types
- π Support for combined assignment operators (
+=,-=, etc.) - π¨ Print statement support
- β Custom exception handling for syntactic and lexical errors
- π§ͺ JUnit testing for core components
| Language | Description |
|---|---|
| AC | C-like imperative language with variables, operators, control flow |
| DC | Postfix (stack-based) expression language (like Unix dc) |
x = 3 + 4;
y = x * 2;
print y;
- Variables and assignments
- Arithmetic expressions:
+,-,*,/ - Print statements:
print x; - Semi-colon terminated statements
3 4 + p
- Postfix notation (RPN)
- Stack operations: push, pop, print
- Arithmetic operations:
+,-,*,/ p= print top of the stackn= pop and discard top
5 1 2 + 4 * + 3 - p
Interpreted as:
- Push
5 - Push
1,2β1 2 +β3 3 4 *β125 12 +β1717 3 -β14pβ prints14
To check if the input is valid, you can use the dc command in your terminal:
echo "5 1 2 + 4 * + 3 - p" | dcfor exmple.
To see the documentation, click this link.
| Token | Pattern | Class |
|---|---|---|
INT |
[0-9]+ |
Constant/Litteral |
FLOAT |
[0-9]+.([0-9]{0,5}) |
Constant/Literal |
ID |
[a-z][a-z0-9]* |
Identificator |
TYINT |
int |
Key Word |
TYFLOAT |
float |
Key Word |
PRINT |
print |
Key Word |
OP_ASSIGN |
+= | -= | *= | /= |
Operators |
ASSIGN |
= |
Operator |
PLUS |
+ |
Operator |
MINUS |
- |
Operator |
TIMES |
* |
Operator |
DIVIDE |
/ |
Operator |
SEMI |
; |
Delimiter |
EOF |
(char) -1 |
End Input |
Character to ignore: ' ', '\n', '\t', '\r'.
Token format example:
<TYINT,r:1><ID,r:1,tempa><SEMI,r:1>
<ID,r:2,tempa><ASSIGN,r:2><INT,r:2,5><SEMI,r:2>
<TYFLOAT,r:3><ID,r:3,tempb><ASSIGN,r:3><ID,r:3,tempa><DIVIDE,r:3>
<FLOAT,r:3,3.2><SEMI,r:2>
<ID,r:4,tempb><OP_ASSIGN,r:4,+=><INT,r:4,7><SEMI,r:4>
<PRINT,r:5><ID,r:5,tempb><SEMI,r:5><EOF,r:5>
This table outlines the predictive parsing rules for the AC language grammar.
| Num. | LHS | RHS | Predict Set |
|---|---|---|---|
| 0 | Prg | DSs $ | { TYFLOAT, TYINT, ID, PRINT, EOF } |
| 1 | DSs | Dcl DSs | { TYFLOAT, TYINT } |
| 2 | DSs | Stm DSs | { ID, PRINT } |
| 3 | DSs | Ο΅ | { EOF } |
| 4 | Dcl | Ty id DclP | { TYFLOAT, TYINT } |
| 5 | DclP | ; | { SEMI } |
| 6 | DclP | = Exp ; | { ASSIGN } |
| 7 | Stm | id opAss Exp ; | { ID } |
| 8 | Stm | print id ; | { PRINT } |
| 9 | Exp | Tr ExpP | { ID, FLOAT, INT } |
| 10 | ExpP | + Tr ExpP | { PLUS } |
| 11 | ExpP | - Tr ExpP | { MINUS } |
| 12 | ExpP | Ο΅ | { SEMI } |
| 13 | Tr | Val TrP | { ID, FLOAT, INT } |
| 14 | TrP | * Val TrP | { TIMES } |
| 15 | TrP | / Val TrP | { DIVIDE } |
| 16 | TrP | Ο΅ | { MINUS, PLUS, SEMI } |
| 17 | Ty | float | { TYFLOAT } |
| 18 | Ty | int | { TYINT } |
| 19 | Val | intVal | { INT } |
| 20 | Val | floatVal | { FLOAT } |
| 21 | Val | id | { ID } |
| 22 | Op | = | { ASSIGN } |
| 23 | Op | opAss | { OP_ASSIGN } |
To improve its readability:
Prg β DSs $
DSs β Dcl DSs | Stm DSs | Ξ΅
Dcl β Ty id DclP
DclP β ; | = Exp ;
Stm β id opAss Exp ; | print id ;
Exp β Tr ExpP
ExpP β + Tr ExpP | - Tr ExpP | Ξ΅
Tr β Val TrP
TrP β * Val TrP | / Val TrP | Ξ΅
Ty β float | int
Val β intVal | floatVal | id
Op β = | opAss
- Language: Java 17
- Build Tool: Maven
- Testing: JUnit 5
- IDE Recommended: IntelliJ IDEA / VS Code with Java plugins
minilang-compiler/
β
βββ src/
β βββ ast/ # AST node definitions
β βββ eception/ # Custom exception
β βββ parser/ # Parser and grammar implementation
β βββ scanner/ # Tokenizer and token types
β βββ symbolTable/ # Type checking, symbol table
β βββ test/ # Unit tests
β βββ token/ # Token structure and type enums
β βββ visitor/ # Visitor pattern for AST traversal and interpretation
β βββ Main # Main where you can test the program
β
βββ README.md
Follow these steps to run the project locally.
https://github.com/BeastOfShadow/FLT.git
cd FLTmvn clean installRun Main.java to test our program!
x = 5;
x += 2;
print x;Will produce a valid AST and simulate print output: 7.
5 1 2 + 4 * + 3 - p
Interpreted as:
- Push
5 - Push
1,2β1 2 +β3 3 4 *β125 12 +β1717 3 -β14pβ prints14
- The parser uses a recursive descent strategy.
- Statements are matched using a predictive lookahead with
peekToken(). - AST nodes like
NodeAssign,NodePrint, andNodeBinOpare used to represent program structure. - Compound operators like
+=are transformed internally to binary operations within assignments.
This compiler was developed as an academic project by:
This project is licensed under the Apache License 2.0.