A fully-functional Unix shell implementation in C that replicates core bash functionality. This project demonstrates advanced parsing techniques using Abstract Syntax Trees (AST) and BNF (Backus-Naur Form) Grammar to build a complete command interpreter with pipes, redirections, and process management.
Project: 42 School Rank 03 | Language: C | Status: Complete
- Overview
- Features
- Architecture
- BNF Grammar
- AST Implementation
- Build Instructions
- Usage
- Project Structure
- Implementation Highlights
Minishell is a command-line shell interpreter that parses and executes shell commands with full support for:
- Pipes and process chains (
|) - Input/Output redirection (
<,>,>>,<<) - Logical operators (
&&,||) - Subshells and command grouping
- Environment variable expansion
- Signal handling for proper shell behavior
- Built-in commands (cd, pwd, echo, export, unset, env, exit)
The shell uses a sophisticated three-stage pipeline:
- Tokenization - Breaks input into tokens respecting quotes and operators
- Parsing - Builds an Abstract Syntax Tree following BNF grammar rules
- Execution - Traverses the AST with fork/exec for process management
| Command | Description |
|---|---|
cd [path] |
Change directory; updates PWD/OLDPWD |
pwd |
Print working directory |
echo [-n] [args...] |
Print arguments (with optional -n flag) |
export [var=value] |
Create/modify environment variables |
unset [var...] |
Remove environment variables |
env |
Display all environment variables |
exit [code] |
Exit shell with optional exit code |
< # Input redirection (read from file)
> # Output redirection (overwrite file)
>> # Append output redirection
<<DELIM # Here-document (multi-line input until DELIM)Examples:
cat < input.txt # Redirect input
ls > output.txt # Redirect output
echo "text" >> file.txt # Append output
cat << EOF # Here-document
input here
EOF| # Pipe: pass stdout of left to stdin of right
&& # AND: execute right only if left succeeded
|| # OR: execute right only if left failed
; # Semicolon: sequential command executionExamples:
ls | grep .txt # Pipe chain
cd /tmp && ls # AND operator
invalid_cmd || echo "Error" # OR operator
echo "first"; echo "second" # Semicolon'string' # Single quotes: literal (no expansion)
"string" # Double quotes: variable expansion enabled
$VAR # Environment variable expansion
$? # Last command exit status
$$ # Current shell PIDExamples:
echo $HOME # Expands to home directory
echo "$USER" # Variable expansion in quotes
echo '$USER' # Literal $USER (no expansion)
echo "Exit code: $?" # Display last exit status- Subshells
(...)- Execute commands in child process - Here-documents - Multi-line input with optional variable expansion
- Command chaining - Complex pipelines with multiple operators
- Signal handling - Proper SIGINT/SIGQUIT behavior
- Exit status tracking - Proper propagation through pipes and operators
┌─────────────┐ ┌──────────┐ ┌───────────┐
│ INPUT │ → │ TOKENIZE │ → │ PARSE │ → ┌─────────┐
│ Raw Shell │ │ (Lexer) │ │ (BNF) │ │ AST │
│ Syntax │ │ │ │ │ │ Tree │
└─────────────┘ └──────────┘ └───────────┘ └─────────┘
│
↓
┌─────────────┐ ┌──────────┐ ┌───────────┐ ┌─────────┐
│ EXECUTE │ ← │ EVALUATE │ ← │ TRAVERSE │ ← │ Final │
│ (Fork/Exec) │ │ (Expand) │ │ (DFS) │ │ AST │
└─────────────┘ └──────────┘ └───────────┘ └─────────┘
minishell/
├── tokens/ # Tokenization (lexical analysis)
│ ├── tokenize.c # Main tokenizer
│ ├── fill_tokens.c # Token classification
│ └── convert_env_tokens.c # Variable expansion
│
├── parse/ # Parsing (syntactic analysis)
│ ├── parse_main.c # Entry point
│ ├── parse_cmd_line.c # Top-level command parsing
│ ├── parse_pipe_cmd.c # Pipe operator handling
│ ├── parse_compound.c # Logical operators (&&, ||)
│ ├── parse_redirection.c # Redirection parsing
│ └── parse_subshell.c # Subshell parsing
│
├── execute_cmd/ # Execution (semantic analysis & runtime)
│ ├── exe_ast.c # AST traversal entry point
│ ├── exe_pipe.c # Pipe execution
│ ├── exe_simple_cmd.c # Command execution
│ ├── exe_redirections.c # Redirection handling
│ ├── exe_built_in.c # Built-in command dispatch
│ └── exe_here_doc.c # Here-document processing
│
├── env/ # Environment variable management
│ ├── init_env.c # Environment initialization
│ ├── update_env.c # Variable updates
│ └── build_envp.c # Create envp[] array
│
├── evaluate/ # Expression evaluation & utilities
│ ├── evals_1.c/.c # Quote and escape handling
│ └── utils.c # Helper functions
│
├── minishell.h # Main header with data structures
├── libft/ # Custom library (from 42 School)
└── Makefile # Build configuration
The shell syntax is defined using Backus-Naur Form (BNF), which formally specifies the language structure. This enables predictable parsing and execution.
<program> ::= <command_line> | ε
<command_line> ::= <compound_command> (";" <compound_command>)*
<compound_command> ::= <or_expr>
<or_expr> ::= <and_expr> ("||" <and_expr>)*
<and_expr> ::= <pipe_expr> ("&&" <pipe_expr>)*
<pipe_expr> ::= <command> ("|" <command>)*
<command> ::= <simple_command> | <subshell>
<subshell> ::= "(" <command_line> ")"
<simple_command> ::= <word>+ <redirections>*
<redirections> ::= <redirection> | ε
<redirection> ::= (">" | ">>" | "<" | "<<") <filename>
<word> ::= <quoted_string> | <unquoted_string>
<quoted_string> ::= '"' <dquoted_content> '"' | "'" <squoted_content> "'"- Semicolon (
;) - Command separator (lowest precedence) - OR (
||) - Logical OR operator - AND (
&&) - Logical AND operator - Pipe (
|) - Process chaining - Redirections (
<,>,>>,<<) - I/O operations (highest precedence)
Input: echo hello > file.txt | grep e
PIPE
/ \
REDIR COMMAND
/ \ |
COMMAND file grep e
|
echo hello
The Abstract Syntax Tree (AST) is the internal representation of parsed commands, enabling clean separation between syntax analysis and execution logic.
typedef enum node_type_e {
NODE_COMMAND, // Simple command execution
NODE_PIPE, // Pipe operator (|)
NODE_SEQUENCE_AND, // AND operator (&&)
NODE_SEQUENCE_OR, // OR operator (||)
NODE_SEMI, // Semicolon separator (;)
NODE_REDIRECT_OUT, // Output redirect (>)
NODE_REDIRECT_OUT_APPEND, // Append redirect (>>)
NODE_REDIRECT_IN, // Input redirect (<)
NODE_REDIRECT_HEREDOC, // Here-document (<<)
NODE_SUBSHELL, // Subshell (...)
NODE_VAR_ASSIGNMENT, // Variable assignment
NODE_VAR_EXPANSION, // Variable expansion
NODE_QUOTED_STRING, // Quoted string
} t_node_type;
typedef struct ast_node {
t_node_type type;
struct ast_node *left; // Left subtree
struct ast_node *right; // Right subtree
char **args; // Command arguments
char **filename; // Redirection target
} t_ast;- Tokenize input into token stream
- Parse with recursive descent:
- Start with lowest precedence rule (
;) - Each production tries to parse higher precedence operators
- Terminal symbols become command nodes
- Start with lowest precedence rule (
- Build binary tree where:
- Internal nodes = operators (pipes, &&, ||, etc.)
- Leaf nodes = simple commands
The AST is executed via depth-first traversal with proper process management:
int exe_ast(t_ast **root, t_env_s *env_s) {
if (!root) return 0;
switch(root->type) {
case NODE_COMMAND:
return exe_simple_cmd(root, env_s); // Fork & exec
case NODE_PIPE:
return exe_pipe(root, env_s); // Create pipe, fork twice
case NODE_SEQUENCE_AND:
left_status = exe_ast(&root->left);
if (left_status == 0)
return exe_ast(&root->right); // Only exec right if left ok
case NODE_SEQUENCE_OR:
left_status = exe_ast(&root->left);
if (left_status != 0)
return exe_ast(&root->right); // Only exec right if left failed
// ... other node types
}
}- C compiler (gcc/clang)
- readline development library
- Unix-like environment (Linux, macOS)
- GNU Make
# Install readline (Ubuntu/Debian)
sudo apt-get install libreadline-dev
# Build the project
make
# Run minishell
./minishell# Install readline (Homebrew)
brew install readline
# Build with readline path
make
# Run minishell
./minishellmake # Compile minishell
make clean # Remove object files
make fclean # Remove objects and executable
make re # Full rebuild (fclean + all)
make valgrind # Run with memory leak detectionminishell> echo "Hello, World!"
Hello, World!
minishell> ls -la
total 120
drwxr-xr-x 13 user wheel 416 Jul 08 19:31 .
drwxr-xr-x 34 user wheel 1088 Jun 01 12:45 ..
...# Simple pipe
minishell> cat file1.txt | grep "search" | wc -l
# Output redirection
minishell> echo "Hello" > output.txt
minishell> cat output.txt
Hello
# Append redirection
minishell> echo "World" >> output.txt
minishell> cat output.txt
Hello
World
# Input redirection
minishell> sort < unsorted.txt
# Here-document
minishell> cat << EOF
This is a
multi-line
input
EOF
This is a
multi-line
input# AND operator
minishell> mkdir test_dir && cd test_dir
minishell> pwd
/path/to/test_dir
# OR operator
minishell> cd /nonexistent || echo "Failed"
Failed
# Semicolon (sequential execution)
minishell> echo "first"; echo "second"
first
second
# Complex chains
minishell> echo "data" | grep d && echo "Success" || echo "Fail"
data
Success# Subshell isolation
minishell> (cd /tmp && ls)
# Lists /tmp contents but returns to original directory
minishell> echo $SHELL; (export SHELL="/bin/dash"; echo $SHELL); echo $SHELL
/bin/bash # Original
/bin/dash # In subshell
/bin/bash # Back to original# View all variables
minishell> env
# Export new variable
minishell> export MY_VAR="my_value"
minishell> echo $MY_VAR
my_value
# Unset variable
minishell> unset MY_VAR
minishell> echo $MY_VAR
$MY_VAR # Not expanded (empty)
# Special variables
minishell> echo $$ # Current shell PID
12345
minishell> false; echo $? # Last exit status
1# cd with directory change
minishell> cd /tmp
minishell> pwd
/tmp
# echo with options
minishell> echo -n "No newline"
No newline%
# export and display
minishell> export VAR1=val1 VAR2=val2
minishell> export
declare -x HOME=/home/user
declare -x VAR1=val1
declare -x VAR2=val2
...
# exit with code
minishell> exit
$ echo $?
0minishell/
│
├── 📄 Main Files
│ ├── minishell.h # Main header file (data structures, prototypes)
│ ├── libft.h # Custom library header
│ ├── Makefile # Build configuration
│ └── README.md # This file
│
├── 🔤 tokens/ # Tokenization & Lexical Analysis
│ ├── tokenize.c # Main tokenizer - splits input by operators/spaces
│ ├── tokenize_2.c # Additional tokenization utilities
│ ├── fill_tokens.c # Classify tokens as operators, args, etc.
│ ├── fill_tokens_utils.c # Token classification helpers
│ ├── token_utils.c # General token utilities
│ ├── convert_env_tokens.c # Handle $VAR expansion in double quotes
│ ├── utils_env_tokens.c # Environment expansion utilities
│ ├── her_doc_token.c # Here-document token processing
│ ├── helper_here_doc_token.c # Here-document helpers
│ └── utils_fill_tokens.c # Additional token utilities
│
├── 🌳 parse/ # Parsing & Syntax Analysis (BNF Grammar)
│ ├── parse_main.c # Entry point, initializes parsing
│ ├── parse_cmd_line.c # Parses: command_line → compound_command*
│ ├── parse_compound.c # Parses logical operators (&&, ||)
│ ├── parse_pipe_cmd.c # Parses: pipe_expr → command ("|" command)*
│ ├── command.c # Parses: command → simple_command | subshell
│ ├── simple_cmd.c # Parses: simple_command → word* redirection*
│ ├── parse_redirection.c # Parses: >, >>, <, << redirections
│ ├── parse_subshell.c # Parses: subshell → "(" command_line ")"
│ ├── parse_doc.c # Here-document parsing
│ ├── parse_doc2.c # Additional here-document parsing
│ ├── parse_utils.c # Parsing utilities
│ ├── create_node.c # AST node creation and management
│ └── parse_utils.c # Additional parsing helpers
│
├── ⚙️ execute_cmd/ # Execution & Runtime
│ ├── main.c # Shell main loop, signal handling
│ ├── main_exe.c # Execution coordinator
│ ├── exe_ast.c # AST traversal & dispatch to handlers
│ ├── exe_pipe.c # Execute pipe chains (left | right)
│ ├── exe_simple_cmd.c # Execute simple commands (fork + exec)
│ ├── exe_simple_cmd_2.c # Additional simple command logic
│ ├── exe_subshell.c # Execute subshell (child process)
│ ├── exe_redirections.c # Handle >, >>, < redirections
│ ├── exe_redirections_2.c # Additional redirection logic
│ ├── exe_redirections_3.c # More redirection handling
│ ├── exe_redir_in.c # Input redirection handler
│ ├── exe_here_doc.c # Here-document execution
│ ├── exe_here_doc_2.c # Additional here-doc logic
│ ├── exe_built_in.c # Built-in command dispatcher
│ ├── exe_built_in_2.c # Additional built-in handlers
│ ├── exe_built_in_3.c # More built-in commands
│ ├── ft_echo.c # echo built-in implementation
│ ├── executable.c # Executable path resolution
│ ├── execute.c # Core execution logic
│ ├── free_funcs.c # Memory cleanup functions
│ ├── n_pmt_utils.c # New prompt utilities
│ ├── open_new_promt.c # Prompt display & input
│ └── open_new_promt.c # Additional prompt handling
│
├── 🔧 env/ # Environment Variable Management
│ ├── init_env.c # Initialize environment from envp
│ ├── create_node.c # Create environment list nodes
│ ├── add_env.c # Add environment variables
│ ├── add_modify_env.c # Add/modify environment entries
│ ├── update_env.c # Update existing variables
│ ├── del_env.c # Delete environment variables
│ └── build_envp.c # Build envp[] array for execve
│
├── 📊 evaluate/ # Expression & String Evaluation
│ ├── evals_1.c # Quote handling & expansion
│ ├── evals_2.c # Additional evaluation logic
│ ├── evals_3.c # Expression evaluation
│ ├── utils.c # General utilities
│ ├── utils2.c # Additional utils
│ └── utils3.c # More utilities
│
├── 📦 obj/ # Compiled object files (generated)
│
├── 📚 libft/ # 42 School custom library (not shown)
│
├── 🧪 testdir1/, testdir2/ # Test directories
│ └── mycmd # Test executable
│
└── ⚙️ Other Scripts
└── (Compilation artifacts)
- Handles quoted strings (single/double) with escape sequences
- Recognizes multi-character operators (
>>,&&,||,<<) - Validates token sequences (prevents invalid syntax)
- Preserves whitespace context for proper word boundaries
- Recursive descent parser implementing BNF grammar
- Operator precedence correctly enforced (semicolon < OR < AND < pipe)
- Error recovery with syntax validation
- Quote preservation through parsing phases for later evaluation
- Process management with fork/exec/waitpid
- Bidirectional pipes with proper file descriptor handling
- Subshell isolation via child processes
- Signal handling for SIGINT/SIGQUIT (Ctrl+C gracefully handled)
- Linked-list environment for efficient var lookups
- Variable expansion with special vars (
$?,$$,$_) - Export management with proper scoping
- Dynamic envp array built on-demand for execve
- Single quotes prevent ALL expansion (literal strings)
- Double quotes allow
$VARand$$expansion - Escape sequences properly handled (e.g.,
\$VARin double quotes) - Quote removal performed post-expansion for clean arguments
- Multi-line input capture until delimiter
- Variable expansion support within here-docs (when appropriate)
- Delimiter validation with optional quoting
- Heredoc chaining in complex pipelines
# Test 1: Basic command execution
./minishell
minishell> ls -la
# Test 2: Pipes and redirections
minishell> echo "test" | cat > output.txt
minishell> cat output.txt
# Test 3: Complex operators
minishell> (cd /tmp && ls) || echo "Failed"
# Test 4: Variable expansion
minishell> export VAR="hello"
minishell> echo "Value: $VAR in quotes, '$VAR' in single"
# Test 5: Here-documents
minishell> cat << EOF
> line 1
> line 2
> $VAR
> EOF
# Test 6: Exit status
minishell> true && echo "Success" || echo "Failed"
minishell> false || echo "Caught error"This project teaches:
- Compiler/Interpreter design - Multi-stage processing (tokenize → parse → execute)
- Formal grammars - BNF notation and recursive descent parsing
- Data structures - AST, linked lists for environment management
- System programming - Fork/exec/pipe/signal handling
- Process management - Parent-child process communication and synchronization
- String processing - Quote handling, variable expansion, escape sequences
- Memory management - Proper allocation/deallocation patterns in C
- This is an educational project from 42 School (French programming school)
- Implements core bash features; some advanced features (command history search, job control) are intentionally excluded
- All code follows 42 School coding standards (Norm)
- Custom library (libft) is used from previous 42 projects
- mkhavari - Main implementation
- asemykin - Co-developer
This project is part of the 42 School curriculum.
- ✅ Built-in commands (cd, pwd, echo, export, unset, env, exit)
- ✅ I/O redirections (<, >, >>, <<)
- ✅ Pipes (|) and command chaining
- ✅ Logical operators (&&, ||)
- ✅ Subshells and parentheses
- ✅ Environment variables and expansion (
$VAR, $ ?, $$) - ✅ Quote handling (single, double, escaping)
- ✅ Signal handling (SIGINT, SIGQUIT)
- ✅ Here-documents
- ✅ Proper exit status management
- ✅ Memory safety and cleanup
- ✅ Error handling and reporting
Happy shell scripting! 🎉