Skip to content

JavadJan/minishell

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🐚 Minishell

A fully-functional Unix shell implementation in C that replicates core bash functionality. This project demonstrates advanced parsing techniques using Abstract Syntax Trees (AST) and BNF (Backus-Naur Form) Grammar to build a complete command interpreter with pipes, redirections, and process management.

Project: 42 School Rank 03 | Language: C | Status: Complete


📋 Table of Contents


🎯 Overview

Minishell is a command-line shell interpreter that parses and executes shell commands with full support for:

  • Pipes and process chains (|)
  • Input/Output redirection (<, >, >>, <<)
  • Logical operators (&&, ||)
  • Subshells and command grouping
  • Environment variable expansion
  • Signal handling for proper shell behavior
  • Built-in commands (cd, pwd, echo, export, unset, env, exit)

The shell uses a sophisticated three-stage pipeline:

  1. Tokenization - Breaks input into tokens respecting quotes and operators
  2. Parsing - Builds an Abstract Syntax Tree following BNF grammar rules
  3. Execution - Traverses the AST with fork/exec for process management

✨ Features

Built-in Commands

Command Description
cd [path] Change directory; updates PWD/OLDPWD
pwd Print working directory
echo [-n] [args...] Print arguments (with optional -n flag)
export [var=value] Create/modify environment variables
unset [var...] Remove environment variables
env Display all environment variables
exit [code] Exit shell with optional exit code

I/O Redirection

<          # Input redirection (read from file)
>          # Output redirection (overwrite file)
>>         # Append output redirection
<<DELIM    # Here-document (multi-line input until DELIM)

Examples:

cat < input.txt                    # Redirect input
ls > output.txt                    # Redirect output
echo "text" >> file.txt            # Append output
cat << EOF                         # Here-document
input here
EOF

Command Operators & Control Flow

|          # Pipe: pass stdout of left to stdin of right
&&         # AND: execute right only if left succeeded
||         # OR: execute right only if left failed
;          # Semicolon: sequential command execution

Examples:

ls | grep .txt                     # Pipe chain
cd /tmp && ls                      # AND operator
invalid_cmd || echo "Error"        # OR operator
echo "first"; echo "second"        # Semicolon

Quote Handling & Variable Expansion

'string'   # Single quotes: literal (no expansion)
"string"   # Double quotes: variable expansion enabled
$VAR       # Environment variable expansion
$?         # Last command exit status
$$         # Current shell PID

Examples:

echo $HOME                         # Expands to home directory
echo "$USER"                       # Variable expansion in quotes
echo '$USER'                       # Literal $USER (no expansion)
echo "Exit code: $?"               # Display last exit status

Advanced Features

  • Subshells (...) - Execute commands in child process
  • Here-documents - Multi-line input with optional variable expansion
  • Command chaining - Complex pipelines with multiple operators
  • Signal handling - Proper SIGINT/SIGQUIT behavior
  • Exit status tracking - Proper propagation through pipes and operators

🏗️ Architecture

Three-Stage Pipeline

┌─────────────┐      ┌──────────┐      ┌───────────┐
│   INPUT     │  →   │ TOKENIZE │  →   │  PARSE    │  →  ┌─────────┐
│  Raw Shell  │      │ (Lexer)  │      │  (BNF)    │     │   AST   │
│  Syntax     │      │          │      │           │     │  Tree   │
└─────────────┘      └──────────┘      └───────────┘     └─────────┘
                                                                │
                                                                ↓
┌─────────────┐      ┌──────────┐      ┌───────────┐     ┌─────────┐
│  EXECUTE    │  ←   │ EVALUATE │  ←   │ TRAVERSE  │  ←  │  Final  │
│ (Fork/Exec) │      │ (Expand) │      │   (DFS)   │     │  AST    │
└─────────────┘      └──────────┘      └───────────┘     └─────────┘

Module Organization

minishell/
├── tokens/              # Tokenization (lexical analysis)
│   ├── tokenize.c       # Main tokenizer
│   ├── fill_tokens.c    # Token classification
│   └── convert_env_tokens.c  # Variable expansion
│
├── parse/               # Parsing (syntactic analysis)
│   ├── parse_main.c     # Entry point
│   ├── parse_cmd_line.c # Top-level command parsing
│   ├── parse_pipe_cmd.c # Pipe operator handling
│   ├── parse_compound.c # Logical operators (&&, ||)
│   ├── parse_redirection.c  # Redirection parsing
│   └── parse_subshell.c # Subshell parsing
│
├── execute_cmd/         # Execution (semantic analysis & runtime)
│   ├── exe_ast.c        # AST traversal entry point
│   ├── exe_pipe.c       # Pipe execution
│   ├── exe_simple_cmd.c # Command execution
│   ├── exe_redirections.c   # Redirection handling
│   ├── exe_built_in.c   # Built-in command dispatch
│   └── exe_here_doc.c   # Here-document processing
│
├── env/                 # Environment variable management
│   ├── init_env.c       # Environment initialization
│   ├── update_env.c     # Variable updates
│   └── build_envp.c     # Create envp[] array
│
├── evaluate/            # Expression evaluation & utilities
│   ├── evals_1.c/.c     # Quote and escape handling
│   └── utils.c          # Helper functions
│
├── minishell.h          # Main header with data structures
├── libft/               # Custom library (from 42 School)
└── Makefile             # Build configuration

📖 BNF Grammar

The shell syntax is defined using Backus-Naur Form (BNF), which formally specifies the language structure. This enables predictable parsing and execution.

Grammar Rules

<program>          ::= <command_line> | ε
<command_line>     ::= <compound_command> (";" <compound_command>)*
<compound_command> ::= <or_expr>
<or_expr>          ::= <and_expr> ("||" <and_expr>)*
<and_expr>         ::= <pipe_expr> ("&&" <pipe_expr>)*
<pipe_expr>        ::= <command> ("|" <command>)*
<command>          ::= <simple_command> | <subshell>
<subshell>         ::= "(" <command_line> ")"
<simple_command>   ::= <word>+ <redirections>*
<redirections>     ::= <redirection> | ε
<redirection>      ::= (">" | ">>" | "<" | "<<") <filename>
<word>             ::= <quoted_string> | <unquoted_string>
<quoted_string>    ::= '"' <dquoted_content> '"' | "'" <squoted_content> "'"

Precedence & Associativity (Low to High)

  1. Semicolon (;) - Command separator (lowest precedence)
  2. OR (||) - Logical OR operator
  3. AND (&&) - Logical AND operator
  4. Pipe (|) - Process chaining
  5. Redirections (<, >, >>, <<) - I/O operations (highest precedence)

Example Parse Tree

Input: echo hello > file.txt | grep e

            PIPE
           /    \
       REDIR      COMMAND
       /   \         |
   COMMAND  file   grep e
     |
  echo hello

🌳 AST Implementation

The Abstract Syntax Tree (AST) is the internal representation of parsed commands, enabling clean separation between syntax analysis and execution logic.

AST Node Types

typedef enum node_type_e {
    NODE_COMMAND,           // Simple command execution
    NODE_PIPE,              // Pipe operator (|)
    NODE_SEQUENCE_AND,      // AND operator (&&)
    NODE_SEQUENCE_OR,       // OR operator (||)
    NODE_SEMI,              // Semicolon separator (;)
    NODE_REDIRECT_OUT,      // Output redirect (>)
    NODE_REDIRECT_OUT_APPEND,  // Append redirect (>>)
    NODE_REDIRECT_IN,       // Input redirect (<)
    NODE_REDIRECT_HEREDOC,  // Here-document (<<)
    NODE_SUBSHELL,          // Subshell (...)
    NODE_VAR_ASSIGNMENT,    // Variable assignment
    NODE_VAR_EXPANSION,     // Variable expansion
    NODE_QUOTED_STRING,     // Quoted string
} t_node_type;

typedef struct ast_node {
    t_node_type type;
    struct ast_node *left;      // Left subtree
    struct ast_node *right;     // Right subtree
    char **args;                // Command arguments
    char **filename;            // Redirection target
} t_ast;

AST Construction Algorithm

  1. Tokenize input into token stream
  2. Parse with recursive descent:
    • Start with lowest precedence rule (;)
    • Each production tries to parse higher precedence operators
    • Terminal symbols become command nodes
  3. Build binary tree where:
    • Internal nodes = operators (pipes, &&, ||, etc.)
    • Leaf nodes = simple commands

Execution Strategy

The AST is executed via depth-first traversal with proper process management:

int exe_ast(t_ast **root, t_env_s *env_s) {
    if (!root) return 0;
    
    switch(root->type) {
        case NODE_COMMAND:
            return exe_simple_cmd(root, env_s);   // Fork & exec
        case NODE_PIPE:
            return exe_pipe(root, env_s);         // Create pipe, fork twice
        case NODE_SEQUENCE_AND:
            left_status = exe_ast(&root->left);
            if (left_status == 0)
                return exe_ast(&root->right);     // Only exec right if left ok
        case NODE_SEQUENCE_OR:
            left_status = exe_ast(&root->left);
            if (left_status != 0)
                return exe_ast(&root->right);     // Only exec right if left failed
        // ... other node types
    }
}

🔨 Build Instructions

Requirements

  • C compiler (gcc/clang)
  • readline development library
  • Unix-like environment (Linux, macOS)
  • GNU Make

Compile on Linux

# Install readline (Ubuntu/Debian)
sudo apt-get install libreadline-dev

# Build the project
make

# Run minishell
./minishell

Compile on macOS

# Install readline (Homebrew)
brew install readline

# Build with readline path
make

# Run minishell
./minishell

Build Targets

make              # Compile minishell
make clean        # Remove object files
make fclean       # Remove objects and executable
make re           # Full rebuild (fclean + all)
make valgrind     # Run with memory leak detection

💻 Usage

Basic Command Execution

minishell> echo "Hello, World!"
Hello, World!

minishell> ls -la
total 120
drwxr-xr-x   13 user  wheel   416 Jul 08 19:31 .
drwxr-xr-x   34 user  wheel  1088 Jun 01 12:45 ..
...

Pipes & Redirection

# Simple pipe
minishell> cat file1.txt | grep "search" | wc -l

# Output redirection
minishell> echo "Hello" > output.txt
minishell> cat output.txt
Hello

# Append redirection
minishell> echo "World" >> output.txt
minishell> cat output.txt
Hello
World

# Input redirection
minishell> sort < unsorted.txt

# Here-document
minishell> cat << EOF
This is a
multi-line
input
EOF
This is a
multi-line
input

Command Operators

# AND operator
minishell> mkdir test_dir && cd test_dir
minishell> pwd
/path/to/test_dir

# OR operator
minishell> cd /nonexistent || echo "Failed"
Failed

# Semicolon (sequential execution)
minishell> echo "first"; echo "second"
first
second

# Complex chains
minishell> echo "data" | grep d && echo "Success" || echo "Fail"
data
Success

Subshells

# Subshell isolation
minishell> (cd /tmp && ls)
# Lists /tmp contents but returns to original directory

minishell> echo $SHELL; (export SHELL="/bin/dash"; echo $SHELL); echo $SHELL
/bin/bash          # Original
/bin/dash          # In subshell
/bin/bash          # Back to original

Environment Variables

# View all variables
minishell> env

# Export new variable
minishell> export MY_VAR="my_value"
minishell> echo $MY_VAR
my_value

# Unset variable
minishell> unset MY_VAR
minishell> echo $MY_VAR
$MY_VAR            # Not expanded (empty)

# Special variables
minishell> echo $$               # Current shell PID
12345
minishell> false; echo $?        # Last exit status
1

Built-in Commands

# cd with directory change
minishell> cd /tmp
minishell> pwd
/tmp

# echo with options
minishell> echo -n "No newline"
No newline%

# export and display
minishell> export VAR1=val1 VAR2=val2
minishell> export
declare -x HOME=/home/user
declare -x VAR1=val1
declare -x VAR2=val2
...

# exit with code
minishell> exit
$ echo $?
0

📁 Project Structure

minishell/
│
├── 📄 Main Files
│   ├── minishell.h              # Main header file (data structures, prototypes)
│   ├── libft.h                  # Custom library header
│   ├── Makefile                 # Build configuration
│   └── README.md                # This file
│
├── 🔤 tokens/                   # Tokenization & Lexical Analysis
│   ├── tokenize.c               # Main tokenizer - splits input by operators/spaces
│   ├── tokenize_2.c             # Additional tokenization utilities
│   ├── fill_tokens.c            # Classify tokens as operators, args, etc.
│   ├── fill_tokens_utils.c      # Token classification helpers
│   ├── token_utils.c            # General token utilities
│   ├── convert_env_tokens.c     # Handle $VAR expansion in double quotes
│   ├── utils_env_tokens.c       # Environment expansion utilities
│   ├── her_doc_token.c          # Here-document token processing
│   ├── helper_here_doc_token.c  # Here-document helpers
│   └── utils_fill_tokens.c      # Additional token utilities
│
├── 🌳 parse/                    # Parsing & Syntax Analysis (BNF Grammar)
│   ├── parse_main.c             # Entry point, initializes parsing
│   ├── parse_cmd_line.c         # Parses: command_line → compound_command*
│   ├── parse_compound.c         # Parses logical operators (&&, ||)
│   ├── parse_pipe_cmd.c         # Parses: pipe_expr → command ("|" command)*
│   ├── command.c                # Parses: command → simple_command | subshell
│   ├── simple_cmd.c             # Parses: simple_command → word* redirection*
│   ├── parse_redirection.c      # Parses: >, >>, <, << redirections
│   ├── parse_subshell.c         # Parses: subshell → "(" command_line ")"
│   ├── parse_doc.c              # Here-document parsing
│   ├── parse_doc2.c             # Additional here-document parsing
│   ├── parse_utils.c            # Parsing utilities
│   ├── create_node.c            # AST node creation and management
│   └── parse_utils.c            # Additional parsing helpers
│
├── ⚙️ execute_cmd/              # Execution & Runtime
│   ├── main.c                   # Shell main loop, signal handling
│   ├── main_exe.c               # Execution coordinator
│   ├── exe_ast.c                # AST traversal & dispatch to handlers
│   ├── exe_pipe.c               # Execute pipe chains (left | right)
│   ├── exe_simple_cmd.c         # Execute simple commands (fork + exec)
│   ├── exe_simple_cmd_2.c       # Additional simple command logic
│   ├── exe_subshell.c           # Execute subshell (child process)
│   ├── exe_redirections.c       # Handle >, >>, < redirections
│   ├── exe_redirections_2.c     # Additional redirection logic
│   ├── exe_redirections_3.c     # More redirection handling
│   ├── exe_redir_in.c           # Input redirection handler
│   ├── exe_here_doc.c           # Here-document execution
│   ├── exe_here_doc_2.c         # Additional here-doc logic
│   ├── exe_built_in.c           # Built-in command dispatcher
│   ├── exe_built_in_2.c         # Additional built-in handlers
│   ├── exe_built_in_3.c         # More built-in commands
│   ├── ft_echo.c                # echo built-in implementation
│   ├── executable.c             # Executable path resolution
│   ├── execute.c                # Core execution logic
│   ├── free_funcs.c             # Memory cleanup functions
│   ├── n_pmt_utils.c            # New prompt utilities
│   ├── open_new_promt.c         # Prompt display & input
│   └── open_new_promt.c         # Additional prompt handling
│
├── 🔧 env/                      # Environment Variable Management
│   ├── init_env.c               # Initialize environment from envp
│   ├── create_node.c            # Create environment list nodes
│   ├── add_env.c                # Add environment variables
│   ├── add_modify_env.c         # Add/modify environment entries
│   ├── update_env.c             # Update existing variables
│   ├── del_env.c                # Delete environment variables
│   └── build_envp.c             # Build envp[] array for execve
│
├── 📊 evaluate/                 # Expression & String Evaluation
│   ├── evals_1.c                # Quote handling & expansion
│   ├── evals_2.c                # Additional evaluation logic
│   ├── evals_3.c                # Expression evaluation
│   ├── utils.c                  # General utilities
│   ├── utils2.c                 # Additional utils
│   └── utils3.c                 # More utilities
│
├── 📦 obj/                      # Compiled object files (generated)
│
├── 📚 libft/                    # 42 School custom library (not shown)
│
├── 🧪 testdir1/, testdir2/      # Test directories
│   └── mycmd                    # Test executable
│
└── ⚙️ Other Scripts
    └── (Compilation artifacts)

💡 Implementation Highlights

1. Sophisticated Tokenization

  • Handles quoted strings (single/double) with escape sequences
  • Recognizes multi-character operators (>>, &&, ||, <<)
  • Validates token sequences (prevents invalid syntax)
  • Preserves whitespace context for proper word boundaries

2. Robust Parsing

  • Recursive descent parser implementing BNF grammar
  • Operator precedence correctly enforced (semicolon < OR < AND < pipe)
  • Error recovery with syntax validation
  • Quote preservation through parsing phases for later evaluation

3. Advanced Execution

  • Process management with fork/exec/waitpid
  • Bidirectional pipes with proper file descriptor handling
  • Subshell isolation via child processes
  • Signal handling for SIGINT/SIGQUIT (Ctrl+C gracefully handled)

4. Environment Handling

  • Linked-list environment for efficient var lookups
  • Variable expansion with special vars ($?, $$, $_)
  • Export management with proper scoping
  • Dynamic envp array built on-demand for execve

5. Quote & Escape Processing

  • Single quotes prevent ALL expansion (literal strings)
  • Double quotes allow $VAR and $$ expansion
  • Escape sequences properly handled (e.g., \$VAR in double quotes)
  • Quote removal performed post-expansion for clean arguments

6. Here-Document Support

  • Multi-line input capture until delimiter
  • Variable expansion support within here-docs (when appropriate)
  • Delimiter validation with optional quoting
  • Heredoc chaining in complex pipelines

🧪 Testing

Manual Testing Examples

# Test 1: Basic command execution
./minishell
minishell> ls -la

# Test 2: Pipes and redirections
minishell> echo "test" | cat > output.txt
minishell> cat output.txt

# Test 3: Complex operators
minishell> (cd /tmp && ls) || echo "Failed"

# Test 4: Variable expansion
minishell> export VAR="hello"
minishell> echo "Value: $VAR in quotes, '$VAR' in single"

# Test 5: Here-documents
minishell> cat << EOF
> line 1
> line 2
> $VAR
> EOF

# Test 6: Exit status
minishell> true && echo "Success" || echo "Failed"
minishell> false || echo "Caught error"

🎓 Learning Outcomes

This project teaches:

  1. Compiler/Interpreter design - Multi-stage processing (tokenize → parse → execute)
  2. Formal grammars - BNF notation and recursive descent parsing
  3. Data structures - AST, linked lists for environment management
  4. System programming - Fork/exec/pipe/signal handling
  5. Process management - Parent-child process communication and synchronization
  6. String processing - Quote handling, variable expansion, escape sequences
  7. Memory management - Proper allocation/deallocation patterns in C

📝 Notes

  • This is an educational project from 42 School (French programming school)
  • Implements core bash features; some advanced features (command history search, job control) are intentionally excluded
  • All code follows 42 School coding standards (Norm)
  • Custom library (libft) is used from previous 42 projects

👥 Authors

  • mkhavari - Main implementation
  • asemykin - Co-developer

📄 License

This project is part of the 42 School curriculum.


🔗 References


✅ Checklist

  • ✅ Built-in commands (cd, pwd, echo, export, unset, env, exit)
  • ✅ I/O redirections (<, >, >>, <<)
  • ✅ Pipes (|) and command chaining
  • ✅ Logical operators (&&, ||)
  • ✅ Subshells and parentheses
  • ✅ Environment variables and expansion ($VAR, $?, $$)
  • ✅ Quote handling (single, double, escaping)
  • ✅ Signal handling (SIGINT, SIGQUIT)
  • ✅ Here-documents
  • ✅ Proper exit status management
  • ✅ Memory safety and cleanup
  • ✅ Error handling and reporting

Happy shell scripting! 🎉

About

Minishell is a small Unix shell built using a BNF‑defined grammar and an AST-based execution model. It parses commands into structured AST nodes and supports pipes, redirections, built‑ins, and environment expansion with a clean, modular architecture

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors