A neural network implementation for handwritten digit recognition using the MNIST dataset. This project implements a Multi-Layer Perceptron (MLP) from scratch using NumPy for educational purposes.
DigitDecoder is a complete machine learning pipeline that:
- Loads and preprocesses the MNIST handwritten digit dataset
- Implements a 2-layer neural network with ReLU and Softmax activations
- Trains the model using mini-batch gradient descent
- Evaluates model performance on test data
- Visualizes predictions with sample images
- Saves trained model weights for future use
The neural network consists of:
- Input Layer: 784 neurons (28×28 flattened pixel values)
- Hidden Layer: 128 neurons with ReLU activation
- Output Layer: 10 neurons with Softmax activation (for 10 digit classes)
DigitDecoder/
├── data/
│ ├── __init__.py
│ └── mnist_loader.py # MNIST data loading and preprocessing
├── model/
│ ├── __init__.py
│ └── mlp.py # Neural network implementation
├── train/
│ ├── __init__.py
│ └── trainer.py # Training logic and loss function
├── evaluate/
│ ├── __init__.py
│ └── evaluator.py # Model evaluation utilities
├── utils/
│ ├── __init__.py
│ └── helpers.py # Visualization and helper functions
├── weights/ # Directory for saved model weights
├── main.py # Main execution script
├── LICENSE
└── README.md
pip install numpy tensorflow matplotlib- Clone the repository:
git clone https://github.com/KartikAg13/DigitDecoder.git
cd DigitDecoder- Create weights directory:
mkdir weightsRun the complete training and evaluation pipeline:
python main.pyThis will:
- Load and preprocess the MNIST dataset
- Create and train the neural network
- Display training and test accuracy
- Generate prediction visualizations
- Save trained weights to the
weights/directory
You can modify the following hyperparameters in main.py:
hidden_size: Number of neurons in the hidden layer (default: 128)epochs: Maximum number of training epochs (default: 1000)learning_rate: Learning rate for gradient descent (default: 0.1)batch_size: Mini-batch size for training (default: 32)epsilon: Early stopping threshold (default: 1e-4)
The model typically achieves:
- Training Accuracy: ~98%
- Test Accuracy: ~97%
Training includes early stopping when the loss falls below the epsilon threshold.
The project generates predictions.png showing sample test images with their predicted and true labels for visual verification of model performance.
- Linear transformation followed by ReLU activation in hidden layer
- Linear transformation followed by Softmax activation in output layer
- Computes gradients using chain rule
- Updates weights and biases using gradient descent
- Mini-batch gradient descent
- Data shuffling each epoch
- Cross-entropy loss function
- Early stopping mechanism
- Pure NumPy Implementation: Educational focus on understanding neural networks
- Modular Design: Clean separation of concerns across modules
- Comprehensive Pipeline: From data loading to model evaluation
- Visualization: Visual feedback on model predictions
- Model Persistence: Save/load trained weights
- Fork the repository
- Create a feature branch (
git checkout -b feature/improvement) - Commit your changes (
git commit -am 'Add new feature') - Push to the branch (
git push origin feature/improvement) - Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- MNIST dataset provided by Yann LeCun and Corinna Cortes
- TensorFlow/Keras for convenient dataset loading
- NumPy for numerical computations
This implementation is designed for learning and understanding the fundamentals of neural networks. For production use, consider using established frameworks like TensorFlow or PyTorch which offer optimized implementations and additional features.