MNIST classification implemented from scratch with three classifiers: Naïve Bayes, Decision Tree, and Linear SVM. Only numpy, matplotlib, seaborn, and sklearn.datasets.fetch_openml (for loading data) are used — every model is built by hand.
| File | Description |
|---|---|
| phase1_binary_classification.ipynb | Phase 1 — Binary Classification. Loops over each digit d = 0..9 and trains a one-vs-rest classifier (digit d vs all others) using all three models. |
| phase2_multiclass_classification.ipynb | Phase 2 — Multi-Class Classification. True 10-class classification on digits 0–9. Naïve Bayes uses one Gaussian per class, Decision Tree uses weighted multi-class Gini, Linear SVM uses 10 one-vs-rest binary classifiers. |
| Model | Phase 1 (Binary) | Phase 2 (Multiclass) |
|---|---|---|
| Naïve Bayes | Gaussian likelihood, log-odds threshold | One Gaussian per class, argmax log-likelihood |
| Decision Tree | Weighted Gini, max_depth=12 |
Multi-class weighted Gini, leaf returns class 0–9 |
| Linear SVM | Hinge loss with class weights | 10 one-vs-rest binary SVMs |
MNIST (mnist_784, 70,000 × 784) loaded via sklearn.datasets.fetch_openml. Stratified train / validation / test split, normalised to [0, 1], and standardised for the SVM.
pip install numpy matplotlib seaborn scikit-learn
jupyter notebookOpen either notebook and run all cells.