A three-phase study on detecting pneumonia from chest X-ray images, progressing from classical machine learning with handcrafted features through deep learning, transfer learning, and sequential/transformer models.
.
├── notebooks/
│ └── Medical_Diagnosis.ipynb
├── reports/
│ └── 20230072_20230311_20230444_20231231_20231085.docx
├── requirements.txt
└── README.md
- Clone this repository.
- Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Download the dataset:
python -c "import kagglehub; kagglehub.dataset_download('paultimothymooney/chest-xray-pneumonia')"
jupyter notebook notebooks/Medical_Diagnosis.ipynbPhase 1 — Classical ML Features extracted using LBP (26-dim) and HOG (~34,596-dim), combined into a 34,622-dimensional vector per image. Four classifiers evaluated with original, PCA-reduced, and LDA-reduced features: KNN, Naive Bayes, Random Forest, and Gradient Boosting.
Phase 2 — Deep Learning Custom CNN (MedVision) trained from scratch, with dropout ablation and Adam vs SGD optimizer comparison. Autoencoder trained on normal images only for unsupervised anomaly detection. Transfer learning with ResNet50, VGG16, and EfficientNetB0 fine-tuned on the dataset.
Phase 3 — Sequential and Transformer Models Images treated as row-by-row sequences (128 rows x 128 pixels). SimpleRNN, LSTM, and GRU architectures evaluated. Vision Transformer (ViT-Base, google/vit-base-patch16-224-in21k) fine-tuned using HuggingFace Trainer with attention map visualization for interpretability.
| Model | Accuracy | F1 Score | ROC-AUC |
|---|---|---|---|
| ResNet50 (Transfer Learning) | 0.9519 | 0.9633 | 0.9892 |
| EfficientNetB0 (Transfer Learning) | 0.9423 | 0.9560 | 0.9861 |
| VGG16 (Transfer Learning) | 0.9359 | 0.9512 | 0.9824 |
| Gradient Boosting (LBP+HOG) | 0.9415 | 0.9564 | 0.9819 |
| Custom CNN (Adam) | 0.9263 | 0.9462 | 0.9814 |
| Autoencoder (Unsupervised) | — | 0.8802 | 0.9318 |
ResNet50 achieves the best overall performance. Notably, Gradient Boosting with classical features matches or exceeds the custom CNN, demonstrating the effectiveness of handcrafted feature engineering for this task.
Chest X-Ray Images (Pneumonia) — 5,856 images across train, test, and validation splits. The training set is imbalanced (74.3% pneumonia).