Spam Detection with Stacking Classifier

A sophisticated machine learning web application that detects spam messages using an ensemble stacking classifier. Built with FastAPI backend and a modern, responsive frontend.

Features

Advanced ML Model: Stacking classifier combining MultinomialNB, LinearSVC, SGDClassifier, and RandomForestClassifier
Optimized Performance: RandomizedSearchCV hyperparameter tuning for best results
RESTful API: FastAPI backend for fast, reliable predictions
Modern UI: Professional, responsive web interface.
Real-time Analysis: Instant spam detection with visual feedback
High Accuracy: Optimized for recall to minimize false negatives

Model Performance

Accuracy: ~98%
Precision: ~88%
Recall: ~93%
F1 Score: ~90%

Note: Metrics based on test set evaluation

Screenshots

Predicting SPAM

Predicting HAM

Project Structure

spam-detection-with-stacking-classifier-machine-learning/
│
├── data/
│   ├── spam_dataset.csv          # Original dataset
│
│── spam_classifier.pkl           # Trained model
├── notebooks/
│   ├── spam_detection.ipynb
│
│── main.py                       # FastAPI application
│── index.html                    # Frontend HTML/CSS/JS
│
│
├── .gitignore                    # Git ignore rules
├── README.md                     # This file
└── LICENSE                       # MIT License

Installation

Clone the repository

git clone this repo
cd spam-detection-with-stacking-classifier-machine-learning

Create a virtual environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Run the application
```
uvicorn main:app --reload
```
Access the web interface Open your browser and navigate to http://localhost:8000

Requirements

fastapi==0.104.1
uvicorn==0.24.0
scikit-learn==1.3.2
pandas==2.1.3
numpy==1.26.2
pydantic==2.5.2
nltk

Model Architecture

Base Models

MultinomialNB: Naive Bayes with Laplace smoothing
LinearSVC: Support Vector Classification with balanced class weights
SGDClassifier: Stochastic Gradient Descent with modified Huber loss
RandomForestClassifier: Ensemble of decision trees with 200 estimators

Meta-Learner

LogisticRegression: Combines predictions from base models

Feature Engineering

TfidfVectorizer: N-gram based text vectorization
- Max features: 8000
- N-gram range: (1,1) to (1,3)
- Sublinear TF scaling

Hyperparameter Tuning

RandomizedSearchCV was used with:

n_iter: 100 iterations
cv: 5-fold cross-validation
scoring: Recall (to minimize false negatives)
n_jobs: Parallel processing enabled

Tuned Parameters

Component	Parameter	Range
TF-IDF	max_features	3000-11000
TF-IDF	ngram_range	(1,1) to (1,3)
MultinomialNB	alpha	0.01-1.0
LinearSVC	C	0.1-2
RandomForest	n_estimators	50-300
LogisticRegression	C	0.01-5

🌐 API Endpoints

POST `/predict`

Predicts whether a message is spam or not.

Request:

{
  "text": "Your message here"
}

Response:

{
  "prediction": "spam" | "not spam"
}

GET `/`

Returns the web interface (HTML page).

📝 License

This project is licensed under the MIT License

👤 Author

Chinmoy Guha

GitHub: @LT-Ripjaws
Email: chinmoyguha676z@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spam Detection with Stacking Classifier

Features

Model Performance

Screenshots

Project Structure

Installation

Requirements

Model Architecture

Base Models

Meta-Learner

Feature Engineering

Hyperparameter Tuning

Tuned Parameters

🌐 API Endpoints

POST `/predict`

GET `/`

📝 License

👤 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
notebook		notebook
screenshots		screenshots
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.html		index.html
main.py		main.py
spam_classifier.pkl		spam_classifier.pkl

Folders and files

Latest commit

History

Repository files navigation

Spam Detection with Stacking Classifier

Features

Model Performance

Screenshots

Project Structure

Installation

Requirements

Model Architecture

Base Models

Meta-Learner

Feature Engineering

Hyperparameter Tuning

Tuned Parameters

🌐 API Endpoints

POST /predict

GET /

📝 License

👤 Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

POST `/predict`

GET `/`

Packages