K-Nearest Neighbors Classifier with Hyperparameter Tuning

This project demonstrates the use of the K-Nearest Neighbors (KNN) classifier for binary classification on a bank dataset. The project involves data preprocessing, model training, and hyperparameter tuning using Grid Search and Randomized Search.

Overview

This project utilizes the K-Nearest Neighbors (KNN) algorithm to classify bank customers based on various features. The target variable is binary, indicating whether a customer has subscribed to a term deposit. The project involves splitting the dataset into training and testing sets, training the KNN model, and improving model performance through hyperparameter tuning.

Dataset

The dataset used in this project is a CSV file containing the following columns:

age: Age of the customer
balance: Balance of the customer's bank account
day: Day of the month when the customer was last contacted
duration: Duration of the last contact
campaign: Number of contacts performed during this campaign
pdays: Number of days since the customer was last contacted in a previous campaign
previous: Number of contacts performed before this campaign
y: Target variable (0 = No, 1 = Yes)

Requirements

The following Python libraries are required to run the project:

pandas
numpy
matplotlib
scikit-learn

Install the required libraries using the following command:

pip install pandas numpy matplotlib scikit-learn

Usage

To run the project, simply execute the script in a Python environment.

python knn_classifier.py

Model Training and Evaluation

The KNN model is trained on the training dataset using the default parameters. The model's performance is evaluated on the test dataset using classification metrics such as precision, recall, f1-score, and accuracy.

Hyperparameter Tuning

To improve the model's performance, hyperparameter tuning is performed using two methods: Grid Search and Randomized Search.

Grid Search

Grid Search is used to explore a specified range of hyperparameters. The following hyperparameters are tuned:

n_neighbors: Number of neighbors to use
weights: Weight function used in prediction
leaf_size: Leaf size passed to BallTree or KDTree

param_grid = {
    'n_neighbors': [1, 3, 5, 7, 9, 11, 13, 15, 17, 19],
    'weights': ['uniform', 'distance'],
    'leaf_size': [10, 50, 100, 500]
}

Randomized Search

Randomized Search is used to explore a larger range of hyperparameters with a specified number of iterations. The following hyperparameters are tuned:

n_neighbors: Number of neighbors to use
weights: Weight function used in prediction
leaf_size: Leaf size passed to BallTree or KDTree

param_grid = {
    'n_neighbors': np.arange(1, 20, 2),
    'weights': ['uniform', 'distance'],
    'leaf_size': np.arange(1, 500)
}

Results

The best hyperparameter combinations found using Grid Search and Randomized Search are:

Grid Search: {'leaf_size': 10, 'n_neighbors': 17, 'weights': 'distance'}
Randomized Search: {'weights': 'distance', 'n_neighbors': 19, 'leaf_size': 243}

The overall accuracy of the model improved from 0.87 to 0.89 after hyperparameter tuning.

Interpretations

The best hyperparameter combination improved the overall accuracy of the model. However, if recall of the positive class is the metric of interest, the adjusted hyperparameters performed worse compared to the untuned model. This highlights the trade-off between different performance metrics during model tuning.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Resources		Resources
README.md		README.md
hyperparameter-tuning.ipynb		hyperparameter-tuning.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

K-Nearest Neighbors Classifier with Hyperparameter Tuning

Table of Contents

Overview

Dataset

Requirements

Usage

Model Training and Evaluation

Hyperparameter Tuning

Grid Search

Randomized Search

Results

Interpretations

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

K-Nearest Neighbors Classifier with Hyperparameter Tuning

Table of Contents

Overview

Dataset

Requirements

Usage

Model Training and Evaluation

Hyperparameter Tuning

Grid Search

Randomized Search

Results

Interpretations

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages