BERT-Based Comment Classification

This project demonstrates how to use the BERT (Bidirectional Encoder Representations from Transformers) model for classifying comments. It is designed to identify and categorize user comments based on their semantic content—useful for detecting spam, abuse, sentiment, or topic relevance in forums, customer feedback, or online platforms.

Features

Fine-tuning of a pretrained BERT model (bert-base-uncased) for text classification tasks

Data preprocessing and tokenization using Hugging Face's transformers and datasets

Training with GPU support and performance tracking via accuracy and loss metrics

Evaluation with classification report and confusion matrix for detailed insight

📁 Project Structure

📆comment-classification-bert

├ 📋comment_classification_task.ipynb

├ 📋README.md

┗ 📂data/ (optional)

🚀 How to Run

1. Clone the Repository

git clone https://github.com/yourusername/comment-classification-bert.git cd comment-classification-bert

2. Install Dependencies

pip install -r requirements.txt

You can also install manually:

pip install torch transformers datasets scikit-learn matplotlib seaborn

3. Run the Notebook

Launch the notebook in Jupyter or any compatible IDE:

jupyter notebook comment_classification_task.ipynb

🧪 Dataset

The dataset used consists of labeled user comments. Each comment is associated with a category label. The supported labels in this project are:

toxic

severe_toxic

obscene

threat

insult

identity_hate

non-toxic

You may customize the notebook to load your own dataset in CSV or JSON format with the following structure:

comment,text,label 1,"This is a great product!",non-toxic 2,"You are terrible!",toxic

🧠 Model Details

Model: bert-base-uncased (Hugging Face Transformers)

Fine-tuned for text classification

Optimizer: AdamW

Loss Function: CrossEntropyLoss

Evaluation: Accuracy, Confusion Matrix, Precision/Recall/F1

📊 Results

After training, the model is evaluated on a test set, and results include:

Accuracy score

Confusion matrix (visualized using seaborn)

Classification report with precision, recall, and F1-score

📈 Visualization

The notebook includes plots of training and validation loss, and confusion matrix for better interpretability.

🛠️ Customization

Replace dataset with your own comment data

Adjust number of epochs, batch size, and learning rate

Add early stopping or additional metrics

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Fandishe Presentation BERT MODEL.pdf		Fandishe Presentation BERT MODEL.pdf
README.md		README.md
comment_classification_task.ipynb		comment_classification_task.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BERT-Based Comment Classification

Features

📁 Project Structure

📆comment-classification-bert

├ 📋comment_classification_task.ipynb

├ 📋README.md

┗ 📂data/ (optional)

🚀 How to Run

1. Clone the Repository

2. Install Dependencies

You can also install manually:

3. Run the Notebook

🧪 Dataset

toxic

severe_toxic

obscene

threat

insult

identity_hate

non-toxic

🧠 Model Details

📊 Results

📈 Visualization

🛠️ Customization

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BERT-Based Comment Classification

Features

📁 Project Structure

📆comment-classification-bert

├ 📋comment_classification_task.ipynb

├ 📋README.md

┗ 📂data/ (optional)

🚀 How to Run

1. Clone the Repository

2. Install Dependencies

You can also install manually:

3. Run the Notebook

🧪 Dataset

toxic

severe_toxic

obscene

threat

insult

identity_hate

non-toxic

🧠 Model Details

📊 Results

📈 Visualization

🛠️ Customization

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages