an AI model designed to detect hatespeech from message given. Based on WangchanBERTa. Trained with HateThaiSent, ThaiToxicityTweetCorpus.
**Group project for Artificial intelligence Class.
- Ensure Python 3.9+ is available.
- (Optional) Create and activate a virtual environment:
python -m venv venv && source venv/bin/activate. - Install dependencies:
pip install -r requirements.txt.
If you're on Windows and want GPU acceleration, keep the optional torch-directml dependency; on other platforms it is safe to ignore if installation fails.
- Place the labelled CSV in
data/HateThaiSent.csv(default expected by the training script). - The CSV must contain a
Messagecolumn with text and aHatespeechcolumn withhatespeech/nonhatespeechlabels. - Optional: download and convert the Thai Toxicity Tweet corpus with
python convert_thai_toxicity_tweet.py --output data/ThaiToxicityTweet_converted.csv; the training script consumes this file automatically (run training with--extra-dataand no values to opt out).
Fine-tune WangchanBERTa with the bundled training helper:
python train_model.py \
--data data/HateThaiSent.csv \
--output-dir models/wangchanberta-hatespeech \
--epochs 2 \
--batch-size 8Key options:
--device auto|cpu|cuda|mps|dml|directml: choose the execution device (DirectML unlocks Windows GPU acceleration).--freeze-encoderor--trainable-layer-count N: control how much of the backbone to fine-tune.--extra-data <path ...>: append additional CSVs that share the HateThaiSent schema; pass--extra-datawith no values to disable the defaults.--max-length,--learning-rate,--gradient-accumulation: adjust training performance.--max-gpu-memory-fraction: leave VRAM headroom when working on limited GPUs such as Google Colab T4 instances.--lr-scheduler,--warmup-steps,--warmup-ratio: configure the learning-rate schedule and warmup strategy.--fp16/--bf16: enable mixed-precision training on supported hardware.--gradient-checkpointing,--group-by-length: trade compute for lower memory and reduce padding overhead.--torch-compile: turn on PyTorch 2.x graph capture for potential throughput gains.
After training finishes, score new text with age-aware post-processing:
python predict.py "ใส่ข้อความภาษาไทยที่นี่" --age 15 --model models/wangchanberta-hatespeechThe script prints a JSON object containing the raw probability, whether the content should be blocked for the provided age, and the policy thresholds that were applied.
Training writes evaluation metrics to models/wangchanberta-hatespeech/eval_metrics.json. Share this file or use it to monitor regression between fine-tuning runs.
Use sample_predictions.py for quick qualitative checks or to score your own sentences in bulk:
python sample_predictions.py --model models/wangchanberta-hatespeech --input sentences.jsonl --output predictions.csvThe script prints per-sentence decisions, reports accuracy when expected labels are provided, and can export a CSV for downstream analysis.
Default demo prompts now live in sample_sentences.jsonl; edit that file to tweak the built-in examples or pass a custom .jsonl/plain-text file through --input.
- Integrate the exported model directory into the browser extension packaging workflow.
- Adjust
age_policy.pyif you need different moderation thresholds per age bracket. - Run
csv_output.pyduring data exploration to inspect label distribution and spot missing values.
L. Lowphansirikul, C. Polpanumas, N. Jantrakulchai, and S. Nutanong, "WangchanBERTa: Pretraining transformer-based Thai language models," arXiv preprint arXiv:2101.09635, 2021. [Online]. Available: https://arxiv.org/abs/2101.09635.
This project stands on the shoulders of public Thai-language moderation corpora. Please cite the sources below and respect the licences listed by the maintainers when redistributing or publishing results:
- HateThaiSent (
data/HateThaiSent.csv). Data Science and Machine Learning Research Group (DSMLR), King Mongkut's University of Technology Thonburi. HateThaiSent: Hate speech and sentiment dataset for Thai social media (v1.0, 2021). Available at https://github.com/dsmlr/HateThaiSent. See the repository for detailed licensing terms. - Thai Toxicity Tweet Corpus (
data/ThaiToxicityTweet_converted.csv). National Electronics and Computer Technology Center (NECTEC) and Artificial Intelligence Association of Thailand (AIAT). Thai Toxicity Tweet Corpus (2020). Available via https://archive.org/download/ThaiToxicityTweetCorpus/data.zip with supporting documentation at https://github.com/tmu-nlp/ThaiToxicityTweetCorpus. Review the corpus licence before redistribution.