This project was the final project for Dr. Meghan Chiovaro's ELE392 Class at URI. A binary classifier vision transformer based binary classifier to distinguish if an image was generated by an AI model or not.
Try the live demo:
ai-detection-model.streamlit.app
With the rapid advancement of generative AI, determining whether something was generated by an AI or not poses a serious concern for users.
We used the ArtiFact dataset, which contains 2.5 million images, generated from 25 different models. https://github.com/awsaf49/artifact
- Base Encoder: OpenAI's CLIP ViT image encoder pre‑trained on internet‑scale data.
- Fine‑Tuning with LoRA:
- Freeze original weights
- Inject low‑rank adapters into selected transformer layers
- Train only ~0.16 % of total parameters for efficiency and to avoid catastrophic forgetting.
| Hyperparameter | Value |
|---|---|
| Batch size | 16 |
| Learning rate | 1 × 10⁻⁴ |
| LoRA rank | 4 |
| LoRA α | 16 |
| Epochs | ~2 on full dataset (2.5 million images) |
- Accuracy: 96 %
- Recall: 95 % (Real), 96 % (Fake)
- F1‑Score: 97 % (Real), 97 % (Fake)
