This project is an end-to-end computer vision application for disability-related object detection using the Ultralytics YOLO framework. It includes:
- Model training, evaluation, and batch prediction scripts
- A full-featured PyQt5 desktop application for real-time webcam inference and single-image prediction
- Analytics dashboards with inference-time, confidence, and detection statistics
- CSV logging and visual output storage
The system is designed for research, experimentation, and demonstration purposes, with optional GPU acceleration.
project-root/
│
├── application.py # PyQt5 GUI application (image + webcam inference)
├── ss.py # Utility script to locate Python Scripts folder
├── TRAIN.pdf # Contains TRAIN.py, EVALUATE.py, PREDICT.py source
├── predictions/ # Saved prediction images (auto-created)
├── prediction_log.csv # Inference log (auto-generated)
├── app_errors.log # Application error logs
└── results.zip # Training or evaluation artifacts (optional)
-
Load YOLO
.ptmodels dynamically -
Real-time webcam detection with frame skipping for low latency
-
Single-image prediction support
-
Live analytics dashboard:
- Inference time per prediction
- Average confidence score
- Object count per frame
-
CSV logging of all predictions
-
GPU acceleration (CUDA) when available
- Training with configurable hyperparameters
- Evaluation with mAP, precision, and recall metrics
- Batch Prediction on image directories
- ONNX export for deployment
- Python 3.8+
- Windows (recommended for PyQt5 webcam backend)
ultralytics
opencv-python
torch
torchvision
numpy
pandas
matplotlib
PyQt5
Install dependencies:
pip install -r requirements.txt(Create requirements.txt from the list above if not already present.)
python application.py-
Launch the application
-
Click Load Model and select a YOLO
.ptfile -
Choose one of the following:
- Predict on Image (static image inference)
- Start Video (real-time webcam detection)
-
View analytics and saved outputs in real time
All predictions are logged to prediction_log.csv.
Create a training script based on TRAIN.py:
python train.pyTraining configuration:
- Dataset:
disability.yaml - Epochs: 50
- Optimizer: AdamW
- Image size: 640
- Batch size: 16
Model artifacts are saved under:
runs/detect/train*/
python evaluate.pyOutputs include:
- mAP@50
- mAP@50–95
- Precision
- Recall
- Per-class AP metrics
python predict.py- Input directory:
test/images/ - Output directory:
predictions/
Each image is processed and saved with bounding boxes overlaid.
- Automatically detects CUDA-enabled GPUs
- Falls back to CPU if GPU is unavailable
- Half-precision inference (
FP16) enabled when supported
prediction_log.csv– structured inference metricspredictions/– annotated imagesapp_errors.log– runtime error tracking
- Webcam performance depends on camera quality and system resources
- Frame skipping is enabled to reduce latency
- ONNX models are supported for inference and evaluation