Skip to content

Baharcakir/compvi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

YOLO Beverage Container Detection

A real-time multi-class beverage container detection system powered by YOLOv8 with a modern React frontend and FastAPI backend. Detect bottles, wine glasses, and cups in live webcam feeds or uploaded videos.


Table of Contents


Overview

This project provides an end-to-end solution for detecting beverage containers in images and video streams. It leverages the YOLOv8 object detection model, optimized for identifying common beverage-holding objects such as plastic bottles, wine glasses, and cups.

Use Cases:

  • Recycling automation systems
  • Inventory management for bars/restaurants
  • Smart retail shelf monitoring
  • Environmental monitoring for litter detection
  • Educational demonstrations of computer vision

Features

  • Real-time WebSocket Detection - Stream webcam frames for instant object detection
  • Video Upload Processing - Upload videos and receive frame-by-frame detection results via Server-Sent Events (SSE)
  • Adjustable Confidence Threshold - Dynamically tune detection sensitivity via UI slider
  • Multi-class Detection - Identify bottles, wine glasses, and cups simultaneously
  • Rolling Benchmark Statistics - Track inference latency with avg, p95, and p99 metrics
  • Responsive React UI - Modern interface with live bounding box overlays
  • Docker Ready - One-command deployment with Docker Compose
  • CORS Enabled - Easy integration with external frontends

Architecture

+------------------+       WebSocket (frames)      +------------------+
|                  | ----------------------------> |                  |
|   React Frontend |                               |  FastAPI Backend |
|   (Vite + Nginx) | <---------------------------- |   (YOLOv8 + UV)  |
|                  |       JSON (detections)       |                  |
+------------------+                               +------------------+
        |                                                   |
        |  HTTP POST /upload                                |
        | ---------------------------------------------->   |
        |                                                   |
        |  SSE (progress + results)                         |
        | <----------------------------------------------   |
        |                                                   |
        v                                                   v
+------------------+                               +------------------+
|   User Webcam    |                               |  YOLOv8n Model   |
|   or Video File  |                               |   (6.5 MB .pt)   |
+------------------+                               +------------------+

Component Overview

Component Technology Purpose
Frontend React 19 + Vite 8 User interface, webcam capture, visualization
Backend FastAPI + Uvicorn API server, WebSocket handler, video processing
Model YOLOv8n (Nano) Object detection inference
Proxy Nginx Static file serving, WebSocket proxy
Container Docker Compose Orchestration and deployment

Detected Classes

The model detects the following beverage container classes from the COCO dataset:

Class ID Label Description
39 bottle Plastic bottles, water bottles, soda bottles
40 wine glass Wine glasses, champagne flutes
41 cup Coffee cups, mugs, plastic cups

The model uses YOLOv8n pre-trained on COCO and filters detections to only these beverage-related classes.


Quick Start

Docker Deployment (Recommended)

The fastest way to get started is using Docker Compose:

# Clone the repository
git clone <repository-url>
cd yolo-demo

# Start all services
docker compose up --build

# Access the application
# Frontend: http://localhost:3000
# Backend API: http://localhost:8000
# API Docs: http://localhost:8000/docs

Stop the services:

docker compose down

View logs:

docker compose logs -f backend
docker compose logs -f frontend

Local Development

Prerequisites

  • Python 3.11+
  • Node.js 20+
  • pip or uv package manager

Backend Setup

cd backend

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run the server
python main.py

# Server will start at http://localhost:8000

Frontend Setup

cd frontend

# Install dependencies
npm install

# Start development server
npm run dev

# Frontend will start at http://localhost:5173

API Documentation

WebSocket Endpoint

WS /ws/detect

Real-time object detection via WebSocket connection.

Message Types (Client -> Server):

  1. Binary Frame - JPEG-encoded image bytes

    Send raw JPEG bytes for detection
    
  2. Config Message - JSON text to update settings

    {
      "type": "config",
      "threshold": 0.5
    }

Response Types (Server -> Client):

  1. Detection Result

    {
      "type": "detection",
      "detections": [
        {
          "class_id": 39,
          "label": "bottle",
          "confidence": 0.923,
          "bbox": [120.5, 80.2, 250.8, 400.1]
        }
      ],
      "inference_ms": 23.45,
      "avg_ms": 25.12,
      "p95_ms": 32.50,
      "p99_ms": 45.20,
      "frames": 30
    }
  2. Config Acknowledgment

    {
      "type": "config_ack",
      "confidence": 0.5
    }
  3. Error

    {
      "type": "error",
      "detail": "could not decode frame"
    }

REST Endpoint

POST /upload

Upload a video file for batch processing with SSE progress streaming.

Request:

  • Content-Type: multipart/form-data
  • Body: file - Video file (MP4, WebM, MOV)

Response: Server-Sent Events stream

SSE Event Types:

  1. Start Event

    {
      "type": "start",
      "job_id": "uuid",
      "total_frames": 1500,
      "fps": 30
    }
  2. Progress Event (per frame)

    {
      "type": "progress",
      "frame": 100,
      "total_frames": 1500,
      "progress": 6.7,
      "detections": [...]
    }
  3. Complete Event

    {
      "type": "complete",
      "job_id": "uuid",
      "total_frames": 1500,
      "results": [...]
    }

Benchmarks

Performance benchmarks measured on different hardware configurations:

Inference Latency (YOLOv8n)

Hardware Avg (ms) P95 (ms) P99 (ms) FPS
Apple M1 (CPU) 45 52 58 ~22
Apple M2 Pro (CPU) 32 38 42 ~31
Intel i7-12700K (CPU) 38 45 50 ~26
NVIDIA RTX 3060 (GPU) 8 10 12 ~125
NVIDIA RTX 4090 (GPU) 3 4 5 ~333

Note: Benchmarks include model inference only, not pre/post-processing or network latency.

Model Comparison

Model Size mAP@50 Inference (CPU) Inference (GPU)
YOLOv8n 6.5 MB 37.3 45 ms 8 ms
YOLOv8s 22 MB 44.9 85 ms 12 ms
YOLOv8m 52 MB 50.2 180 ms 18 ms
YOLOv8l 84 MB 52.9 320 ms 25 ms
YOLOv8x 131 MB 53.9 480 ms 35 ms

Memory Usage

Component RAM Usage
Backend (idle) ~500 MB
Backend (active) ~1.2 GB
Frontend (Nginx) ~15 MB
Model (YOLOv8n) ~50 MB

Throughput (Video Processing)

Resolution FPS (CPU) FPS (GPU)
480p 18-22 80-100
720p 12-15 60-80
1080p 6-8 40-50

Model Information

YOLOv8 Nano (yolov8n.pt)

  • Architecture: YOLOv8 (Ultralytics)
  • Variant: Nano (smallest, fastest)
  • Size: 6.5 MB (3.2M parameters)
  • Training Data: COCO 2017 (80 classes)
  • Input Size: 640x640 (auto-scaled)
  • Output: Bounding boxes + class probabilities

Using a Custom Model

To use your own trained YOLOv8 model:

  1. Place your .pt file in the models/ directory
  2. Update the MODEL_PATH environment variable:
    export MODEL_PATH=/app/models/your-model.pt
  3. Or modify docker-compose.yml:
    environment:
      - MODEL_PATH=/app/models/your-model.pt

Training Your Own Model

# Install ultralytics
pip install ultralytics

# Train on custom dataset
yolo detect train data=your-dataset.yaml model=yolov8n.pt epochs=100

# Export to ONNX for faster inference (optional)
yolo export model=runs/detect/train/weights/best.pt format=onnx

Training with Google Colab

Train a custom beverage detection model using Google Colab's free GPU and Roboflow for dataset management.

Custom Dataset Classes

The custom Roboflow dataset includes 9 beverage container classes:

Class ID Label Description
0 bottle-glass Glass bottles
1 bottle-plastic Plastic bottles
2 cup-disposable Disposable cups
3 cup-handle Cups with handles
4 glass-mug Glass mugs
5 glass-normal Regular glasses
6 glass-wine Wine glasses
7 gym bottle Sports/gym bottles
8 tin can Tin cans

Colab Training Notebook

Create a new Colab notebook and run the following cells:

Cell 1: Install Dependencies

!pip install ultralytics roboflow

Cell 2: Download Dataset from Roboflow

from roboflow import Roboflow

rf = Roboflow(api_key="YOUR_ROBOFLOW_API_KEY")
project = rf.workspace("").project("")
dataset = project.version(1).download("yolov8")

Get your API key from: Roboflow Dashboard → Settings → API Key

Cell 3: Train the Model

from ultralytics import YOLO

# Load pretrained YOLOv8n
model = YOLO('yolov8n.pt')

# Train on custom dataset
results = model.train(
    data='/content/beverage-containers-1/data.yaml',
    epochs=100,
    imgsz=640,
    batch=16,
    name='beverage-detector'
)

Cell 4: Validate the Model

# Run validation
metrics = model.val()
print(f"mAP50: {metrics.box.map50:.3f}")
print(f"mAP50-95: {metrics.box.map:.3f}")

Cell 5: Download Trained Model

from google.colab import files

# Download best weights
files.download('/content/runs/detect/beverage-detector/weights/best.pt')

Using Your Trained Model

  1. Download best.pt from Colab
  2. Place it in the models/ directory:
    mv ~/Downloads/best.pt models/beverage-model.pt
  3. Update docker-compose.yml:
    environment:
      - MODEL_PATH=/app/models/beverage-model.pt
  4. Update class filtering in backend/detector.py to use class IDs 0-8 instead of COCO IDs

Alternative: Upload Dataset Manually

If you prefer not to use the Roboflow API:

  1. Zip your dataset locally:

    zip -r dataset.zip train/ valid/ test/ data.yaml
  2. Upload and unzip in Colab:

    from google.colab import files
    
    # Upload the zip file
    uploaded = files.upload()
    
    # Unzip
    !unzip dataset.zip
  3. Train with local data.yaml:

    model.train(data='/content/data.yaml', epochs=100)

Configuration

Environment Variables

Variable Default Description
MODEL_PATH ../models/yolov8n.pt Path to YOLOv8 model weights
VITE_API_URL (auto-detect) Backend API URL for frontend
VITE_WS_URL (auto-detect) WebSocket URL for frontend

Detection Parameters

Parameter Default Range Description
confidence 0.45 0.1-0.9 Minimum confidence threshold
window_size 30 10-100 Rolling window for benchmark stats

Project Structure

yolo-demo/
├── backend/
│   ├── main.py              # FastAPI application entry point
│   ├── detector.py          # YOLOv8 wrapper class
│   ├── benchmark.py         # Latency tracking utilities
│   ├── requirements.txt     # Python dependencies
│   ├── Dockerfile           # Backend container definition
│   └── .dockerignore        # Docker build exclusions
│
├── frontend/
│   ├── src/
│   │   ├── App.jsx          # Main React component
│   │   ├── App.css          # Application styles
│   │   ├── components/
│   │   │   ├── Webcam.jsx   # Webcam capture component
│   │   │   ├── Canvas.jsx   # Bounding box overlay
│   │   │   └── VideoUpload.jsx # Video upload UI
│   │   └── hooks/
│   │       ├── useDetection.js # WebSocket hook
│   │       └── useUpload.js    # Upload/SSE hook
│   ├── package.json         # Node.js dependencies
│   ├── vite.config.js       # Vite configuration
│   ├── nginx.conf           # Production nginx config
│   ├── Dockerfile           # Frontend container definition
│   └── .dockerignore        # Docker build exclusions
│
├── models/
│   └── yolov8n.pt           # Pre-trained YOLOv8 nano weights
│
├── docker-compose.yml       # Multi-service orchestration
└── README.md                # This file

Troubleshooting

Common Issues

WebSocket Connection Failed

  • Ensure backend is running on port 8000
  • Check firewall settings
  • Verify CORS configuration

Model Not Found

  • Confirm yolov8n.pt exists in models/ directory
  • Check MODEL_PATH environment variable

Slow Inference

  • Consider using GPU acceleration
  • Reduce input resolution
  • Use a smaller model variant (yolov8n)

Docker Build Fails

  • Ensure Docker Desktop is running
  • Check disk space (needs ~5GB)
  • Try docker compose build --no-cache

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.


Acknowledgments

About

Planning to add more computer vision projects to this repository.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors