Skip to content

bbyrd2021/efficient_light_detection

Repository files navigation

Traffic Light Detection - 7-Class Classification System

Production-ready traffic light classifier using EfficientNet-B0 on the LISA Traffic Light Dataset. This system classifies traffic lights into 7 categories: 3 circular lights (red, yellow, green) and 4 arrow lights (left + straight directions).

Overview

  • Model: EfficientNet-B0 (128×128 input)
  • Dataset: LISA Traffic Light Dataset (~5GB from Kaggle)
  • Classes: 7 categories with underscore naming
    • Circular: red_light, yellow_light, green_light
    • Left arrows: red_left, yellow_left, green_left
    • Straight arrows: green_straight
  • Total Samples: 109,475 annotated traffic lights
  • Performance Target: >85% validation accuracy
  • Integration: Designed for ROS integration with YOLOv10 detector

Installation

Prerequisites

  • Python 3.10
  • ~10GB disk space for LISA dataset
  • NVIDIA GPU recommended for training
  • Kaggle account for dataset download

Environment Setup

# Create conda environment
conda create -n eff_tlight python=3.10
conda activate eff_tlight

# Clone repository
cd /data/repos/eff_light_detection

# Install dependencies
pip install -r requirements.txt

Kaggle API Setup (Required)

You must configure Kaggle API credentials to download the LISA dataset:

# Install Kaggle CLI (already included in requirements.txt)
pip install kaggle

# Set your Kaggle API token
export KAGGLE_API_TOKEN=<your-token-here>

# Or create ~/.kaggle/kaggle.json with your credentials
# See KAGGLE_SETUP.md for detailed instructions

See KAGGLE_SETUP.md for detailed instructions and troubleshooting.


Quick Start

Step 1: Download LISA Dataset

# Set Kaggle API token
export KAGGLE_API_TOKEN=<your-token-here>

# Download LISA dataset (~5GB)
mkdir -p /data/datasets/lisa
cd /data/datasets/lisa
kaggle datasets download -d mbornoe/lisa-traffic-light-dataset

# Unzip dataset
unzip lisa-traffic-light-dataset.zip
rm lisa-traffic-light-dataset.zip  # Optional: remove zip after extraction

Step 2: Explore Dataset

Generate statistics and visualizations:

python tools/explore_lisa.py \
    --dataset-dir /data/datasets/lisa \
    --visualize \
    --num-samples 6

Outputs:

  • results/statistics/lisa_statistics.json - Dataset statistics
  • results/visualizations/lisa_dataset_statistics.png - Distribution plots
  • results/visualizations/lisa_sample_images.png - Sample annotated images

Step 3: Create Class Mapping

python tools/create_class_mapping.py \
    --stats results/statistics/lisa_statistics.json \
    --print-table

Outputs:

  • results/statistics/class_mapping.json - 7-class mapping configuration

Step 4: Generate Patch Dataset

python tools/generate_patch_dataset.py \
    --lisa-dir /data/datasets/lisa \
    --output-dir /data/datasets/lisa_patches_128 \
    --patch-size 128 \
    --expansion 1.2 \
    --train-sequences dayTrain,nightTrain \
    --val-sequences daySequence1,nightSequence1

Outputs:

  • /data/datasets/lisa_patches_128/train/ - Training patches (7 subdirectories)
  • /data/datasets/lisa_patches_128/val/ - Validation patches (7 subdirectories)
  • /data/datasets/lisa_patches_128/dataset_metadata.json - Metadata
  • /data/datasets/lisa_patches_128/class_weights.json - Class weights for training

Step 5: Train Model

python train.py \
    --data-dir /data/datasets/lisa_patches_128 \
    --epochs 50 \
    --batch-size 32 \
    --lr 0.001 \
    --use-class-weights \
    --image-size 128

Outputs:

  • experiments/efficientnet_b0_<timestamp>/best.pth - Best model checkpoint
  • experiments/efficientnet_b0_<timestamp>/logs/ - TensorBoard logs

Monitor training:

tensorboard --logdir experiments/efficientnet_b0_<timestamp>/logs

Step 6: Evaluate Model

python evaluate.py \
    --checkpoint experiments/efficientnet_b0_<timestamp>/best.pth \
    --data-dir /data/datasets/lisa_patches_128/val

Outputs:

  • Confusion matrix (7×7)
  • Per-class precision/recall/F1 scores
  • Misclassified sample images

Step 7: Run Demo

python demo.py \
    --checkpoint experiments/efficientnet_b0_<timestamp>/best.pth \
    --image samples/red_light.jpg

Dataset Details

LISA Traffic Light Dataset

  • Source: Kaggle (mbornoe/lisa-traffic-light-dataset)
  • Format: Images (1280×960 JPG) + CSV annotations
  • Size: ~5GB (compressed), ~8GB (extracted)
  • Content: 43,007 frames from day/night driving sequences
  • Annotations: 109,475 traffic light bounding boxes with state labels
  • Capture: San Diego, CA (Pacific Beach & La Jolla) with stereo camera
  • Conditions: Day and night, varying weather and lighting

Classification Mapping

LISA provides annotation tags per traffic light:

  • Circular lights: stop, warning, go
  • Left arrows: stopLeft, warningLeft, goLeft
  • Straight arrows: goForward

Mapping to 7 categories:

LISA Tag Our Category Count Percentage
stop red_light 44,318 40.5%
warning yellow_light 2,669 2.4%
go green_light 46,723 42.7%
stopLeft red_left 12,734 11.6%
warningLeft yellow_left 350 0.3%
goLeft green_left 2,476 2.3%
goForward green_straight 205 0.2%

Missing: red_straight, yellow_straight (not in LISA dataset)

Class Imbalance Note: Circular lights (red/yellow/green) are well-represented with ~94K samples. Arrow lights are rarer, especially yellow_left and green_straight (<1% each), requiring class weighting during training.


Model Architecture

EfficientNet-B0

  • Input: 128×128×3 RGB images
  • Output: 7 classes (softmax)
  • Parameters: ~5.3M
  • Model size: ~17MB

Data Augmentation (Traffic Light-Specific)

# Training augmentation
- Rotation: ±10° (less than traffic signs - lights are more vertical)
- Brightness: ±40% (high variance due to time of day)
- Contrast: ±30%
- Saturation: ±30% (critical for color-based classification)
- Hue: ±2%
- Gaussian blur: σ=0-2
- NO horizontal flip (would change light position semantics)

Training Configuration

  • Optimizer: Adam (lr=0.001)
  • Scheduler: Cosine annealing
  • Loss: CrossEntropyLoss with class weights
  • Batch size: 32
  • Epochs: 50
  • Early stopping: Patience=10

Project Structure

eff_light_detection/
├── tools/
│   ├── explore_waymo.py              # Dataset exploration and statistics
│   ├── create_class_mapping.py       # 12-class mapping definition
│   ├── generate_patch_dataset.py     # Extract patches from TFRecords
│   └── README.md                     # Tool documentation
├── train.py                          # Training script
├── evaluate.py                       # Evaluation script
├── demo.py                           # Inference demo
├── requirements.txt                  # Python dependencies
├── README.md                         # This file
├── GOOGLE_CLOUD_SETUP.md            # GCP setup instructions
├── .gitignore                        # Git ignore rules
├── LICENSE                           # MIT License
├── results/
│   ├── visualizations/               # Dataset visualizations
│   └── statistics/                   # Dataset statistics
├── experiments/                      # Training checkpoints and logs
└── samples/                          # Sample images for demo

Expected Performance

Dataset Statistics (LISA Traffic Light Dataset)

  • Total frames: 43,007
  • Total traffic lights: 109,475
  • Distribution:
    • Circular lights: 85.6% (go 42.7%, stop 40.5%, warning 2.4%)
    • Arrow lights: 14.4% (red_left 11.6%, green_left 2.3%, yellow_left 0.3%, green_straight 0.2%)
  • Bbox size: Mean 25×42 pixels, range 6-160 × 10-197 pixels

Training Performance

  • Target accuracy: >85% validation
  • Baseline: >75% validation
  • Training time: 4-8 hours (50 epochs, V100 GPU)
  • Model size: ~17MB
  • Expected per-class F1: >0.9 for circular lights, >0.7 for arrow lights

Challenges

  • Severe class imbalance: yellow_left (0.3%) and green_straight (0.2%) are very rare
  • Solution: Inverse frequency weighting with oversampling for rare classes
  • Small bbox sizes: Mean 25×42 pixels, many traffic lights <20 pixels
  • Solution: 1.2× bbox expansion during patch extraction
  • Brightness variation: Day/night sequences with varying lighting
  • Solution: Heavy brightness/saturation augmentation (±40% brightness)

ROS Integration (Future)

This classifier is designed to integrate with YOLOv10 for autonomous vehicle perception:

# Proposed ROS node structure
class TrafficLightClassifierNode:
    def __init__(self):
        self.model = load_model('best.pth')
        self.detector_sub = rospy.Subscriber('/yolo/detections', ...)
        self.state_pub = rospy.Publisher('/traffic_light/states', ...)

    def callback(self, detections):
        for detection in detections:
            if detection.class_name == 'traffic_light':
                crop = extract_crop(image, detection.bbox)
                state = self.model(crop)  # 7-class prediction
                smoothed_state = self.temporal_smooth(state)  # 5-frame buffer
                self.publish_state(detection.id, smoothed_state)

Features:

  • Temporal smoothing over 5 frames
  • Confidence thresholding
  • Fallback to detection-only mode if classifier confidence low

Troubleshooting

"Dataset not found" or "No annotations found"

  • Verify dataset downloaded: ls /data/datasets/lisa/
  • Check for Annotations directory: ls /data/datasets/lisa/Annotations/
  • See KAGGLE_SETUP.md for download instructions

"401 Unauthorized" (Kaggle API)

# Set your Kaggle API token
export KAGGLE_API_TOKEN=<your-token-here>

# Or create ~/.kaggle/kaggle.json
mkdir -p ~/.kaggle
echo '{"username":"<user>","key":"<key>"}' > ~/.kaggle/kaggle.json
chmod 600 ~/.kaggle/kaggle.json

"CSV parsing errors"

  • LISA uses semicolon (;) as delimiter, not comma
  • Verify annotation files exist: find /data/datasets/lisa/Annotations -name "*.csv"
  • Check CSV format: head /data/datasets/lisa/Annotations/Annotations/dayTrain/dayClip1/frameAnnotationsBOX.csv

Low accuracy / Poor performance

  • Critical: Use class weights (--use-class-weights) due to severe imbalance
  • Verify augmentation settings (brightness ±40%, saturation critical)
  • Check confusion matrix for rare classes (yellow_left, green_straight)
  • Consider oversampling rare classes or using focal loss
  • Rare classes may need data augmentation or synthetic samples

Citation

If you use this code or the LISA Traffic Light Dataset, please cite:

@article{jensen2016evaluating,
  title={Evaluating state-of-the-art object detector on challenging traffic light data},
  author={Jensen, Morten Bornø and Philipsen, Mark Philip and Møgelmose, Andreas and Moeslund, Thomas B and Trivedi, Mohan M},
  journal={IEEE Transactions on Intelligent Transportation Systems},
  volume={18},
  number={9},
  pages={2300--2313},
  year={2017},
  publisher={IEEE}
}

LISA Dataset: Kaggle | Original Paper


License

MIT License - see LICENSE file


Additional Resources


Contact

For questions or issues, please open a GitHub issue.

Status: Active development for autonomous vehicle deployment.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors