Production-ready traffic light classifier using EfficientNet-B0 on the LISA Traffic Light Dataset. This system classifies traffic lights into 7 categories: 3 circular lights (red, yellow, green) and 4 arrow lights (left + straight directions).
- Model: EfficientNet-B0 (128×128 input)
- Dataset: LISA Traffic Light Dataset (~5GB from Kaggle)
- Classes: 7 categories with underscore naming
- Circular:
red_light,yellow_light,green_light - Left arrows:
red_left,yellow_left,green_left - Straight arrows:
green_straight
- Circular:
- Total Samples: 109,475 annotated traffic lights
- Performance Target: >85% validation accuracy
- Integration: Designed for ROS integration with YOLOv10 detector
- Python 3.10
- ~10GB disk space for LISA dataset
- NVIDIA GPU recommended for training
- Kaggle account for dataset download
# Create conda environment
conda create -n eff_tlight python=3.10
conda activate eff_tlight
# Clone repository
cd /data/repos/eff_light_detection
# Install dependencies
pip install -r requirements.txtYou must configure Kaggle API credentials to download the LISA dataset:
# Install Kaggle CLI (already included in requirements.txt)
pip install kaggle
# Set your Kaggle API token
export KAGGLE_API_TOKEN=<your-token-here>
# Or create ~/.kaggle/kaggle.json with your credentials
# See KAGGLE_SETUP.md for detailed instructionsSee KAGGLE_SETUP.md for detailed instructions and troubleshooting.
# Set Kaggle API token
export KAGGLE_API_TOKEN=<your-token-here>
# Download LISA dataset (~5GB)
mkdir -p /data/datasets/lisa
cd /data/datasets/lisa
kaggle datasets download -d mbornoe/lisa-traffic-light-dataset
# Unzip dataset
unzip lisa-traffic-light-dataset.zip
rm lisa-traffic-light-dataset.zip # Optional: remove zip after extractionGenerate statistics and visualizations:
python tools/explore_lisa.py \
--dataset-dir /data/datasets/lisa \
--visualize \
--num-samples 6Outputs:
results/statistics/lisa_statistics.json- Dataset statisticsresults/visualizations/lisa_dataset_statistics.png- Distribution plotsresults/visualizations/lisa_sample_images.png- Sample annotated images
python tools/create_class_mapping.py \
--stats results/statistics/lisa_statistics.json \
--print-tableOutputs:
results/statistics/class_mapping.json- 7-class mapping configuration
python tools/generate_patch_dataset.py \
--lisa-dir /data/datasets/lisa \
--output-dir /data/datasets/lisa_patches_128 \
--patch-size 128 \
--expansion 1.2 \
--train-sequences dayTrain,nightTrain \
--val-sequences daySequence1,nightSequence1Outputs:
/data/datasets/lisa_patches_128/train/- Training patches (7 subdirectories)/data/datasets/lisa_patches_128/val/- Validation patches (7 subdirectories)/data/datasets/lisa_patches_128/dataset_metadata.json- Metadata/data/datasets/lisa_patches_128/class_weights.json- Class weights for training
python train.py \
--data-dir /data/datasets/lisa_patches_128 \
--epochs 50 \
--batch-size 32 \
--lr 0.001 \
--use-class-weights \
--image-size 128Outputs:
experiments/efficientnet_b0_<timestamp>/best.pth- Best model checkpointexperiments/efficientnet_b0_<timestamp>/logs/- TensorBoard logs
Monitor training:
tensorboard --logdir experiments/efficientnet_b0_<timestamp>/logspython evaluate.py \
--checkpoint experiments/efficientnet_b0_<timestamp>/best.pth \
--data-dir /data/datasets/lisa_patches_128/valOutputs:
- Confusion matrix (7×7)
- Per-class precision/recall/F1 scores
- Misclassified sample images
python demo.py \
--checkpoint experiments/efficientnet_b0_<timestamp>/best.pth \
--image samples/red_light.jpg- Source: Kaggle (
mbornoe/lisa-traffic-light-dataset) - Format: Images (1280×960 JPG) + CSV annotations
- Size: ~5GB (compressed), ~8GB (extracted)
- Content: 43,007 frames from day/night driving sequences
- Annotations: 109,475 traffic light bounding boxes with state labels
- Capture: San Diego, CA (Pacific Beach & La Jolla) with stereo camera
- Conditions: Day and night, varying weather and lighting
LISA provides annotation tags per traffic light:
- Circular lights:
stop,warning,go - Left arrows:
stopLeft,warningLeft,goLeft - Straight arrows:
goForward
Mapping to 7 categories:
| LISA Tag | Our Category | Count | Percentage |
|---|---|---|---|
stop |
red_light |
44,318 | 40.5% |
warning |
yellow_light |
2,669 | 2.4% |
go |
green_light |
46,723 | 42.7% |
stopLeft |
red_left |
12,734 | 11.6% |
warningLeft |
yellow_left |
350 | 0.3% |
goLeft |
green_left |
2,476 | 2.3% |
goForward |
green_straight |
205 | 0.2% |
Missing: red_straight, yellow_straight (not in LISA dataset)
Class Imbalance Note: Circular lights (red/yellow/green) are well-represented with ~94K samples. Arrow lights are rarer, especially yellow_left and green_straight (<1% each), requiring class weighting during training.
- Input: 128×128×3 RGB images
- Output: 7 classes (softmax)
- Parameters: ~5.3M
- Model size: ~17MB
# Training augmentation
- Rotation: ±10° (less than traffic signs - lights are more vertical)
- Brightness: ±40% (high variance due to time of day)
- Contrast: ±30%
- Saturation: ±30% (critical for color-based classification)
- Hue: ±2%
- Gaussian blur: σ=0-2
- NO horizontal flip (would change light position semantics)- Optimizer: Adam (lr=0.001)
- Scheduler: Cosine annealing
- Loss: CrossEntropyLoss with class weights
- Batch size: 32
- Epochs: 50
- Early stopping: Patience=10
eff_light_detection/
├── tools/
│ ├── explore_waymo.py # Dataset exploration and statistics
│ ├── create_class_mapping.py # 12-class mapping definition
│ ├── generate_patch_dataset.py # Extract patches from TFRecords
│ └── README.md # Tool documentation
├── train.py # Training script
├── evaluate.py # Evaluation script
├── demo.py # Inference demo
├── requirements.txt # Python dependencies
├── README.md # This file
├── GOOGLE_CLOUD_SETUP.md # GCP setup instructions
├── .gitignore # Git ignore rules
├── LICENSE # MIT License
├── results/
│ ├── visualizations/ # Dataset visualizations
│ └── statistics/ # Dataset statistics
├── experiments/ # Training checkpoints and logs
└── samples/ # Sample images for demo
- Total frames: 43,007
- Total traffic lights: 109,475
- Distribution:
- Circular lights: 85.6% (go 42.7%, stop 40.5%, warning 2.4%)
- Arrow lights: 14.4% (red_left 11.6%, green_left 2.3%, yellow_left 0.3%, green_straight 0.2%)
- Bbox size: Mean 25×42 pixels, range 6-160 × 10-197 pixels
- Target accuracy: >85% validation
- Baseline: >75% validation
- Training time: 4-8 hours (50 epochs, V100 GPU)
- Model size: ~17MB
- Expected per-class F1: >0.9 for circular lights, >0.7 for arrow lights
- Severe class imbalance:
yellow_left(0.3%) andgreen_straight(0.2%) are very rare - Solution: Inverse frequency weighting with oversampling for rare classes
- Small bbox sizes: Mean 25×42 pixels, many traffic lights <20 pixels
- Solution: 1.2× bbox expansion during patch extraction
- Brightness variation: Day/night sequences with varying lighting
- Solution: Heavy brightness/saturation augmentation (±40% brightness)
This classifier is designed to integrate with YOLOv10 for autonomous vehicle perception:
# Proposed ROS node structure
class TrafficLightClassifierNode:
def __init__(self):
self.model = load_model('best.pth')
self.detector_sub = rospy.Subscriber('/yolo/detections', ...)
self.state_pub = rospy.Publisher('/traffic_light/states', ...)
def callback(self, detections):
for detection in detections:
if detection.class_name == 'traffic_light':
crop = extract_crop(image, detection.bbox)
state = self.model(crop) # 7-class prediction
smoothed_state = self.temporal_smooth(state) # 5-frame buffer
self.publish_state(detection.id, smoothed_state)Features:
- Temporal smoothing over 5 frames
- Confidence thresholding
- Fallback to detection-only mode if classifier confidence low
- Verify dataset downloaded:
ls /data/datasets/lisa/ - Check for Annotations directory:
ls /data/datasets/lisa/Annotations/ - See KAGGLE_SETUP.md for download instructions
# Set your Kaggle API token
export KAGGLE_API_TOKEN=<your-token-here>
# Or create ~/.kaggle/kaggle.json
mkdir -p ~/.kaggle
echo '{"username":"<user>","key":"<key>"}' > ~/.kaggle/kaggle.json
chmod 600 ~/.kaggle/kaggle.json- LISA uses semicolon (
;) as delimiter, not comma - Verify annotation files exist:
find /data/datasets/lisa/Annotations -name "*.csv" - Check CSV format:
head /data/datasets/lisa/Annotations/Annotations/dayTrain/dayClip1/frameAnnotationsBOX.csv
- Critical: Use class weights (
--use-class-weights) due to severe imbalance - Verify augmentation settings (brightness ±40%, saturation critical)
- Check confusion matrix for rare classes (
yellow_left,green_straight) - Consider oversampling rare classes or using focal loss
- Rare classes may need data augmentation or synthetic samples
If you use this code or the LISA Traffic Light Dataset, please cite:
@article{jensen2016evaluating,
title={Evaluating state-of-the-art object detector on challenging traffic light data},
author={Jensen, Morten Bornø and Philipsen, Mark Philip and Møgelmose, Andreas and Moeslund, Thomas B and Trivedi, Mohan M},
journal={IEEE Transactions on Intelligent Transportation Systems},
volume={18},
number={9},
pages={2300--2313},
year={2017},
publisher={IEEE}
}LISA Dataset: Kaggle | Original Paper
MIT License - see LICENSE file
- LISA Traffic Light Dataset: https://www.kaggle.com/datasets/mbornoe/lisa-traffic-light-dataset
- LISA Paper (IEEE TITS 2017): https://ieeexplore.ieee.org/document/7795598
- EfficientNet Paper: https://arxiv.org/abs/1905.11946
- Kaggle API Documentation: https://www.kaggle.com/docs/api
For questions or issues, please open a GitHub issue.
Status: Active development for autonomous vehicle deployment.