This project addresses the challenge of identifying and tracking individual football players across multiple video camera angles (Assignment 1).
- Approach
- Pipeline
- Models Used
- Explanation Flow
- Setup Instructions (Local Machine)
- Setup Instructions (Kaggle)
- Output
The core approach is a hybrid method that combines robust object detection with a multi-object tracking framework, enhanced by various features for re-identification:
- YOLOv8 is used to accurately detect players in each frame.
- DeepSORT is employed to track players over time using a combination of motion prediction (Kalman filter) and appearance features (embeddings).
- Appearance Embedding: Deep feature vector from a pre-trained ResNet-50 model.
- Jersey Number: Extracted using EasyOCR.
- Dominant Jersey Color: HSV histogram to capture the jersey color.
- Grid Position: Coarse spatial location on the field.
After tracking players independently in each video, a matching algorithm compares players across videos using a weighted scoring system based on:
- Exact jersey number match (highest weight)
- Appearance embedding similarity (cosine distance)
- Dominant color similarity (medium weight)
- Spatial proximity using grid (low weight)
- Temporal overlap: Players must appear in both videos at roughly the same time.
- Determine the computing device (GPU or CPU).
- Load the YOLOv8 model (
best.pt). - Load the modified ResNet-50 model.
- Initialize EasyOCR reader.
- Initialize DeepSORT trackers.
-
Read frames from a video.
-
For each frame:
-
Detect players using YOLO.
-
For each detection:
- Extract appearance embeddings using ResNet-50.
- Extract jersey number using EasyOCR.
- Calculate dominant jersey color histogram.
- Determine grid position.
- Store metadata.
-
Update DeepSORT tracker.
-
Draw bounding boxes and IDs on output frame.
-
-
After video ends:
- Aggregate features (average embedding, most frequent jersey number, color, etc.).
-
Call
process_videofor bothbroadcast.mp4andtacticam.mp4. -
For each track in broadcast:
-
Compare with unmatched tracks in tacticam.
-
Filter candidates based on temporal overlap.
-
Compute a weighted score for each pair using:
- Jersey match
- Cosine similarity
- Grid position match
- Dominant color similarity
-
Match the best scoring pair (if above threshold).
-
-
Save matched pairs to
player_mapping.xlsx.
- Used for player detection.
- Custom-trained weights:
best.pt
- Used for extracting appearance embeddings.
- Final layer replaced with
torch.nn.Identity().
- Used for recognizing jersey numbers.
- Language: English (
en)
- Used for tracking and maintaining consistent IDs.
- Combines Kalman filtering and appearance-based matching.
-
Setup: Import libraries, detect device (CUDA/CPU), load models.
-
Detection + Tracking:
- For each video, process every frame to detect and track players.
- Extract features and maintain track-specific metadata.
-
Post Processing:
- Aggregate metadata across time.
- Match players across videos using similarity metrics.
-
Output:
- Annotated videos and a player mapping Excel sheet.
model weights : https://drive.google.com/file/d/1-5fOSHOSB9UXyP_enOoZNAMScrePVcMD/viewgit clone <https://github.com/pranavrw/Mapper>
cd <Mapper>python -m venv venv
# Activate:
# Windows:
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activatepip install opencv-python-headless torch torchvision ultralytics easyocr deep-sort-realtime numpy pandas scipy- Place
broadcast.mp4,tacticam.mp4, andbest.ptin the working directory.
python main.py- Go to Kaggle → Notebooks → New Notebook.
- In Settings, enable GPU (P100 or T4).
- Ensure Internet is ON.
- Create and upload a dataset with
best.pt,broadcast.mp4, andtacticam.mp4. - Attach it to the notebook via + Add Data.
!pip install opencv-python-headless torch torchvision ultralytics easyocr deep-sort-realtime pandas scipyvideo1_path = "/kaggle/input/your-dataset-name/broadcast.mp4"
video2_path = "/kaggle/input/your-dataset-name/tacticam.mp4"
model_path = "/kaggle/input/your-dataset-name/best.pt"-
Output files will be saved in
/kaggle/working/:output_broadcast.mp4output_tacticam.mp4player_mapping.xlsx
After execution, the following files are generated:
-
output_broadcast.mp4: Annotated video with tracking IDs frombroadcast.mp4 -
output_tacticam.mp4: Annotated video with tracking IDs fromtacticam.mp4 -
player_mapping.xlsx: Excel file mapping players across both videos with:global_id: Unified player IDvideo1_id: DeepSORT ID from broadcast viewvideo2_id: DeepSORT ID from tacticam view
Review videos visually and verify mapping using the Excel file.