Skip to content

aroraa7/cod-player-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

COD Player Performance and Engagement Analytics

This project analyzes Call of Duty player behavior to understand which factors are most associated with player progression and in-game performance. The analysis separates raw engagement volume, such as time played and total kills, from efficiency metrics, such as kills per hour, kills per game, and score per minute.

Data source: https://www.kaggle.com/datasets/aishahakami/call-of-duty-players

Research Question

What behavioral factors are most strongly associated with player progression and performance?

The project compares three related outcomes:

  • Raw progression: how far a player has advanced, measured with level.
  • Progression efficiency: how quickly a player earns progress, measured with xp_per_hour.
  • Performance: how effectively a player performs, measured with scorePerMinute.

Key Findings

  • Raw level progression is highly predictable from engagement volume. Players with more time played, kills, headshots, and wins tend to have higher levels.
  • Cumulative stats can be misleading on their own because they often reflect playtime more than skill.
  • Efficiency metrics help separate "played more" from "played better."
  • For progression efficiency, the strongest nonlinear signals were kdRatio, kills_per_game, and kills_per_hour.
  • The project uses both Ridge regression and Random Forest models to balance interpretability with nonlinear feature importance.

Model Results

Analysis Target Model R2 MAE
Raw Progression Level level RidgeCV 0.950 8.24
Raw Progression Level level Random Forest 0.964 5.39
Progression Efficiency xp_per_hour RidgeCV 0.786 305.89
Progression Efficiency xp_per_hour Random Forest 0.829 269.65
Performance scorePerMinute RidgeCV 0.793 35.51
Performance scorePerMinute Random Forest 0.845 28.59

How To Run

From the project root:

pip install -r requirements.txt
python src/run_analysis.py

The pipeline loads the raw dataset, engineers behavioral features, trains the models, and writes updated tables and figures to outputs/.

Outputs

Generated summary:

  • outputs/analysis_summary.md

Generated tables:

  • outputs/tables/model_metrics.csv
  • outputs/tables/behavioral_factor_rankings.csv
  • outputs/tables/data_quality_summary.csv
  • outputs/tables/numeric_feature_summary.csv
  • outputs/tables/target_correlations.csv
  • model-specific Ridge coefficient tables
  • model-specific Random Forest permutation-importance tables

Generated figures:

  • outputs/figures/behavioral_correlation_heatmap.png
  • outputs/figures/level_vs_time_played.png
  • model-specific coefficient plots
  • model-specific permutation-importance plots

Methodology

Feature Engineering

The analysis creates rate-based features to separate raw activity from efficiency:

  • win_rate = wins / (wins + losses)
  • accuracy = hits / shots
  • headshot_rate = headshots / kills
  • kills_per_game = kills / gamesPlayed
  • assists_per_game = assists / gamesPlayed
  • kills_per_hour = kills / timePlayed
  • xp_per_hour = xp / timePlayed
  • level_per_hour = level / timePlayed

Division-by-zero cases are handled safely, and extreme rate outliers are capped at the 99th percentile to keep the models interpretable.

Modeling

The analysis trains three model groups:

Analysis Target Purpose
Raw Progression Level level Identifies factors associated with total progression.
Progression Efficiency xp_per_hour Identifies behaviors associated with earning progress faster.
Performance scorePerMinute Identifies behaviors associated with stronger in-game performance.

Each analysis uses:

  • RidgeCV: standardized linear model with cross-validated regularization.
  • Random Forest: nonlinear model with permutation importance.

xp is intentionally excluded from the raw level model because XP is directly tied to level progression and would create leakage.

Project Structure

COD-Analysis/
├── data/
│   └── raw/
│       └── cod.csv
├── src/
│   ├── cod_analysis/
│   │   ├── config.py
│   │   ├── data.py
│   │   ├── features.py
│   │   ├── models.py
│   │   ├── pipeline.py
│   │   ├── plots.py
│   │   └── reports.py
│   └── run_analysis.py
├── outputs/
│   ├── analysis_summary.md
│   ├── figures/
│   └── tables/
├── README.md
└── requirements.txt

data/raw/ stores the original dataset. outputs/ contains generated artifacts and can be recreated by rerunning the pipeline.

Product Takeaways

This framing supports two types of engagement strategy:

  • Retention strategy: encourage consistent play through session goals, streaks, and time-limited progression events.
  • Skill strategy: reward efficient play through headshot challenges, accuracy goals, assist bonuses, win streaks, and score-per-minute objectives.

The main distinction is that encouraging more play is different from encouraging better, more satisfying play.

About

Call of Duty player analytics for progression, engagement, and performance.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages