Dota 2 Match Outcome Predictor

An LSTM-based neural network that predicts Dota 2 match outcomes in real time by analysing the minute-by-minute evolution of game state. Trained on 50,000+ matches from the OpenDota API.

Based on the research paper Machine learning models for DOTA 2 outcomes prediction (Akhmedov & Phan, arXiv:2106.01782).

Results

Metric	Value
End-of-game accuracy	94.11%
Brier score	0.052
Expected calibration error	0.076

Accuracy by game time:

Minute	Accuracy
5	49.3% (near random)
10	52.1%
15	62.2%
20	71.1%
30	73.7%
End	94.11%

Project Structure

dota-predictor/
├── src/dota_predictor/
│   ├── api/           # Async OpenDota API client
│   ├── data/          # Match / MatchPlayer dataclasses, DotaDataset, SQLite DB
│   ├── models/        # LSTMPredictor, baseline models
│   ├── features/      # Feature extraction (8 and 20 feature variants)
│   ├── evaluation/    # Metrics, isotonic calibration, experiment tracker
│   ├── inference/     # GSI server, live predictor
│   ├── polymarket/    # Polymarket API client + Dota 2 market finder
│   └── utils/         # Config, training loop utilities
├── scripts/
│   ├── fetch_data.py           # Collect matches from OpenDota → SQLite
│   ├── process_data.py         # Transform matches → .npy training arrays
│   ├── train.py                # Train the LSTM model
│   ├── evaluate.py             # Evaluate with metrics, calibration, plots
│   ├── calibrate_per_minute.py # Fit per-phase isotonic calibrators
│   ├── predict_match.py        # Predict a historical match by ID
│   ├── live_predict.py         # Real-time GSI prediction dashboard
│   ├── find_games.py           # Show live/upcoming games with Polymarket odds
│   ├── spectate.py             # Find a game on Polymarket + launch Dota 2
│   ├── inspect_data.py         # Inspect processed .npy files
│   └── gsi_diagnostic.py       # Log raw GSI values for validation
├── docs/
│   ├── features.md                  # Feature engineering reference
│   ├── data_pipeline.md             # Full data flow documentation
│   └── polymarket_integration.md    # Polymarket API reference
├── models/checkpoints/
│   ├── model.pt                # Pre-trained LSTM (94.11% accuracy)
│   └── calibrator*.json        # Isotonic calibration files
├── tests/
├── notebooks/
├── data/examples/              # Sample match data
├── gamestate_integration_predictor.cfg
├── .env.example
├── Makefile
├── pyproject.toml
└── requirements.txt

Installation

Prerequisites: Python 3.10+

git clone https://github.com/amarcu/dota-predictor.git
cd dota-predictor

python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

pip install -e ".[dev]"

A pre-trained model (models/checkpoints/model.pt) is included in the repo, so you can run predictions immediately without training from scratch.

Copy .env.example to .env and optionally add your OpenDota API key (increases rate limits from ~30/min to 3000/min):

cp .env.example .env
# Edit .env and set OPENDOTA_API_KEY
# Get a key at: https://www.opendota.com/api-keys

Quickstart

1. Fetch training data

# Fetch 1000 matches into the local SQLite database
python scripts/fetch_data.py --count 1000 --db

2. Process into training arrays

# Convert to .npy arrays (uses enhanced 20-feature set)
python scripts/process_data.py --db --enhanced-features

3. Train the model

python scripts/train.py
# Saves best checkpoint to models/checkpoints/model.pt

4. Evaluate

python scripts/evaluate.py --model models/checkpoints/model.pt --log

# Then fit calibrators for live prediction
python scripts/calibrate_per_minute.py

5. Predict a historical match

python scripts/predict_match.py --match-id 7892631234

Or use make to run the full pipeline:

make all       # process → train → evaluate
make live      # start the GSI prediction server

Live Prediction (GSI)

The live predictor connects to Dota 2 via Game State Integration (GSI) and shows a real-time win-probability dashboard in your terminal.

Setup

Copy the GSI config to your Dota 2 config directory:

macOS:

cp gamestate_integration_predictor.cfg \
  "$HOME/Library/Application Support/Steam/steamapps/common/dota 2 beta/game/dota/cfg/"

Windows:

Copy gamestate_integration_predictor.cfg to:
C:\Program Files (x86)\Steam\steamapps\common\dota 2 beta\game\dota\cfg\

Add -gamestateintegration to Dota 2 launch options in Steam.

Start the prediction server:

python scripts/live_predict.py --port 3000
# or: make live

Open Dota 2 and start or spectate a match. The dashboard updates every minute with the current win probability.

Spectating with Polymarket odds

To find a live pro match on Polymarket and launch Dota 2 alongside the predictor:

# List live and upcoming games with current market odds
python scripts/find_games.py

# Auto-find a live game, open Dota 2, and start the predictor
python scripts/spectate.py

# Spectate a specific game
python scripts/spectate.py --slug dota2-l1ga-vpp-2025-12-22

# Makefile shortcuts
make find-games
make spectate
make spectate-slug SLUG=dota2-l1ga-vpp-2025-12-22

For GSI diagnostics (validate that live features match training features):

python scripts/gsi_diagnostic.py --port 3000

Model Architecture

Input: time-series (60 timesteps × 20 features)
       hero IDs (10 heroes)

Hero branch:
  Embedding (vocab = max_hero_id + 1, 32 dim) × 10 heroes
  → team averages + concatenate → hero_features (320,)

Sequence branch:
  LSTM (20 → 128 hidden, 2 layers, dropout=0.3)
  → output at each timestep (60, 128)

Combined (per timestep):
  concat(lstm_out, hero_features) → (448,)
  Linear 448→64 + ReLU + dropout
  Linear 64→1 + Sigmoid
  → win probability at each minute (60,)

Training uses binary cross-entropy masked to valid (non-padded) timesteps, with early stopping on validation loss.

Calibration: Raw model outputs are passed through isotonic regression calibrators fitted per game phase (early/mid/late) to produce well-calibrated probabilities. See docs/data_pipeline.md for details.

Documentation

Feature Engineering — full description of the 8 and 20 feature sets, hero embeddings, and GSI alignment
Data Pipeline — step-by-step walkthrough from API to trained model, including design decisions and calibration

Development

make test       # pytest
make lint       # ruff check
make format     # ruff format
make typecheck  # mypy

License

MIT — see LICENSE.

Acknowledgments

OpenDota for the free, comprehensive API
Akhmedov & Phan for the reference paper (arXiv:2106.01782)
Polymarket for the prediction market data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dota 2 Match Outcome Predictor

Results

Project Structure

Installation

Quickstart

1. Fetch training data

2. Process into training arrays

3. Train the model

4. Evaluate

5. Predict a historical match

Live Prediction (GSI)

Setup

Spectating with Polymarket odds

Model Architecture

Documentation

Development

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data/examples		data/examples
docs		docs
models/checkpoints		models/checkpoints
notebooks		notebooks
scripts		scripts
src/dota_predictor		src/dota_predictor
tests		tests
.cursorrules		.cursorrules
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
gamestate_integration_predictor.cfg		gamestate_integration_predictor.cfg
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Dota 2 Match Outcome Predictor

Results

Project Structure

Installation

Quickstart

1. Fetch training data

2. Process into training arrays

3. Train the model

4. Evaluate

5. Predict a historical match

Live Prediction (GSI)

Setup

Spectating with Polymarket odds

Model Architecture

Documentation

Development

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages