A machine learning-powered tennis match prediction system that analyzes player statistics, historical performance, and match conditions to predict match outcomes with confidence scores.
- Advanced ML Models: Three trained models (CatBoost, XGBoost, Random Forest) for accurate match predictions
- Player Database: Comprehensive player statistics and historical data
- Smart Search: Find players by name or country code
- Prediction Interface: Get match predictions with confidence scores and winning odds
- Modern Web Interface: Clean, responsive design built with SvelteKit
- REST API: FastAPI backend with comprehensive endpoints
The system uses three different machine learning models, with CatBoost as the default:
| Model | Accuracy | ROC AUC | Log Loss |
|---|---|---|---|
| CatBoost (Default) | 67.21% | 0.73 | 0.60 |
| XGBoost | 66.70% | 0.73 | 0.61 |
| Random Forest | 65.70% | 0.72 | 0.62 |
The models are trained on 137 carefully engineered features including:
- Player Characteristics: Entry type, playing hand, physical attributes
- Match Context: Tournament level, surface type, date, draw size
- Performance Metrics: Aces, double faults, service statistics
- Historical Data: Head-to-head records, recent form (last 5, 10, 20, 50 matches)
- Ranking & Elo: Current rankings, Elo ratings, surface-specific performance
- Comparative Analysis: Differential features between players
Dataset Coverage: All training data includes tennis matches through 2024 (including the complete 2024 season).
Backend:
- FastAPI with Pydantic for data validation
- Python 3.11 with ML libraries (CatBoost, XGBoost, scikit-learn)
- UV for dependency management
Frontend:
- SvelteKit for modern, reactive UI
- Responsive design for all devices
Deployment:
- Docker & Docker Compose for containerization
- Hot-reload development environment
POST /prediction- Get match prediction with confidence scoresGET /players- Retrieve all players dataGET /players/{player_id}- Get specific player informationGET /players-lookup- Search players by name or country code
This project uses Git LFS (Large File Storage) for dataset files. You'll need to install and configure Git LFS:
# Install Git LFS (if not already installed)
# On macOS with Homebrew:
brew install git-lfs
# On Ubuntu/Debian:
sudo apt-get install git-lfs
# On Windows, download from: https://git-lfs.github.io/
# Initialize Git LFS
git lfs install-
Clone the repository and pull LFS files
git clone https://github.com/hikmatazimzade/tennis-ai.git cd tennis-ai git lfs pull -
Run with Docker Compose
docker-compose up --build
-
Access the application
- Frontend: http://localhost:5173
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/docs
-
Clone the repository and pull LFS files
git clone https://github.com/hikmatazimzade/tennis-ai.git cd tennis-ai git lfs pull -
Backend Setup
# Install UV (if not already installed) pip install uv # Install dependencies uv sync # Run the backend uv run uvicorn backend.api:app --host 0.0.0.0 --port 8000 --reload
-
Frontend Setup
cd ui-app npm install npm run dev -
Access the application
- Frontend: http://localhost:5173
- Backend API: http://localhost:8000
- Browse Players: Visit the players page to explore the database of tennis players
- Search Players: Use the search functionality to find specific players by name or country
- View Player Details: Click on any player to see their detailed statistics and performance history
- Make Predictions: Go to the prediction page, select two players, and get AI-powered match predictions with confidence scores
All contributions are welcome! Whether it's improving the ML models, adding new features, or enhancing the UI, your input is valuable.
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
