A lightweight implementation of Alphago style training pipeline based on the book
- Deep Learning and the game of Go, and the paper
- Mastering the game of Go with Deep Neural Networks and Tree Search
- fast policy network
- strong policy network
- self-play loop -q learning, policy learning
- policy gradient learning
- Q-learning
- Bot vs Bot gameplay
- web interface(next.js frontend)
- train strong policy network
- alphago style self-play
- benchmark against known bots
The project roughly follows the pipeline described in Deep Learning and the Game of Go.
Train a policy network to imitate human moves from professional Go games.
Improve the policy using reinforcement learning through self-play.
Evaluate trained agents by running automated matches between models.
This project uses the Computer Go Dataset:
https://github.com/yenw/computer-go-dataset.git
- Deep Learning and the Game of Go
- Silver et al., Mastering the Game of Go with Deep Neural Networks and Tree Search https://www.nature.com/articles/nature16961