Skip to content

Pilot-Khadka/alphago-lite

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

alphago-lite

Tests

A lightweight implementation of Alphago style training pipeline based on the book

  • Deep Learning and the game of Go, and the paper
  • Mastering the game of Go with Deep Neural Networks and Tree Search

implementation

  • fast policy network
  • strong policy network
  • self-play loop -q learning, policy learning
    • policy gradient learning
    • Q-learning
  • Bot vs Bot gameplay
  • web interface(next.js frontend)
  • train strong policy network
  • alphago style self-play
  • benchmark against known bots

implementation overview

The project roughly follows the pipeline described in Deep Learning and the Game of Go.

1. supervised learning

Train a policy network to imitate human moves from professional Go games.

2. policy improvement

Improve the policy using reinforcement learning through self-play.

3. evaluation

Evaluate trained agents by running automated matches between models.

dataset

This project uses the Computer Go Dataset:

https://github.com/yenw/computer-go-dataset.git

references

About

Lightweight AlphaGo-style implementation with policy networks, self-play RL, and automated Go gameplay.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors