tianshou

Tianshou(天授) is a reinforcement learning platform.

agent

Examples

Self-play Framework

DQN, Policy-Value Network of AlphaGo Zero, PPO-specific, TROP-specific

Actor-Critic (Variations), DQN (Variations), DDPG, TRPO, PPO

SGD, ADAM, TRPO, natural gradient, etc.

MCTS

Training style - Monte Carlo or Temporal Difference

Reward Reshaping/ Advantage Estimation Function

Importance weight

Multithread Read/Write

DQN repeat frames etc.

Go, Othello/Reversi, Warzone

Search based method parallel.