# tianshou Tianshou(天授) is a reinforcement learning platform. The following image illustrate its architecture.

## agent Examples Self-play Framework ## core ### Model DQN, Policy-Value Network of AlphaGo Zero, PPO-specific, TROP-specific ### Algorithm #### Loss design Actor-Critic (Variations), DQN (Variations), DDPG, TRPO, PPO #### Optimization method SGD, ADAM, TRPO, natural gradient, etc. ### Planning MCTS ## data Training style - Monte Carlo or Temporal Difference Reward Reshaping/ Advantage Estimation Function Importance weight Multithread Read/Write ## environment DQN repeat frames etc. ## simulator Go, Othello/Reversi, Warzone

## TODO Search based method parallel.