# tianshou Tianshou(天授) is a reinforcement learning platform. The following image illustrate its architecture. ## agent     Examples     Self-play Framework ## core ### Model     DQN, Policy-Value Network of AlphaGo Zero, PPO-specific, TROP-specific ### Algorithm #### Loss design     Actor-Critic (Variations), DQN (Variations), DDPG, TRPO, PPO #### Optimization method     SGD, ADAM, TRPO, natural gradient, etc. ### Planning     MCTS ## data     Training style - Monte Carlo or Temporal Difference     Reward Reshaping/ Advantage Estimation Function     Importance weight     Multithread Read/Write ## environment     DQN repeat frames etc. ## simulator     Go, Othello/Reversi, Warzone ## TODO Search based method parallel.