# tianshou
Tianshou(天授) is a reinforcement learning platform. The following image illustrate its architecture.
## agent
Examples
Self-play Framework
## core
### Model
DQN, Policy-Value Network of AlphaGo Zero, PPO-specific, TROP-specific
### Algorithm
#### Loss design
Actor-Critic (Variations), DQN (Variations), DDPG, TRPO, PPO
#### Optimization method
SGD, ADAM, TRPO, natural gradient, etc.
### Planning
MCTS
## data
Training style - Monte Carlo or Temporal Difference
Reward Reshaping/ Advantage Estimation Function
Importance weight
Multithread Read/Write
## environment
DQN repeat frames etc.
## simulator
Go, Othello/Reversi, Warzone
## TODO
Search based method parallel.