architecture design patch
This commit is contained in:
parent
595e62e111
commit
e6cad0bce9
25
README.md
25
README.md
@ -1,2 +1,27 @@
|
||||
# tianshou
|
||||
Tianshou(天授) is a reinforcement learning platform.
|
||||
## data
|
||||
TODO:
|
||||
|
||||
Replay Memory
|
||||
|
||||
Multiple wirter/reader
|
||||
|
||||
Importance sampling
|
||||
|
||||
## simulator
|
||||
go(for AlphaGo)
|
||||
|
||||
## environment
|
||||
gym
|
||||
|
||||
## core
|
||||
TODO:
|
||||
|
||||
Optimizer
|
||||
|
||||
MCTS
|
||||
|
||||
## agent (optional)
|
||||
|
||||
DQNAgent etc.
|
||||
|
@ -1,20 +0,0 @@
|
||||
# Optimizer for policy gradient methods
|
||||
TODO:
|
||||
|
||||
vanilla
|
||||
|
||||
baseline
|
||||
|
||||
REINFORCE
|
||||
|
||||
TRPO
|
||||
|
||||
PPO
|
||||
|
||||
GAE
|
||||
|
||||
NAF
|
||||
|
||||
DPG
|
||||
|
||||
ACKTR
|
Loading…
x
Reference in New Issue
Block a user