2017-11-06 15:15:44 +08:00

20 lines
108 B
Markdown

# Optimizer for policy gradient methods
TODO:
vanilla
baseline
REINFORCE
TRPO
PPO
GAE
NAF
DPG
ACKTR