11 lines
112 B
Markdown
Raw Normal View History

2017-11-06 13:50:35 +08:00
# Optimizer for policy gradient methods
TODO:
vanilla
introduce a baseline
REINFORCE
TRPO
PPO
GAE
NAF
DPG
2017-11-06 14:01:29 +08:00
ACKTR