2017-11-06 15:15:44 +08:00
..
2017-11-06 15:15:44 +08:00

Optimizer for policy gradient methods

TODO:

vanilla

baseline

REINFORCE

TRPO

PPO

GAE

NAF

DPG

ACKTR