20 lines
108 B
Markdown
20 lines
108 B
Markdown
# Optimizer for policy gradient methods
|
|
TODO:
|
|
|
|
vanilla
|
|
|
|
baseline
|
|
|
|
REINFORCE
|
|
|
|
TRPO
|
|
|
|
PPO
|
|
|
|
GAE
|
|
|
|
NAF
|
|
|
|
DPG
|
|
|
|
ACKTR |