Trinkle23897
de556fd22d
item3 of #51
2020-05-27 11:02:23 +08:00
magicly
6237cc0d52
fix dqn zero eps ( #52 )
...
Co-authored-by: liyan <liyan1@digisky.com>
2020-05-21 11:35:41 +08:00
Imone
57bca16f94
Fix log_prob and PPO dual_clip ( #49 )
...
* Added DiagGaussian to fix log_probg
* Disable PPO dual_clip
2020-05-18 16:23:35 +08:00
Trinkle23897
0eef0ca198
fix optional type syntax
2020-05-16 20:08:32 +08:00
Trinkle23897
9b26137cd2
add type annotation
2020-05-12 11:31:47 +08:00
Trinkle23897
04b091d975
fix max-grad-norm err in a2c ( #46 )
2020-05-04 12:33:04 +08:00
Trinkle23897
134f787e24
reserve 'policy' keyword in replay buffer
2020-04-29 17:48:48 +08:00
nicoguertler
8f718d9b13
Fix log_prob in SAC ( #41 )
2020-04-28 23:44:15 +08:00
Trinkle23897
80d661907e
Multimodal obs ( #38 , #27 , #25 )
2020-04-28 20:56:02 +08:00
Trinkle23897
959955fa2a
fix historical issues
2020-04-26 16:13:51 +08:00
Trinkle23897
6b96f124ae
fix pdqn
2020-04-26 15:11:20 +08:00
rocknamx
b23749463e
Prioritized DQN ( #30 )
...
* add sum_tree.py
* add prioritized replay buffer
* del sum_tree.py
* fix some format issues
* fix weight_update bug
* simply replace replaybuffer in test_dqn without weight update
* weight default set to 1
* fix sampling bug when buffer is not full
* rename parameter
* fix formula error, add accuracy check
* add PrioritizedDQN test
* add test_pdqn.py
* add update_weight() doc
* add ref of prio dqn in readme.md and index.rst
* restore test_dqn.py, fix args of test_pdqn.py
2020-04-26 12:05:58 +08:00
Trinkle23897
70290346ea
compatible with torch==1.5.0 ( fix #37 )
2020-04-26 11:04:45 +08:00
Trinkle23897
6bf1ea644d
fix ppo
2020-04-19 14:30:42 +08:00
Trinkle23897
680fc0ffbe
gae
2020-04-14 21:11:06 +08:00
Trinkle23897
3cc22b7c0c
__call__ -> forward
2020-04-10 10:47:16 +08:00
Trinkle23897
13086b7f64
add ignore_obs_next in buffer
2020-04-10 09:01:17 +08:00
Trinkle23897
19f2cce294
seealso and change policy dir structure
2020-04-09 21:36:53 +08:00