13 Commits

Author SHA1 Message Date
youkaichao
5b1373924e
doc fix; policy train/eval signiture fix (#109)
* doc fix; policy train/eval signiture fix

* change train/eval behavior according to pytorch

* change train/eval behavior according to pytorch
2020-07-06 10:44:34 +08:00
danagi
c59ad40aef
Add auto alpha tuning and exploration noise for sac. (#80)
Add class BaseNoise and GaussianNoise for the concept of exploration noise.
Add new test for sac tested in MountainCarContinuous-v0,
which should benefits from the two above new feature.
2020-06-16 22:17:28 +08:00
Trinkle23897
5f2f05a570 fix #40 2020-06-13 17:06:08 +08:00
Trinkle23897
dc451dfe88 nstep all (fix #51) 2020-06-03 13:59:47 +08:00
Alexis DUBURCQ
8af7196a9a
Robust conversion from/to numpy/pytorch (#63)
* Enable to convert Batch data back to torch.

* Add torch converter to collector.

* Fix

* Move to_numpy/to_torch convert in dedicated utils.py.

* Use to_numpy/to_torch to convert arrays.

* fix lint

* fix

* Add unit test to check Batch from/to numpy.

* Fix Batch over Batch.

Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu>
2020-05-29 20:45:21 +08:00
Trinkle23897
de556fd22d item3 of #51 2020-05-27 11:02:23 +08:00
Imone
57bca16f94
Fix log_prob and PPO dual_clip (#49)
* Added DiagGaussian to fix log_probg

* Disable PPO dual_clip
2020-05-18 16:23:35 +08:00
Trinkle23897
0eef0ca198 fix optional type syntax 2020-05-16 20:08:32 +08:00
Trinkle23897
9b26137cd2 add type annotation 2020-05-12 11:31:47 +08:00
nicoguertler
8f718d9b13
Fix log_prob in SAC (#41) 2020-04-28 23:44:15 +08:00
Trinkle23897
70290346ea compatible with torch==1.5.0 (fix #37) 2020-04-26 11:04:45 +08:00
Trinkle23897
3cc22b7c0c __call__ -> forward 2020-04-10 10:47:16 +08:00
Trinkle23897
19f2cce294 seealso and change policy dir structure 2020-04-09 21:36:53 +08:00