34 Commits

Author SHA1 Message Date
Trinkle23897
81e4a16ef2 fix a bug in re-index replay buffer (fix #82) 2020-06-17 16:37:51 +08:00
Trinkle23897
f1951780ab fix a bug of storing batch over batch data into buffer 2020-06-09 18:46:14 +08:00
Trinkle23897
560116d0b2 cheat sheet 2020-06-08 21:53:00 +08:00
Trinkle23897
ba1b3e54eb fix #69 2020-06-01 08:30:09 +08:00
Alexis DUBURCQ
dd3e2130bb
Infer the right dtype for replay buffers. (#64) 2020-05-29 22:27:03 +08:00
Trinkle23897
de556fd22d item3 of #51 2020-05-27 11:02:23 +08:00
Trinkle23897
0eef0ca198 fix optional type syntax 2020-05-16 20:08:32 +08:00
Trinkle23897
9b26137cd2 add type annotation 2020-05-12 11:31:47 +08:00
Trinkle23897
c2a7caf806 add recurrent actor and critic 2020-04-30 16:31:40 +08:00
Trinkle23897
134f787e24 reserve 'policy' keyword in replay buffer 2020-04-29 17:48:48 +08:00
Trinkle23897
bb2f833d0e support Batch of Batch and fix bugs (#38) 2020-04-29 12:14:53 +08:00
Trinkle23897
80d661907e Multimodal obs (#38, #27, #25) 2020-04-28 20:56:02 +08:00
rocknamx
b23749463e
Prioritized DQN (#30)
* add sum_tree.py

* add prioritized replay buffer

* del sum_tree.py

* fix some format issues

* fix weight_update bug

* simply replace replaybuffer in test_dqn without weight update

* weight default set to 1

* fix sampling bug when buffer is not full

* rename parameter

* fix formula error, add accuracy check

* add PrioritizedDQN test

* add test_pdqn.py

* add update_weight() doc

* add ref of prio dqn in readme.md and index.rst

* restore test_dqn.py, fix args of test_pdqn.py
2020-04-26 12:05:58 +08:00
Trinkle23897
6a244d1fbb save_fn 2020-04-11 16:54:27 +08:00
Trinkle23897
13086b7f64 add ignore_obs_next in buffer 2020-04-10 09:01:17 +08:00
Trinkle23897
19f2cce294 seealso and change policy dir structure 2020-04-09 21:36:53 +08:00
Trinkle23897
6da80e045a fix rnn (#19), add __repr__, and fix #26 2020-04-09 19:53:45 +08:00
Trinkle23897
86572c66d4 maybe finished rnn? 2020-04-08 21:13:15 +08:00
Trinkle23897
610390c132 add docs of collector and trainer (#20) 2020-04-05 18:34:45 +08:00
Trinkle23897
b6c9db6b0b docs for env 2020-04-04 21:02:06 +08:00
Trinkle23897
974ade8019 add some docs 2020-04-03 21:28:12 +08:00
Trinkle23897
04208e6cce update some tutorial 2020-03-30 22:52:25 +08:00
Trinkle23897
f23b0dfac9 add ListReplayBuffer 2020-03-28 15:14:41 +08:00
Minghao Zhang
3c0a09fefd
minor reformat (#2)
* update atari.py

* fix setup.py
pass the pytest

* fix setup.py
pass the pytest
2020-03-26 09:01:20 +08:00
Trinkle23897
64bab0b6a0 ddpg 2020-03-18 21:45:41 +08:00
Trinkle23897
39de63592f finish pg 2020-03-17 11:37:31 +08:00
Trinkle23897
cef5de8b83 fix some bugs 2020-03-16 11:11:29 +08:00
Trinkle23897
5983c6b33d finish dqn 2020-03-15 17:41:00 +08:00
Trinkle23897
c804662457 add cache buf in collector 2020-03-14 21:48:31 +08:00
Trinkle23897
f16e05c0e7 maybe finished collector? 2020-03-13 17:49:22 +08:00
Trinkle23897
f58c1397c6 half of collector 2020-03-12 22:20:33 +08:00
Trinkle23897
6632e47b9d add test_buffer 2020-03-11 17:28:51 +08:00
Trinkle23897
5550aed0a1 flake8 fix 2020-03-11 09:38:14 +08:00
Trinkle23897
0dfb900e29 env and data 2020-03-11 09:09:56 +08:00