43 Commits

Author SHA1 Message Date
youkaichao
e767de044b
Remove dummy net code (#123)
* remove dummy net; delete two files

* split code to have backbone and head

* rename class

* change torch.float to torch.float32

* use flatten(1) instead of view(batch, -1)

* remove dummy net in docs

* bugfix for rnn

* fix cuda error

* minor fix of docs

* do not change the example code in dqn tutorial, since it is for demonstration

Co-authored-by: Trinkle23897 <463003665@qq.com>
2020-07-09 22:57:01 +08:00
Trinkle23897
263e490b76 fix #79 2020-06-16 16:54:16 +08:00
Trinkle23897
513573ea82 add link 2020-06-08 22:20:52 +08:00
Trinkle23897
7bf202f195 polish docs 2020-06-03 17:04:26 +08:00
Trinkle23897
dc451dfe88 nstep all (fix #51) 2020-06-03 13:59:47 +08:00
Trinkle23897
f818a2467b zh_CN docs 2020-06-02 08:51:14 +08:00
Trinkle23897
de556fd22d item3 of #51 2020-05-27 11:02:23 +08:00
Trinkle23897
3243484f8e show stat in pytest 2020-05-16 08:48:12 +08:00
Trinkle23897
80d661907e Multimodal obs (#38, #27, #25) 2020-04-28 20:56:02 +08:00
Trinkle23897
959955fa2a fix historical issues 2020-04-26 16:13:51 +08:00
Trinkle23897
6b96f124ae fix pdqn 2020-04-26 15:11:20 +08:00
rocknamx
b23749463e
Prioritized DQN (#30)
* add sum_tree.py

* add prioritized replay buffer

* del sum_tree.py

* fix some format issues

* fix weight_update bug

* simply replace replaybuffer in test_dqn without weight update

* weight default set to 1

* fix sampling bug when buffer is not full

* rename parameter

* fix formula error, add accuracy check

* add PrioritizedDQN test

* add test_pdqn.py

* add update_weight() doc

* add ref of prio dqn in readme.md and index.rst

* restore test_dqn.py, fix args of test_pdqn.py
2020-04-26 12:05:58 +08:00
Trinkle23897
6bf1ea644d fix ppo 2020-04-19 14:30:42 +08:00
Trinkle23897
680fc0ffbe gae 2020-04-14 21:11:06 +08:00
Trinkle23897
7b65d43394 vanilla imitation learning 2020-04-13 19:37:27 +08:00
Trinkle23897
befdfb07e8 polish docs 2020-04-11 19:29:46 +08:00
Trinkle23897
3cc22b7c0c __call__ -> forward 2020-04-10 10:47:16 +08:00
Trinkle23897
86572c66d4 maybe finished rnn? 2020-04-08 21:13:15 +08:00
Trinkle23897
6c8edf6a3a codecov badge 2020-04-07 11:17:10 +08:00
Trinkle23897
e0809ff135 add policy docs (#21) 2020-04-06 19:36:59 +08:00
Trinkle23897
974ade8019 add some docs 2020-04-03 21:28:12 +08:00
Trinkle23897
6cfa876591 hot fix 2020-04-03 15:17:58 +08:00
Trinkle23897
7cb5146611 add docs of trick 2020-04-02 21:57:26 +08:00
Trinkle23897
0e86d44860 finish concepts 2020-04-02 12:31:22 +08:00
Trinkle23897
0acd0d164c test api doc 2020-04-02 09:07:04 +08:00
Trinkle23897
4f843d3f51 update readme 2020-04-01 10:21:58 +08:00
Trinkle23897
04208e6cce update some tutorial 2020-03-30 22:52:25 +08:00
Trinkle23897
4e7df7616a update dqn tutorial 2020-03-29 15:18:33 +08:00
Trinkle23897
d9e4b9d16f upd doc 2020-03-29 10:22:03 +08:00
Trinkle23897
a326d30739 shorten quick start 2020-03-28 22:40:47 +08:00
Trinkle23897
57735ce1b5 add logo and sphinx setup 2020-03-28 22:01:23 +08:00
Trinkle23897
f68f23292e update readme and force flake8 2020-03-28 13:27:01 +08:00
Minghao Zhang
068c4068ec
fix atari/mujoco env (#7)
* update atari.py

* fix setup.py
pass the pytest

* fix setup.py
pass the pytest

* add args "render"

* change the tensorboard writter

* change the tensorboard writter

* change device, render, tensorboard log location

* change device, render, tensorboard log location

* remove some wrong local files

* fix some tab mistakes and the envs name in continuous/test_xx.py

* add examples and point robot maze environment

* fix some bugs during testing examples

* add dqn network and fix some args

* change back the tensorboard writter's frequency to ensure ppo and a2c can write things normally

* add a warning to collector

* rm some unrelated files

* reformat

* fix a bug in test_dqn due to the model wrong selection

* change atari frame skip and observation to improve performance

* readd some files

* change import

* modified readme

* rm tensorboard log

* update atari and mujoco which are ignored

* rm the wrong lines
2020-03-28 12:03:49 +08:00
Trinkle23897
c42990c725 add rllib result and fix pep8 2020-03-28 09:43:35 +08:00
sproblvem
acb93502cf
Update README.md
change "Framework" to "Task"
2020-03-27 16:52:07 +08:00
Trinkle23897
044aae4355 add baseline and rlpyt result 2020-03-27 16:24:07 +08:00
Trinkle23897
44f911bc31 add pytorch drl result 2020-03-27 09:04:29 +08:00
Trinkle23897
519f9f20d0 update readme 2020-03-26 17:32:51 +08:00
Trinkle23897
c505cd8205 update readme 2020-03-26 11:42:34 +08:00
Trinkle23897
75364cd986 ppo and early stop 2020-03-20 19:52:29 +08:00
Trinkle23897
f58c1397c6 half of collector 2020-03-12 22:20:33 +08:00
Trinkle23897
04557fdb82 env test \ ray 2020-03-11 16:14:53 +08:00
Trinkle23897
0c944eab68 init 2020-03-09 11:38:04 +08:00