155 Commits

Author SHA1 Message Date
Trinkle23897
6cfa876591 hot fix 2020-04-03 15:17:58 +08:00
Trinkle23897
7cb5146611 add docs of trick 2020-04-02 21:57:26 +08:00
Trinkle23897
0e86d44860 finish concepts 2020-04-02 12:31:22 +08:00
Trinkle23897
0acd0d164c test api doc 2020-04-02 09:07:04 +08:00
Minghao Zhang
0b08a41610
move mujoco to examples (#12)
* move mujoco to examples

* fix the import mujoco bug

* flake8

* flake8

* rm __init__.py
2020-04-02 08:49:19 +08:00
Trinkle23897
4f843d3f51 update readme 2020-04-01 10:21:58 +08:00
ShenDezhou
4da857d86e
Fix windows env setup bugs and other typo. (#11) 2020-03-31 17:22:32 +08:00
Doxie
98feb79057
fix bug in discrete_net.py (#10) 2020-03-31 16:13:53 +08:00
Trinkle23897
04208e6cce update some tutorial 2020-03-30 22:52:25 +08:00
Trinkle23897
2169dd2201 update high-res logo 2020-03-29 15:52:47 +08:00
Trinkle23897
4e7df7616a update dqn tutorial 2020-03-29 15:18:33 +08:00
Trinkle23897
d9e4b9d16f upd doc 2020-03-29 10:22:03 +08:00
Trinkle23897
a326d30739 shorten quick start 2020-03-28 22:40:47 +08:00
Trinkle23897
57735ce1b5 add logo and sphinx setup 2020-03-28 22:01:23 +08:00
Trinkle23897
f23b0dfac9 add ListReplayBuffer 2020-03-28 15:14:41 +08:00
Minghao Zhang
eb7fb37806
fix PointMaze (#8)
* update atari.py

* fix setup.py
pass the pytest

* fix setup.py
pass the pytest

* add args "render"

* change the tensorboard writter

* change the tensorboard writter

* change device, render, tensorboard log location

* change device, render, tensorboard log location

* remove some wrong local files

* fix some tab mistakes and the envs name in continuous/test_xx.py

* add examples and point robot maze environment

* fix some bugs during testing examples

* add dqn network and fix some args

* change back the tensorboard writter's frequency to ensure ppo and a2c can write things normally

* add a warning to collector

* rm some unrelated files

* reformat

* fix a bug in test_dqn due to the model wrong selection

* change atari frame skip and observation to improve performance

* readd some files

* change import

* modified readme

* rm tensorboard log

* update atari and mujoco which are ignored

* rm the wrong lines

* readd the import of PointMaze

* fix a typo in test/discrete/net.py

* add a class declaration to pass the flake8

* fix flake8 errors
2020-03-28 14:36:12 +08:00
Trinkle23897
f68f23292e update readme and force flake8 2020-03-28 13:27:01 +08:00
Minghao Zhang
068c4068ec
fix atari/mujoco env (#7)
* update atari.py

* fix setup.py
pass the pytest

* fix setup.py
pass the pytest

* add args "render"

* change the tensorboard writter

* change the tensorboard writter

* change device, render, tensorboard log location

* change device, render, tensorboard log location

* remove some wrong local files

* fix some tab mistakes and the envs name in continuous/test_xx.py

* add examples and point robot maze environment

* fix some bugs during testing examples

* add dqn network and fix some args

* change back the tensorboard writter's frequency to ensure ppo and a2c can write things normally

* add a warning to collector

* rm some unrelated files

* reformat

* fix a bug in test_dqn due to the model wrong selection

* change atari frame skip and observation to improve performance

* readd some files

* change import

* modified readme

* rm tensorboard log

* update atari and mujoco which are ignored

* rm the wrong lines
2020-03-28 12:03:49 +08:00
Trinkle23897
c42990c725 add rllib result and fix pep8 2020-03-28 09:43:35 +08:00
Minghao Zhang
77068af526
add examples, fix some bugs (#5)
* update atari.py

* fix setup.py
pass the pytest

* fix setup.py
pass the pytest

* add args "render"

* change the tensorboard writter

* change the tensorboard writter

* change device, render, tensorboard log location

* change device, render, tensorboard log location

* remove some wrong local files

* fix some tab mistakes and the envs name in continuous/test_xx.py

* add examples and point robot maze environment

* fix some bugs during testing examples

* add dqn network and fix some args

* change back the tensorboard writter's frequency to ensure ppo and a2c can write things normally

* add a warning to collector

* rm some unrelated files

* reformat

* fix a bug in test_dqn due to the model wrong selection
2020-03-28 07:27:18 +08:00
sproblvem
acb93502cf
Update README.md
change "Framework" to "Task"
2020-03-27 16:52:07 +08:00
Trinkle23897
044aae4355 add baseline and rlpyt result 2020-03-27 16:24:07 +08:00
Trinkle23897
44f911bc31 add pytorch drl result 2020-03-27 09:04:29 +08:00
Trinkle23897
519f9f20d0 update readme 2020-03-26 17:32:51 +08:00
Trinkle23897
c505cd8205 update readme 2020-03-26 11:42:34 +08:00
Minghao Zhang
3c0a09fefd
minor reformat (#2)
* update atari.py

* fix setup.py
pass the pytest

* fix setup.py
pass the pytest
2020-03-26 09:01:20 +08:00
Trinkle23897
fdc969b830 fix collector 2020-03-25 14:08:28 +08:00
Trinkle23897
e95218e295 sac 2020-03-23 17:17:41 +08:00
Trinkle23897
30a0fc079c td3 2020-03-23 11:34:52 +08:00
Trinkle23897
a87563b8e6 add demo of ppo continuous action task 2020-03-21 17:04:42 +08:00
Trinkle23897
c173f7bfbc fix ddpg 2020-03-21 15:31:31 +08:00
Trinkle23897
8bd8246b16 refract test code 2020-03-21 10:58:01 +08:00
Trinkle23897
d64d78d769 seed??? 2020-03-20 21:51:09 +08:00
Trinkle23897
75364cd986 ppo and early stop 2020-03-20 19:52:29 +08:00
Trinkle23897
c87fe3c18c add trainer 2020-03-19 17:23:46 +08:00
Trinkle23897
9c5417dd51 change env to vecenv for higher code coverage rate 2020-03-18 21:56:03 +08:00
Trinkle23897
64bab0b6a0 ddpg 2020-03-18 21:45:41 +08:00
Trinkle23897
6e563fe61a a2c 2020-03-17 20:22:37 +08:00
Trinkle23897
fd621971e5 fix bug in test 2020-03-17 15:16:30 +08:00
Trinkle23897
39de63592f finish pg 2020-03-17 11:37:31 +08:00
Trinkle23897
8b0b970c9b add speed stat 2020-03-16 15:04:58 +08:00
Trinkle23897
cef5de8b83 fix some bugs 2020-03-16 11:11:29 +08:00
Trinkle23897
5983c6b33d finish dqn 2020-03-15 17:41:00 +08:00
Trinkle23897
c804662457 add cache buf in collector 2020-03-14 21:48:31 +08:00
Trinkle23897
543e57cdbd clear 2020-03-13 21:47:17 +08:00
Trinkle23897
f16e05c0e7 maybe finished collector? 2020-03-13 17:49:22 +08:00
Trinkle23897
f58c1397c6 half of collector 2020-03-12 22:20:33 +08:00
Trinkle23897
4a1a7dd670 fix a bug 2020-03-11 18:02:19 +08:00
Trinkle23897
6632e47b9d add test_buffer 2020-03-11 17:28:51 +08:00
Trinkle23897
04557fdb82 env test \ ray 2020-03-11 16:14:53 +08:00