34 Commits

Author SHA1 Message Date
Trinkle23897
6bf1ea644d fix ppo 2020-04-19 14:30:42 +08:00
Trinkle23897
680fc0ffbe gae 2020-04-14 21:11:06 +08:00
Trinkle23897
7b65d43394 vanilla imitation learning 2020-04-13 19:37:27 +08:00
Trinkle23897
ecfcb9f295 fix docs 2020-04-10 11:16:33 +08:00
Trinkle23897
3cc22b7c0c __call__ -> forward 2020-04-10 10:47:16 +08:00
Trinkle23897
13086b7f64 add ignore_obs_next in buffer 2020-04-10 09:01:17 +08:00
Trinkle23897
19f2cce294 seealso and change policy dir structure 2020-04-09 21:36:53 +08:00
Trinkle23897
6da80e045a fix rnn (#19), add __repr__, and fix #26 2020-04-09 19:53:45 +08:00
Trinkle23897
86572c66d4 maybe finished rnn? 2020-04-08 21:13:15 +08:00
Trinkle23897
e0809ff135 add policy docs (#21) 2020-04-06 19:36:59 +08:00
Trinkle23897
610390c132 add docs of collector and trainer (#20) 2020-04-05 18:34:45 +08:00
Oblivion
4d4d0daf9e
Performance improve (#18)
* improve performance

set one thread for NN
replace detach() op with torch.no_grad()

* fix pep 8 errors
2020-04-05 09:10:21 +08:00
Trinkle23897
974ade8019 add some docs 2020-04-03 21:28:12 +08:00
Trinkle23897
c42990c725 add rllib result and fix pep8 2020-03-28 09:43:35 +08:00
Minghao Zhang
77068af526
add examples, fix some bugs (#5)
* update atari.py

* fix setup.py
pass the pytest

* fix setup.py
pass the pytest

* add args "render"

* change the tensorboard writter

* change the tensorboard writter

* change device, render, tensorboard log location

* change device, render, tensorboard log location

* remove some wrong local files

* fix some tab mistakes and the envs name in continuous/test_xx.py

* add examples and point robot maze environment

* fix some bugs during testing examples

* add dqn network and fix some args

* change back the tensorboard writter's frequency to ensure ppo and a2c can write things normally

* add a warning to collector

* rm some unrelated files

* reformat

* fix a bug in test_dqn due to the model wrong selection
2020-03-28 07:27:18 +08:00
Trinkle23897
c505cd8205 update readme 2020-03-26 11:42:34 +08:00
Minghao Zhang
3c0a09fefd
minor reformat (#2)
* update atari.py

* fix setup.py
pass the pytest

* fix setup.py
pass the pytest
2020-03-26 09:01:20 +08:00
Trinkle23897
fdc969b830 fix collector 2020-03-25 14:08:28 +08:00
Trinkle23897
e95218e295 sac 2020-03-23 17:17:41 +08:00
Trinkle23897
30a0fc079c td3 2020-03-23 11:34:52 +08:00
Trinkle23897
c173f7bfbc fix ddpg 2020-03-21 15:31:31 +08:00
Trinkle23897
75364cd986 ppo and early stop 2020-03-20 19:52:29 +08:00
Trinkle23897
c87fe3c18c add trainer 2020-03-19 17:23:46 +08:00
Trinkle23897
64bab0b6a0 ddpg 2020-03-18 21:45:41 +08:00
Trinkle23897
6e563fe61a a2c 2020-03-17 20:22:37 +08:00
Trinkle23897
fd621971e5 fix bug in test 2020-03-17 15:16:30 +08:00
Trinkle23897
39de63592f finish pg 2020-03-17 11:37:31 +08:00
Trinkle23897
8b0b970c9b add speed stat 2020-03-16 15:04:58 +08:00
Trinkle23897
cef5de8b83 fix some bugs 2020-03-16 11:11:29 +08:00
Trinkle23897
5983c6b33d finish dqn 2020-03-15 17:41:00 +08:00
Trinkle23897
c804662457 add cache buf in collector 2020-03-14 21:48:31 +08:00
Trinkle23897
543e57cdbd clear 2020-03-13 21:47:17 +08:00
Trinkle23897
f16e05c0e7 maybe finished collector? 2020-03-13 17:49:22 +08:00
Trinkle23897
f58c1397c6 half of collector 2020-03-12 22:20:33 +08:00