youkaichao
32df0567bb
use nn.Sequential in DQN ( #176 )
2020-08-02 15:14:44 +08:00
yingchengyang
99a1d40e85
Dueling DQN ( #170 )
...
Co-authored-by: n+e <463003665@qq.com>
2020-07-29 19:44:42 +08:00
n+e
38a95c19da
Yet another 3 fix ( #160 )
...
1. DQN learn should keep eps=0
2. Add a warning of env.seed in VecEnv
3. fix #162 of multi-dim action
2020-07-24 17:38:12 +08:00
n+e
bd9c3c7f8d
docs fix and v0.2.5 ( #156 )
...
* pre
* update docs
* update docs
* $ in bash
* size -> hidden_layer_size
* doctest
* doctest again
* filter a warning
* fix bug
* fix examples
* test fail
* test succ
2020-07-22 14:42:08 +08:00
youkaichao
e767de044b
Remove dummy net code ( #123 )
...
* remove dummy net; delete two files
* split code to have backbone and head
* rename class
* change torch.float to torch.float32
* use flatten(1) instead of view(batch, -1)
* remove dummy net in docs
* bugfix for rnn
* fix cuda error
* minor fix of docs
* do not change the example code in dqn tutorial, since it is for demonstration
Co-authored-by: Trinkle23897 <463003665@qq.com>
2020-07-09 22:57:01 +08:00
Trinkle23897
ff81a18f42
compute_nstep_returns (item 2 of #51 )
2020-06-02 22:29:50 +08:00
Trinkle23897
0eef0ca198
fix optional type syntax
2020-05-16 20:08:32 +08:00
Trinkle23897
9b26137cd2
add type annotation
2020-05-12 11:31:47 +08:00
Trinkle23897
8812eaa502
fix #36
2020-04-23 22:06:18 +08:00
Trinkle23897
6bf1ea644d
fix ppo
2020-04-19 14:30:42 +08:00
Trinkle23897
e0809ff135
add policy docs ( #21 )
2020-04-06 19:36:59 +08:00
Trinkle23897
610390c132
add docs of collector and trainer ( #20 )
2020-04-05 18:34:45 +08:00
Trinkle23897
b6c9db6b0b
docs for env
2020-04-04 21:02:06 +08:00
Trinkle23897
974ade8019
add some docs
2020-04-03 21:28:12 +08:00
Trinkle23897
75364cd986
ppo and early stop
2020-03-20 19:52:29 +08:00
Trinkle23897
39de63592f
finish pg
2020-03-17 11:37:31 +08:00
Trinkle23897
5983c6b33d
finish dqn
2020-03-15 17:41:00 +08:00
Trinkle23897
c804662457
add cache buf in collector
2020-03-14 21:48:31 +08:00
Trinkle23897
f58c1397c6
half of collector
2020-03-12 22:20:33 +08:00
Trinkle23897
7533e5b0ac
add first test
2020-03-11 10:56:38 +08:00
Trinkle23897
5550aed0a1
flake8 fix
2020-03-11 09:38:14 +08:00
Trinkle23897
0dfb900e29
env and data
2020-03-11 09:09:56 +08:00