17 Commits

Author SHA1 Message Date
n+e
94bfb32cc1
optimize training procedure and improve code coverage (#189)
1. add policy.eval() in all test scripts' "watch performance"
2. remove dict return support for collector preprocess_fn
3. add `__contains__` and `pop` in batch: `key in batch`, `batch.pop(key, deft)`
4. exact n_episode for a list of n_episode limitation and save fake data in cache_buffer when self.buffer is None (#184)
5. fix tensorboard logging: h-axis stands for env step instead of gradient step; add test results into tensorboard
6. add test_returns (both GAE and nstep)
7. change the type-checking order in batch.py and converter.py in order to meet the most often case first
8. fix shape inconsistency for torch.Tensor in replay buffer
9. remove `**kwargs` in ReplayBuffer
10. remove default value in batch.split() and add merge_last argument (#185)
11. improve nstep efficiency
12. add max_batchsize in onpolicy algorithms
13. potential bugfix for subproc.wait
14. fix RecurrentActorProb
15. improve the code-coverage (from 90% to 95%) and remove the dead code
16. fix some incorrect type annotation

The above improvement also increases the training FPS: on my computer, the previous version is only ~1800 FPS and after that, it can reach ~2050 (faster than v0.2.4.post1).
2020-08-27 12:15:18 +08:00
youkaichao
a9f9940d17
code refactor for venv (#179)
- Refacor code to remove duplicate code

- Enable async simulation for all vector envs

- Remove `collector.close` and rename `VectorEnv` to `DummyVectorEnv`

The abstraction of vector env changed.

Prior to this pr, each vector env is almost independent.

After this pr, each env is wrapped into a worker, and vector envs differ with their worker type. In fact, users can just use `BaseVectorEnv` with different workers, I keep `SubprocVectorEnv`, `ShmemVectorEnv` for backward compatibility.

Co-authored-by: n+e <463003665@qq.com>
Co-authored-by: magicly <magicly007@gmail.com>
2020-08-19 15:00:24 +08:00
n+e
bd9c3c7f8d
docs fix and v0.2.5 (#156)
* pre

* update docs

* update docs

* $ in bash

* size -> hidden_layer_size

* doctest

* doctest again

* filter a warning

* fix bug

* fix examples

* test fail

* test succ
2020-07-22 14:42:08 +08:00
n+e
47e8e2686c
move atari wrapper to examples and publish v0.2.4 (#124)
* move atari wrapper to examples

* consistency

* change drqn seed since it is quite unstable in current seed

* minor fix

* 0.2.4
2020-07-10 17:20:39 +08:00
Trinkle23897
5f2c5347df v0.2.3 2020-06-01 09:37:30 +08:00
Trinkle23897
6b96f124ae fix pdqn 2020-04-26 15:11:20 +08:00
Trinkle23897
d9d2763dad first version with full documentation 2020-04-07 11:50:34 +08:00
ShenDezhou
4da857d86e
Fix windows env setup bugs and other typo. (#11) 2020-03-31 17:22:32 +08:00
Trinkle23897
d9e4b9d16f upd doc 2020-03-29 10:22:03 +08:00
Trinkle23897
044aae4355 add baseline and rlpyt result 2020-03-27 16:24:07 +08:00
Minghao Zhang
3c0a09fefd
minor reformat (#2)
* update atari.py

* fix setup.py
pass the pytest

* fix setup.py
pass the pytest
2020-03-26 09:01:20 +08:00
Trinkle23897
c87fe3c18c add trainer 2020-03-19 17:23:46 +08:00
Trinkle23897
64bab0b6a0 ddpg 2020-03-18 21:45:41 +08:00
Trinkle23897
543e57cdbd clear 2020-03-13 21:47:17 +08:00
Trinkle23897
f16e05c0e7 maybe finished collector? 2020-03-13 17:49:22 +08:00
Trinkle23897
0dfb900e29 env and data 2020-03-11 09:09:56 +08:00
Trinkle23897
0c944eab68 init 2020-03-09 11:38:04 +08:00