509 Commits

Author SHA1 Message Date
youkaichao
affeec13de Improve Batch (#128)
* minor polish

* improve and implement Batch.cat_

* bugfix for buffer.sample with field impt_weight

* restore the usage of a.cat_(b)

* fix 2 bugs in batch and add corresponding unittest

* code fix for update

* update is_empty to recognize empty over empty; bugfix for len

* bugfix for update and add testcase

* add testcase of update

* fix docs

* fix docs

* fix docs [ci skip]

* fix docs [ci skip]

Co-authored-by: Trinkle23897 <463003665@qq.com>
2020-07-13 17:33:01 +08:00
youkaichao
2564e989fb Improve Batch (#126)
* make sure the key type of Batch is string, and add unit tests

* add is_empty() function and unit tests

* enable cat of mixing dict and Batch, just like stack
2020-07-13 17:33:01 +08:00
n+e
47e8e2686c
move atari wrapper to examples and publish v0.2.4 (#124)
* move atari wrapper to examples

* consistency

* change drqn seed since it is quite unstable in current seed

* minor fix

* 0.2.4
v0.2.4
2020-07-10 17:20:39 +08:00
youkaichao
ff99662fe6
bugfix for update with empty buffer; remove duplicate variable _weight_sum in PrioritizedReplayBuffer (#120)
* bugfix for update with empty buffer; remove duplicate variable _weight_sum in PrioritizedReplayBuffer

* point out that ListReplayBuffer cannot be sampled

* remove useless _amortization_counter variable
2020-07-10 08:24:11 +08:00
youkaichao
e767de044b
Remove dummy net code (#123)
* remove dummy net; delete two files

* split code to have backbone and head

* rename class

* change torch.float to torch.float32

* use flatten(1) instead of view(batch, -1)

* remove dummy net in docs

* bugfix for rnn

* fix cuda error

* minor fix of docs

* do not change the example code in dqn tutorial, since it is for demonstration

Co-authored-by: Trinkle23897 <463003665@qq.com>
2020-07-09 22:57:01 +08:00
Alexis DUBURCQ
aa3c453f42
Raise exception for Batch __getitem__. (#119)
* Raise exception for Batch __getitem__.

* Try fixing access to reserved key.

* Simpler patch.

* Add unit test to check indexing empty Batch raises an exception.

Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu>
2020-07-08 22:29:37 +08:00
youkaichao
7f9a1f1328
add type check for each element rather than the first element (#112)
This PR does the following:
- improvement: dramatic reduce of the call to _is_batch_set
- bugfix: list(Batch()) fail; Batch(a=[torch.ones(3), torch.ones(3)]) fail;
- misc: add type check for each element rather than the first element; add test case; _create_value with torch.Tensor does not have np.object type;
2020-07-08 21:00:00 +08:00
youkaichao
481015932c
bugfix for hang in list(Batch()) (#117) 2020-07-08 17:09:27 +08:00
youkaichao
f5e007932f
fix Batch init for types other than number and bool (#115)
* fix Batch init for types other than number and bool

* change doc to involve bool type

* use type check

* Batch type check complete
2020-07-08 13:45:29 +08:00
youkaichao
dbbb859ec5
doc fix (#113)
* doc fix

* change line

Co-authored-by: Trinkle23897 <463003665@qq.com>
2020-07-08 08:30:01 +08:00
youkaichao
9c7d31e5d6
bugfix for empty_ (#114)
* bugfix for empty_

* use v.__class__(0) for scalar
2020-07-08 08:10:34 +08:00
Alexis DUBURCQ
69caf89908
Fix to_torch converters (#111)
* Fix to_torch converters.

* to_torch now convert any object Torch Tensor-compatible.

* Fix linter.

* Fix Batch to_torch to convert any Torch Tensor-compatible data.

Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu>
2020-07-07 18:40:55 +08:00
youkaichao
8913bf36b1
change Batch.empty to in-place fill; add copy option for Batch construction (#110)
* in-place empty_ for Batch

* change Batch.empty to in-place fill; add copy option for Batch construction

* type signiture & remove shadow names for copy

* add doc for data type (only support numbers and object data type)

* add unit test for Batch copy

* fix pep8

* add test case for Batch.empty

* doc fix

* fix pep8

* use object to test Batch

* test commit

* refact

* change Batch(copy) testcase

* minor fix

Co-authored-by: Trinkle23897 <463003665@qq.com>
2020-07-06 20:30:15 +08:00
youkaichao
5b1373924e
doc fix; policy train/eval signiture fix (#109)
* doc fix; policy train/eval signiture fix

* change train/eval behavior according to pytorch

* change train/eval behavior according to pytorch
2020-07-06 10:44:34 +08:00
n+e
db0e2e5cd2
Advanced Batch slicing & minor fix of RNN support (#106)
* add shape property and modify __getitem__

* change Batch.size to Batch.shape

* setattr

* Batch.empty

* remove scalar in advanced slicing

* modify empty_ and __getitem__

* missing testcase

* fix empty
2020-06-30 18:02:44 +08:00
Trinkle23897
c639446c66 add copybutton 2020-06-29 13:45:01 +08:00
Trinkle23897
e0f4862d01 store RNN hidden states in policy._state and add sample_avail in buffer (#19) 2020-06-29 12:18:52 +08:00
danagi
60cfc373f8
fix #98, support #99 (#102)
* Add auto alpha tuning and exploration noise for sac.
Add class BaseNoise and GaussianNoise for the concept of exploration noise.
Add new test for sac tested in MountainCarContinuous-v0,
which should benefits from the two above new feature.

* add exploration noise to collector, fix example to adapt modification

* fix #98

* enable off-policy to update multiple times in one step. (#99)
2020-06-27 21:40:09 +08:00
Alexis DUBURCQ
a951a32487
Enable partial stacking at Batch level (#100)
* Enable stacking of partially matching Batch instances.

* Fix list support for getitem.

* Fix Batch 'size' method.

* Update Batch documentation.
2020-06-27 09:06:40 +08:00
Alexis DUBURCQ
70aa7bf93e
Use lower-level API to reduce overhead. (#97)
* Use lower-level API to reduce overhead.

* Further improvements.

* Buffer _add_to_buffer improvement.

* Do not use _data field to store Batch data to avoid overhead. Add back _meta field in Buffer.

* Restore metadata attribute to store batch in Buffer.

* Move out nested methods.

* Update try/catch instead of actual check to efficiency.

* Remove unsed branches for efficiency.

* Use np.array over list when possible for efficiency.

* Final performance improvement.

* Add unit tests for Batch size method.

* Add missing stack unit tests.

* Enforce Buffer initialization to zero.

Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu>
2020-06-26 18:37:50 +08:00
Alexis DUBURCQ
5ac9f9b144
Do not check bounds since it is always valid when everything is fine. (#95) 2020-06-25 21:06:35 +08:00
Alexis DUBURCQ
3086b5c31d
Buffer refactoring to support batch over batch reliably (#93)
* Fix support of batch over batch for Buffer.

* Do not use internal __dict__ attribute to store batch data since it breaks inheritance.

* Various fixes.

* Improve robustness of Batch/Buffer by avoiding direct attribute assignment. Buffer refactoring.

* Add axis optional argument to Batch stack method.

* Add item assignment to Batch class.

* Fix list support for Buffer.

* Convert list to np.array by default for efficiency.

* Add missing unit test for Batch. Fix unit tests.

* Batch item assignment is now robust to key order.

* Do not use getattr/setattr explicity for simplicity.

* More flexible __setitem__.

* Fixes

* Remove broacasting at Batch level since it is unreliable.

* Forbid item assignement for inconsistent batches.

* Implement broadcasting at Buffer level.

* Add more unit test for Batch item assignment.

Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu>
2020-06-25 20:39:30 +08:00
rocknamx
506cc97ba5
fix #91 (#94) 2020-06-25 07:02:59 +08:00
Alexis DUBURCQ
49f43e9f1f
Fix Batch to numpy compatibility (#92)
* Fix Batch to numpy compatibility.

* Fix Batch unit tests.

* Fix linter

* Add Batch shape method.

* Remove shape and add size. Enable to reserve keys using empty batch/list.

* Fix linter and unit tests.

* Batch init using list of Batch.

* Add unit tests.

* Fix Batch __len__.

* Fix unit tests.

* Fix slicing

* Add missing slicing unit tests.

Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu>
2020-06-24 21:43:48 +08:00
Alexis DUBURCQ
ebc551a25e
Fix support of 0-dim numpy array (#89)
* Fix support of 0-dim numpy array.

* Do not raise exception if Batch index does not make sense since it breaks existing code.

Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu>
2020-06-24 06:55:24 +08:00
Alexis DUBURCQ
d7dd3105bc
Fix tuple support. (#88) 2020-06-23 23:37:26 +08:00
Alexis DUBURCQ
ec270759ab
Batch refactoring (#87)
* Enable to stack Batch instances. Add Batch cat static method. Rename cat in cat_ since inplace.

* Properly handle Batch init using np.array of dict.

* WIP

* Get rid of metadata.

* Update UT. Replace cat by cat_ everywhere.

* Do not sort Batch keys anymore for efficiency. Add items method.

* Fix cat copy issue.

* Add unit test to chack cat and stack methods.

* Remove used import.

* Fix linter issues.

* Fix unit tests.

Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu>
2020-06-23 22:50:59 +08:00
danagi
13828f6309
added noise param to collector for test phase, fixed examples to adapt modification (#86)
* Add auto alpha tuning and exploration noise for sac.
Add class BaseNoise and GaussianNoise for the concept of exploration noise.
Add new test for sac tested in MountainCarContinuous-v0,
which should benefits from the two above new feature.

* add exploration noise to collector, fix example to adapt modification
2020-06-23 07:20:51 +08:00
Trinkle23897
e8b44bbaf4 move sac_mcc to examples (runtime too long) 2020-06-22 21:39:00 +08:00
Trinkle23897
6a2963bd64 fix #85 2020-06-22 17:11:26 +08:00
Trinkle23897
a655334d00 change batch.append to batch.cat 2020-06-20 22:23:12 +08:00
Trinkle23897
aff0f9aee0 fix append batch over batch 2020-06-20 22:03:22 +08:00
youkaichao
268f9d0533
type signature correction (#83) 2020-06-20 09:57:16 +08:00
Trinkle23897
81e4a16ef2 fix a bug in re-index replay buffer (fix #82) 2020-06-17 16:37:51 +08:00
danagi
c59ad40aef
Add auto alpha tuning and exploration noise for sac. (#80)
Add class BaseNoise and GaussianNoise for the concept of exploration noise.
Add new test for sac tested in MountainCarContinuous-v0,
which should benefits from the two above new feature.
2020-06-16 22:17:28 +08:00
Trinkle23897
263e490b76 fix #79 2020-06-16 16:54:16 +08:00
Trinkle23897
5f2f05a570 fix #40 2020-06-13 17:06:08 +08:00
Trinkle23897
3774258cc7 fix unittest 2020-06-11 09:07:45 +08:00
Trinkle23897
1a914336f7 add random action in collector (fix #78) 2020-06-11 08:57:37 +08:00
Trinkle23897
397e92b0fc fix #77 2020-06-10 12:06:56 +08:00
Trinkle23897
f1951780ab fix a bug of storing batch over batch data into buffer 2020-06-09 18:46:14 +08:00
Trinkle23897
b32b96cd3e seperate flake8 lint 2020-06-09 10:33:48 +08:00
Trinkle23897
513573ea82 add link 2020-06-08 22:20:52 +08:00
Trinkle23897
560116d0b2 cheat sheet 2020-06-08 21:53:00 +08:00
Alexis DUBURCQ
52be533d06
Enable getattr for SubprocVecEnv. (#74)
* Enable getattr for SubprovVecEnv.

* Consistent API between VectorEnv and SubprocVecEnv.

* Avoid code duplication. Add unit tests.

* Add docstring.

* Test more branches.

* Fix UT.

Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu>
2020-06-05 17:17:43 +08:00
Alexis DUBURCQ
66be5641b6
Fix to_numpy. (#73)
Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu>
2020-06-04 22:32:05 +08:00
Trinkle23897
7bf202f195 polish docs 2020-06-03 17:04:26 +08:00
Trinkle23897
dc451dfe88 nstep all (fix #51) 2020-06-03 13:59:47 +08:00
Trinkle23897
ff81a18f42 compute_nstep_returns (item 2 of #51) 2020-06-02 22:29:50 +08:00
Trinkle23897
f818a2467b zh_CN docs 2020-06-02 08:51:14 +08:00