Tianshou

Author	SHA1	Message	Date
ChenDRAG	243ab43b3c	support observation normalization in BaseVectorEnv (#308 ) add RunningMeanStd	2021-03-11 20:50:20 +08:00
ChenDRAG	9b61bc620c	add logger (#295 ) This PR focus on refactor of logging method to solve bug of nan reward and log interval. After these two pr, hopefully fundamental change of tianshou/data is finished. We then can concentrate on building benchmarks of tianshou finally. Things changed: 1. trainer now accepts logger (BasicLogger or LazyLogger) instead of writer; 2. remove utils.SummaryWriter;	2021-02-24 14:48:42 +08:00
rocknamx	c97aa4065e	add singleton pattern version of summary_writter (#230 ) Co-authored-by: Trinkle23897 <trinkle23897@gmail.com>	2020-10-31 16:38:54 +08:00
n+e	c91def6cbc	code format and update function signatures (#213 ) Cherry-pick from #200 - update the function signature - format code-style - move _compile into separate functions - fix a bug in to_torch and to_numpy (Batch) - remove None in action_range In short, the code-format only contains function-signature style and `'` -> `"`. (pick up from [black](https://github.com/psf/black))	2020-09-12 15:39:01 +08:00
Trinkle23897	34f714a677	Numba acceleration (#193 ) Training FPS improvement (base commit is 94bfb32): test_pdqn: 1660 (without numba) -> 1930 discrete/test_ppo: 5100 -> 5170 since nstep has little impact on overall performance, the unit test result is: GAE: 4.1s -> 0.057s nstep: 0.3s -> 0.15s (little improvement) Others: - fix a bug in ttt set_eps - keep only sumtree in segment tree implementation - dirty fix for asyncVenv check_id test	2020-09-02 13:03:32 +08:00
Trinkle23897	75364cd986	ppo and early stop	2020-03-20 19:52:29 +08:00
Trinkle23897	f58c1397c6	half of collector	2020-03-12 22:20:33 +08:00
Trinkle23897	7533e5b0ac	add first test	2020-03-11 10:56:38 +08:00
Trinkle23897	0dfb900e29	env and data	2020-03-11 09:09:56 +08:00

9 Commits