51 Commits

Author SHA1 Message Date
NM512
7433d1e877 avoid ".to(device)" 2024-09-28 07:58:15 +09:00
NM512
7f66ed5333 erased unused options 2024-01-05 23:23:09 +09:00
NM512
4fe9b29ebe env seed vary between envs of parallel 2024-01-05 10:44:20 +09:00
NM512
e0487f8206 merged action head into MLP and modified configs 2024-01-05 10:26:48 +09:00
NM512
e0f2017e28 unified the place to initialize the latents 2024-01-05 10:09:13 +09:00
NM512
d3576c5a98 added save and load for optimizers 2023-09-27 09:15:37 +09:00
NM512
16635df3e4 removed scheduling function 2023-09-26 20:58:55 +09:00
NM512
7607a92d71 modified the memorymaze environment 2023-08-16 21:54:09 +09:00
NM512
606ec8af8c added the option for a deterministic run 2023-08-16 21:46:06 +09:00
NM512
02cf57b617 added option to parallize 2023-08-05 22:42:03 +09:00
NM512
8571cf656a modifications for minecraft 2023-08-05 21:13:57 +09:00
NM512
6924abdd3e eval is executed after steps in config elapsed 2023-07-26 01:00:03 +09:00
NM512
12ed21e06d applied formatter 2023-07-23 22:02:06 +09:00
NM512
afa5ab988d introduced parallel processing for envs 2023-07-23 21:58:46 +09:00
NM512
106317015d erased unused lines of code 2023-07-22 21:20:55 +09:00
NM512
d1f4d5c709 erased unnecessary wrapper 2023-07-22 21:17:26 +09:00
NM512
f07d843953 erased unnecessary reward input 2023-07-22 20:53:43 +09:00
NM512
9ca5082da3 separated cache management of episode from env 2023-07-22 19:22:41 +09:00
NM512
88514ec022 removed unnecessary imports 2023-07-02 11:52:33 +09:00
NM512
0ae6d2d1e0 step-based counting 2023-07-02 11:51:11 +09:00
NM512
036e9a8028 added minecraft environment 2023-07-02 11:29:48 +09:00
NM512
34a44916f7 modified training step display 2023-06-24 23:05:45 +09:00
NM512
edc26e42ed modified memory maze and dependencies 2023-06-18 19:42:48 +09:00
NM512
e3329b35e5 applied formatter 2023-06-18 16:57:05 +09:00
NM512
2a8b2e84e0
Merge branch 'main' into memmaze 2023-06-18 16:27:05 +09:00
zdx
8e005afde5 mem maze env ok 1.2 2023-06-18 09:16:32 +08:00
zdx
152415f32e mem maze env ok 1.1 2023-06-17 23:59:05 +08:00
张德祥
ea446adaf4 mem maze env ok 1 2023-06-17 23:29:53 +08:00
NM512
5dce8cf13b added benchmark task Crafter 2023-06-18 00:02:22 +09:00
NM512
f7c505579c erased unnecessary lines 2023-06-17 15:27:09 +09:00
张德祥
b9120a7440 env v0.12 2023-06-13 21:39:04 +08:00
张德祥
5038a91aad env v0.11 2023-06-13 10:44:54 +08:00
张德祥
7879c6cfe7 env v01 2023-06-13 09:58:03 +08:00
ktolnos
b07badeee6 Fixes for Plan2Explore with actions and for windows. 2023-06-05 22:50:12 +03:00
NM512
02c3d45fcf modification of expl. 2023-05-21 08:17:47 +09:00
NM512
b984e69b6e added state input capability 2023-05-14 23:38:46 +09:00
NM512
d692b377ec memory saving at evaluation 2023-05-05 01:32:08 +09:00
NM512
0eb66997fb learnable initial state options for RSSM 2023-04-29 07:54:03 +09:00
NM512
1328ff1088 sampling from the replay buffer across episodes 2023-04-29 07:43:02 +09:00
NM512
432a359bcf put running episode into replay buffer 2023-04-24 06:25:17 +09:00
NM512
628b856c63 changed the discount head to predict terminal 2023-04-22 09:34:23 +09:00
Aditya
52782d31e3 terminal value is along the sequence dim 2023-04-20 19:05:18 -07:00
NM512
1e070a3daf cleaned up envs 2023-04-15 23:16:43 +09:00
NM512
55ed69bdf7 fix bug when using envs > 1 2023-04-15 15:25:25 +09:00
NM512
cd935b7dd9 set default replay buffer size as 1M 2023-04-05 21:38:51 +09:00
NM512
57ac1c11d3 replaced all tf function to torch 2023-04-03 08:06:34 +09:00
NM512
8bd69bfcd4 bug fix when using multiple environments 2023-04-03 08:00:16 +09:00
NM512
942eae10a9 updated result, requirements and torch version 2023-03-24 07:51:57 +09:00
NM512
5ad0f6e9ca clear eval episodes for saving memory 2023-03-20 20:55:06 +09:00
NM512
6273444394 modified based on author's implementation 2023-03-18 08:38:23 +09:00