15 Commits

Author SHA1 Message Date
NM512
d692b377ec memory saving at evaluation 2023-05-05 01:32:08 +09:00
NM512
0eb66997fb learnable initial state options for RSSM 2023-04-29 07:54:03 +09:00
NM512
1328ff1088 sampling from the replay buffer across episodes 2023-04-29 07:43:02 +09:00
NM512
432a359bcf put running episode into replay buffer 2023-04-24 06:25:17 +09:00
NM512
628b856c63 changed the discount head to predict terminal 2023-04-22 09:34:23 +09:00
Aditya
52782d31e3 terminal value is along the sequence dim 2023-04-20 19:05:18 -07:00
NM512
1e070a3daf cleaned up envs 2023-04-15 23:16:43 +09:00
NM512
55ed69bdf7 fix bug when using envs > 1 2023-04-15 15:25:25 +09:00
NM512
cd935b7dd9 set default replay buffer size as 1M 2023-04-05 21:38:51 +09:00
NM512
57ac1c11d3 replaced all tf function to torch 2023-04-03 08:06:34 +09:00
NM512
8bd69bfcd4 bug fix when using multiple environments 2023-04-03 08:00:16 +09:00
NM512
942eae10a9 updated result, requirements and torch version 2023-03-24 07:51:57 +09:00
NM512
5ad0f6e9ca clear eval episodes for saving memory 2023-03-20 20:55:06 +09:00
NM512
6273444394 modified based on author's implementation 2023-03-18 08:38:23 +09:00
NM512
fb5c21557a Initial Commit 2023-02-12 22:35:25 +09:00