NM512
|
d692b377ec
|
memory saving at evaluation
|
2023-05-05 01:32:08 +09:00 |
|
NM512
|
0eb66997fb
|
learnable initial state options for RSSM
|
2023-04-29 07:54:03 +09:00 |
|
NM512
|
1328ff1088
|
sampling from the replay buffer across episodes
|
2023-04-29 07:43:02 +09:00 |
|
NM512
|
432a359bcf
|
put running episode into replay buffer
|
2023-04-24 06:25:17 +09:00 |
|
NM512
|
628b856c63
|
changed the discount head to predict terminal
|
2023-04-22 09:34:23 +09:00 |
|
Aditya
|
52782d31e3
|
terminal value is along the sequence dim
|
2023-04-20 19:05:18 -07:00 |
|
NM512
|
1e070a3daf
|
cleaned up envs
|
2023-04-15 23:16:43 +09:00 |
|
NM512
|
55ed69bdf7
|
fix bug when using envs > 1
|
2023-04-15 15:25:25 +09:00 |
|
NM512
|
cd935b7dd9
|
set default replay buffer size as 1M
|
2023-04-05 21:38:51 +09:00 |
|
NM512
|
57ac1c11d3
|
replaced all tf function to torch
|
2023-04-03 08:06:34 +09:00 |
|
NM512
|
8bd69bfcd4
|
bug fix when using multiple environments
|
2023-04-03 08:00:16 +09:00 |
|
NM512
|
942eae10a9
|
updated result, requirements and torch version
|
2023-03-24 07:51:57 +09:00 |
|
NM512
|
5ad0f6e9ca
|
clear eval episodes for saving memory
|
2023-03-20 20:55:06 +09:00 |
|
NM512
|
6273444394
|
modified based on author's implementation
|
2023-03-18 08:38:23 +09:00 |
|
NM512
|
fb5c21557a
|
Initial Commit
|
2023-02-12 22:35:25 +09:00 |
|