lucidrains
|
0fdb67bafa
|
add the noising of the latent context during generation, technique i think was from EPFL, or perhaps some google group that built on top of EPFL work
|
2025-10-07 09:37:37 -07:00 |
|
lucidrains
|
36ccb08500
|
allow for step_sizes to be passed in, log2 is not that intuitive
|
2025-10-07 08:36:46 -07:00 |
|
lucidrains
|
1176269927
|
correct signal levels when doing teacher forcing generation
|
2025-10-07 07:41:02 -07:00 |
|
lucidrains
|
0f4783f23c
|
use a newly built module from x-mlps for multi token prediction
|
2025-10-04 07:56:56 -07:00 |
|
lucidrains
|
0a26e0f92f
|
complete the lpips loss used for the video tokenizer
|
2025-10-04 07:47:27 -07:00 |
|
lucidrains
|
986bf4c529
|
allow for the video tokenizer to accept any spatial dimensions by parameterizing the decoder positional embedding with an MLP
|
2025-10-03 10:08:05 -07:00 |
|
lucidrains
|
046f8927d1
|
complete the symexp two hot proposed by Hafner from the previous versions of Dreamer, but will also bring in hl gauss
|
2025-10-03 08:08:44 -07:00 |
|
lucidrains
|
8b66b703e0
|
add the discretized signal level + step size embeddings necessary for diffusion forcing + shortcut
|
2025-10-02 07:39:34 -07:00 |
|
lucidrains
|
e3cbcd94c6
|
sketch out top down
|
2025-10-01 10:25:56 -07:00 |
|
lucidrains
|
2e92c0121a
|
they employ two stability measures, qk rmsnorm and softclamping of attention logits
|
2025-10-01 09:40:24 -07:00 |
|
lucidrains
|
bdc7dd30a6
|
scaffold
|
2025-10-01 07:18:23 -07:00 |
|