68 Commits

Author SHA1 Message Date
Phil Wang
9c56ba0c9d
Merge pull request #3 from lucidrains/pytest-shard
add pytest shard
2025-10-08 07:03:11 -07:00
lucidrains
b5744237bf fix 2025-10-08 06:58:46 -07:00
lucidrains
63b63dfedd add shard 2025-10-08 06:56:03 -07:00
lucidrains
612f5f5dd1 a bit of dropout to rewards as state 2025-10-08 06:45:25 -07:00
lucidrains
c8f75caa40 although not in the paper, it would be interesting for each agent (will extend to multi-agent) to consider its own past rewards as state 2025-10-08 06:40:43 -07:00
lucidrains
187edc1414 all set for generating the perceived rewards once the RL components fall into place 2025-10-08 06:33:28 -07:00
lucidrains
f7bdaddbbb one more incision before knocking out reward decoding 2025-10-08 06:11:02 -07:00
lucidrains
c056835aea address https://github.com/lucidrains/dreamer4/issues/2 0.0.5 2025-10-08 05:55:22 -07:00
lucidrains
4de357b6c2 tiny change needed to have the world model produce both the video and predicted rewards (after phase 2 finetuning) 2025-10-08 05:52:13 -07:00
lucidrains
0fdb67bafa add the noising of the latent context during generation, technique i think was from EPFL, or perhaps some google group that built on top of EPFL work 0.0.4 2025-10-07 09:37:37 -07:00
lucidrains
36ccb08500 allow for step_sizes to be passed in, log2 is not that intuitive 0.0.3 2025-10-07 08:36:46 -07:00
lucidrains
a8e14f4b7c oops 2025-10-07 08:09:33 -07:00
lucidrains
1176269927 correct signal levels when doing teacher forcing generation 0.0.2 2025-10-07 07:41:02 -07:00
lucidrains
c6bef85984 generating video with raw teacher forcing 0.0.1 2025-10-07 07:22:57 -07:00
lucidrains
83ba9a285a reorganize tokenizer to generate video from the dynamics model 2025-10-06 11:37:45 -07:00
lucidrains
7180a8cf43 start carving into the reinforcement learning portion, starting with reward prediction head (single for now) 2025-10-06 11:17:25 -07:00
lucidrains
77724049e2 fix latent / modality attention pattern in video tokenizer, thanks to another researcher 2025-10-06 09:44:12 -07:00
lucidrains
25b8de91cc handle spatial tokens less than latent tokens in dynamics model 2025-10-06 09:19:27 -07:00
lucidrains
bfbecb4968 an anonymous researcher pointed out that the video tokenizer may be using multiple latents per frame 2025-10-06 08:16:55 -07:00
lucidrains
338def693d oops 2025-10-05 11:52:54 -07:00
lucidrains
f507afa0d3 last commit for the day - take care of the task embed 2025-10-05 11:40:48 -07:00
lucidrains
fe99efecba make a first pass through the shortcut training logic (Frans et al from Berkeley) maintaining both v-space and x-space 2025-10-05 11:17:36 -07:00
lucidrains
971637673b complete all the types of attention masking patterns as proposed in the paper 2025-10-04 12:45:54 -07:00
lucidrains
5c6be4d979 take care of blocked causal in video tokenizer, still need the special attention pattern for latents to and from though 2025-10-04 12:03:50 -07:00
lucidrains
6c994db341 first nail down the attention masking for the dynamics transformer model using a factory function 2025-10-04 11:20:57 -07:00
lucidrains
ca700ba8e1 prepare for the learning in dreams 2025-10-04 09:44:46 -07:00
lucidrains
e04f9ffec6 for the temporal attention in dynamics model, do rotary the traditional way 2025-10-04 09:41:36 -07:00
lucidrains
1b7f6e787d rotate in the 3d rotary embeddings for the video tokenizer for both encoder / decoder 2025-10-04 09:22:06 -07:00
lucidrains
93f6738c9c given the special attention patterns, attend function needs to be constructed before traversing the transformer layers 2025-10-04 08:31:51 -07:00
lucidrains
7cac3d28c5 cleanup 2025-10-04 08:04:42 -07:00
lucidrains
0f4783f23c use a newly built module from x-mlps for multi token prediction 2025-10-04 07:56:56 -07:00
lucidrains
0a26e0f92f complete the lpips loss used for the video tokenizer 2025-10-04 07:47:27 -07:00
Phil Wang
92e55a90b4
temporary discord 2025-10-04 07:28:36 -07:00
lucidrains
85eea216fd cleanup 2025-10-04 06:59:09 -07:00
lucidrains
895a867a66 able to accept raw video for dynamics model, if tokenizer passed in 2025-10-04 06:57:54 -07:00
lucidrains
8373cb13ec grouped query attention is necessary 2025-10-04 06:31:32 -07:00
lucidrains
58a6964dd9 the dynamics model has a spatial attention with a non-causal attention pattern but nothing else attending to agent tokens 2025-10-03 11:59:22 -07:00
lucidrains
77ad96ded2 make attention masking correct for dynamics model 2025-10-03 11:18:44 -07:00
lucidrains
986bf4c529 allow for the video tokenizer to accept any spatial dimensions by parameterizing the decoder positional embedding with an MLP 2025-10-03 10:08:05 -07:00
lucidrains
90bf19f076 take care of the loss weight proposed in eq 8 2025-10-03 08:19:38 -07:00
lucidrains
046f8927d1 complete the symexp two hot proposed by Hafner from the previous versions of Dreamer, but will also bring in hl gauss 2025-10-03 08:08:44 -07:00
lucidrains
2a896ab01d last commit for the day 2025-10-02 12:39:20 -07:00
lucidrains
8d1cd311bb Revert "address https://github.com/lucidrains/dreamer4/issues/1"
This reverts commit e23a5294ec2f49d58d3ccb936c498eb86939fa96.
2025-10-02 12:25:05 -07:00
lucidrains
e23a5294ec address https://github.com/lucidrains/dreamer4/issues/1 2025-10-02 11:49:22 -07:00
lucidrains
51e0852604 cleanup 2025-10-02 09:43:30 -07:00
lucidrains
0b503d880d ellipsis 2025-10-02 09:14:39 -07:00
lucidrains
e6c808960f take care of the MAE portion from Kaiming He 2025-10-02 08:57:44 -07:00
lucidrains
49082d8629 x-space and v-space prediction in dynamics model 2025-10-02 08:36:00 -07:00
lucidrains
8b66b703e0 add the discretized signal level + step size embeddings necessary for diffusion forcing + shortcut 2025-10-02 07:39:34 -07:00
lucidrains
bb7a5d1680 sketch out the axial space time transformer in dynamics model 2025-10-02 07:17:58 -07:00