Phil Wang
|
9c56ba0c9d
|
Merge pull request #3 from lucidrains/pytest-shard
add pytest shard
|
2025-10-08 07:03:11 -07:00 |
|
lucidrains
|
b5744237bf
|
fix
|
2025-10-08 06:58:46 -07:00 |
|
lucidrains
|
63b63dfedd
|
add shard
|
2025-10-08 06:56:03 -07:00 |
|
lucidrains
|
612f5f5dd1
|
a bit of dropout to rewards as state
|
2025-10-08 06:45:25 -07:00 |
|
lucidrains
|
c8f75caa40
|
although not in the paper, it would be interesting for each agent (will extend to multi-agent) to consider its own past rewards as state
|
2025-10-08 06:40:43 -07:00 |
|
lucidrains
|
187edc1414
|
all set for generating the perceived rewards once the RL components fall into place
|
2025-10-08 06:33:28 -07:00 |
|
lucidrains
|
f7bdaddbbb
|
one more incision before knocking out reward decoding
|
2025-10-08 06:11:02 -07:00 |
|
lucidrains
|
c056835aea
|
address https://github.com/lucidrains/dreamer4/issues/2
0.0.5
|
2025-10-08 05:55:22 -07:00 |
|
lucidrains
|
4de357b6c2
|
tiny change needed to have the world model produce both the video and predicted rewards (after phase 2 finetuning)
|
2025-10-08 05:52:13 -07:00 |
|
lucidrains
|
0fdb67bafa
|
add the noising of the latent context during generation, technique i think was from EPFL, or perhaps some google group that built on top of EPFL work
0.0.4
|
2025-10-07 09:37:37 -07:00 |
|
lucidrains
|
36ccb08500
|
allow for step_sizes to be passed in, log2 is not that intuitive
0.0.3
|
2025-10-07 08:36:46 -07:00 |
|
lucidrains
|
a8e14f4b7c
|
oops
|
2025-10-07 08:09:33 -07:00 |
|
lucidrains
|
1176269927
|
correct signal levels when doing teacher forcing generation
0.0.2
|
2025-10-07 07:41:02 -07:00 |
|
lucidrains
|
c6bef85984
|
generating video with raw teacher forcing
0.0.1
|
2025-10-07 07:22:57 -07:00 |
|
lucidrains
|
83ba9a285a
|
reorganize tokenizer to generate video from the dynamics model
|
2025-10-06 11:37:45 -07:00 |
|
lucidrains
|
7180a8cf43
|
start carving into the reinforcement learning portion, starting with reward prediction head (single for now)
|
2025-10-06 11:17:25 -07:00 |
|
lucidrains
|
77724049e2
|
fix latent / modality attention pattern in video tokenizer, thanks to another researcher
|
2025-10-06 09:44:12 -07:00 |
|
lucidrains
|
25b8de91cc
|
handle spatial tokens less than latent tokens in dynamics model
|
2025-10-06 09:19:27 -07:00 |
|
lucidrains
|
bfbecb4968
|
an anonymous researcher pointed out that the video tokenizer may be using multiple latents per frame
|
2025-10-06 08:16:55 -07:00 |
|
lucidrains
|
338def693d
|
oops
|
2025-10-05 11:52:54 -07:00 |
|
lucidrains
|
f507afa0d3
|
last commit for the day - take care of the task embed
|
2025-10-05 11:40:48 -07:00 |
|
lucidrains
|
fe99efecba
|
make a first pass through the shortcut training logic (Frans et al from Berkeley) maintaining both v-space and x-space
|
2025-10-05 11:17:36 -07:00 |
|
lucidrains
|
971637673b
|
complete all the types of attention masking patterns as proposed in the paper
|
2025-10-04 12:45:54 -07:00 |
|
lucidrains
|
5c6be4d979
|
take care of blocked causal in video tokenizer, still need the special attention pattern for latents to and from though
|
2025-10-04 12:03:50 -07:00 |
|
lucidrains
|
6c994db341
|
first nail down the attention masking for the dynamics transformer model using a factory function
|
2025-10-04 11:20:57 -07:00 |
|
lucidrains
|
ca700ba8e1
|
prepare for the learning in dreams
|
2025-10-04 09:44:46 -07:00 |
|
lucidrains
|
e04f9ffec6
|
for the temporal attention in dynamics model, do rotary the traditional way
|
2025-10-04 09:41:36 -07:00 |
|
lucidrains
|
1b7f6e787d
|
rotate in the 3d rotary embeddings for the video tokenizer for both encoder / decoder
|
2025-10-04 09:22:06 -07:00 |
|
lucidrains
|
93f6738c9c
|
given the special attention patterns, attend function needs to be constructed before traversing the transformer layers
|
2025-10-04 08:31:51 -07:00 |
|
lucidrains
|
7cac3d28c5
|
cleanup
|
2025-10-04 08:04:42 -07:00 |
|
lucidrains
|
0f4783f23c
|
use a newly built module from x-mlps for multi token prediction
|
2025-10-04 07:56:56 -07:00 |
|
lucidrains
|
0a26e0f92f
|
complete the lpips loss used for the video tokenizer
|
2025-10-04 07:47:27 -07:00 |
|
Phil Wang
|
92e55a90b4
|
temporary discord
|
2025-10-04 07:28:36 -07:00 |
|
lucidrains
|
85eea216fd
|
cleanup
|
2025-10-04 06:59:09 -07:00 |
|
lucidrains
|
895a867a66
|
able to accept raw video for dynamics model, if tokenizer passed in
|
2025-10-04 06:57:54 -07:00 |
|
lucidrains
|
8373cb13ec
|
grouped query attention is necessary
|
2025-10-04 06:31:32 -07:00 |
|
lucidrains
|
58a6964dd9
|
the dynamics model has a spatial attention with a non-causal attention pattern but nothing else attending to agent tokens
|
2025-10-03 11:59:22 -07:00 |
|
lucidrains
|
77ad96ded2
|
make attention masking correct for dynamics model
|
2025-10-03 11:18:44 -07:00 |
|
lucidrains
|
986bf4c529
|
allow for the video tokenizer to accept any spatial dimensions by parameterizing the decoder positional embedding with an MLP
|
2025-10-03 10:08:05 -07:00 |
|
lucidrains
|
90bf19f076
|
take care of the loss weight proposed in eq 8
|
2025-10-03 08:19:38 -07:00 |
|
lucidrains
|
046f8927d1
|
complete the symexp two hot proposed by Hafner from the previous versions of Dreamer, but will also bring in hl gauss
|
2025-10-03 08:08:44 -07:00 |
|
lucidrains
|
2a896ab01d
|
last commit for the day
|
2025-10-02 12:39:20 -07:00 |
|
lucidrains
|
8d1cd311bb
|
Revert "address https://github.com/lucidrains/dreamer4/issues/1"
This reverts commit e23a5294ec2f49d58d3ccb936c498eb86939fa96.
|
2025-10-02 12:25:05 -07:00 |
|
lucidrains
|
e23a5294ec
|
address https://github.com/lucidrains/dreamer4/issues/1
|
2025-10-02 11:49:22 -07:00 |
|
lucidrains
|
51e0852604
|
cleanup
|
2025-10-02 09:43:30 -07:00 |
|
lucidrains
|
0b503d880d
|
ellipsis
|
2025-10-02 09:14:39 -07:00 |
|
lucidrains
|
e6c808960f
|
take care of the MAE portion from Kaiming He
|
2025-10-02 08:57:44 -07:00 |
|
lucidrains
|
49082d8629
|
x-space and v-space prediction in dynamics model
|
2025-10-02 08:36:00 -07:00 |
|
lucidrains
|
8b66b703e0
|
add the discretized signal level + step size embeddings necessary for diffusion forcing + shortcut
|
2025-10-02 07:39:34 -07:00 |
|
lucidrains
|
bb7a5d1680
|
sketch out the axial space time transformer in dynamics model
|
2025-10-02 07:17:58 -07:00 |
|