Commit Graph

  • 0285bba821 flesh out tokenizer even more lucidrains 2025-10-02 06:11:04 -07:00
  • 31c4aa28c7 start setting up tokenizer lucidrains 2025-10-02 05:37:43 -07:00
  • 67519a451d softclamping in flex lucidrains 2025-10-01 12:19:41 -07:00
  • 8e7a35b89c cover the attention masking for tokenizer encoder, decoder, as well as dynamics model (latent and agent tokens are "special" and placed on the right) lucidrains 2025-10-01 12:11:06 -07:00
  • c18c624be6 their latent bottleneck is tanh it seems, constraining it to -1 to 1 for flow matching in dynamics model. please open an issue if mistakened lucidrains 2025-10-01 10:39:16 -07:00
  • e3cbcd94c6 sketch out top down lucidrains 2025-10-01 10:25:56 -07:00
  • 882e63511b will apply the golden gate rotary for this work as an option lucidrains 2025-10-01 10:07:54 -07:00
  • ceb1af263e oops lucidrains 2025-10-01 09:49:04 -07:00
  • c979883f21 ready the block causal mask lucidrains 2025-10-01 09:45:54 -07:00
  • 2e92c0121a they employ two stability measures, qk rmsnorm and softclamping of attention logits lucidrains 2025-10-01 09:40:24 -07:00
  • e8678364ba swish glu feedforward from shazeer et al lucidrains 2025-10-01 09:28:25 -07:00
  • 8ebb8a9661 finished a first pass at digesting the paper, start with transformer lucidrains 2025-10-01 09:21:55 -07:00
  • e0dd4cfeaa they replace the recurrent state-space model with a transformer, with the implication that the former does not scale lucidrains 2025-10-01 07:59:02 -07:00
  • bdc7dd30a6 scaffold lucidrains 2025-10-01 07:18:18 -07:00
  • 62e9c4eecf
    project page Phil Wang 2025-10-01 06:56:03 -07:00
  • febbc73284 dreamer fig2 lucidrains 2025-10-01 06:30:29 -07:00
  • deecd30f52
    wip Phil Wang 2025-09-30 05:59:20 -07:00
  • 4eeb4ee7fc
    Initial commit Phil Wang 2025-09-30 05:58:16 -07:00