dreamerv3-torch/README.md

49 lines
2.5 KiB
Markdown
Raw Normal View History

2023-02-18 14:42:22 +09:00
# dreamerv3-torch
2023-05-05 18:21:19 +09:00
Pytorch implementation of [Mastering Diverse Domains through World Models](https://arxiv.org/abs/2301.04104v1). DreamerV3 is a scalable algorithm that outperforms previous approaches across various domains with fixed hyperparameters.
2023-02-12 22:35:25 +09:00
## Instructions
2023-03-18 19:07:12 +09:00
2023-02-12 22:35:25 +09:00
Get dependencies:
```
pip install -r requirements.txt
```
2023-06-18 19:43:01 +09:00
Run training on DMC Vision:
2023-02-12 22:35:25 +09:00
```
2023-05-14 23:38:46 +09:00
python3 dreamer.py --configs dmc_vision --task dmc_walker_walk --logdir ./logdir/dmc_walker_walk
```
2023-02-12 22:35:25 +09:00
Monitor results:
```
2023-06-04 23:49:05 +09:00
tensorboard --logdir ./logdir
2023-02-12 22:35:25 +09:00
```
2023-06-18 19:43:01 +09:00
## Benchmarks
2023-06-19 06:02:35 +09:00
So far, the following benchmarks can be used for testing.
| Environment | Observation | Action | Budget | Description |
|-------------------|---|---|---|-----------------------|
| [DMC Proprio](https://github.com/deepmind/dm_control) | State | Continuous | 500K | DeepMind Control Suite with low-dimensional inputs. |
| [DMC Vision](https://github.com/deepmind/dm_control) | Image | Continuous |1M| DeepMind Control Suite with high-dimensional images inputs. |
| [Atari 100k](https://github.com/openai/atari-py) | Image | Discrete |400K| 26 Atari games. |
| [Crafter](https://github.com/danijar/crafter) | Image | Discrete |1M| Survival environment to evaluates diverse agent abilities.|
2023-07-23 22:40:32 +09:00
| [Minecraft](https://github.com/minerllabs/minerl) | Image and State |Discrete |100M| Vast 3D open world.|
2023-06-19 06:02:35 +09:00
| [Memory Maze](https://github.com/jurgisp/memory-maze) | Image |Discrete |100M| 3D mazes to evaluate RL agents' long-term memory.|
2023-06-18 19:43:01 +09:00
2023-05-14 23:38:46 +09:00
## Results
2023-06-19 06:02:35 +09:00
#### DMC Proprio
![dmcproprio](https://github.com/NM512/dreamerv3-torch/assets/70328564/7f6e47a5-3235-4bc4-bef9-15ff96782d5e)
2023-05-21 23:12:51 +09:00
#### DMC Vision
2023-06-04 23:49:05 +09:00
![dmcvision](https://github.com/NM512/dreamerv3-torch/assets/70328564/b710d217-2428-4fa0-8471-55e15ec5aa43)
2023-05-21 23:12:51 +09:00
#### Atari 100k
![atari100k](https://github.com/NM512/dreamerv3-torch/assets/70328564/0da6d899-d91d-44b4-a8c4-d5b37413aa11)
2023-08-15 20:11:15 +09:00
#### Crafter
<img src="https://github.com/NM512/dreamerv3-torch/assets/70328564/2a4d65d3-7e7b-4a95-b0cf-146d978054f0" width="300" height="150" />
2023-02-12 22:35:25 +09:00
## Acknowledgments
This code is heavily inspired by the following works:
- danijar's Dreamer-v3 jax implementation: https://github.com/danijar/dreamerv3
2023-02-12 22:35:25 +09:00
- danijar's Dreamer-v2 tensorflow implementation: https://github.com/danijar/dreamerv2
- jsikyoon's Dreamer-v2 pytorch implementation: https://github.com/jsikyoon/dreamer-torch
- RajGhugare19's Dreamer-v2 pytorch implementation: https://github.com/RajGhugare19/dreamerv2
- denisyarats's DrQ-v2 original implementation: https://github.com/facebookresearch/drqv2