ChenDRAG a633a6a028
update utils.network (#275)
This is the first commit of 6 commits mentioned in #274, which features

1. Refactor of `Class Net` to support any form of MLP.
2. Enable type check in utils.network.
3. Relative change in docs/test/examples.
4. Move atari-related network to examples/atari/atari_network.py

Co-authored-by: Trinkle23897 <trinkle23897@gmail.com>
2021-01-20 16:54:13 +08:00
..
2020-11-09 16:43:55 +08:00
2021-01-20 16:54:13 +08:00
2021-01-20 16:54:13 +08:00
2021-01-20 16:54:13 +08:00
2020-11-09 16:43:55 +08:00

Bipedal-Hardcore-SAC

  • Our default choice: remove the done flag penalty, will soon converge to ~280 reward within 100 epochs (10M env steps, 3~4 hours, see the image below)
  • If the done penalty is not removed, it converges much slower than before, about 200 epochs (20M env steps) to reach the same performance (~200 reward)