Tianshou

History

I needed a policy gradient baseline myself and it has been requested several times (#497, #374, #440). I used https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/ppo_atari.py as a reference for hyper-parameters.

Note that using lr=2.5e-4 will result in "Invalid Value" error for 2 games. The fix is to reduce the learning rate. That's why I set the default lr to 1e-4. See discussion in https://github.com/DLR-RM/rl-baselines3-zoo/issues/156.

2022-02-11 06:45:06 +08:00

c51

Add C51 algorithm (#266 )

2021-01-06 10:17:45 +08:00

dqn

DQN Atari examples (#187 )

2020-08-30 05:48:09 +08:00

fqf

Add Fully-parameterized Quantile Function (#376 )

2021-06-15 11:59:02 +08:00

iqn

update iqn results and reward plots (#377 )