Jiayi Weng
2a9c9289e5
rename save_fn to save_best_fn to avoid ambiguity ( #575 )
...
This PR also introduces `tianshou.utils.deprecation` for a unified deprecation wrapper.
2022-03-22 04:29:27 +08:00
Costa Huang
df3d7f582b
Update WandbLogger implementation ( #558 )
...
* Use `global_step` as the x-axis for wandb
* Use Tensorboard SummaryWritter as core with `wandb.init(..., sync_tensorboard=True)`
* Update all atari examples with wandb
Co-authored-by: Jiayi Weng <trinkle23897@gmail.com>
2022-03-07 06:40:47 +08:00
Chengqi Duan
23fbc3b712
upgrade gym version to >=0.21, fix related CI and update examples/atari ( #534 )
...
Co-authored-by: Jiayi Weng <trinkle23897@gmail.com>
2022-02-25 07:40:33 +08:00
Yi Su
d29188ee77
update atari ppo slots ( #529 )
2022-02-13 04:04:21 +08:00
Yi Su
40289b8b0e
Add atari ppo example ( #523 )
...
I needed a policy gradient baseline myself and it has been requested several times (#497 , #374 , #440 ). I used https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/ppo_atari.py as a reference for hyper-parameters.
Note that using lr=2.5e-4 will result in "Invalid Value" error for 2 games. The fix is to reduce the learning rate. That's why I set the default lr to 1e-4. See discussion in https://github.com/DLR-RM/rl-baselines3-zoo/issues/156 .
2022-02-11 06:45:06 +08:00