Tianshou

Author	SHA1	Message	Date
Jiayi Weng	2a9c9289e5	rename save_fn to save_best_fn to avoid ambiguity (#575 ) This PR also introduces `tianshou.utils.deprecation` for a unified deprecation wrapper.	2022-03-22 04:29:27 +08:00
Costa Huang	df3d7f582b	Update WandbLogger implementation (#558 ) * Use `global_step` as the x-axis for wandb * Use Tensorboard SummaryWritter as core with `wandb.init(..., sync_tensorboard=True)` * Update all atari examples with wandb Co-authored-by: Jiayi Weng <trinkle23897@gmail.com>	2022-03-07 06:40:47 +08:00
Chengqi Duan	23fbc3b712	upgrade gym version to >=0.21, fix related CI and update examples/atari (#534 ) Co-authored-by: Jiayi Weng <trinkle23897@gmail.com>	2022-02-25 07:40:33 +08:00
Yi Su	d29188ee77	update atari ppo slots (#529 )	2022-02-13 04:04:21 +08:00
Yi Su	40289b8b0e	Add atari ppo example (#523 ) I needed a policy gradient baseline myself and it has been requested several times (#497, #374, #440). I used https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/ppo_atari.py as a reference for hyper-parameters. Note that using lr=2.5e-4 will result in "Invalid Value" error for 2 games. The fix is to reduce the learning rate. That's why I set the default lr to 1e-4. See discussion in https://github.com/DLR-RM/rl-baselines3-zoo/issues/156.	2022-02-11 06:45:06 +08:00