diff --git a/README.md b/README.md index 5037a78..30084ff 100644 --- a/README.md +++ b/README.md @@ -280,7 +280,7 @@ If you find Tianshou useful, please cite it in your publications. ```latex @misc{tianshou, - author = {Jiayi Weng, Minghao Zhang, Alexis Duburcq, Kaichao You, Dong Yan, Hang Su, Jun Zhu}, + author = {Jiayi Weng, Huayu Chen, Alexis Duburcq, Kaichao You, Minghao Zhang, Dong Yan, Hang Su, Jun Zhu}, title = {Tianshou}, year = {2020}, publisher = {GitHub}, diff --git a/docs/tutorials/dqn.rst b/docs/tutorials/dqn.rst index 5c5d547..412051e 100644 --- a/docs/tutorials/dqn.rst +++ b/docs/tutorials/dqn.rst @@ -129,8 +129,7 @@ Tianshou provides :func:`~tianshou.trainer.onpolicy_trainer`, :func:`~tianshou.t update_per_step=0.1, episode_per_test=100, batch_size=64, train_fn=lambda epoch, env_step: policy.set_eps(0.1), test_fn=lambda epoch, env_step: policy.set_eps(0.05), - stop_fn=lambda mean_rewards: mean_rewards >= env.spec.reward_threshold, - logger=None) + stop_fn=lambda mean_rewards: mean_rewards >= env.spec.reward_threshold) print(f'Finished training! Use {result["duration"]}') The meaning of each parameter is as follows (full description can be found at :func:`~tianshou.trainer.offpolicy_trainer`):