From 7bf202f1952b41be7b878a38576956328fa5aafd Mon Sep 17 00:00:00 2001 From: Trinkle23897 <463003665@qq.com> Date: Wed, 3 Jun 2020 17:04:26 +0800 Subject: [PATCH] polish docs --- README.md | 8 +++++++- docs/index.rst | 7 ++++++- 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 586cbb6..8dc783d 100644 --- a/README.md +++ b/README.md @@ -30,7 +30,13 @@ - [Prioritized Experience Replay (PER)](https://arxiv.org/pdf/1511.05952.pdf) - [Generalized Advantage Estimator (GAE)](https://arxiv.org/pdf/1506.02438.pdf) -**Tianshou supports parallel workers for all algorithms as well since all of them are reformatted as replay-buffer based algorithms. All of the algorithms support recurrent state representation in actor network (RNN-style training in POMDP). The environment state can be any type (dict, self-defined class, ...). All Q-learning algorithms support n-step returns estimation.** +Here is Tianshou's other features: + +- Elegant framework, using only ~2000 lines of code +- Support parallel environment sampling for all algorithms +- Support recurrent state representation in actor network and critic network (RNN-style training for POMDP) +- Support any type of environment state (e.g. a dict, a self-defined class, ...) +- Support n-step returns estimation for all Q-learning based algorithms In Chinese, Tianshou means divinely ordained and is derived to the gift of being born with. Tianshou is a reinforcement learning platform, and the RL algorithm does not learn from humans. So taking "Tianshou" means that there is no teacher to study with, but rather to learn by themselves through constant interaction with the environment. diff --git a/docs/index.rst b/docs/index.rst index 20b3a8e..c71f9eb 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -20,8 +20,13 @@ Welcome to Tianshou! * :class:`~tianshou.data.PrioritizedReplayBuffer` `Prioritized Experience Replay `_ * :meth:`~tianshou.policy.BasePolicy.compute_episodic_return` `Generalized Advantage Estimator `_ +Here is Tianshou's other features: -Tianshou supports parallel workers for all algorithms as well since all of them are reformatted as replay-buffer based algorithms. All of the algorithms support recurrent state representation in actor network (RNN-style training in POMDP). The environment state can be any type (Dict, self-defined class, ...). +* Elegant framework, using only ~2000 lines of code +* Support parallel environment sampling for all algorithms +* Support recurrent state representation in actor network and critic network (RNN-style training for POMDP) +* Support any type of environment state (e.g. a dict, a self-defined class, ...) +* Support n-step returns estimation :meth:`~tianshou.policy.BasePolicy.compute_nstep_return` for all Q-learning based algorithms 中文文档位于 https://tianshou.readthedocs.io/zh/latest/