2020-03-28 22:01:23 +08:00
.. Tianshou documentation master file, created by
sphinx-quickstart on Sat Mar 28 15:58:19 2020.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
2020-03-29 10:22:03 +08:00
Welcome to Tianshou!
====================
2020-04-02 09:07:04 +08:00
**Tianshou** (`天授 <https://baike.baidu.com/item/%E5%A4%A9%E6%8E%88> `_ ) is a reinforcement learning platform based on pure PyTorch. Unlike existing reinforcement learning libraries, which are mainly based on TensorFlow, have many nested classes, unfriendly API, or slow-speed, Tianshou provides a fast-speed framework and pythonic API for building the deep reinforcement learning agent. The supported interface algorithms include:
2020-03-29 10:22:03 +08:00
2020-04-06 19:36:59 +08:00
* :class: `~tianshou.policy.PGPolicy` `Policy Gradient <https://papers.nips.cc/paper/1713-policy-gradient-methods-for-reinforcement-learning-with-function-approximation.pdf> `_
* :class: `~tianshou.policy.DQNPolicy` `Deep Q-Network <https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf> `_
* :class: `~tianshou.policy.DQNPolicy` `Double DQN <https://arxiv.org/pdf/1509.06461.pdf> `_ with n-step returns
* :class: `~tianshou.policy.A2CPolicy` `Advantage Actor-Critic <https://openai.com/blog/baselines-acktr-a2c/> `_
* :class: `~tianshou.policy.DDPGPolicy` `Deep Deterministic Policy Gradient <https://arxiv.org/pdf/1509.02971.pdf> `_
* :class: `~tianshou.policy.PPOPolicy` `Proximal Policy Optimization <https://arxiv.org/pdf/1707.06347.pdf> `_
* :class: `~tianshou.policy.TD3Policy` `Twin Delayed DDPG <https://arxiv.org/pdf/1802.09477.pdf> `_
* :class: `~tianshou.policy.SACPolicy` `Soft Actor-Critic <https://arxiv.org/pdf/1812.05905.pdf> `_
2020-04-14 21:11:06 +08:00
* :class: `~tianshou.policy.ImitationPolicy` Imitation Learning
2020-05-27 11:02:23 +08:00
* :class: `~tianshou.data.PrioritizedReplayBuffer` `Prioritized Experience Replay <https://arxiv.org/pdf/1511.05952.pdf> `_
* :meth: `~tianshou.policy.BasePolicy.compute_episodic_return` `Generalized Advantage Estimator <https://arxiv.org/pdf/1506.02438.pdf> `_
2020-03-29 10:22:03 +08:00
2020-06-03 17:04:26 +08:00
Here is Tianshou's other features:
2020-03-29 10:22:03 +08:00
2020-06-03 17:04:26 +08:00
* Elegant framework, using only ~2000 lines of code
2020-06-08 22:20:52 +08:00
* Support parallel environment sampling for all algorithms: :ref: `parallel_sampling`
* Support recurrent state representation in actor network and critic network (RNN-style training for POMDP): :ref: `rnn_training`
* Support any type of environment state (e.g. a dict, a self-defined class, ...): :ref: `self_defined_env`
* Support customized training process: :ref: `customize_training`
2020-06-03 17:04:26 +08:00
* Support n-step returns estimation :meth: `~tianshou.policy.BasePolicy.compute_nstep_return` for all Q-learning based algorithms
2020-03-29 10:22:03 +08:00
2020-06-02 08:51:14 +08:00
中文文档位于 https://tianshou.readthedocs.io/zh/latest/
2020-04-02 09:07:04 +08:00
2020-03-29 10:22:03 +08:00
Installation
2020-06-02 08:51:14 +08:00
------------
2020-03-29 10:22:03 +08:00
2020-06-01 08:30:09 +08:00
Tianshou is currently hosted on `PyPI <https://pypi.org/project/tianshou/> `_ . You can simply install Tianshou with the following command (with Python >= 3.6):
2020-03-29 10:22:03 +08:00
::
2020-04-28 20:56:02 +08:00
pip3 install tianshou
2020-03-29 10:22:03 +08:00
You can also install with the newest version through GitHub:
::
pip3 install git+https://github.com/thu-ml/tianshou.git@master
2020-04-28 20:56:02 +08:00
If you use Anaconda or Miniconda, you can install Tianshou through the following command lines:
::
# create a new virtualenv and install pip, change the env name if you like
conda create -n myenv pip
# activate the environment
conda activate myenv
# install tianshou
pip install tianshou
2020-03-29 10:22:03 +08:00
After installation, open your python console and type
::
import tianshou as ts
print(ts.__version__)
If no error occurs, you have successfully installed Tianshou.
2020-04-10 10:47:16 +08:00
Tianshou is still under development, you can also check out the documents in stable version through `tianshou.readthedocs.io/en/stable/ <https://tianshou.readthedocs.io/en/stable/> `_ .
2020-03-29 10:22:03 +08:00
.. toctree ::
:maxdepth: 1
:caption: Tutorials
2020-03-29 15:18:33 +08:00
tutorials/dqn
tutorials/concepts
2020-04-10 11:16:33 +08:00
tutorials/trick
2020-04-10 10:47:16 +08:00
tutorials/cheatsheet
2020-03-29 10:22:03 +08:00
.. toctree ::
:maxdepth: 1
:caption: API Docs
2020-04-02 09:07:04 +08:00
api/tianshou.data
api/tianshou.env
api/tianshou.policy
api/tianshou.trainer
api/tianshou.exploration
api/tianshou.utils
2020-03-28 22:01:23 +08:00
.. toctree ::
2020-03-29 10:22:03 +08:00
:maxdepth: 1
:caption: Community
2020-03-28 22:01:23 +08:00
2020-03-29 10:22:03 +08:00
contributing
2020-04-11 19:29:46 +08:00
contributor
2020-03-28 22:01:23 +08:00
Indices and tables
2020-06-02 08:51:14 +08:00
------------------
2020-03-28 22:01:23 +08:00
* :ref: `genindex`
* :ref: `modindex`
* :ref: `search`