Tianshou/docs/index.rst

.. Tianshou documentation master file, created by
   sphinx-quickstart on Sat Mar 28 15:58:19 2020.
   You can adapt this file completely to your liking, but it should at least
   contain the root `toctree` directive.

Welcome to Tianshou!
====================

**Tianshou** (`天授 <https://baike.baidu.com/item/%E5%A4%A9%E6%8E%88>`_) is a reinforcement learning platform based on pure PyTorch. Unlike existing reinforcement learning libraries, which are mainly based on TensorFlow, have many nested classes, unfriendly API, or slow-speed, Tianshou provides a fast-speed framework and pythonic API for building the deep reinforcement learning agent. The supported interface algorithms include:

* :class:`~tianshou.policy.PGPolicy` `Policy Gradient <https://papers.nips.cc/paper/1713-policy-gradient-methods-for-reinforcement-learning-with-function-approximation.pdf>`_
* :class:`~tianshou.policy.DQNPolicy` `Deep Q-Network <https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf>`_
* :class:`~tianshou.policy.DQNPolicy` `Double DQN <https://arxiv.org/pdf/1509.06461.pdf>`_ with n-step returns
* :class:`~tianshou.policy.A2CPolicy` `Advantage Actor-Critic <https://openai.com/blog/baselines-acktr-a2c/>`_
* :class:`~tianshou.policy.DDPGPolicy` `Deep Deterministic Policy Gradient <https://arxiv.org/pdf/1509.02971.pdf>`_
* :class:`~tianshou.policy.PPOPolicy` `Proximal Policy Optimization <https://arxiv.org/pdf/1707.06347.pdf>`_
* :class:`~tianshou.policy.TD3Policy` `Twin Delayed DDPG <https://arxiv.org/pdf/1802.09477.pdf>`_
* :class:`~tianshou.policy.SACPolicy` `Soft Actor-Critic <https://arxiv.org/pdf/1812.05905.pdf>`_
* :class:`~tianshou.policy.ImitationPolicy` Imitation Learning
* :class:`~tianshou.data.PrioritizedReplayBuffer` `Prioritized Experience Replay <https://arxiv.org/pdf/1511.05952.pdf>`_
* :meth:`~tianshou.policy.BasePolicy.compute_episodic_return` `Generalized Advantage Estimator <https://arxiv.org/pdf/1506.02438.pdf>`_

Here is Tianshou's other features:

* Elegant framework, using only ~2000 lines of code
* Support parallel environment sampling for all algorithms: :ref:`parallel_sampling`
* Support recurrent state representation in actor network and critic network (RNN-style training for POMDP): :ref:`rnn_training`
* Support any type of environment state (e.g. a dict, a self-defined class, ...): :ref:`self_defined_env`
* Support customized training process: :ref:`customize_training`
* Support n-step returns estimation :meth:`~tianshou.policy.BasePolicy.compute_nstep_return` for all Q-learning based algorithms

中文文档位于 https://tianshou.readthedocs.io/zh/latest/

Installation
------------

Tianshou is currently hosted on `PyPI <https://pypi.org/project/tianshou/>`_. You can simply install Tianshou with the following command (with Python >= 3.6):
::

    pip3 install tianshou

You can also install with the newest version through GitHub:
::

    pip3 install git+https://github.com/thu-ml/tianshou.git@master

If you use Anaconda or Miniconda, you can install Tianshou through the following command lines:
::

    # create a new virtualenv and install pip, change the env name if you like
    conda create -n myenv pip
    # activate the environment
    conda activate myenv
    # install tianshou
    pip install tianshou

After installation, open your python console and type
::

    import tianshou as ts
    print(ts.__version__)

If no error occurs, you have successfully installed Tianshou.

Tianshou is still under development, you can also check out the documents in stable version through `tianshou.readthedocs.io/en/stable/ <https://tianshou.readthedocs.io/en/stable/>`_.

.. toctree::
   :maxdepth: 1
   :caption: Tutorials

   tutorials/dqn
   tutorials/concepts
   tutorials/trick
   tutorials/cheatsheet

.. toctree::
   :maxdepth: 1
   :caption: API Docs

   api/tianshou.data
   api/tianshou.env
   api/tianshou.policy
   api/tianshou.trainer
   api/tianshou.exploration
   api/tianshou.utils

.. toctree::
   :maxdepth: 1
   :caption: Community

   contributing
   contributor


Indices and tables
------------------

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
add logo and sphinx setup 2020-03-28 22:01:23 +08:00			`.. Tianshou documentation master file, created by`
			`sphinx-quickstart on Sat Mar 28 15:58:19 2020.`
			`You can adapt this file completely to your liking, but it should at least`
			contain the root `toctree` directive.

upd doc 2020-03-29 10:22:03 +08:00			`Welcome to Tianshou!`
			`====================`

test api doc 2020-04-02 09:07:04 +08:00			Tianshou (`天授 <https://baike.baidu.com/item/%E5%A4%A9%E6%8E%88>`_) is a reinforcement learning platform based on pure PyTorch. Unlike existing reinforcement learning libraries, which are mainly based on TensorFlow, have many nested classes, unfriendly API, or slow-speed, Tianshou provides a fast-speed framework and pythonic API for building the deep reinforcement learning agent. The supported interface algorithms include:
upd doc 2020-03-29 10:22:03 +08:00
add policy docs (#21) 2020-04-06 19:36:59 +08:00			* :class:`~tianshou.policy.PGPolicy` `Policy Gradient <https://papers.nips.cc/paper/1713-policy-gradient-methods-for-reinforcement-learning-with-function-approximation.pdf>`_
			* :class:`~tianshou.policy.DQNPolicy` `Deep Q-Network <https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf>`_
			* :class:`~tianshou.policy.DQNPolicy` `Double DQN <https://arxiv.org/pdf/1509.06461.pdf>`_ with n-step returns
			* :class:`~tianshou.policy.A2CPolicy` `Advantage Actor-Critic <https://openai.com/blog/baselines-acktr-a2c/>`_
			* :class:`~tianshou.policy.DDPGPolicy` `Deep Deterministic Policy Gradient <https://arxiv.org/pdf/1509.02971.pdf>`_
			* :class:`~tianshou.policy.PPOPolicy` `Proximal Policy Optimization <https://arxiv.org/pdf/1707.06347.pdf>`_
			* :class:`~tianshou.policy.TD3Policy` `Twin Delayed DDPG <https://arxiv.org/pdf/1802.09477.pdf>`_
			* :class:`~tianshou.policy.SACPolicy` `Soft Actor-Critic <https://arxiv.org/pdf/1812.05905.pdf>`_
gae 2020-04-14 21:11:06 +08:00			* :class:`~tianshou.policy.ImitationPolicy` Imitation Learning
item3 of #51 2020-05-27 11:02:23 +08:00			* :class:`~tianshou.data.PrioritizedReplayBuffer` `Prioritized Experience Replay <https://arxiv.org/pdf/1511.05952.pdf>`_
			* :meth:`~tianshou.policy.BasePolicy.compute_episodic_return` `Generalized Advantage Estimator <https://arxiv.org/pdf/1506.02438.pdf>`_
upd doc 2020-03-29 10:22:03 +08:00
polish docs 2020-06-03 17:04:26 +08:00			`Here is Tianshou's other features:`
upd doc 2020-03-29 10:22:03 +08:00
polish docs 2020-06-03 17:04:26 +08:00			`* Elegant framework, using only ~2000 lines of code`
add link 2020-06-08 22:20:52 +08:00			* Support parallel environment sampling for all algorithms: :ref:`parallel_sampling`
			* Support recurrent state representation in actor network and critic network (RNN-style training for POMDP): :ref:`rnn_training`
			* Support any type of environment state (e.g. a dict, a self-defined class, ...): :ref:`self_defined_env`
			* Support customized training process: :ref:`customize_training`
polish docs 2020-06-03 17:04:26 +08:00			* Support n-step returns estimation :meth:`~tianshou.policy.BasePolicy.compute_nstep_return` for all Q-learning based algorithms
upd doc 2020-03-29 10:22:03 +08:00
zh_CN docs 2020-06-02 08:51:14 +08:00			`中文文档位于 https://tianshou.readthedocs.io/zh/latest/`
test api doc 2020-04-02 09:07:04 +08:00
upd doc 2020-03-29 10:22:03 +08:00			`Installation`
zh_CN docs 2020-06-02 08:51:14 +08:00			`------------`
upd doc 2020-03-29 10:22:03 +08:00
fix #69 2020-06-01 08:30:09 +08:00			Tianshou is currently hosted on `PyPI <https://pypi.org/project/tianshou/>`_. You can simply install Tianshou with the following command (with Python >= 3.6):
upd doc 2020-03-29 10:22:03 +08:00			`::`

Multimodal obs (#38, #27, #25) 2020-04-28 20:56:02 +08:00			`pip3 install tianshou`
upd doc 2020-03-29 10:22:03 +08:00
			`You can also install with the newest version through GitHub:`
			`::`

			`pip3 install git+https://github.com/thu-ml/tianshou.git@master`

Multimodal obs (#38, #27, #25) 2020-04-28 20:56:02 +08:00			`If you use Anaconda or Miniconda, you can install Tianshou through the following command lines:`
			`::`

			`# create a new virtualenv and install pip, change the env name if you like`
			`conda create -n myenv pip`
			`# activate the environment`
			`conda activate myenv`
			`# install tianshou`
			`pip install tianshou`

upd doc 2020-03-29 10:22:03 +08:00			`After installation, open your python console and type`
			`::`

			`import tianshou as ts`
			`print(ts.__version__)`

			`If no error occurs, you have successfully installed Tianshou.`

__call__ -> forward 2020-04-10 10:47:16 +08:00			Tianshou is still under development, you can also check out the documents in stable version through `tianshou.readthedocs.io/en/stable/ <https://tianshou.readthedocs.io/en/stable/>`_.
upd doc 2020-03-29 10:22:03 +08:00
			`.. toctree::`
			`:maxdepth: 1`
			`:caption: Tutorials`

update dqn tutorial 2020-03-29 15:18:33 +08:00			`tutorials/dqn`
			`tutorials/concepts`
fix docs 2020-04-10 11:16:33 +08:00			`tutorials/trick`
__call__ -> forward 2020-04-10 10:47:16 +08:00			`tutorials/cheatsheet`
upd doc 2020-03-29 10:22:03 +08:00
			`.. toctree::`
			`:maxdepth: 1`
			`:caption: API Docs`

test api doc 2020-04-02 09:07:04 +08:00			`api/tianshou.data`
			`api/tianshou.env`
			`api/tianshou.policy`
			`api/tianshou.trainer`
			`api/tianshou.exploration`
			`api/tianshou.utils`
add logo and sphinx setup 2020-03-28 22:01:23 +08:00
			`.. toctree::`
upd doc 2020-03-29 10:22:03 +08:00			`:maxdepth: 1`
			`:caption: Community`
add logo and sphinx setup 2020-03-28 22:01:23 +08:00
upd doc 2020-03-29 10:22:03 +08:00			`contributing`
polish docs 2020-04-11 19:29:46 +08:00			`contributor`
add logo and sphinx setup 2020-03-28 22:01:23 +08:00

			`Indices and tables`
zh_CN docs 2020-06-02 08:51:14 +08:00			`------------------`
add logo and sphinx setup 2020-03-28 22:01:23 +08:00
			* :ref:`genindex`
			* :ref:`modindex`
			* :ref:`search`