Docs: added sorting order for autogenerated toc

This commit is contained in:
Michael Panchenko 2023-12-04 13:49:30 +01:00
parent 5af29475e8
commit b12983622b
21 changed files with 11 additions and 11 deletions

2
docs/.gitignore vendored
View File

@ -1,3 +1,3 @@
/api/*
/03_api/*
jupyter_execute
_toc.yml

View File

@ -308,7 +308,7 @@ Tianshou supports user-defined training code. Here is the code snippet:
# train policy with a sampled batch data from buffer
losses = policy.update(64, train_collector.buffer)
For further usage, you can refer to the :doc:`/tutorials/07_cheatsheet`.
For further usage, you can refer to the :doc:`/01_tutorials/07_cheatsheet`.
.. rubric:: References

View File

@ -339,7 +339,7 @@ Thus, we need a time-related interface for calculating the 2-step return. :meth:
This code does not consider the done flag, so it may not work very well. It shows two ways to get :math:`s_{t + 2}` from the replay buffer easily in :meth:`~tianshou.policy.BasePolicy.process_fn`.
For other method, you can check out :doc:`/api/policy/index`. We give the usage of policy class a high-level explanation in :ref:`pseudocode`.
For other method, you can check out :doc:`/03_api/policy/index`. We give the usage of policy class a high-level explanation in :ref:`pseudocode`.
Collector
@ -382,7 +382,7 @@ Trainer
Once you have a collector and a policy, you can start writing the training method for your RL agent. Trainer, to be honest, is a simple wrapper. It helps you save energy for writing the training loop. You can also construct your own trainer: :ref:`customized_trainer`.
Tianshou has three types of trainer: :func:`~tianshou.trainer.onpolicy_trainer` for on-policy algorithms such as Policy Gradient, :func:`~tianshou.trainer.offpolicy_trainer` for off-policy algorithms such as DQN, and :func:`~tianshou.trainer.offline_trainer` for offline algorithms such as BCQ. Please check out :doc:`/api/trainer/index` for the usage.
Tianshou has three types of trainer: :func:`~tianshou.trainer.onpolicy_trainer` for on-policy algorithms such as Policy Gradient, :func:`~tianshou.trainer.offpolicy_trainer` for off-policy algorithms such as DQN, and :func:`~tianshou.trainer.offline_trainer` for offline algorithms such as BCQ. Please check out :doc:`/03_api/trainer/index` for the usage.
We also provide the corresponding iterator-based trainer classes :class:`~tianshou.trainer.OnpolicyTrainer`, :class:`~tianshou.trainer.OffpolicyTrainer`, :class:`~tianshou.trainer.OfflineTrainer` to facilitate users writing more flexible training logic:
::

View File

@ -126,7 +126,7 @@ The figure in the right gives an intuitive comparison among synchronous/asynchro
.. note::
The async simulation collector would cause some exceptions when used as
``test_collector`` in :doc:`/api/trainer/index` (related to
``test_collector`` in :doc:`/03_api/trainer/index` (related to
`Issue 700 <https://github.com/thu-ml/tianshou/issues/700>`_). Please use
sync version for ``test_collector`` instead.
@ -478,4 +478,4 @@ By constructing a new state ``state_ = (state, agent_id, mask)``, essentially we
act = policy(state_)
next_state_, reward = env.step(act)
Following this idea, we write a tiny example of playing `Tic Tac Toe <https://en.wikipedia.org/wiki/Tic-tac-toe>`_ against a random player by using a Q-learning algorithm. The tutorial is at :doc:`/tutorials/04_tictactoe`.
Following this idea, we write a tiny example of playing `Tic Tac Toe <https://en.wikipedia.org/wiki/Tic-tac-toe>`_ against a random player by using a Q-learning algorithm. The tutorial is at :doc:`/01_tutorials/04_tictactoe`.

View File

@ -368,7 +368,7 @@
"id": "8Oc1p8ud9kcu"
},
"source": [
"Would like to learn more advanced usages of Batch? Feel curious about how data is organized inside the Batch? Check the [documentation](https://tianshou.readthedocs.io/en/master/api/tianshou.data.html) and other [tutorials](https://tianshou.readthedocs.io/en/master/tutorials/batch.html#) for more details."
"Would like to learn more advanced usages of Batch? Feel curious about how data is organized inside the Batch? Check the [documentation](https://tianshou.readthedocs.io/en/master/03_api/tianshou.data.html) and other [tutorials](https://tianshou.readthedocs.io/en/master/tutorials/batch.html#) for more details."
]
}
],

View File

@ -61,19 +61,19 @@ Test by GitHub Actions
1. Click the ``Actions`` button in your own repo:
.. image:: _static/images/action1.jpg
.. image:: ../_static/images/action1.jpg
:align: center
2. Click the green button:
.. image:: _static/images/action2.jpg
.. image:: ../_static/images/action2.jpg
:align: center
3. You will see ``Actions Enabled.`` on the top of html page.
4. When you push a new commit to your own repo (e.g. ``git push``), it will automatically run the test in this page:
.. image:: _static/images/action3.png
.. image:: ../_static/images/action3.png
:align: center

View File

@ -52,7 +52,7 @@ Here is Tianshou's other features:
* Support any type of environment state/action (e.g. a dict, a self-defined class, ...): :ref:`self_defined_env`
* Support :ref:`customize_training`
* Support n-step returns estimation :meth:`~tianshou.policy.BasePolicy.compute_nstep_return` and prioritized experience replay :class:`~tianshou.data.PrioritizedReplayBuffer` for all Q-learning based algorithms; GAE, nstep and PER are very fast thanks to numba jit function and vectorized numpy operation
* Support :doc:`/tutorials/04_tictactoe`
* Support :doc:`/01_tutorials/04_tictactoe`
* Support both `TensorBoard <https://www.tensorflow.org/tensorboard>`_ and `W&B <https://wandb.ai/>`_ log tools
* Support multi-GPU training :ref:`multi_gpu`
* Comprehensive `unit tests <https://github.com/thu-ml/tianshou/actions>`_, including functional checking, RL pipeline checking, documentation checking, PEP8 code-style checking, and type checking