update readme
This commit is contained in:
parent
4da857d86e
commit
4f843d3f51
34
README.md
34
README.md
@ -13,7 +13,7 @@
|
||||
[](https://github.com/thu-ml/tianshou/blob/master/LICENSE)
|
||||
[](https://gitter.im/thu-ml/tianshou?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
|
||||
|
||||
**Tianshou** (天授) is a reinforcement learning platform based on pure PyTorch. Unlike existing reinforcement learning libraries, which are mainly based on TensorFlow, have many nested classes, unfriendly API, or slow-speed, Tianshou provides a fast-speed framework and pythonic API for building the deep reinforcement learning agent. The supported interface algorithms include:
|
||||
**Tianshou** ([天授]([https://baike.baidu.com/item/%E5%A4%A9%E6%8E%88/9342](https://baike.baidu.com/item/天授/9342))) is a reinforcement learning platform based on pure PyTorch. Unlike existing reinforcement learning libraries, which are mainly based on TensorFlow, have many nested classes, unfriendly API, or slow-speed, Tianshou provides a fast-speed framework and pythonic API for building the deep reinforcement learning agent. The supported interface algorithms include:
|
||||
|
||||
|
||||
- [Policy Gradient (PG)](https://papers.nips.cc/paper/1713-policy-gradient-methods-for-reinforcement-learning-with-function-approximation.pdf)
|
||||
@ -242,21 +242,6 @@ You can check out the [documentation](https://tianshou.readthedocs.io) for advan
|
||||
|
||||
Tianshou is still under development. More algorithms and features are going to be added and we always welcome contributions to help make Tianshou better. If you would like to contribute, please check out [CONTRIBUTING.md](https://github.com/thu-ml/tianshou/blob/master/CONTRIBUTING.md).
|
||||
|
||||
## Citing Tianshou
|
||||
|
||||
If you find Tianshou useful, please cite it in your publications.
|
||||
|
||||
```latex
|
||||
@misc{tianshou,
|
||||
author = {Jiayi Weng, Minghao Zhang},
|
||||
title = {Tianshou},
|
||||
year = {2020},
|
||||
publisher = {GitHub},
|
||||
journal = {GitHub repository},
|
||||
howpublished = {\url{https://github.com/thu-ml/tianshou}},
|
||||
}
|
||||
```
|
||||
|
||||
## TODO
|
||||
|
||||
- [x] More examples on [mujoco, atari] benchmark
|
||||
@ -267,6 +252,23 @@ If you find Tianshou useful, please cite it in your publications.
|
||||
- [ ] Multi-agent
|
||||
- [ ] Distributed training
|
||||
|
||||
## Citing Tianshou
|
||||
|
||||
If you find Tianshou useful, please cite it in your publications.
|
||||
|
||||
```latex
|
||||
@misc{tianshou,
|
||||
author = {Jiayi Weng, Minghao Zhang, Dong Yan, Hang Su, Jun Zhu},
|
||||
title = {Tianshou},
|
||||
year = {2020},
|
||||
publisher = {GitHub},
|
||||
journal = {GitHub repository},
|
||||
howpublished = {\url{https://github.com/thu-ml/tianshou}},
|
||||
}
|
||||
```
|
||||
|
||||
We would like to thank [TSAIL](http://ml.cs.tsinghua.edu.cn/) and [Institute for Artificial Intelligence, Tsinghua University](http://ai.tsinghua.edu.cn/) for providing such an excellent AI research platform.
|
||||
|
||||
## Miscellaneous
|
||||
|
||||
Tianshou was previously a reinforcement learning platform based on TensorFlow. You can checkout the branch [`priv`](https://github.com/thu-ml/tianshou/tree/priv) for more detail.
|
||||
|
BIN
docs/_static/images/concepts_arch.png
vendored
BIN
docs/_static/images/concepts_arch.png
vendored
Binary file not shown.
Before Width: | Height: | Size: 16 KiB After Width: | Height: | Size: 18 KiB |
@ -89,7 +89,7 @@ Data Buffer
|
||||
>>> batch_data.obs == buf[indice].obs
|
||||
array([ True, True, True, True])
|
||||
|
||||
The :class:`~tianshou.data.ReplayBuffer` is based on ``numpy.ndarray``. Tianshou provides other type of data buffer such as :class:`~tianshou.data.ListReplayBuffer` (based on list), :class:`tianshou.data.PrioritizedReplayBuffer` (based on Segment Tree and ``numpy.ndarray``). Check out the API documentation for more detail.
|
||||
The :class:`~tianshou.data.ReplayBuffer` is based on ``numpy.ndarray``. Tianshou provides other type of data buffer such as :class:`~tianshou.data.ListReplayBuffer` (based on list), :class:`~tianshou.data.PrioritizedReplayBuffer` (based on Segment Tree and ``numpy.ndarray``). Check out the API documentation for more detail.
|
||||
|
||||
|
||||
Policy
|
||||
|
Loading…
x
Reference in New Issue
Block a user