update readme
This commit is contained in:
		
							parent
							
								
									4da857d86e
								
							
						
					
					
						commit
						4f843d3f51
					
				
							
								
								
									
										34
									
								
								README.md
									
									
									
									
									
								
							
							
						
						
									
										34
									
								
								README.md
									
									
									
									
									
								
							@ -13,7 +13,7 @@
 | 
			
		||||
[](https://github.com/thu-ml/tianshou/blob/master/LICENSE)
 | 
			
		||||
[](https://gitter.im/thu-ml/tianshou?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
 | 
			
		||||
 | 
			
		||||
**Tianshou** (天授) is a reinforcement learning platform based on pure PyTorch. Unlike existing reinforcement learning libraries, which are mainly based on TensorFlow, have many nested classes, unfriendly API, or slow-speed, Tianshou provides a fast-speed framework and pythonic API for building the deep reinforcement learning agent. The supported interface algorithms include:
 | 
			
		||||
**Tianshou** ([天授]([https://baike.baidu.com/item/%E5%A4%A9%E6%8E%88/9342](https://baike.baidu.com/item/天授/9342))) is a reinforcement learning platform based on pure PyTorch. Unlike existing reinforcement learning libraries, which are mainly based on TensorFlow, have many nested classes, unfriendly API, or slow-speed, Tianshou provides a fast-speed framework and pythonic API for building the deep reinforcement learning agent. The supported interface algorithms include:
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
- [Policy Gradient (PG)](https://papers.nips.cc/paper/1713-policy-gradient-methods-for-reinforcement-learning-with-function-approximation.pdf)
 | 
			
		||||
@ -242,21 +242,6 @@ You can check out the [documentation](https://tianshou.readthedocs.io) for advan
 | 
			
		||||
 | 
			
		||||
Tianshou is still under development. More algorithms and features are going to be added and we always welcome contributions to help make Tianshou better. If you would like to contribute, please check out [CONTRIBUTING.md](https://github.com/thu-ml/tianshou/blob/master/CONTRIBUTING.md).
 | 
			
		||||
 | 
			
		||||
## Citing Tianshou
 | 
			
		||||
 | 
			
		||||
If you find Tianshou useful, please cite it in your publications.
 | 
			
		||||
 | 
			
		||||
```latex
 | 
			
		||||
@misc{tianshou,
 | 
			
		||||
  author = {Jiayi Weng, Minghao Zhang},
 | 
			
		||||
  title = {Tianshou},
 | 
			
		||||
  year = {2020},
 | 
			
		||||
  publisher = {GitHub},
 | 
			
		||||
  journal = {GitHub repository},
 | 
			
		||||
  howpublished = {\url{https://github.com/thu-ml/tianshou}},
 | 
			
		||||
}
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
## TODO
 | 
			
		||||
 | 
			
		||||
- [x] More examples on [mujoco, atari] benchmark
 | 
			
		||||
@ -267,6 +252,23 @@ If you find Tianshou useful, please cite it in your publications.
 | 
			
		||||
- [ ] Multi-agent
 | 
			
		||||
- [ ] Distributed training
 | 
			
		||||
 | 
			
		||||
## Citing Tianshou
 | 
			
		||||
 | 
			
		||||
If you find Tianshou useful, please cite it in your publications.
 | 
			
		||||
 | 
			
		||||
```latex
 | 
			
		||||
@misc{tianshou,
 | 
			
		||||
  author = {Jiayi Weng, Minghao Zhang, Dong Yan, Hang Su, Jun Zhu},
 | 
			
		||||
  title = {Tianshou},
 | 
			
		||||
  year = {2020},
 | 
			
		||||
  publisher = {GitHub},
 | 
			
		||||
  journal = {GitHub repository},
 | 
			
		||||
  howpublished = {\url{https://github.com/thu-ml/tianshou}},
 | 
			
		||||
}
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
We would like to thank [TSAIL](http://ml.cs.tsinghua.edu.cn/) and [Institute for Artificial Intelligence, Tsinghua University](http://ai.tsinghua.edu.cn/) for providing such an excellent AI research platform.
 | 
			
		||||
 | 
			
		||||
## Miscellaneous
 | 
			
		||||
 | 
			
		||||
Tianshou was previously a reinforcement learning platform based on TensorFlow. You can checkout the branch [`priv`](https://github.com/thu-ml/tianshou/tree/priv) for more detail.
 | 
			
		||||
 | 
			
		||||
							
								
								
									
										
											BIN
										
									
								
								docs/_static/images/concepts_arch.png
									
									
									
									
										vendored
									
									
								
							
							
						
						
									
										
											BIN
										
									
								
								docs/_static/images/concepts_arch.png
									
									
									
									
										vendored
									
									
								
							
										
											Binary file not shown.
										
									
								
							| 
		 Before Width: | Height: | Size: 16 KiB After Width: | Height: | Size: 18 KiB  | 
@ -89,7 +89,7 @@ Data Buffer
 | 
			
		||||
    >>> batch_data.obs == buf[indice].obs
 | 
			
		||||
    array([ True,  True,  True,  True])
 | 
			
		||||
 | 
			
		||||
The :class:`~tianshou.data.ReplayBuffer` is based on ``numpy.ndarray``. Tianshou provides other type of data buffer such as :class:`~tianshou.data.ListReplayBuffer` (based on list), :class:`tianshou.data.PrioritizedReplayBuffer` (based on Segment Tree and ``numpy.ndarray``). Check out the API documentation for more detail.
 | 
			
		||||
The :class:`~tianshou.data.ReplayBuffer` is based on ``numpy.ndarray``. Tianshou provides other type of data buffer such as :class:`~tianshou.data.ListReplayBuffer` (based on list), :class:`~tianshou.data.PrioritizedReplayBuffer` (based on Segment Tree and ``numpy.ndarray``). Check out the API documentation for more detail.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
Policy
 | 
			
		||||
 | 
			
		||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user