Tianshou/tianshou/core/README.md

24 lines
591 B
Markdown
Raw Normal View History

#TODO:
Separate actor and critic. (Important, we need to focus on that recently)
# policy
2017-11-18 09:37:15 +08:00
YongRen
2017-11-18 09:37:15 +08:00
### base, stochastic
2017-11-18 09:37:15 +08:00
follow OnehotCategorical to write Gaussian, can be in the same file as stochastic.py
2017-11-18 09:37:15 +08:00
### deterministic
2017-11-18 09:37:15 +08:00
not sure how to write, but should at least have act() method to interact with environment
2017-11-18 09:37:15 +08:00
referencing QValuePolicy in base.py, should have at least the listed methods.
2017-11-18 09:37:15 +08:00
# losses
2017-11-18 09:37:15 +08:00
TongzhengRen
2017-11-18 09:37:15 +08:00
seems to be direct python functions. Though the management of placeholders may require some discussion. also may write it in a functional form.