Tianshou/tianshou/core/README.md

591 B

#TODO:

Separate actor and critic. (Important, we need to focus on that recently)

policy

YongRen

base, stochastic

follow OnehotCategorical to write Gaussian, can be in the same file as stochastic.py

deterministic

not sure how to write, but should at least have act() method to interact with environment

referencing QValuePolicy in base.py, should have at least the listed methods.

losses

TongzhengRen

seems to be direct python functions. Though the management of placeholders may require some discussion. also may write it in a functional form.