# policy

YongRen

### base, stochastic

follow OnehotCategorical to write Gaussian, can be in the same file as stochastic.py

### deterministic

not sure how to write, but should at least have act() method to interact with environment

DQN should have an effective argmax_{actions}() method to use as a value network



# losses

TongzhengRen

seems to be direct python functions. Though the management of placeholders may require some discussion. also may write it in a functional form.