21 lines
508 B
Markdown
21 lines
508 B
Markdown
# policy
|
|
|
|
YongRen
|
|
|
|
### base, stochastic
|
|
|
|
follow OnehotCategorical to write Gaussian, can be in the same file as stochastic.py
|
|
|
|
### deterministic
|
|
|
|
not sure how to write, but should at least have act() method to interact with environment
|
|
|
|
DQN should have an effective argmax_{actions}() method to use as a value network
|
|
|
|
|
|
|
|
# losses
|
|
|
|
TongzhengRen
|
|
|
|
seems to be direct python functions. Though the management of placeholders may require some discussion. also may write it in a functional form. |