coding style
This commit is contained in:
parent
d43e0fe311
commit
a8a12f1083
13
README.md
13
README.md
@ -41,15 +41,16 @@ Tianshou(天授) is a reinforcement learning platform. The following image illus
|
||||
|
||||
<img src="https://github.com/sproblvem/tianshou/blob/master/docs/figures/go.png" height="150"/> <img src="https://github.com/sproblvem/tianshou/blob/master/docs/figures/reversi.jpg" height="150"/> <img src="https://github.com/sproblvem/tianshou/blob/master/docs/figures/warzone.jpg" height="150"/>
|
||||
|
||||
|
||||
## About coding style
|
||||
|
||||
You can follow (google python coding style)[https://google.github.io/styleguide/pyguide.html]
|
||||
|
||||
The file should all be named with lower case letters and underline.
|
||||
|
||||
## TODO
|
||||
Search based method parallel.
|
||||
|
||||
`Please Write comments.`
|
||||
|
||||
`Please do not use abbreviations unless others can know it well. (e.g. adv can short for advantage/adversarial, please use the full name instead)`
|
||||
|
||||
`Please name the module formally. (e.g. use more lower case and "_", I think a module called "Batch" is terrible)`
|
||||
|
||||
YongRen: Policy Wrapper, in order of Gaussian, DQN and DDPG
|
||||
|
||||
TongzhengRen: losses, in order of ppo, pg, DQN, DDPG with management of placeholders
|
||||
|
@ -23,3 +23,7 @@ def KL_diff(pi, pi_old):
|
||||
kloldnew = pi_old.pd.kl(pi.pd)
|
||||
meankl = U.mean(kloldnew)
|
||||
return meankl
|
||||
|
||||
|
||||
def vanilla_policy_gradient():
|
||||
pass
|
||||
|
Loading…
x
Reference in New Issue
Block a user