Wenbo Hu
|
336cede197
|
repair komi. add todo for forbid pass:
|
2017-12-20 22:57:58 +08:00 |
|
Wenbo Hu
|
40909fa994
|
forbid pass, if we have other choices
|
2017-12-20 22:10:47 +08:00 |
|
Wenbo Hu
|
0ab38743aa
|
minor revision.
|
2017-12-20 21:52:30 +08:00 |
|
Wenbo Hu
|
8875ad1bf7
|
minor revision
|
2017-12-20 21:40:03 +08:00 |
|
Wenbo Hu
|
818da800e2
|
simulator process a valid set, instead of a single action
|
2017-12-20 21:35:35 +08:00 |
|
Wenbo Hu
|
12f45d9dc6
|
checkpoint
|
2017-12-20 20:12:08 +08:00 |
|
rtz19970824
|
112fd07b13
|
modify the mcts, refactor the network
|
2017-12-20 16:43:42 +08:00 |
|
Dong Yan
|
db40994e11
|
merge Go and GoEnv finallygit status!
|
2017-12-20 01:14:05 +08:00 |
|
Dong Yan
|
0456e0c15e
|
final version before merge Go and GoEnv
|
2017-12-20 00:43:31 +08:00 |
|
Dong Yan
|
afc5dbac5a
|
rearrange the sequence of functions of Go and GoEnv before merging
|
2017-12-20 00:16:24 +08:00 |
|
Dong Yan
|
f8a70183b6
|
fix the copy bug in check_global_isomorphous; refactor code to eliminate side effect
|
2017-12-19 22:57:38 +08:00 |
|
Dong Yan
|
83f9e19fa5
|
merge flatten and deflatten, rename variable for clarity
|
2017-12-19 16:51:50 +08:00 |
|
rtz19970824
|
fae273f219
|
start a random player if checkpoint path is not specified
|
2017-12-19 15:39:31 +08:00 |
|
rtz19970824
|
d7b3b6aba9
|
deflatten debug
|
2017-12-19 15:09:46 +08:00 |
|
Dong Yan
|
e168df5609
|
fix bug in check_global_isomorphous and refactor _is_suicide again
|
2017-12-19 12:00:17 +08:00 |
|
Dong Yan
|
72a9f4823c
|
rename variable for clarity
|
2017-12-19 11:16:17 +08:00 |
|
Dong Yan
|
1a164d4d7d
|
rewrite _is_qi in a more understandable way
|
2017-12-19 00:47:21 +08:00 |
|
Dong Yan
|
243bbaff64
|
Merge branch 'master' of github.com:sproblvem/tianshou
|
2017-12-19 00:16:33 +08:00 |
|
Dong Yan
|
fb511aa76d
|
delete unused parameter of _find_block, and using _find_group to replace _find_block
|
2017-12-19 00:16:21 +08:00 |
|
Tongzheng Ren
|
0e1287b5cb
|
update gitignore
|
2017-12-18 23:34:32 +08:00 |
|
Tongzheng Ren
|
27c1017259
|
add a detailed Chinese google coding style for convenience
|
2017-12-18 23:32:41 +08:00 |
|
宋世虹
|
d220f7f2a8
|
add comments and todos
|
2017-12-17 13:28:21 +08:00 |
|
宋世虹
|
3624cc9036
|
finished very naive dqn: changed the interface of replay buffer by adding collect and next_batch, but still need refactoring; added implementation of dqn.py, but still need to consider the interface to make it more extensive; slightly refactored the code style of the codebase; more comments and todos will be in the next commit
|
2017-12-17 12:52:00 +08:00 |
|
Dong Yan
|
31199c7d0d
|
0. code refactor, try to merge Go and GoEnv
|
2017-12-16 23:29:11 +08:00 |
|
Dong Yan
|
01c0c2483a
|
check if the network weights exists for every player
|
2017-12-16 14:55:19 +08:00 |
|
Dong Yan
|
d115c586d4
|
start the player server in a more robost way.
|
2017-12-16 14:33:31 +08:00 |
|
Dong Yan
|
4fc50c5f1b
|
merge class strategy with class game. Next, merge Go with GoEnv
|
2017-12-15 22:19:44 +08:00 |
|
rtz19970824
|
d0bdccc25a
|
assign TODO to Haosheng and Tongzheng
|
2017-12-15 14:27:04 +08:00 |
|
rtz19970824
|
cb9540b91c
|
Merge branch 'master' of https://github.com/sproblvem/tianshou
|
2017-12-15 14:24:15 +08:00 |
|
rtz19970824
|
e5bf7a9270
|
implement dqn loss and dpg loss, add TODO for separate actor and critic
|
2017-12-15 14:24:08 +08:00 |
|
Haosheng Zou
|
92deae9f8d
|
minor fix
|
2017-12-14 19:46:38 +08:00 |
|
Haosheng Zou
|
039c8140e2
|
add dqn.py to write
|
2017-12-13 22:43:45 +08:00 |
|
Haosheng Zou
|
7ab211b63c
|
preliminary design of dqn_example, dqn interface. identify the assign of networks
|
2017-12-13 20:47:45 +08:00 |
|
Wenbo Hu
|
657422a4ed
|
Merge pull request #1 from sproblvem/add_rules
Add rules
|
2017-12-13 14:35:39 +08:00 |
|
Wenbo Hu
|
3f3d7b56f5
|
minor indent fix
|
2017-12-12 23:16:50 +08:00 |
|
Wenbo Hu
|
d52ee30259
|
add nearby stones
|
2017-12-12 23:13:31 +08:00 |
|
Wenbo Hu
|
f820aab008
|
change mcts steps
|
2017-12-12 20:37:57 +08:00 |
|
Wenbo Hu
|
848b8f0399
|
minor fix
|
2017-12-12 17:09:26 +08:00 |
|
rtz19970824
|
9791ad386e
|
Merge branch 'master' of https://github.com/sproblvem/tianshou
|
2017-12-12 16:54:52 +08:00 |
|
Wenbo Hu
|
44fbccd380
|
add stone estimation using nearby stone for those UNKNOWs
|
2017-12-13 00:35:18 +08:00 |
|
rtz19970824
|
e88d651400
|
minor fixed on self play
|
2017-12-11 15:56:16 +08:00 |
|
rtz19970824
|
715f7be6a8
|
update the policy
|
2017-12-11 13:38:24 +08:00 |
|
rtz19970824
|
0c4a83f3eb
|
vanilla policy gradient
|
2017-12-11 13:37:27 +08:00 |
|
haosheng
|
88ecaa332d
|
minor fix in core/policy
|
2017-12-11 13:25:22 +08:00 |
|
Dong Yan
|
e3c0478fa0
|
Merge branch 'master' of github.com:sproblvem/tianshou
|
2017-12-10 20:23:30 +08:00 |
|
Dong Yan
|
cacf31657b
|
supporting self-play between different version of neural netowrks
|
2017-12-10 20:23:10 +08:00 |
|
haosheng
|
972044c39d
|
minor fix
|
2017-12-10 17:33:10 +08:00 |
|
haosheng
|
a00b930c2c
|
fix naming and comments of coding style, delete .json
|
2017-12-10 17:23:13 +08:00 |
|
songshshshsh
|
0da31faa94
|
Merge branch 'master' of https://github.com/sproblvem/tianshou
|
2017-12-10 14:58:53 +08:00 |
|
songshshshsh
|
f1a7fd9ee1
|
replay buffer initial commit
|
2017-12-10 14:56:04 +08:00 |
|