137 Commits

Author SHA1 Message Date
Wenbo Hu
0ab38743aa minor revision. 2017-12-20 21:52:30 +08:00
Wenbo Hu
8875ad1bf7 minor revision 2017-12-20 21:40:03 +08:00
Wenbo Hu
818da800e2 simulator process a valid set, instead of a single action 2017-12-20 21:35:35 +08:00
Wenbo Hu
12f45d9dc6 checkpoint 2017-12-20 20:12:08 +08:00
rtz19970824
112fd07b13 modify the mcts, refactor the network 2017-12-20 16:43:42 +08:00
Dong Yan
db40994e11 merge Go and GoEnv finallygit status! 2017-12-20 01:14:05 +08:00
Dong Yan
0456e0c15e final version before merge Go and GoEnv 2017-12-20 00:43:31 +08:00
Dong Yan
afc5dbac5a rearrange the sequence of functions of Go and GoEnv before merging 2017-12-20 00:16:24 +08:00
Dong Yan
f8a70183b6 fix the copy bug in check_global_isomorphous; refactor code to eliminate side effect 2017-12-19 22:57:38 +08:00
Dong Yan
83f9e19fa5 merge flatten and deflatten, rename variable for clarity 2017-12-19 16:51:50 +08:00
rtz19970824
fae273f219 start a random player if checkpoint path is not specified 2017-12-19 15:39:31 +08:00
rtz19970824
d7b3b6aba9 deflatten debug 2017-12-19 15:09:46 +08:00
Dong Yan
e168df5609 fix bug in check_global_isomorphous and refactor _is_suicide again 2017-12-19 12:00:17 +08:00
Dong Yan
72a9f4823c rename variable for clarity 2017-12-19 11:16:17 +08:00
Dong Yan
1a164d4d7d rewrite _is_qi in a more understandable way 2017-12-19 00:47:21 +08:00
Dong Yan
243bbaff64 Merge branch 'master' of github.com:sproblvem/tianshou 2017-12-19 00:16:33 +08:00
Dong Yan
fb511aa76d delete unused parameter of _find_block, and using _find_group to replace _find_block 2017-12-19 00:16:21 +08:00
Tongzheng Ren
0e1287b5cb update gitignore 2017-12-18 23:34:32 +08:00
Tongzheng Ren
27c1017259 add a detailed Chinese google coding style for convenience 2017-12-18 23:32:41 +08:00
宋世虹
d220f7f2a8 add comments and todos 2017-12-17 13:28:21 +08:00
宋世虹
3624cc9036 finished very naive dqn: changed the interface of replay buffer by adding collect and next_batch, but still need refactoring; added implementation of dqn.py, but still need to consider the interface to make it more extensive; slightly refactored the code style of the codebase; more comments and todos will be in the next commit 2017-12-17 12:52:00 +08:00
Dong Yan
31199c7d0d 0. code refactor, try to merge Go and GoEnv 2017-12-16 23:29:11 +08:00
Dong Yan
01c0c2483a check if the network weights exists for every player 2017-12-16 14:55:19 +08:00
Dong Yan
d115c586d4 start the player server in a more robost way. 2017-12-16 14:33:31 +08:00
Dong Yan
4fc50c5f1b merge class strategy with class game. Next, merge Go with GoEnv 2017-12-15 22:19:44 +08:00
rtz19970824
d0bdccc25a assign TODO to Haosheng and Tongzheng 2017-12-15 14:27:04 +08:00
rtz19970824
cb9540b91c Merge branch 'master' of https://github.com/sproblvem/tianshou 2017-12-15 14:24:15 +08:00
rtz19970824
e5bf7a9270 implement dqn loss and dpg loss, add TODO for separate actor and critic 2017-12-15 14:24:08 +08:00
Haosheng Zou
92deae9f8d minor fix 2017-12-14 19:46:38 +08:00
Haosheng Zou
039c8140e2 add dqn.py to write 2017-12-13 22:43:45 +08:00
Haosheng Zou
7ab211b63c preliminary design of dqn_example, dqn interface. identify the assign of networks 2017-12-13 20:47:45 +08:00
Wenbo Hu
657422a4ed
Merge pull request #1 from sproblvem/add_rules
Add rules
2017-12-13 14:35:39 +08:00
Wenbo Hu
3f3d7b56f5 minor indent fix 2017-12-12 23:16:50 +08:00
Wenbo Hu
d52ee30259 add nearby stones 2017-12-12 23:13:31 +08:00
Wenbo Hu
f820aab008 change mcts steps 2017-12-12 20:37:57 +08:00
Wenbo Hu
848b8f0399 minor fix 2017-12-12 17:09:26 +08:00
rtz19970824
9791ad386e Merge branch 'master' of https://github.com/sproblvem/tianshou 2017-12-12 16:54:52 +08:00
Wenbo Hu
44fbccd380 add stone estimation using nearby stone for those UNKNOWs 2017-12-13 00:35:18 +08:00
rtz19970824
e88d651400 minor fixed on self play 2017-12-11 15:56:16 +08:00
rtz19970824
715f7be6a8 update the policy 2017-12-11 13:38:24 +08:00
rtz19970824
0c4a83f3eb vanilla policy gradient 2017-12-11 13:37:27 +08:00
haosheng
88ecaa332d minor fix in core/policy 2017-12-11 13:25:22 +08:00
Dong Yan
e3c0478fa0 Merge branch 'master' of github.com:sproblvem/tianshou 2017-12-10 20:23:30 +08:00
Dong Yan
cacf31657b supporting self-play between different version of neural netowrks 2017-12-10 20:23:10 +08:00
haosheng
972044c39d minor fix 2017-12-10 17:33:10 +08:00
haosheng
a00b930c2c fix naming and comments of coding style, delete .json 2017-12-10 17:23:13 +08:00
songshshshsh
0da31faa94 Merge branch 'master' of https://github.com/sproblvem/tianshou 2017-12-10 14:58:53 +08:00
songshshshsh
f1a7fd9ee1 replay buffer initial commit 2017-12-10 14:56:04 +08:00
rtz19970824
9cda0bec08 coding style 2017-12-10 14:37:29 +08:00
rtz19970824
a8a12f1083 coding style 2017-12-10 14:23:40 +08:00