159 Commits

Author SHA1 Message Date
haoshengzou
04048b7873 fix imports to support both python2 and python3. move contents from __init__.py to leave for work after major development. 2017-12-23 15:36:10 +08:00
haoshengzou
951eed60ed fix imports to support both python2 and python3. move contents from __init__.py to leave for work after major development. 2017-12-23 15:34:44 +08:00
rtz19970824
d42a76f8f3 Merge branch 'master' of https://github.com/sproblvem/tianshou 2017-12-22 13:47:39 +08:00
rtz19970824
ed96268454 Merge branch 'master' of https://github.com/sproblvem/tianshou 2017-12-22 13:47:38 +08:00
rtz19970824
8328153b86 print in the loading process 2017-12-22 13:47:27 +08:00
rtz19970824
a8509ba292 faster the loading 2017-12-22 13:42:53 +08:00
rtz19970824
5f296ce009 merge 2017-12-22 13:31:41 +08:00
rtz19970824
6b3efd7fca modify the training config 2017-12-22 13:30:48 +08:00
rtz19970824
d281ecc6e0 no restrict on saving checkpoints 2017-12-22 13:05:01 +08:00
rtz19970824
2b1285143c debug the training process, initialize a nameserver if no nameserver exists 2017-12-22 13:04:02 +08:00
JialianLee
5c29dad263 An initial version for Reversi 2017-12-22 01:57:48 +08:00
Haosheng Zou
8ba16a8808 Merge remote-tracking branch 'origin/master' 2017-12-22 00:24:06 +08:00
Haosheng Zou
1cc5063007 add value_function (critic). value_function and policy not finished yet. 2017-12-22 00:22:23 +08:00
rtz19970824
6835ec62e1 multi-instance support 2017-12-22 00:04:51 +08:00
rtz19970824
43f6527d8e modify for multi instance 2017-12-21 23:55:31 +08:00
rtz19970824
6bb34afba5 merge conflict 2017-12-21 23:36:57 +08:00
rtz19970824
9ad53de54f implement the training process 2017-12-21 23:30:24 +08:00
Dong Yan
2acb1aab07 eliminate all references of Game class in Go class 2017-12-21 22:48:53 +08:00
rtz19970824
eda7ed07a1 implement data collection and part of training 2017-12-21 21:01:25 +08:00
Wenbo Hu
ced63af18f fixing bug pass parameterg 2017-12-21 19:31:51 +08:00
Wenbo Hu
00d2aa86bf repair komi. add todo for forbid pass: 2017-12-20 22:57:58 +08:00
Wenbo Hu
f0d59dab6c forbid pass, if we have other choices 2017-12-20 22:10:47 +08:00
Wenbo Hu
e2c6b96e57 minor revision. 2017-12-20 21:52:30 +08:00
Wenbo Hu
cabbb21968 minor revision 2017-12-20 21:40:03 +08:00
Wenbo Hu
48e95a21ea simulator process a valid set, instead of a single action 2017-12-20 21:35:35 +08:00
Wenbo Hu
50e306368f checkpoint 2017-12-20 20:12:08 +08:00
rtz19970824
7fca90c61b modify the mcts, refactor the network 2017-12-20 16:43:42 +08:00
Dong Yan
c2b46c44e7 merge Go and GoEnv finallygit status! 2017-12-20 01:14:05 +08:00
Dong Yan
d1af137686 final version before merge Go and GoEnv 2017-12-20 00:43:31 +08:00
Dong Yan
2a9d949510 rearrange the sequence of functions of Go and GoEnv before merging 2017-12-20 00:16:24 +08:00
Dong Yan
232204d797 fix the copy bug in check_global_isomorphous; refactor code to eliminate side effect 2017-12-19 22:57:38 +08:00
Dong Yan
fc8114fe35 merge flatten and deflatten, rename variable for clarity 2017-12-19 16:51:50 +08:00
rtz19970824
4a2d8f0003 start a random player if checkpoint path is not specified 2017-12-19 15:39:31 +08:00
rtz19970824
0991fef527 deflatten debug 2017-12-19 15:09:46 +08:00
Dong Yan
4440294c12 fix bug in check_global_isomorphous and refactor _is_suicide again 2017-12-19 12:00:17 +08:00
Dong Yan
99a617a1f0 rename variable for clarity 2017-12-19 11:16:17 +08:00
Dong Yan
6a410384bb rewrite _is_qi in a more understandable way 2017-12-19 00:47:21 +08:00
Dong Yan
14da3200ff Merge branch 'master' of github.com:sproblvem/tianshou 2017-12-19 00:16:33 +08:00
Dong Yan
ea52096713 delete unused parameter of _find_block, and using _find_group to replace _find_block 2017-12-19 00:16:21 +08:00
Tongzheng Ren
6b6c48f122 update gitignore 2017-12-18 23:34:32 +08:00
Tongzheng Ren
75bc2968d2 add a detailed Chinese google coding style for convenience 2017-12-18 23:32:41 +08:00
宋世虹
7693c38f44 add comments and todos 2017-12-17 13:28:21 +08:00
宋世虹
62e2c6582d finished very naive dqn: changed the interface of replay buffer by adding collect and next_batch, but still need refactoring; added implementation of dqn.py, but still need to consider the interface to make it more extensive; slightly refactored the code style of the codebase; more comments and todos will be in the next commit 2017-12-17 12:52:00 +08:00
Dong Yan
e10acf5130 0. code refactor, try to merge Go and GoEnv 2017-12-16 23:29:11 +08:00
Dong Yan
431f551ce9 check if the network weights exists for every player 2017-12-16 14:55:19 +08:00
Dong Yan
b8bdfea8bd start the player server in a more robost way. 2017-12-16 14:33:31 +08:00
Dong Yan
6cb4b02fca merge class strategy with class game. Next, merge Go with GoEnv 2017-12-15 22:19:44 +08:00
rtz19970824
00f599bba3 assign TODO to Haosheng and Tongzheng 2017-12-15 14:27:04 +08:00
rtz19970824
ea541ed559 Merge branch 'master' of https://github.com/sproblvem/tianshou 2017-12-15 14:24:15 +08:00
rtz19970824
0874d5342f implement dqn loss and dpg loss, add TODO for separate actor and critic 2017-12-15 14:24:08 +08:00