368 Commits

Author SHA1 Message Date
rtz19970824
3b222f5edb add an args to intrigue training 2018-01-13 15:59:57 +08:00
rtz19970824
2e8662889f add multi-thread for end-to-end training 2018-01-13 15:57:41 +08:00
rtz19970824
fcaa571b42 add the interface in engine.py 2018-01-12 21:48:01 +08:00
Dong Yan
68cc63144f fix the hash conflict bug 2018-01-12 21:08:07 +08:00
rtz19970824
90ffdcbb1f check the latest checkpoint while self play 2018-01-12 19:16:44 +08:00
rtz19970824
c217aa165d add some error message for better debugging 2018-01-12 17:17:03 +08:00
Dong Yan
e58df65301 fix the async bug between think and do move checking, which introduced by bobo 2018-01-11 21:00:32 +08:00
Dong Yan
afc55ed9c2 refactor code to avoid memory leak 2018-01-11 17:02:36 +08:00
sproblvem
284cc64c18
Merge pull request #3 from sproblvem/double-network
Double network
2018-01-11 10:55:12 +08:00
Dong Yan
5482815de6 replace two isolated player process by two different set of variables in the tf graph 2018-01-10 23:27:17 +08:00
Dong Yan
f425085e0a fix the tf assign error of copy the trained variable from black to white 2018-01-09 21:16:35 +08:00
rtz19970824
c2775df8e6 modify game.py for multi-player 2018-01-09 20:09:48 +08:00
rtz19970824
eb0ce95919 modify model.py for multi-player 2018-01-09 19:50:37 +08:00
Tongzheng Ren
891c5b1e47 Merge branch 'master' of https://github.com/sproblvem/tianshou 2018-01-08 21:21:08 +08:00
Tongzheng Ren
f2edc4896e modify play.py for avoiding potential bug 2018-01-08 21:19:17 +08:00
rtz19970824
32b7b33ed5 debug: we should estimate our own win rate 2018-01-08 16:19:59 +08:00
JialianLee
8b7b4b6c6b Add dirichlet noise to root prior and add uniform noise to initial Q value 2018-01-05 17:02:19 +08:00
haoshengzou
dfcea74fcf fix memory growth and slowness caused by sess.run(tf.multinomial()), now ppo examples are working OK with slight memory growth (1M/min), which still needs research 2018-01-03 20:32:05 +08:00
haoshengzou
4333ee5d39 ppo_cartpole.py seems to be working with param: bs128, num_ep20, max_time500; manually merged Normal from branch policy_wrapper 2018-01-02 19:40:37 +08:00
haoshengzou
88648f0c4b Merge branch 'master' of https://github.com/sproblvem/tianshou 2017-12-31 15:56:19 +08:00
JialianLee
5849776c9a Modification and doc for unit test 2017-12-29 13:45:53 +08:00
rtz19970824
01f39f40d3 debug for unit test 2017-12-28 19:38:25 +08:00
Wenbo Hu
50e8ea36e8 merge 2017-12-29 03:31:57 +08:00
Wenbo Hu
63a0d32b34 use hash table for check_global_isomorphous 2017-12-29 03:30:09 +08:00
Wenbo Hu
da156ed88e Merge branch 'master' of github.com:sproblvem/tianshou 2017-12-29 03:19:46 +08:00
Wenbo Hu
76ac579056 Merge branch 'master' of github.com:sproblvem/tianshou 2017-12-29 01:05:14 +08:00
rtz19970824
2dfab68efe debug for unit test 2017-12-28 19:28:21 +08:00
JialianLee
4140d8c9d2 Modification on unit test 2017-12-28 17:10:25 +08:00
JialianLee
0352866b1a Modification for game engine 2017-12-28 16:27:28 +08:00
JialianLee
5457e5134e add a unit test 2017-12-28 16:20:44 +08:00
rtz19970824
b699258e76 debug for reversi 2017-12-28 15:55:07 +08:00
Dong Yan
08b6649fea test next_action.next_state in MCTS 2017-12-28 15:52:31 +08:00
Dong Yan
47676993fd solve the performance bottleneck by only hashing the last board 2017-12-28 01:16:24 +08:00
Dong Yan
affd0319e2 rewrite the selection fuction of UCTNode to return the action node instead of return the state node and next action 2017-12-27 21:11:40 +08:00
Dong Yan
d48982d59e move evaluator from action node to mcts 2017-12-27 20:49:54 +08:00
rtz19970824
0a160065aa Merge branch 'master' of https://github.com/sproblvem/tianshou 2017-12-27 19:54:52 +08:00
rtz19970824
f2291efc72 check exists when save data 2017-12-27 19:54:36 +08:00
JialianLee
8d102d249f Modification for backpropagation process 2017-12-27 18:55:00 +08:00
Dong Yan
9f60984973 remove type_conversion function 2017-12-27 14:08:34 +08:00
Dong Yan
a1f6044cba rewrite selection function of ActionNode for clarity, add and delete some notes 2017-12-27 11:43:04 +08:00
Dong Yan
c788b253fb show the stdout of player.py for debugging 2017-12-27 01:04:09 +08:00
Dong Yan
7f0565a5f6 variable rename and delete redundant code 2017-12-26 22:19:10 +08:00
Dong Yan
0c3ff3bf37 delete unused code 2017-12-26 19:29:35 +08:00
Dong Yan
029ab199f4 add softmax for mcts root node 2017-12-26 16:47:24 +08:00
Dong Yan
8f508c790b add role for mcts debug 2017-12-26 15:07:15 +08:00
Dong Yan
aa6b5434c6 add debuf info for mcts and add softmax for the prior 2017-12-26 14:46:14 +08:00
rtz19970824
725fc2c04e pass the checkpoint path to the model 2017-12-26 13:17:46 +08:00
rtz19970824
76f641a0f1 minor fixed 2017-12-25 16:51:44 +08:00
rtz19970824
76f6a0c470 merge conflict 2017-12-25 16:42:08 +08:00
rtz19970824
4379f4c0fd modify play.py for better experience 2017-12-25 16:40:38 +08:00