Tianshou

Author	SHA1	Message	Date
rtz19970824	3b222f5edb	add an args to intrigue training	2018-01-13 15:59:57 +08:00
rtz19970824	2e8662889f	add multi-thread for end-to-end training	2018-01-13 15:57:41 +08:00
rtz19970824	fcaa571b42	add the interface in engine.py	2018-01-12 21:48:01 +08:00
Dong Yan	68cc63144f	fix the hash conflict bug	2018-01-12 21:08:07 +08:00
rtz19970824	90ffdcbb1f	check the latest checkpoint while self play	2018-01-12 19:16:44 +08:00
rtz19970824	c217aa165d	add some error message for better debugging	2018-01-12 17:17:03 +08:00
Dong Yan	e58df65301	fix the async bug between think and do move checking, which introduced by bobo	2018-01-11 21:00:32 +08:00
Dong Yan	afc55ed9c2	refactor code to avoid memory leak	2018-01-11 17:02:36 +08:00
sproblvem	284cc64c18	Merge pull request #3 from sproblvem/double-network Double network	2018-01-11 10:55:12 +08:00
Dong Yan	5482815de6	replace two isolated player process by two different set of variables in the tf graph	2018-01-10 23:27:17 +08:00
Dong Yan	f425085e0a	fix the tf assign error of copy the trained variable from black to white	2018-01-09 21:16:35 +08:00
rtz19970824	c2775df8e6	modify game.py for multi-player	2018-01-09 20:09:48 +08:00
rtz19970824	eb0ce95919	modify model.py for multi-player	2018-01-09 19:50:37 +08:00
Tongzheng Ren	891c5b1e47	Merge branch 'master' of https://github.com/sproblvem/tianshou	2018-01-08 21:21:08 +08:00
Tongzheng Ren	f2edc4896e	modify play.py for avoiding potential bug	2018-01-08 21:19:17 +08:00
rtz19970824	32b7b33ed5	debug: we should estimate our own win rate	2018-01-08 16:19:59 +08:00
JialianLee	8b7b4b6c6b	Add dirichlet noise to root prior and add uniform noise to initial Q value	2018-01-05 17:02:19 +08:00
haoshengzou	dfcea74fcf	fix memory growth and slowness caused by sess.run(tf.multinomial()), now ppo examples are working OK with slight memory growth (1M/min), which still needs research	2018-01-03 20:32:05 +08:00
haoshengzou	4333ee5d39	ppo_cartpole.py seems to be working with param: bs128, num_ep20, max_time500; manually merged Normal from branch policy_wrapper	2018-01-02 19:40:37 +08:00
haoshengzou	88648f0c4b	Merge branch 'master' of https://github.com/sproblvem/tianshou	2017-12-31 15:56:19 +08:00
JialianLee	5849776c9a	Modification and doc for unit test	2017-12-29 13:45:53 +08:00
rtz19970824	01f39f40d3	debug for unit test	2017-12-28 19:38:25 +08:00
Wenbo Hu	50e8ea36e8	merge	2017-12-29 03:31:57 +08:00
Wenbo Hu	63a0d32b34	use hash table for check_global_isomorphous	2017-12-29 03:30:09 +08:00
Wenbo Hu	da156ed88e	Merge branch 'master' of github.com:sproblvem/tianshou	2017-12-29 03:19:46 +08:00
Wenbo Hu	76ac579056	Merge branch 'master' of github.com:sproblvem/tianshou	2017-12-29 01:05:14 +08:00
rtz19970824	2dfab68efe	debug for unit test	2017-12-28 19:28:21 +08:00
JialianLee	4140d8c9d2	Modification on unit test	2017-12-28 17:10:25 +08:00
JialianLee	0352866b1a	Modification for game engine	2017-12-28 16:27:28 +08:00
JialianLee	5457e5134e	add a unit test	2017-12-28 16:20:44 +08:00
rtz19970824	b699258e76	debug for reversi	2017-12-28 15:55:07 +08:00
Dong Yan	08b6649fea	test next_action.next_state in MCTS	2017-12-28 15:52:31 +08:00
Dong Yan	47676993fd	solve the performance bottleneck by only hashing the last board	2017-12-28 01:16:24 +08:00
Dong Yan	affd0319e2	rewrite the selection fuction of UCTNode to return the action node instead of return the state node and next action	2017-12-27 21:11:40 +08:00
Dong Yan	d48982d59e	move evaluator from action node to mcts	2017-12-27 20:49:54 +08:00
rtz19970824	0a160065aa	Merge branch 'master' of https://github.com/sproblvem/tianshou	2017-12-27 19:54:52 +08:00
rtz19970824	f2291efc72	check exists when save data	2017-12-27 19:54:36 +08:00
JialianLee	8d102d249f	Modification for backpropagation process	2017-12-27 18:55:00 +08:00
Dong Yan	9f60984973	remove type_conversion function	2017-12-27 14:08:34 +08:00
Dong Yan	a1f6044cba	rewrite selection function of ActionNode for clarity, add and delete some notes	2017-12-27 11:43:04 +08:00
Dong Yan	c788b253fb	show the stdout of player.py for debugging	2017-12-27 01:04:09 +08:00
Dong Yan	7f0565a5f6	variable rename and delete redundant code	2017-12-26 22:19:10 +08:00
Dong Yan	0c3ff3bf37	delete unused code	2017-12-26 19:29:35 +08:00
Dong Yan	029ab199f4	add softmax for mcts root node	2017-12-26 16:47:24 +08:00
Dong Yan	8f508c790b	add role for mcts debug	2017-12-26 15:07:15 +08:00
Dong Yan	aa6b5434c6	add debuf info for mcts and add softmax for the prior	2017-12-26 14:46:14 +08:00
rtz19970824	725fc2c04e	pass the checkpoint path to the model	2017-12-26 13:17:46 +08:00
rtz19970824	76f641a0f1	minor fixed	2017-12-25 16:51:44 +08:00
rtz19970824	76f6a0c470	merge conflict	2017-12-25 16:42:08 +08:00
rtz19970824	4379f4c0fd	modify play.py for better experience	2017-12-25 16:40:38 +08:00

1 2 3 4 5 ...

368 Commits