Tianshou

Author	SHA1	Message	Date
JialianLee	282b496e49	Modification for reversi.py	2017-12-23 15:43:45 +08:00
Dong Yan	0c3ebacc75	Merge branch 'master' of github.com:sproblvem/tianshou	2017-12-23 13:05:38 +08:00
Dong Yan	e63338ab01	connect reversi with game	2017-12-23 13:05:25 +08:00
JialianLee	1eb46774c2	small modification	2017-12-23 09:47:08 +08:00
rtz19970824	5b044c9a0c	merge	2017-12-22 17:17:50 +08:00
rtz19970824	d8c0eae6a3	implement a stochastic sample training method	2017-12-22 17:16:44 +08:00
Dong Yan	51b8e7fc18	move the unit test of is_eye into go.py	2017-12-22 15:44:44 +08:00
JialianLee	7964064242	Modification for reversi	2017-12-22 15:26:47 +08:00
rtz19970824	a062c7610c	Merge branch 'master' of https://github.com/sproblvem/tianshou	2017-12-22 13:47:39 +08:00
rtz19970824	e72fd52913	Merge branch 'master' of https://github.com/sproblvem/tianshou	2017-12-22 13:47:38 +08:00
rtz19970824	e8a10f189e	print in the loading process	2017-12-22 13:47:27 +08:00
rtz19970824	97161f37ef	faster the loading	2017-12-22 13:42:53 +08:00
rtz19970824	9824cf8bef	merge	2017-12-22 13:31:41 +08:00
rtz19970824	d151f71ee3	modify the training config	2017-12-22 13:30:48 +08:00
rtz19970824	62a241e763	no restrict on saving checkpoints	2017-12-22 13:05:01 +08:00
rtz19970824	e75883a5fb	debug the training process, initialize a nameserver if no nameserver exists	2017-12-22 13:04:02 +08:00
JialianLee	7f1191ef02	An initial version for Reversi	2017-12-22 01:57:48 +08:00
Haosheng Zou	b32418d11a	Merge remote-tracking branch 'origin/master'	2017-12-22 00:24:06 +08:00
Haosheng Zou	6611d948dd	add value_function (critic). value_function and policy not finished yet.	2017-12-22 00:22:23 +08:00
rtz19970824	a61c1f136a	multi-instance support	2017-12-22 00:04:51 +08:00
rtz19970824	a20255249c	modify for multi instance	2017-12-21 23:55:31 +08:00
rtz19970824	ff2ebd49c1	merge conflict	2017-12-21 23:36:57 +08:00
rtz19970824	c11eccbc90	implement the training process	2017-12-21 23:30:24 +08:00
Dong Yan	c3e9e55b24	eliminate all references of Game class in Go class	2017-12-21 22:48:53 +08:00
rtz19970824	2dad8e4020	implement data collection and part of training	2017-12-21 21:01:25 +08:00
Wenbo Hu	1e2567c174	fixing bug pass parameterg	2017-12-21 19:31:51 +08:00
Wenbo Hu	336cede197	repair komi. add todo for forbid pass:	2017-12-20 22:57:58 +08:00
Wenbo Hu	40909fa994	forbid pass, if we have other choices	2017-12-20 22:10:47 +08:00
Wenbo Hu	0ab38743aa	minor revision.	2017-12-20 21:52:30 +08:00
Wenbo Hu	8875ad1bf7	minor revision	2017-12-20 21:40:03 +08:00
Wenbo Hu	818da800e2	simulator process a valid set, instead of a single action	2017-12-20 21:35:35 +08:00
Wenbo Hu	12f45d9dc6	checkpoint	2017-12-20 20:12:08 +08:00
rtz19970824	112fd07b13	modify the mcts, refactor the network	2017-12-20 16:43:42 +08:00
Dong Yan	db40994e11	merge Go and GoEnv finallygit status!	2017-12-20 01:14:05 +08:00
Dong Yan	0456e0c15e	final version before merge Go and GoEnv	2017-12-20 00:43:31 +08:00
Dong Yan	afc5dbac5a	rearrange the sequence of functions of Go and GoEnv before merging	2017-12-20 00:16:24 +08:00
Dong Yan	f8a70183b6	fix the copy bug in check_global_isomorphous; refactor code to eliminate side effect	2017-12-19 22:57:38 +08:00
Dong Yan	83f9e19fa5	merge flatten and deflatten, rename variable for clarity	2017-12-19 16:51:50 +08:00
rtz19970824	fae273f219	start a random player if checkpoint path is not specified	2017-12-19 15:39:31 +08:00
rtz19970824	d7b3b6aba9	deflatten debug	2017-12-19 15:09:46 +08:00
Dong Yan	e168df5609	fix bug in check_global_isomorphous and refactor _is_suicide again	2017-12-19 12:00:17 +08:00
Dong Yan	72a9f4823c	rename variable for clarity	2017-12-19 11:16:17 +08:00
Dong Yan	1a164d4d7d	rewrite _is_qi in a more understandable way	2017-12-19 00:47:21 +08:00
Dong Yan	243bbaff64	Merge branch 'master' of github.com:sproblvem/tianshou	2017-12-19 00:16:33 +08:00
Dong Yan	fb511aa76d	delete unused parameter of _find_block, and using _find_group to replace _find_block	2017-12-19 00:16:21 +08:00
Tongzheng Ren	0e1287b5cb	update gitignore	2017-12-18 23:34:32 +08:00
Tongzheng Ren	27c1017259	add a detailed Chinese google coding style for convenience	2017-12-18 23:32:41 +08:00
宋世虹	d220f7f2a8	add comments and todos	2017-12-17 13:28:21 +08:00
宋世虹	3624cc9036	finished very naive dqn: changed the interface of replay buffer by adding collect and next_batch, but still need refactoring; added implementation of dqn.py, but still need to consider the interface to make it more extensive; slightly refactored the code style of the codebase; more comments and todos will be in the next commit	2017-12-17 12:52:00 +08:00
Dong Yan	31199c7d0d	0. code refactor, try to merge Go and GoEnv	2017-12-16 23:29:11 +08:00

1 2 3 4

165 Commits