Tianshou

Author	SHA1	Message	Date
rtz19970824	ed96268454	Merge branch 'master' of https://github.com/sproblvem/tianshou	2017-12-22 13:47:38 +08:00
rtz19970824	8328153b86	print in the loading process	2017-12-22 13:47:27 +08:00
rtz19970824	a8509ba292	faster the loading	2017-12-22 13:42:53 +08:00
rtz19970824	5f296ce009	merge	2017-12-22 13:31:41 +08:00
rtz19970824	6b3efd7fca	modify the training config	2017-12-22 13:30:48 +08:00
rtz19970824	d281ecc6e0	no restrict on saving checkpoints	2017-12-22 13:05:01 +08:00
rtz19970824	2b1285143c	debug the training process, initialize a nameserver if no nameserver exists	2017-12-22 13:04:02 +08:00
JialianLee	5c29dad263	An initial version for Reversi	2017-12-22 01:57:48 +08:00
Haosheng Zou	8ba16a8808	Merge remote-tracking branch 'origin/master'	2017-12-22 00:24:06 +08:00
Haosheng Zou	1cc5063007	add value_function (critic). value_function and policy not finished yet.	2017-12-22 00:22:23 +08:00
rtz19970824	6835ec62e1	multi-instance support	2017-12-22 00:04:51 +08:00
rtz19970824	43f6527d8e	modify for multi instance	2017-12-21 23:55:31 +08:00
rtz19970824	6bb34afba5	merge conflict	2017-12-21 23:36:57 +08:00
rtz19970824	9ad53de54f	implement the training process	2017-12-21 23:30:24 +08:00
Dong Yan	2acb1aab07	eliminate all references of Game class in Go class	2017-12-21 22:48:53 +08:00
rtz19970824	eda7ed07a1	implement data collection and part of training	2017-12-21 21:01:25 +08:00
Wenbo Hu	ced63af18f	fixing bug pass parameterg	2017-12-21 19:31:51 +08:00
Wenbo Hu	00d2aa86bf	repair komi. add todo for forbid pass:	2017-12-20 22:57:58 +08:00
Wenbo Hu	f0d59dab6c	forbid pass, if we have other choices	2017-12-20 22:10:47 +08:00
Wenbo Hu	e2c6b96e57	minor revision.	2017-12-20 21:52:30 +08:00
Wenbo Hu	cabbb21968	minor revision	2017-12-20 21:40:03 +08:00
Wenbo Hu	48e95a21ea	simulator process a valid set, instead of a single action	2017-12-20 21:35:35 +08:00
Wenbo Hu	50e306368f	checkpoint	2017-12-20 20:12:08 +08:00
rtz19970824	7fca90c61b	modify the mcts, refactor the network	2017-12-20 16:43:42 +08:00
Dong Yan	c2b46c44e7	merge Go and GoEnv finallygit status!	2017-12-20 01:14:05 +08:00
Dong Yan	d1af137686	final version before merge Go and GoEnv	2017-12-20 00:43:31 +08:00
Dong Yan	2a9d949510	rearrange the sequence of functions of Go and GoEnv before merging	2017-12-20 00:16:24 +08:00
Dong Yan	232204d797	fix the copy bug in check_global_isomorphous; refactor code to eliminate side effect	2017-12-19 22:57:38 +08:00
mcgrady00h	1f011a44ef	add mcts virtual loss version (may have bugs)	2017-12-19 17:04:55 +08:00
Dong Yan	fc8114fe35	merge flatten and deflatten, rename variable for clarity	2017-12-19 16:51:50 +08:00
rtz19970824	4a2d8f0003	start a random player if checkpoint path is not specified	2017-12-19 15:39:31 +08:00
rtz19970824	0991fef527	deflatten debug	2017-12-19 15:09:46 +08:00
Dong Yan	4440294c12	fix bug in check_global_isomorphous and refactor _is_suicide again	2017-12-19 12:00:17 +08:00
Dong Yan	99a617a1f0	rename variable for clarity	2017-12-19 11:16:17 +08:00
Dong Yan	6a410384bb	rewrite _is_qi in a more understandable way	2017-12-19 00:47:21 +08:00
Dong Yan	14da3200ff	Merge branch 'master' of github.com:sproblvem/tianshou	2017-12-19 00:16:33 +08:00
Dong Yan	ea52096713	delete unused parameter of _find_block, and using _find_group to replace _find_block	2017-12-19 00:16:21 +08:00
Tongzheng Ren	6b6c48f122	update gitignore	2017-12-18 23:34:32 +08:00
Tongzheng Ren	75bc2968d2	add a detailed Chinese google coding style for convenience	2017-12-18 23:32:41 +08:00
宋世虹	7693c38f44	add comments and todos	2017-12-17 13:28:21 +08:00
宋世虹	62e2c6582d	finished very naive dqn: changed the interface of replay buffer by adding collect and next_batch, but still need refactoring; added implementation of dqn.py, but still need to consider the interface to make it more extensive; slightly refactored the code style of the codebase; more comments and todos will be in the next commit	2017-12-17 12:52:00 +08:00
Dong Yan	e10acf5130	0. code refactor, try to merge Go and GoEnv	2017-12-16 23:29:11 +08:00
Dong Yan	431f551ce9	check if the network weights exists for every player	2017-12-16 14:55:19 +08:00
Dong Yan	b8bdfea8bd	start the player server in a more robost way.	2017-12-16 14:33:31 +08:00
Dong Yan	6cb4b02fca	merge class strategy with class game. Next, merge Go with GoEnv	2017-12-15 22:19:44 +08:00
rtz19970824	00f599bba3	assign TODO to Haosheng and Tongzheng	2017-12-15 14:27:04 +08:00
rtz19970824	ea541ed559	Merge branch 'master' of https://github.com/sproblvem/tianshou	2017-12-15 14:24:15 +08:00
rtz19970824	0874d5342f	implement dqn loss and dpg loss, add TODO for separate actor and critic	2017-12-15 14:24:08 +08:00
Haosheng Zou	9ed3e7b092	minor fix	2017-12-14 19:46:38 +08:00
Haosheng Zou	f496725437	add dqn.py to write	2017-12-13 22:43:45 +08:00

1 2 3 4 5

207 Commits