rtz19970824
|
ed96268454
|
Merge branch 'master' of https://github.com/sproblvem/tianshou
|
2017-12-22 13:47:38 +08:00 |
|
rtz19970824
|
8328153b86
|
print in the loading process
|
2017-12-22 13:47:27 +08:00 |
|
rtz19970824
|
a8509ba292
|
faster the loading
|
2017-12-22 13:42:53 +08:00 |
|
rtz19970824
|
5f296ce009
|
merge
|
2017-12-22 13:31:41 +08:00 |
|
rtz19970824
|
6b3efd7fca
|
modify the training config
|
2017-12-22 13:30:48 +08:00 |
|
rtz19970824
|
d281ecc6e0
|
no restrict on saving checkpoints
|
2017-12-22 13:05:01 +08:00 |
|
rtz19970824
|
2b1285143c
|
debug the training process, initialize a nameserver if no nameserver exists
|
2017-12-22 13:04:02 +08:00 |
|
JialianLee
|
5c29dad263
|
An initial version for Reversi
|
2017-12-22 01:57:48 +08:00 |
|
Haosheng Zou
|
8ba16a8808
|
Merge remote-tracking branch 'origin/master'
|
2017-12-22 00:24:06 +08:00 |
|
Haosheng Zou
|
1cc5063007
|
add value_function (critic). value_function and policy not finished yet.
|
2017-12-22 00:22:23 +08:00 |
|
rtz19970824
|
6835ec62e1
|
multi-instance support
|
2017-12-22 00:04:51 +08:00 |
|
rtz19970824
|
43f6527d8e
|
modify for multi instance
|
2017-12-21 23:55:31 +08:00 |
|
rtz19970824
|
6bb34afba5
|
merge conflict
|
2017-12-21 23:36:57 +08:00 |
|
rtz19970824
|
9ad53de54f
|
implement the training process
|
2017-12-21 23:30:24 +08:00 |
|
Dong Yan
|
2acb1aab07
|
eliminate all references of Game class in Go class
|
2017-12-21 22:48:53 +08:00 |
|
rtz19970824
|
eda7ed07a1
|
implement data collection and part of training
|
2017-12-21 21:01:25 +08:00 |
|
Wenbo Hu
|
ced63af18f
|
fixing bug pass parameterg
|
2017-12-21 19:31:51 +08:00 |
|
Wenbo Hu
|
00d2aa86bf
|
repair komi. add todo for forbid pass:
|
2017-12-20 22:57:58 +08:00 |
|
Wenbo Hu
|
f0d59dab6c
|
forbid pass, if we have other choices
|
2017-12-20 22:10:47 +08:00 |
|
Wenbo Hu
|
e2c6b96e57
|
minor revision.
|
2017-12-20 21:52:30 +08:00 |
|
Wenbo Hu
|
cabbb21968
|
minor revision
|
2017-12-20 21:40:03 +08:00 |
|
Wenbo Hu
|
48e95a21ea
|
simulator process a valid set, instead of a single action
|
2017-12-20 21:35:35 +08:00 |
|
Wenbo Hu
|
50e306368f
|
checkpoint
|
2017-12-20 20:12:08 +08:00 |
|
rtz19970824
|
7fca90c61b
|
modify the mcts, refactor the network
|
2017-12-20 16:43:42 +08:00 |
|
Dong Yan
|
c2b46c44e7
|
merge Go and GoEnv finallygit status!
|
2017-12-20 01:14:05 +08:00 |
|
Dong Yan
|
d1af137686
|
final version before merge Go and GoEnv
|
2017-12-20 00:43:31 +08:00 |
|
Dong Yan
|
2a9d949510
|
rearrange the sequence of functions of Go and GoEnv before merging
|
2017-12-20 00:16:24 +08:00 |
|
Dong Yan
|
232204d797
|
fix the copy bug in check_global_isomorphous; refactor code to eliminate side effect
|
2017-12-19 22:57:38 +08:00 |
|
mcgrady00h
|
1f011a44ef
|
add mcts virtual loss version (may have bugs)
|
2017-12-19 17:04:55 +08:00 |
|
Dong Yan
|
fc8114fe35
|
merge flatten and deflatten, rename variable for clarity
|
2017-12-19 16:51:50 +08:00 |
|
rtz19970824
|
4a2d8f0003
|
start a random player if checkpoint path is not specified
|
2017-12-19 15:39:31 +08:00 |
|
rtz19970824
|
0991fef527
|
deflatten debug
|
2017-12-19 15:09:46 +08:00 |
|
Dong Yan
|
4440294c12
|
fix bug in check_global_isomorphous and refactor _is_suicide again
|
2017-12-19 12:00:17 +08:00 |
|
Dong Yan
|
99a617a1f0
|
rename variable for clarity
|
2017-12-19 11:16:17 +08:00 |
|
Dong Yan
|
6a410384bb
|
rewrite _is_qi in a more understandable way
|
2017-12-19 00:47:21 +08:00 |
|
Dong Yan
|
14da3200ff
|
Merge branch 'master' of github.com:sproblvem/tianshou
|
2017-12-19 00:16:33 +08:00 |
|
Dong Yan
|
ea52096713
|
delete unused parameter of _find_block, and using _find_group to replace _find_block
|
2017-12-19 00:16:21 +08:00 |
|
Tongzheng Ren
|
6b6c48f122
|
update gitignore
|
2017-12-18 23:34:32 +08:00 |
|
Tongzheng Ren
|
75bc2968d2
|
add a detailed Chinese google coding style for convenience
|
2017-12-18 23:32:41 +08:00 |
|
宋世虹
|
7693c38f44
|
add comments and todos
|
2017-12-17 13:28:21 +08:00 |
|
宋世虹
|
62e2c6582d
|
finished very naive dqn: changed the interface of replay buffer by adding collect and next_batch, but still need refactoring; added implementation of dqn.py, but still need to consider the interface to make it more extensive; slightly refactored the code style of the codebase; more comments and todos will be in the next commit
|
2017-12-17 12:52:00 +08:00 |
|
Dong Yan
|
e10acf5130
|
0. code refactor, try to merge Go and GoEnv
|
2017-12-16 23:29:11 +08:00 |
|
Dong Yan
|
431f551ce9
|
check if the network weights exists for every player
|
2017-12-16 14:55:19 +08:00 |
|
Dong Yan
|
b8bdfea8bd
|
start the player server in a more robost way.
|
2017-12-16 14:33:31 +08:00 |
|
Dong Yan
|
6cb4b02fca
|
merge class strategy with class game. Next, merge Go with GoEnv
|
2017-12-15 22:19:44 +08:00 |
|
rtz19970824
|
00f599bba3
|
assign TODO to Haosheng and Tongzheng
|
2017-12-15 14:27:04 +08:00 |
|
rtz19970824
|
ea541ed559
|
Merge branch 'master' of https://github.com/sproblvem/tianshou
|
2017-12-15 14:24:15 +08:00 |
|
rtz19970824
|
0874d5342f
|
implement dqn loss and dpg loss, add TODO for separate actor and critic
|
2017-12-15 14:24:08 +08:00 |
|
Haosheng Zou
|
9ed3e7b092
|
minor fix
|
2017-12-14 19:46:38 +08:00 |
|
Haosheng Zou
|
f496725437
|
add dqn.py to write
|
2017-12-13 22:43:45 +08:00 |
|