63 Commits

Author SHA1 Message Date
Dong Yan
9f60984973 remove type_conversion function 2017-12-27 14:08:34 +08:00
Dong Yan
a1f6044cba rewrite selection function of ActionNode for clarity, add and delete some notes 2017-12-27 11:43:04 +08:00
Dong Yan
7f0565a5f6 variable rename and delete redundant code 2017-12-26 22:19:10 +08:00
sproblvem
2b24f0760e Merge branch 'master' into mcts_virtual_loss 2017-12-24 21:27:54 +08:00
Dong Yan
89226b449a replace try catch by isinstance collections.Hashable 2017-12-24 20:57:53 +08:00
Dong Yan
f0074aa7ca fix bug of game config and add profing functions to mcts 2017-12-24 17:43:45 +08:00
mcgrady00h
5aa5dcd191 add comments for mcts with virtual loss 2017-12-24 16:47:43 +08:00
mcgrady00h
8c6f44a015 Merge remote-tracking branch 'origin' into mcts_virtual_loss 2017-12-24 15:49:45 +08:00
mcgrady00h
941284e7b1 Merge remote-tracking branch 'origin' into mcts_virtual_loss 2017-12-24 15:44:30 +08:00
rtz19970824
74504ceb1d debug for go and reversi 2017-12-24 14:40:50 +08:00
Dong Yan
426251e158 add some code for debug and profiling 2017-12-24 01:07:46 +08:00
haoshengzou
b2b2d01d9c Merge remote-tracking branch 'origin/master' 2017-12-23 17:25:37 +08:00
haoshengzou
b21a55dc88 towards policy/value refactor 2017-12-23 17:25:16 +08:00
rtz19970824
3f238864fb minor fixed for mcts, check finish for go 2017-12-23 15:58:06 +08:00
haoshengzou
8c13d8ebe6 Merge remote-tracking branch 'origin/master' 2017-12-23 15:36:44 +08:00
haoshengzou
04048b7873 fix imports to support both python2 and python3. move contents from __init__.py to leave for work after major development. 2017-12-23 15:36:10 +08:00
Dong Yan
b2ef770415 connect reversi with game 2017-12-23 13:05:25 +08:00
mcgrady00h
3b534064bd fix virtual loss bug 2017-12-23 02:48:53 +08:00
Haosheng Zou
8ba16a8808 Merge remote-tracking branch 'origin/master' 2017-12-22 00:24:06 +08:00
Haosheng Zou
1cc5063007 add value_function (critic). value_function and policy not finished yet. 2017-12-22 00:22:23 +08:00
Wenbo Hu
ced63af18f fixing bug pass parameterg 2017-12-21 19:31:51 +08:00
Wenbo Hu
f0d59dab6c forbid pass, if we have other choices 2017-12-20 22:10:47 +08:00
Wenbo Hu
e2c6b96e57 minor revision. 2017-12-20 21:52:30 +08:00
Wenbo Hu
48e95a21ea simulator process a valid set, instead of a single action 2017-12-20 21:35:35 +08:00
rtz19970824
7fca90c61b modify the mcts, refactor the network 2017-12-20 16:43:42 +08:00
Dong Yan
232204d797 fix the copy bug in check_global_isomorphous; refactor code to eliminate side effect 2017-12-19 22:57:38 +08:00
mcgrady00h
1f011a44ef add mcts virtual loss version (may have bugs) 2017-12-19 17:04:55 +08:00
Dong Yan
fc8114fe35 merge flatten and deflatten, rename variable for clarity 2017-12-19 16:51:50 +08:00
宋世虹
7693c38f44 add comments and todos 2017-12-17 13:28:21 +08:00
宋世虹
62e2c6582d finished very naive dqn: changed the interface of replay buffer by adding collect and next_batch, but still need refactoring; added implementation of dqn.py, but still need to consider the interface to make it more extensive; slightly refactored the code style of the codebase; more comments and todos will be in the next commit 2017-12-17 12:52:00 +08:00
Dong Yan
e10acf5130 0. code refactor, try to merge Go and GoEnv 2017-12-16 23:29:11 +08:00
Dong Yan
6cb4b02fca merge class strategy with class game. Next, merge Go with GoEnv 2017-12-15 22:19:44 +08:00
rtz19970824
0874d5342f implement dqn loss and dpg loss, add TODO for separate actor and critic 2017-12-15 14:24:08 +08:00
Haosheng Zou
f496725437 add dqn.py to write 2017-12-13 22:43:45 +08:00
Haosheng Zou
72ae304ab3 preliminary design of dqn_example, dqn interface. identify the assign of networks 2017-12-13 20:47:45 +08:00
rtz19970824
0c4a83f3eb vanilla policy gradient 2017-12-11 13:37:27 +08:00
haosheng
a00b930c2c fix naming and comments of coding style, delete .json 2017-12-10 17:23:13 +08:00
songshshshsh
f1a7fd9ee1 replay buffer initial commit 2017-12-10 14:56:04 +08:00
rtz19970824
a8a12f1083 coding style 2017-12-10 14:23:40 +08:00
rtz19970824
18b3b0b850 add some TODO 2017-12-10 13:31:43 +08:00
rtz19970824
03a6880050 Merge branch 'master' of https://github.com/sproblvem/tianshou 2017-12-08 23:41:51 +08:00
rtz19970824
bc49d466d1 minor fixed 2017-12-08 23:41:31 +08:00
haosheng
ff4306ddb9 model-free rl first commit, with ppo_example.py in examples/ and task delegations in ppo_example.py and READMEs 2017-12-08 21:09:23 +08:00
rtz19970824
f9f63e6609 combine gtp and network 2017-12-05 23:17:20 +08:00
rtz19970824
543d876f12 merge gtp 2017-12-04 11:01:49 +08:00
rtz19970824
7a4c5c3c88 minor fixed 2017-12-03 19:16:21 +08:00
rtz19970824
ca0021083f AlphaGo update 2017-11-26 13:36:52 +08:00
rtz19970824
e4e56d17d1 minor fixed 2017-11-21 22:52:17 +08:00
rtz19970824
31beb46563 mcts update 2017-11-21 22:19:52 +08:00
JialianLee
1e07cb1fac modification of docs for mcts 2017-11-18 15:55:14 +08:00