116 Commits

Author SHA1 Message Date
Dong Yan
e10acf5130 0. code refactor, try to merge Go and GoEnv 2017-12-16 23:29:11 +08:00
Dong Yan
431f551ce9 check if the network weights exists for every player 2017-12-16 14:55:19 +08:00
Dong Yan
b8bdfea8bd start the player server in a more robost way. 2017-12-16 14:33:31 +08:00
Dong Yan
6cb4b02fca merge class strategy with class game. Next, merge Go with GoEnv 2017-12-15 22:19:44 +08:00
rtz19970824
00f599bba3 assign TODO to Haosheng and Tongzheng 2017-12-15 14:27:04 +08:00
rtz19970824
ea541ed559 Merge branch 'master' of https://github.com/sproblvem/tianshou 2017-12-15 14:24:15 +08:00
rtz19970824
0874d5342f implement dqn loss and dpg loss, add TODO for separate actor and critic 2017-12-15 14:24:08 +08:00
Haosheng Zou
9ed3e7b092 minor fix 2017-12-14 19:46:38 +08:00
Haosheng Zou
f496725437 add dqn.py to write 2017-12-13 22:43:45 +08:00
Haosheng Zou
72ae304ab3 preliminary design of dqn_example, dqn interface. identify the assign of networks 2017-12-13 20:47:45 +08:00
Wenbo Hu
d280260a46 Merge pull request #1 from sproblvem/add_rules
Add rules
2017-12-13 14:35:39 +08:00
Wenbo Hu
3f3d7b56f5 minor indent fix 2017-12-12 23:16:50 +08:00
Wenbo Hu
d52ee30259 add nearby stones 2017-12-12 23:13:31 +08:00
Wenbo Hu
f820aab008 change mcts steps 2017-12-12 20:37:57 +08:00
Wenbo Hu
848b8f0399 minor fix 2017-12-12 17:09:26 +08:00
rtz19970824
9791ad386e Merge branch 'master' of https://github.com/sproblvem/tianshou 2017-12-12 16:54:52 +08:00
Wenbo Hu
44fbccd380 add stone estimation using nearby stone for those UNKNOWs 2017-12-13 00:35:18 +08:00
rtz19970824
e88d651400 minor fixed on self play 2017-12-11 15:56:16 +08:00
rtz19970824
715f7be6a8 update the policy 2017-12-11 13:38:24 +08:00
rtz19970824
0c4a83f3eb vanilla policy gradient 2017-12-11 13:37:27 +08:00
haosheng
88ecaa332d minor fix in core/policy 2017-12-11 13:25:22 +08:00
Dong Yan
e3c0478fa0 Merge branch 'master' of github.com:sproblvem/tianshou 2017-12-10 20:23:30 +08:00
Dong Yan
cacf31657b supporting self-play between different version of neural netowrks 2017-12-10 20:23:10 +08:00
haosheng
972044c39d minor fix 2017-12-10 17:33:10 +08:00
haosheng
a00b930c2c fix naming and comments of coding style, delete .json 2017-12-10 17:23:13 +08:00
songshshshsh
0da31faa94 Merge branch 'master' of https://github.com/sproblvem/tianshou 2017-12-10 14:58:53 +08:00
songshshshsh
f1a7fd9ee1 replay buffer initial commit 2017-12-10 14:56:04 +08:00
rtz19970824
9cda0bec08 coding style 2017-12-10 14:37:29 +08:00
rtz19970824
a8a12f1083 coding style 2017-12-10 14:23:40 +08:00
rtz19970824
d43e0fe311 minor fixed 2017-12-10 13:37:38 +08:00
rtz19970824
cb99f6bbbb minor fixed 2017-12-10 13:36:43 +08:00
rtz19970824
8de92378c2 minor fixed 2017-12-10 13:34:07 +08:00
rtz19970824
18b3b0b850 add some TODO 2017-12-10 13:31:43 +08:00
rtz19970824
ec6114edf1 rm ckpts 2017-12-09 21:53:12 +08:00
rtz19970824
0341e0d21e modify 2017-12-09 21:42:52 +08:00
rtz19970824
1ff8252e6d play 2017-12-09 21:41:11 +08:00
rtz19970824
03a6880050 Merge branch 'master' of https://github.com/sproblvem/tianshou 2017-12-08 23:41:51 +08:00
rtz19970824
bc49d466d1 minor fixed 2017-12-08 23:41:31 +08:00
haosheng
60630c9b04 minor fixed 2017-12-08 21:18:29 +08:00
haosheng
ff4306ddb9 model-free rl first commit, with ppo_example.py in examples/ and task delegations in ppo_example.py and READMEs 2017-12-08 21:09:23 +08:00
rtz19970824
453e457452 minor fixed 2017-12-08 18:59:20 +08:00
rtz19970824
8bedac5978 minor fixed 2017-12-08 18:08:15 +08:00
rtz19970824
a381577fc7 minor fixed 2017-12-08 17:06:12 +08:00
rtz19970824
906ced84a3 self play 2017-12-08 17:05:33 +08:00
Dong Yan
b687241a7d minor fix 2017-12-07 21:09:58 +08:00
Dong Yan
142810e95b avoid place a stone in an eye 2017-12-07 21:05:29 +08:00
rtz19970824
67a9a39e92 connect gtp and gui 2017-12-07 17:51:58 +08:00
rtz19970824
2d9d1ff945 minor fixed 2017-12-05 23:42:18 +08:00
rtz19970824
e9beef46e4 Merge branch 'master' of https://github.com/sproblvem/tianshou 2017-12-05 23:20:24 +08:00
rtz19970824
f9f63e6609 combine gtp and network 2017-12-05 23:17:20 +08:00