JialianLee
|
5849776c9a
|
Modification and doc for unit test
|
2017-12-29 13:45:53 +08:00 |
|
rtz19970824
|
01f39f40d3
|
debug for unit test
|
2017-12-28 19:38:25 +08:00 |
|
JialianLee
|
4140d8c9d2
|
Modification on unit test
|
2017-12-28 17:10:25 +08:00 |
|
JialianLee
|
0352866b1a
|
Modification for game engine
|
2017-12-28 16:27:28 +08:00 |
|
JialianLee
|
5457e5134e
|
add a unit test
|
2017-12-28 16:20:44 +08:00 |
|
Dong Yan
|
08b6649fea
|
test next_action.next_state in MCTS
|
2017-12-28 15:52:31 +08:00 |
|
Dong Yan
|
47676993fd
|
solve the performance bottleneck by only hashing the last board
|
2017-12-28 01:16:24 +08:00 |
|
Dong Yan
|
affd0319e2
|
rewrite the selection fuction of UCTNode to return the action node instead of return the state node and next action
|
2017-12-27 21:11:40 +08:00 |
|
Dong Yan
|
d48982d59e
|
move evaluator from action node to mcts
|
2017-12-27 20:49:54 +08:00 |
|
JialianLee
|
8d102d249f
|
Modification for backpropagation process
|
2017-12-27 18:55:00 +08:00 |
|
Dong Yan
|
9f60984973
|
remove type_conversion function
|
2017-12-27 14:08:34 +08:00 |
|
Dong Yan
|
a1f6044cba
|
rewrite selection function of ActionNode for clarity, add and delete some notes
|
2017-12-27 11:43:04 +08:00 |
|
Dong Yan
|
7f0565a5f6
|
variable rename and delete redundant code
|
2017-12-26 22:19:10 +08:00 |
|
sproblvem
|
2b24f0760e
|
Merge branch 'master' into mcts_virtual_loss
|
2017-12-24 21:27:54 +08:00 |
|
Dong Yan
|
89226b449a
|
replace try catch by isinstance collections.Hashable
|
2017-12-24 20:57:53 +08:00 |
|
Dong Yan
|
f0074aa7ca
|
fix bug of game config and add profing functions to mcts
|
2017-12-24 17:43:45 +08:00 |
|
mcgrady00h
|
5aa5dcd191
|
add comments for mcts with virtual loss
|
2017-12-24 16:47:43 +08:00 |
|
mcgrady00h
|
8c6f44a015
|
Merge remote-tracking branch 'origin' into mcts_virtual_loss
|
2017-12-24 15:49:45 +08:00 |
|
mcgrady00h
|
941284e7b1
|
Merge remote-tracking branch 'origin' into mcts_virtual_loss
|
2017-12-24 15:44:30 +08:00 |
|
rtz19970824
|
74504ceb1d
|
debug for go and reversi
|
2017-12-24 14:40:50 +08:00 |
|
Dong Yan
|
426251e158
|
add some code for debug and profiling
|
2017-12-24 01:07:46 +08:00 |
|
haoshengzou
|
b2b2d01d9c
|
Merge remote-tracking branch 'origin/master'
|
2017-12-23 17:25:37 +08:00 |
|
haoshengzou
|
b21a55dc88
|
towards policy/value refactor
|
2017-12-23 17:25:16 +08:00 |
|
rtz19970824
|
3f238864fb
|
minor fixed for mcts, check finish for go
|
2017-12-23 15:58:06 +08:00 |
|
haoshengzou
|
8c13d8ebe6
|
Merge remote-tracking branch 'origin/master'
|
2017-12-23 15:36:44 +08:00 |
|
haoshengzou
|
04048b7873
|
fix imports to support both python2 and python3. move contents from __init__.py to leave for work after major development.
|
2017-12-23 15:36:10 +08:00 |
|
Dong Yan
|
b2ef770415
|
connect reversi with game
|
2017-12-23 13:05:25 +08:00 |
|
mcgrady00h
|
3b534064bd
|
fix virtual loss bug
|
2017-12-23 02:48:53 +08:00 |
|
Haosheng Zou
|
8ba16a8808
|
Merge remote-tracking branch 'origin/master'
|
2017-12-22 00:24:06 +08:00 |
|
Haosheng Zou
|
1cc5063007
|
add value_function (critic). value_function and policy not finished yet.
|
2017-12-22 00:22:23 +08:00 |
|
Wenbo Hu
|
ced63af18f
|
fixing bug pass parameterg
|
2017-12-21 19:31:51 +08:00 |
|
Wenbo Hu
|
f0d59dab6c
|
forbid pass, if we have other choices
|
2017-12-20 22:10:47 +08:00 |
|
Wenbo Hu
|
e2c6b96e57
|
minor revision.
|
2017-12-20 21:52:30 +08:00 |
|
Wenbo Hu
|
48e95a21ea
|
simulator process a valid set, instead of a single action
|
2017-12-20 21:35:35 +08:00 |
|
rtz19970824
|
7fca90c61b
|
modify the mcts, refactor the network
|
2017-12-20 16:43:42 +08:00 |
|
Dong Yan
|
232204d797
|
fix the copy bug in check_global_isomorphous; refactor code to eliminate side effect
|
2017-12-19 22:57:38 +08:00 |
|
mcgrady00h
|
1f011a44ef
|
add mcts virtual loss version (may have bugs)
|
2017-12-19 17:04:55 +08:00 |
|
Dong Yan
|
fc8114fe35
|
merge flatten and deflatten, rename variable for clarity
|
2017-12-19 16:51:50 +08:00 |
|
宋世虹
|
7693c38f44
|
add comments and todos
|
2017-12-17 13:28:21 +08:00 |
|
宋世虹
|
62e2c6582d
|
finished very naive dqn: changed the interface of replay buffer by adding collect and next_batch, but still need refactoring; added implementation of dqn.py, but still need to consider the interface to make it more extensive; slightly refactored the code style of the codebase; more comments and todos will be in the next commit
|
2017-12-17 12:52:00 +08:00 |
|
Dong Yan
|
e10acf5130
|
0. code refactor, try to merge Go and GoEnv
|
2017-12-16 23:29:11 +08:00 |
|
Dong Yan
|
6cb4b02fca
|
merge class strategy with class game. Next, merge Go with GoEnv
|
2017-12-15 22:19:44 +08:00 |
|
rtz19970824
|
0874d5342f
|
implement dqn loss and dpg loss, add TODO for separate actor and critic
|
2017-12-15 14:24:08 +08:00 |
|
Haosheng Zou
|
f496725437
|
add dqn.py to write
|
2017-12-13 22:43:45 +08:00 |
|
Haosheng Zou
|
72ae304ab3
|
preliminary design of dqn_example, dqn interface. identify the assign of networks
|
2017-12-13 20:47:45 +08:00 |
|
rtz19970824
|
0c4a83f3eb
|
vanilla policy gradient
|
2017-12-11 13:37:27 +08:00 |
|
haosheng
|
a00b930c2c
|
fix naming and comments of coding style, delete .json
|
2017-12-10 17:23:13 +08:00 |
|
rtz19970824
|
a8a12f1083
|
coding style
|
2017-12-10 14:23:40 +08:00 |
|
rtz19970824
|
03a6880050
|
Merge branch 'master' of https://github.com/sproblvem/tianshou
|
2017-12-08 23:41:51 +08:00 |
|
rtz19970824
|
bc49d466d1
|
minor fixed
|
2017-12-08 23:41:31 +08:00 |
|