haoshengzou
|
04048b7873
|
fix imports to support both python2 and python3. move contents from __init__.py to leave for work after major development.
|
2017-12-23 15:36:10 +08:00 |
|
Dong Yan
|
b2ef770415
|
connect reversi with game
|
2017-12-23 13:05:25 +08:00 |
|
mcgrady00h
|
3b534064bd
|
fix virtual loss bug
|
2017-12-23 02:48:53 +08:00 |
|
Haosheng Zou
|
8ba16a8808
|
Merge remote-tracking branch 'origin/master'
|
2017-12-22 00:24:06 +08:00 |
|
Haosheng Zou
|
1cc5063007
|
add value_function (critic). value_function and policy not finished yet.
|
2017-12-22 00:22:23 +08:00 |
|
Wenbo Hu
|
ced63af18f
|
fixing bug pass parameterg
|
2017-12-21 19:31:51 +08:00 |
|
Wenbo Hu
|
f0d59dab6c
|
forbid pass, if we have other choices
|
2017-12-20 22:10:47 +08:00 |
|
Wenbo Hu
|
e2c6b96e57
|
minor revision.
|
2017-12-20 21:52:30 +08:00 |
|
Wenbo Hu
|
48e95a21ea
|
simulator process a valid set, instead of a single action
|
2017-12-20 21:35:35 +08:00 |
|
rtz19970824
|
7fca90c61b
|
modify the mcts, refactor the network
|
2017-12-20 16:43:42 +08:00 |
|
Dong Yan
|
232204d797
|
fix the copy bug in check_global_isomorphous; refactor code to eliminate side effect
|
2017-12-19 22:57:38 +08:00 |
|
mcgrady00h
|
1f011a44ef
|
add mcts virtual loss version (may have bugs)
|
2017-12-19 17:04:55 +08:00 |
|
Dong Yan
|
fc8114fe35
|
merge flatten and deflatten, rename variable for clarity
|
2017-12-19 16:51:50 +08:00 |
|
宋世虹
|
7693c38f44
|
add comments and todos
|
2017-12-17 13:28:21 +08:00 |
|
宋世虹
|
62e2c6582d
|
finished very naive dqn: changed the interface of replay buffer by adding collect and next_batch, but still need refactoring; added implementation of dqn.py, but still need to consider the interface to make it more extensive; slightly refactored the code style of the codebase; more comments and todos will be in the next commit
|
2017-12-17 12:52:00 +08:00 |
|
Dong Yan
|
e10acf5130
|
0. code refactor, try to merge Go and GoEnv
|
2017-12-16 23:29:11 +08:00 |
|
Dong Yan
|
6cb4b02fca
|
merge class strategy with class game. Next, merge Go with GoEnv
|
2017-12-15 22:19:44 +08:00 |
|
rtz19970824
|
0874d5342f
|
implement dqn loss and dpg loss, add TODO for separate actor and critic
|
2017-12-15 14:24:08 +08:00 |
|
Haosheng Zou
|
f496725437
|
add dqn.py to write
|
2017-12-13 22:43:45 +08:00 |
|
Haosheng Zou
|
72ae304ab3
|
preliminary design of dqn_example, dqn interface. identify the assign of networks
|
2017-12-13 20:47:45 +08:00 |
|
rtz19970824
|
0c4a83f3eb
|
vanilla policy gradient
|
2017-12-11 13:37:27 +08:00 |
|
haosheng
|
a00b930c2c
|
fix naming and comments of coding style, delete .json
|
2017-12-10 17:23:13 +08:00 |
|
songshshshsh
|
f1a7fd9ee1
|
replay buffer initial commit
|
2017-12-10 14:56:04 +08:00 |
|
rtz19970824
|
a8a12f1083
|
coding style
|
2017-12-10 14:23:40 +08:00 |
|
rtz19970824
|
18b3b0b850
|
add some TODO
|
2017-12-10 13:31:43 +08:00 |
|
rtz19970824
|
03a6880050
|
Merge branch 'master' of https://github.com/sproblvem/tianshou
|
2017-12-08 23:41:51 +08:00 |
|
rtz19970824
|
bc49d466d1
|
minor fixed
|
2017-12-08 23:41:31 +08:00 |
|
haosheng
|
ff4306ddb9
|
model-free rl first commit, with ppo_example.py in examples/ and task delegations in ppo_example.py and READMEs
|
2017-12-08 21:09:23 +08:00 |
|
rtz19970824
|
f9f63e6609
|
combine gtp and network
|
2017-12-05 23:17:20 +08:00 |
|
rtz19970824
|
543d876f12
|
merge gtp
|
2017-12-04 11:01:49 +08:00 |
|
rtz19970824
|
7a4c5c3c88
|
minor fixed
|
2017-12-03 19:16:21 +08:00 |
|
rtz19970824
|
ca0021083f
|
AlphaGo update
|
2017-11-26 13:36:52 +08:00 |
|
rtz19970824
|
e4e56d17d1
|
minor fixed
|
2017-11-21 22:52:17 +08:00 |
|
rtz19970824
|
31beb46563
|
mcts update
|
2017-11-21 22:19:52 +08:00 |
|
JialianLee
|
1e07cb1fac
|
modification of docs for mcts
|
2017-11-18 15:55:14 +08:00 |
|
JialianLee
|
3795c24be9
|
Merge branch 'master' of github.com:sproblvem/tianshou
|
2017-11-18 15:50:54 +08:00 |
|
JialianLee
|
d9a50569f5
|
modification to docs of mcts
|
2017-11-18 09:37:15 +08:00 |
|
Dong Yan
|
31bfc07dc2
|
mcts update
|
2017-11-17 19:35:20 +08:00 |
|
Tongzheng Ren
|
c5c2cdf0f3
|
mcts update
|
2017-11-17 15:09:07 +08:00 |
|
Dong Yan
|
767fd4ea20
|
mcts
|
2017-11-16 17:05:54 +08:00 |
|
Dong Yan
|
df57fdb411
|
mcts framework
|
2017-11-16 13:23:26 +08:00 |
|
Dong Yan
|
30427055d1
|
mcts framework
|
2017-11-16 13:21:27 +08:00 |
|
JialianLee
|
2f1035d899
|
update mcts docs
|
2017-11-16 12:38:51 +08:00 |
|
Tongzheng Ren
|
6d9c369a65
|
architecture design patch two
|
2017-11-06 15:24:34 +08:00 |
|
Tongzheng Ren
|
e6cad0bce9
|
architecture design patch
|
2017-11-06 15:17:55 +08:00 |
|
Tongzheng Ren
|
595e62e111
|
architecture design
|
2017-11-06 15:15:44 +08:00 |
|
Tongzheng Ren
|
4e4a7b74c1
|
update the optimizer README
|
2017-11-06 14:01:29 +08:00 |
|
Tongzheng Ren
|
48b830eda6
|
TODO: policy optimizer
|
2017-11-06 13:50:35 +08:00 |
|