haoshengzou
|
b21a55dc88
|
towards policy/value refactor
|
2017-12-23 17:25:16 +08:00 |
|
宋世虹
|
7693c38f44
|
add comments and todos
|
2017-12-17 13:28:21 +08:00 |
|
宋世虹
|
62e2c6582d
|
finished very naive dqn: changed the interface of replay buffer by adding collect and next_batch, but still need refactoring; added implementation of dqn.py, but still need to consider the interface to make it more extensive; slightly refactored the code style of the codebase; more comments and todos will be in the next commit
|
2017-12-17 12:52:00 +08:00 |
|
Haosheng Zou
|
9ed3e7b092
|
minor fix
|
2017-12-14 19:46:38 +08:00 |
|
Haosheng Zou
|
72ae304ab3
|
preliminary design of dqn_example, dqn interface. identify the assign of networks
|
2017-12-13 20:47:45 +08:00 |
|