14 Commits

Author SHA1 Message Date
haoshengzou
6f206759ab add __all__ 2018-05-20 22:36:04 +08:00
haoshengzou
2527030838 fix the bug of unnamed_dict.update(). import cleaning in examples/*.py 2018-04-16 20:17:41 +08:00
haoshengzou
d84c9d121c first master version 2018-04-16 18:02:00 +08:00
haoshengzou
5f979caf58 finish all API docs, first version. 2018-04-15 17:41:43 +08:00
haoshengzou
9186dae6a3 more API docs 2018-04-15 09:35:31 +08:00
haoshengzou
2a3bc3ef35 part of API doc 2018-04-12 21:10:50 +08:00
haoshengzou
03246f7ded functional code freeze. all examples working. prepare to release. 2018-04-11 14:23:40 +08:00
haoshengzou
e68dcd3c64 working on off-policy test. other parts of dqn_replay is runnable, but performance not tested. 2018-03-08 16:51:12 +08:00
haoshengzou
f32e1d9c12 finish ddpg example. all examples under examples/ (except those containing 'contrib' and 'fail') can run! advantage estimation module is not complete yet. 2018-01-18 17:38:52 +08:00
haoshengzou
8fbde8283f finish dqn example. advantage estimation module is not complete yet. 2018-01-18 12:19:48 +08:00
haoshengzou
ed25bf7586 fixed the bugs on Jan 14, which gives inferior or even no improvement. mistook group_ndims. policy will soon need refactoring. 2018-01-17 11:55:51 +08:00
haoshengzou
b33a141373 towards policy/value refactor 2017-12-23 17:25:16 +08:00
haoshengzou
2addef41d2 fix imports to support both python2 and python3. move contents from __init__.py to leave for work after major development. 2017-12-23 15:36:10 +08:00
Haosheng Zou
6611d948dd add value_function (critic). value_function and policy not finished yet. 2017-12-22 00:22:23 +08:00