Tianshou

Author	SHA1	Message	Date
haoshengzou	75e7f14051	towards ddpg	2018-03-28 18:47:41 +08:00
haoshengzou	52e6b09768	finish ddpg. now ppo, actor-critic, dqn works. ddpg is not working, check!	2018-03-11 17:47:42 +08:00
haoshengzou	498b55c051	ppo with batch also works! now ppo improves steadily, dqn not so stable.	2018-03-10 17:30:11 +08:00
haoshengzou	92894d3853	working on off-policy test. other parts of dqn_replay is runnable, but performance not tested.	2018-03-09 15:07:14 +08:00
haoshengzou	e68dcd3c64	working on off-policy test. other parts of dqn_replay is runnable, but performance not tested.	2018-03-08 16:51:12 +08:00
Dong Yan	24d75fd1aa	call nstep_q_return from dqn_replay.py, still need test	2018-03-06 20:48:07 +08:00
haoshengzou	2a2274aeea	initial data_collector. working on examples/dqn_replay.py to run	2018-03-04 21:29:58 +08:00
haoshengzou	54a7b1343d	design exploration and evaluators for off-policy algos	2018-03-04 13:53:29 +08:00
haoshengzou	e302fd87fb	vanilla replay buffer finished and tested. working on data_collector.	2018-03-03 20:42:34 +08:00