Tianshou

Author	SHA1	Message	Date
songshshshsh	25b25ce7d8	Merge branch 'master' of https://github.com/sproblvem/tianshou	2018-02-27 13:15:36 +08:00
songshshshsh	67d0e78ab9	first modify of replay buffer, make all three replay buffers work, wait for refactoring and testing	2018-02-27 13:13:38 +08:00
haoshengzou	87889d766c	minor fixes. proceed to refactor replay to use lists as in batch.	2018-02-26 11:47:02 +08:00
haoshengzou	dfcea74fcf	fix memory growth and slowness caused by sess.run(tf.multinomial()), now ppo examples are working OK with slight memory growth (1M/min), which still needs research	2018-01-03 20:32:05 +08:00
宋世虹	d220f7f2a8	add comments and todos	2017-12-17 13:28:21 +08:00
宋世虹	3624cc9036	finished very naive dqn: changed the interface of replay buffer by adding collect and next_batch, but still need refactoring; added implementation of dqn.py, but still need to consider the interface to make it more extensive; slightly refactored the code style of the codebase; more comments and todos will be in the next commit	2017-12-17 12:52:00 +08:00
songshshshsh	f1a7fd9ee1	replay buffer initial commit	2017-12-10 14:56:04 +08:00