Alexis DUBURCQ 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							529a4cf44c 
							
						 
					 
					
						
						
							
							Add pickle support for Batch. Fix VectorEnv. ( #67 )  
						
						... 
						
						
						
						* Fix vecenv.
* Add pickle support for Batch class.
* Add Batch pickle Unit Test.
* Fix lint.
* Swap Batch UT.
* Fix lint.
Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu> 
						
						
					 
					
						2020-05-30 21:29:33 +08:00 
						 
				 
			
				
					
						
							
							
								Alexis DUBURCQ 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							dd3e2130bb 
							
						 
					 
					
						
						
							
							Infer the right dtype for replay buffers. ( #64 )  
						
						
						
						
					 
					
						2020-05-29 22:27:03 +08:00 
						 
				 
			
				
					
						
							
							
								Alexis DUBURCQ 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8af7196a9a 
							
						 
					 
					
						
						
							
							Robust conversion from/to numpy/pytorch ( #63 )  
						
						... 
						
						
						
						* Enable to convert Batch data back to torch.
* Add torch converter to collector.
* Fix
* Move to_numpy/to_torch convert in dedicated utils.py.
* Use to_numpy/to_torch to convert arrays.
* fix lint
* fix
* Add unit test to check Batch from/to numpy.
* Fix Batch over Batch.
Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu> 
						
						
					 
					
						2020-05-29 20:45:21 +08:00 
						 
				 
			
				
					
						
							
							
								Alexis DUBURCQ 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b5093ecb56 
							
						 
					 
					
						
						
							
							Minor refactor for Batch class. ( #61 )  
						
						... 
						
						
						
						* Minor refactor for Batch class.
* Fix.
* Add back key sorting.
Co-authored-by: Alexis Duburcq <alexis.duburcq@wandercraft.eu> 
						
						
					 
					
						2020-05-29 17:56:46 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							be9ce44290 
							
						 
					 
					
						
						
							
							fix   #59  
						
						
						
						
					 
					
						2020-05-29 11:49:47 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							d2b2fa87c0 
							
						 
					 
					
						
						
							
							fix   #56  
						
						
						
						
					 
					
						2020-05-29 08:03:37 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							de556fd22d 
							
						 
					 
					
						
						
							
							item3 of  #51  
						
						
						
						
					 
					
						2020-05-27 11:02:23 +08:00 
						 
				 
			
				
					
						
							
							
								magicly 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6237cc0d52 
							
						 
					 
					
						
						
							
							fix dqn zero eps ( #52 )  
						
						... 
						
						
						
						Co-authored-by: liyan <liyan1@digisky.com> 
						
						
					 
					
						2020-05-21 11:35:41 +08:00 
						 
				 
			
				
					
						
							
							
								Imone 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							57bca16f94 
							
						 
					 
					
						
						
							
							Fix log_prob and PPO dual_clip ( #49 )  
						
						... 
						
						
						
						* Added DiagGaussian to fix log_probg
* Disable PPO dual_clip 
						
						
					 
					
						2020-05-18 16:23:35 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							0eef0ca198 
							
						 
					 
					
						
						
							
							fix optional type syntax  
						
						
						
						
					 
					
						2020-05-16 20:08:32 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							9b26137cd2 
							
						 
					 
					
						
						
							
							add type annotation  
						
						
						
						
					 
					
						2020-05-12 11:31:47 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							075825325e 
							
						 
					 
					
						
						
							
							add preprocess_fn ( #42 )  
						
						
						
						
					 
					
						2020-05-05 13:39:51 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							04b091d975 
							
						 
					 
					
						
						
							
							fix max-grad-norm err in a2c ( #46 )  
						
						
						
						
					 
					
						2020-05-04 12:33:04 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							c2a7caf806 
							
						 
					 
					
						
						
							
							add recurrent actor and critic  
						
						
						
						
					 
					
						2020-04-30 16:31:40 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							134f787e24 
							
						 
					 
					
						
						
							
							reserve 'policy' keyword in replay buffer  
						
						
						
						
					 
					
						2020-04-29 17:48:48 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							bb2f833d0e 
							
						 
					 
					
						
						
							
							support Batch of Batch and fix bugs ( #38 )  
						
						
						
						
					 
					
						2020-04-29 12:14:53 +08:00 
						 
				 
			
				
					
						
							
							
								nicoguertler 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8f718d9b13 
							
						 
					 
					
						
						
							
							Fix log_prob in SAC ( #41 )  
						
						
						
						
					 
					
						2020-04-28 23:44:15 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							80d661907e 
							
						 
					 
					
						
						
							
							Multimodal obs ( #38 ,  #27 ,  #25 )  
						
						
						
						
					 
					
						2020-04-28 20:56:02 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							959955fa2a 
							
						 
					 
					
						
						
							
							fix historical issues  
						
						
						
						
					 
					
						2020-04-26 16:13:51 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							6b96f124ae 
							
						 
					 
					
						
						
							
							fix pdqn  
						
						
						
						
					 
					
						2020-04-26 15:11:20 +08:00 
						 
				 
			
				
					
						
							
							
								rocknamx 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b23749463e 
							
						 
					 
					
						
						
							
							Prioritized DQN ( #30 )  
						
						... 
						
						
						
						* add sum_tree.py
* add prioritized replay buffer
* del sum_tree.py
* fix some format issues
* fix weight_update bug
* simply replace replaybuffer in test_dqn without weight update
* weight default set to 1
* fix sampling bug when buffer is not full
* rename parameter
* fix formula error, add accuracy check
* add PrioritizedDQN test
* add test_pdqn.py
* add update_weight() doc
* add ref of prio dqn in readme.md and index.rst
* restore test_dqn.py, fix args of test_pdqn.py 
						
						
					 
					
						2020-04-26 12:05:58 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							70290346ea 
							
						 
					 
					
						
						
							
							compatible with torch==1.5.0 ( fix   #37 )  
						
						
						
						
					 
					
						2020-04-26 11:04:45 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							8812eaa502 
							
						 
					 
					
						
						
							
							fix   #36  
						
						
						
						
					 
					
						2020-04-23 22:06:18 +08:00 
						 
				 
			
				
					
						
							
							
								Minghao Zhang 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							205698dd66 
							
						 
					 
					
						
						
							
							fix   #33  ( #34 )  
						
						
						
						
					 
					
						2020-04-21 15:36:08 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							4fd826761c 
							
						 
					 
					
						
						
							
							enable null buffer in test collector  
						
						
						
						
					 
					
						2020-04-20 11:50:18 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							815f3522bb 
							
						 
					 
					
						
						
							
							imitation with discrete action space  
						
						
						
						
					 
					
						2020-04-20 11:25:20 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							6bf1ea644d 
							
						 
					 
					
						
						
							
							fix ppo  
						
						
						
						
					 
					
						2020-04-19 14:30:42 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							680fc0ffbe 
							
						 
					 
					
						
						
							
							gae  
						
						
						
						
					 
					
						2020-04-14 21:11:06 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							7b65d43394 
							
						 
					 
					
						
						
							
							vanilla imitation learning  
						
						
						
						
					 
					
						2020-04-13 19:37:27 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							6a244d1fbb 
							
						 
					 
					
						
						
							
							save_fn  
						
						
						
						
					 
					
						2020-04-11 16:54:27 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							74407e13da 
							
						 
					 
					
						
						
							
							env info log_fn ( #28 )  
						
						
						
						
					 
					
						2020-04-10 18:02:05 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							ecfcb9f295 
							
						 
					 
					
						
						
							
							fix docs  
						
						
						
						
					 
					
						2020-04-10 11:16:33 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							3cc22b7c0c 
							
						 
					 
					
						
						
							
							__call__ -> forward  
						
						
						
						
					 
					
						2020-04-10 10:47:16 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							13086b7f64 
							
						 
					 
					
						
						
							
							add ignore_obs_next in buffer  
						
						
						
						
					 
					
						2020-04-10 09:01:17 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							19f2cce294 
							
						 
					 
					
						
						
							
							seealso and change policy dir structure  
						
						
						
						
					 
					
						2020-04-09 21:36:53 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							6da80e045a 
							
						 
					 
					
						
						
							
							fix rnn ( #19 ), add __repr__, and  fix   #26  
						
						
						
						
					 
					
						2020-04-09 19:53:45 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							86572c66d4 
							
						 
					 
					
						
						
							
							maybe finished rnn?  
						
						
						
						
					 
					
						2020-04-08 21:13:15 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							d9d2763dad 
							
						 
					 
					
						
						
							
							first version with full documentation  
						
						
						
						
					 
					
						2020-04-07 11:50:34 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							e0809ff135 
							
						 
					 
					
						
						
							
							add policy docs ( #21 )  
						
						
						
						
					 
					
						2020-04-06 19:36:59 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							610390c132 
							
						 
					 
					
						
						
							
							add docs of collector and trainer ( #20 )  
						
						
						
						
					 
					
						2020-04-05 18:34:45 +08:00 
						 
				 
			
				
					
						
							
							
								Oblivion 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							4d4d0daf9e 
							
						 
					 
					
						
						
							
							Performance improve ( #18 )  
						
						... 
						
						
						
						* improve performance
set one thread for NN
replace detach() op with torch.no_grad()
* fix pep 8 errors 
						
						
					 
					
						2020-04-05 09:10:21 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							b6c9db6b0b 
							
						 
					 
					
						
						
							
							docs for env  
						
						
						
						
					 
					
						2020-04-04 21:02:06 +08:00 
						 
				 
			
				
					
						
							
							
								Oblivion 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9380368ca3 
							
						 
					 
					
						
						
							
							add an example of bullet env (experiment from jiqizhixin) ( #15 )  
						
						... 
						
						
						
						* add_pybullet_ens_test
test on pybullet envs
modify some log config
* delete DS_Store file
* add pybullet_envs test
add HalfCheetahBulletEnv-v0 test
modify log config
* fix pep 8 errors
* add pybullet to dev
* delete a line
* by pass F401
* add log_interval to onpolicy_trainer
* add comments
* Update halfcheetahBullet_v0_sac.py 
						
						
					 
					
						2020-04-04 11:46:18 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							974ade8019 
							
						 
					 
					
						
						
							
							add some docs  
						
						
						
						
					 
					
						2020-04-03 21:28:12 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							7cb5146611 
							
						 
					 
					
						
						
							
							add docs of trick  
						
						
						
						
					 
					
						2020-04-02 21:57:26 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							0acd0d164c 
							
						 
					 
					
						
						
							
							test api doc  
						
						
						
						
					 
					
						2020-04-02 09:07:04 +08:00 
						 
				 
			
				
					
						
							
							
								Minghao Zhang 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							0b08a41610 
							
						 
					 
					
						
						
							
							move mujoco to examples ( #12 )  
						
						... 
						
						
						
						* move mujoco to examples
* fix the import mujoco bug
* flake8
* flake8
* rm __init__.py 
						
						
					 
					
						2020-04-02 08:49:19 +08:00 
						 
				 
			
				
					
						
							
							
								ShenDezhou 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							4da857d86e 
							
						 
					 
					
						
						
							
							Fix windows env setup bugs and other typo. ( #11 )  
						
						
						
						
					 
					
						2020-03-31 17:22:32 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							04208e6cce 
							
						 
					 
					
						
						
							
							update some tutorial  
						
						
						
						
					 
					
						2020-03-30 22:52:25 +08:00 
						 
				 
			
				
					
						
							
							
								Trinkle23897 
							
						 
					 
					
						
						
						
						
							
						
						
							d9e4b9d16f 
							
						 
					 
					
						
						
							
							upd doc  
						
						
						
						
					 
					
						2020-03-29 10:22:03 +08:00