Tianshou/discrete at 3ac67d9974b6bd3e3d7feac7738ca6de33b317c7 - Tianshou - Gitea: Git with a cup of tea

hongshaorou/Tianshou

History

ChenDRAG 3ac67d9974

refactor A2C/PPO, change behavior of value normalization (#321 )

2021-03-25 10:12:39 +08:00

..

__init__.py

refract test code

2020-03-21 10:58:01 +08:00

test_a2c_with_il.py

refactor A2C/PPO, change behavior of value normalization (#321 )

2021-03-25 10:12:39 +08:00

test_c51.py

Add Timelimit trick to optimize policies (#296 )

2021-02-26 13:23:18 +08:00

test_dqn.py

add logger (#295 )

2021-02-24 14:48:42 +08:00

test_drqn.py

Remove reward_normaliztion option in offpolicy algorithm (#298 )

2021-02-27 11:20:43 +08:00

test_il_bcq.py

add logger (#295 )

2021-02-24 14:48:42 +08:00

test_pg.py

Refactor PG algorithm and change behavior of compute_episodic_return (#319 )

2021-03-23 22:05:48 +08:00

test_ppo.py

Refactor PG algorithm and change behavior of compute_episodic_return (#319 )

2021-03-23 22:05:48 +08:00

test_qrdqn.py

fix qvalue mask_action error for obs_next (#310 )

2021-03-15 08:06:24 +08:00

test_sac.py

Remove reward_normaliztion option in offpolicy algorithm (#298 )

2021-02-27 11:20:43 +08:00