Tianshou/policy at e27b5a26f330de446fe15388bf81c3777f024fb9 - Tianshou - Gitea: Git with a cup of tea

hongshaorou/Tianshou

History

ChenDRAG e27b5a26f3

Refactor PG algorithm and change behavior of compute_episodic_return (#319 )

- simplify code
- apply value normalization (global) and adv norm (per-batch) in on-policy algorithms

2021-03-23 22:05:48 +08:00

..

Remove reward_normaliztion option in offpolicy algorithm (#298 )

2021-02-27 11:20:43 +08:00

Remove reward_normaliztion option in offpolicy algorithm (#298 )

2021-02-27 11:20:43 +08:00

Refactor PG algorithm and change behavior of compute_episodic_return (#319 )

2021-03-23 22:05:48 +08:00

Step collector implementation (#280 )

2021-02-19 10:33:49 +08:00

__init__.py

Step collector implementation (#280 )

2021-02-19 10:33:49 +08:00

base.py

Refactor PG algorithm and change behavior of compute_episodic_return (#319 )

2021-03-23 22:05:48 +08:00

random.py

Trainer refactor : some definition change (#293 )

2021-02-21 13:06:02 +08:00