Tianshou

History

Using dist.mode instead of logits.argmax (#1066 )

changed all the occurrences where an action is selected deterministically

- **from**: using the outputs of the actor network.
- **to**: using the mode of the PyTorch distribution.

---------

Co-authored-by: Arnau Jimenez <arnau.jimenez@zeiss.com>

2024-03-03 00:09:39 +01:00

__init__.py

refract test code

2020-03-21 10:58:01 +08:00

test_ddpg.py

Refactoring/mypy issues test (#1017 )

2024-02-06 14:24:30 +01:00

test_npg.py

Refactoring/mypy issues test (#1017 )

2024-02-06 14:24:30 +01:00

test_ppo.py

Refactoring/mypy issues test (#1017 )

2024-02-06 14:24:30 +01:00

test_redq.py

Refactoring/mypy issues test (#1017 )

2024-02-06 14:24:30 +01:00

test_sac_with_il.py

Using dist.mode instead of logits.argmax (#1066 )

2024-03-03 00:09:39 +01:00

test_td3.py

Refactoring/mypy issues test (#1017 )

2024-02-06 14:24:30 +01:00

test_trpo.py

Refactoring/mypy issues test (#1017 )

2024-02-06 14:24:30 +01:00