Tianshou

History

Using dist.mode instead of logits.argmax (#1066 )

changed all the occurrences where an action is selected deterministically

- **from**: using the outputs of the actor network.
- **to**: using the mode of the PyTorch distribution.

---------

Co-authored-by: Arnau Jimenez <arnau.jimenez@zeiss.com>

2024-03-03 00:09:39 +01:00

data

Docs/use nbqa on notebooks (#1041 )

2024-02-07 17:28:16 +01:00

env

Resolve platform-specific/installation-specific mypy issues

2024-02-15 11:26:54 +01:00

exploration

Remove kwargs in policy init (#950 )

2023-10-08 08:57:03 -07:00

highlevel

Fix/add watch env with obs rms (#1061 )

2024-02-29 15:59:11 +01:00

policy

Using dist.mode instead of logits.argmax (#1066 )

2024-03-03 00:09:39 +01:00

trainer

Minor simplification in train_step (#1019 )

2024-01-09 08:51:49 -08:00

utils

Fix high-level examples (#1060 )

2024-02-23 23:17:14 +01:00

__init__.py

Revert "Depend on sensAI instead of copying its utils (logging, string)"

2023-11-08 19:11:39 +01:00

py.typed

add py.typed, drop 3.6/3.7, support 3.11 (#910 )

2023-08-10 14:13:46 -07:00