Tianshou

History

Naming and typing improvements in Actor/Critic/Policy forwards (#1032 )

Closes #917 

### Internal Improvements
- Better variable names related to model outputs (logits, dist input
etc.). #1032
- Improved typing for actors and critics, using Tianshou classes like
`Actor`, `ActorProb`, etc.,
instead of just `nn.Module`. #1032
- Added interfaces for most `Actor` and `Critic` classes to enforce the
presence of `forward` methods. #1032
- Simplified `PGPolicy` forward by unifying the `dist_fn` interface (see
associated breaking change). #1032
- Use `.mode` of distribution instead of relying on knowledge of the
distribution type. #1032

### Breaking Changes

- Changed interface of `dist_fn` in `PGPolicy` and all subclasses to
take a single argument in both
continuous and discrete cases. #1032

---------

Co-authored-by: Arnau Jimenez <arnau.jimenez@zeiss.com>
Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>

2024-04-01 17:14:17 +02:00

0_intro.md

Docs/use nbqa on notebooks (#1041 )

2024-02-07 17:28:16 +01:00

L0_overview.ipynb

Feat/refactor collector (#1063 )

2024-03-28 18:02:31 +01:00

L1_Batch.ipynb

Docs/html doc issues (#1048 )

2024-02-09 19:43:10 +01:00

L2_Buffer.ipynb

Docs/use nbqa on notebooks (#1041 )