Closes #917 ### Internal Improvements - Better variable names related to model outputs (logits, dist input etc.). #1032 - Improved typing for actors and critics, using Tianshou classes like `Actor`, `ActorProb`, etc., instead of just `nn.Module`. #1032 - Added interfaces for most `Actor` and `Critic` classes to enforce the presence of `forward` methods. #1032 - Simplified `PGPolicy` forward by unifying the `dist_fn` interface (see associated breaking change). #1032 - Use `.mode` of distribution instead of relying on knowledge of the distribution type. #1032 ### Breaking Changes - Changed interface of `dist_fn` in `PGPolicy` and all subclasses to take a single argument in both continuous and discrete cases. #1032 --------- Co-authored-by: Arnau Jimenez <arnau.jimenez@zeiss.com> Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
2.0 KiB
2.0 KiB
Changelog
Release 1.1.0
Api Extensions
- Batch received two new methods:
to_dict
andto_list_of_dicts
. #1063 Collector
s can now be closed, and their reset is more granular. #1063- Trainers can control whether collectors should be reset prior to training. #1063
- Convenience constructor for
CollectStats
calledwith_autogenerated_stats
. #1063
Internal Improvements
Collector
s rely less on state, the few stateful things are stored explicitly instead of through a.data
attribute. #1063- Introduced a first iteration of a naming convention for vars in
Collector
s. #1063 - Generally improved readability of Collector code and associated tests (still quite some way to go). #1063
- Improved typing for
exploration_noise
and within Collector. #1063 - Better variable names related to model outputs (logits, dist input etc.). #1032
- Improved typing for actors and critics, using Tianshou classes like
Actor
,ActorProb
, etc., instead of justnn.Module
. #1032 - Added interfaces for most
Actor
andCritic
classes to enforce the presence offorward
methods. #1032 - Simplified
PGPolicy
forward by unifying thedist_fn
interface (see associated breaking change). #1032 - Use
.mode
of distribution instead of relying on knowledge of the distribution type. #1032
Breaking Changes
- Removed
.data
attribute fromCollector
and its child classes. #1063 - Collectors no longer reset the environment on initialization. Instead, the user might have to call
reset
expicitly or passreset_before_collect=True
. #1063 - VectorEnvs now return an array of info-dicts on reset instead of a list. #1063
- Fixed
iter(Batch(...)
which now behaves the same way asBatch(...).__iter__()
. Can be considered a bugfix. #1063 - Changed interface of
dist_fn
inPGPolicy
and all subclasses to take a single argument in both continuous and discrete cases. #1032
Tests
- Fixed env seeding it test_sac_with_il.py so that the test doesn't fail randomly. #1081
Started after v1.0.0