Closes #952 - `SamplingConfig` supports `batch_size=None`. #1077 - tests and examples are covered by `mypy`. #1077 - `NetBase` is more used, stricter typing by making it generic. #1077 - `utils.net.common.Recurrent` now receives and returns a `RecurrentStateBatch` instead of a dict. #1077 --------- Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
2.3 KiB
2.3 KiB
Changelog
Release 1.1.0
Api Extensions
- Batch received two new methods:
to_dict
andto_list_of_dicts
. #1063 Collector
s can now be closed, and their reset is more granular. #1063- Trainers can control whether collectors should be reset prior to training. #1063
- Convenience constructor for
CollectStats
calledwith_autogenerated_stats
. #1063 SamplingConfig
supportsbatch_size=None
. #1077
Internal Improvements
Collector
s rely less on state, the few stateful things are stored explicitly instead of through a.data
attribute. #1063- Introduced a first iteration of a naming convention for vars in
Collector
s. #1063 - Generally improved readability of Collector code and associated tests (still quite some way to go). #1063
- Improved typing for
exploration_noise
and within Collector. #1063 - Better variable names related to model outputs (logits, dist input etc.). #1032
- Improved typing for actors and critics, using Tianshou classes like
Actor
,ActorProb
, etc., instead of justnn.Module
. #1032 - Added interfaces for most
Actor
andCritic
classes to enforce the presence offorward
methods. #1032 - Simplified
PGPolicy
forward by unifying thedist_fn
interface (see associated breaking change). #1032 - Use
.mode
of distribution instead of relying on knowledge of the distribution type. #1032 - Exception no longer raised on
len
of emptyBatch
. #1084 - tests and examples are covered by
mypy
. #1077 NetBase
is more used, stricter typing by making it generic. #1077
Breaking Changes
- Removed
.data
attribute fromCollector
and its child classes. #1063 - Collectors no longer reset the environment on initialization. Instead, the user might have to call
reset
expicitly or passreset_before_collect=True
. #1063 - VectorEnvs now return an array of info-dicts on reset instead of a list. #1063
- Fixed
iter(Batch(...)
which now behaves the same way asBatch(...).__iter__()
. Can be considered a bugfix. #1063 - Changed interface of
dist_fn
inPGPolicy
and all subclasses to take a single argument in both continuous and discrete cases. #1032 utils.net.common.Recurrent
now receives and returns aRecurrentStateBatch
instead of a dict. #1077
Tests
- Fixed env seeding it test_sac_with_il.py so that the test doesn't fail randomly. #1081
Started after v1.0.0