5.4 KiB
5.4 KiB
Changelog
Release 1.1.0
Api Extensions
- Batch received two new methods:
to_dict
andto_list_of_dicts
. #1063 Collector
s can now be closed, and their reset is more granular. #1063- Trainers can control whether collectors should be reset prior to training. #1063
- Convenience constructor for
CollectStats
calledwith_autogenerated_stats
. #1063 SamplingConfig
supportsbatch_size=None
. #1077- Batch received new methods:
to_numpy_
andto_torch_
. #1098, #1117 to_dict
in Batch supports also non-recursive conversion. #1098- Batch
__eq__
implemented, semantic equality check of batches is now possible. #1098 Batch.keys()
deprecated in favor ofBatch.get_keys()
(needed to make iteration consistent with naming) #1105.Experiment
andExperimentConfig
now have aname
, that can however be overridden whenExperiment.run()
is called. #1074- When building an
Experiment
from anExperimentConfig
, the user has the option to add info about seeds to the name. #1074 - New method in
ExperimentConfig
calledbuild_default_seeded_experiments
. #1074 SamplingConfig
has an explicit training seed,test_seed
is inferred. #1074- New
evaluation
package for repeating the same experiment with multiple seeds and aggregating the results (important extension!). Launchers for parallelization currently in alpha state. #1074 - Loggers can now restore the logged data into python by using the new
restore_logged_data
method. #1074 continuous.Critic
:- Add flag
apply_preprocess_net_to_obs_only
to allow the preprocessing network to be applied to the observations only (without the actions concatenated), which is essential for the case where we want to reuse the actor's preprocessing network #1128
- Add flag
- Base class for collectors:
BaseCollector
#1122 - Collectors can now explicitly specify whether to use the policy in training or evaluation mode. #1122
- New util context managers
in_eval_mode
andin_train_mode
for torch modules. #1122 reset
ofCollectors
now returnsobs
andinfo
. #1122
Fixes
CriticFactoryReuseActor
: Enable the Critic flagapply_preprocess_net_to_obs_only
for continuous critics, fixing the case where we want to reuse an actor's preprocessing network for the critic (affects usages of the experiment builder methodwith_critic_factory_use_actor
with continuous environments) #1128atari_network.DQN
:- Fix constructor input validation #1128
- Fix
output_dim
not being set iffeatures_only
=True andoutput_dim_added_layer
is not None #1128
Internal Improvements
Collector
s rely less on state, the few stateful things are stored explicitly instead of through a.data
attribute. #1063- Introduced a first iteration of a naming convention for vars in
Collector
s. #1063 - Generally improved readability of Collector code and associated tests (still quite some way to go). #1063
- Improved typing for
exploration_noise
and within Collector. #1063 - Better variable names related to model outputs (logits, dist input etc.). #1032
- Improved typing for actors and critics, using Tianshou classes like
Actor
,ActorProb
, etc., instead of justnn.Module
. #1032 - Added interfaces for most
Actor
andCritic
classes to enforce the presence offorward
methods. #1032 - Simplified
PGPolicy
forward by unifying thedist_fn
interface (see associated breaking change). #1032 - Use
.mode
of distribution instead of relying on knowledge of the distribution type. #1032 - Exception no longer raised on
len
of emptyBatch
. #1084 - tests and examples are covered by
mypy
. #1077 NetBase
is more used, stricter typing by making it generic. #1077- Use explicit multiprocessing context for creating
Pipe
insubproc.py
. #1102 - Removed all
if __name__ == "__main__":
blocks from tests. #1122 - Improved typing issues in tests with buffer and collector. #1122
Breaking Changes
- Removed
.data
attribute fromCollector
and its child classes. #1063 - Collectors no longer reset the environment on initialization. Instead, the user might have to call
reset
expicitly or passreset_before_collect=True
. #1063 - VectorEnvs now return an array of info-dicts on reset instead of a list. #1063
- Fixed
iter(Batch(...)
which now behaves the same way asBatch(...).__iter__()
. Can be considered a bugfix. #1063 - Changed interface of
dist_fn
inPGPolicy
and all subclasses to take a single argument in both continuous and discrete cases. #1032 utils.net.common.Recurrent
now receives and returns aRecurrentStateBatch
instead of a dict. #1077- The methods
to_numpy
andto_torch
inBatch
is not in-place anymore (useto_numpy_
orto_torch_
instead). #1098, #1117 AtariEnvFactory
constructor (in examples, so not really breaking) now requires explicit train and test seeds. #1074EnvFactoryRegistered
now requires an explicittest_seed
in the constructor. #1074BaseLogger.prepare_dict_for_logging
is now abstract. #1074- Removed deprecated and unused
BasicLogger
(only affects users who subclassed it). #1074 - Removed deprecations of
0.5.1
(will likely not affect anyone) and the unusedwarnings
module. #1122
Tests
- Fixed env seeding it
test_sac_with_il.py
so that the test doesn't fail randomly. #1081
Dependencies
- DeepDiff added to help with diffs of batches in tests. #1098
- Bumped black, idna, pillow
- New extra "eval"
Started after v1.0.0