Adjusted notebooks, log messages and docs accordingly. Removed now obsolete in_eval_mode and the private context manager in Trainer
5.9 KiB
5.9 KiB
Changelog
Release 1.1.0
Api Extensions
data:Batch:- Add methods
to_dictandto_list_of_dicts. #1063 #1098 - Add methods
to_numpy_andto_torch_. #1098, #1117 - Add
__eq__(semantic equality check). #1098 keys()deprecated in favor ofget_keys()(needed to make iteration consistent with naming) #1105.
- Add methods
data.collector:Collector:- Introduced
BaseCollectoras a base class for all collectors. #1123 - Add method
close#1063 - Method
resetis now more granular (new flags controlling behavior). #1063
- Introduced
CollectStats: Add convenience constructorwith_autogenerated_stats. #1063
trainer:- Trainers can now control whether collectors should be reset prior to training. #1063
- policy:
- introduced attribute
in_training_stepthat is controlled by the trainer. #1123 - policy automatically set to
evalmode when collecting and totrainmode when updating. #1123
- introduced attribute
highlevel:SamplingConfig:- Add support for
batch_size=None. #1077 - Add
training_seedfor explicit seeding of training and test environments, thetest_seedis inferred fromtraining_seed. #1074
- Add support for
highlevel.experiment:Experimentnow has anameattribute, which can be set usingExperimentBuilder.with_nameand which determines the default run name and therefore the persistence subdirectory. It can still be overridden inExperiment.run(), the new parameter name beingrun_namerather thanexperiment_name(although the latter will still be interpreted correctly). #1074 #1131- Add class
ExperimentCollectionfor the convenient execution of multiple experiment runs #1131 ExperimentBuilder:- Add method
build_seeded_collectionfor the sound creation of multiple experiments with varying random seeds #1131 - Add method
copyto facilitate the creation of multiple experiments from a single builder #1131
- Add method
evaluation: New package for repeating the same experiment with multiple seeds and aggregating the results. #1074- The module
evaluation.launchersfor parallelization is currently in alpha state.
- The module
- Loggers can now restore the logged data into python by using the new
restore_logged_datamethod. #1074 utils:net.continuous.Critic:- Add flag
apply_preprocess_net_to_obs_onlyto allow the preprocessing network to be applied to the observations only (without the actions concatenated), which is essential for the case where we want to reuse the actor's preprocessing network #1128
- Add flag
torch_utils(new module)- Added context managers
torch_train_modeandpolicy_within_training_step#1123
- Added context managers
Fixes
CriticFactoryReuseActor: Enable the Critic flagapply_preprocess_net_to_obs_onlyfor continuous critics, fixing the case where we want to reuse an actor's preprocessing network for the critic (affects usages of the experiment builder methodwith_critic_factory_use_actorwith continuous environments) #1128atari_network.DQN:- Fix constructor input validation #1128
- Fix
output_dimnot being set iffeatures_only=True andoutput_dim_added_layeris not None #1128
Internal Improvements
Collectors rely less on state, the few stateful things are stored explicitly instead of through a.dataattribute. #1063- Introduced a first iteration of a naming convention for vars in
Collectors. #1063 - Generally improved readability of Collector code and associated tests (still quite some way to go). #1063
- Improved typing for
exploration_noiseand within Collector. #1063 - Better variable names related to model outputs (logits, dist input etc.). #1032
- Improved typing for actors and critics, using Tianshou classes like
Actor,ActorProb, etc., instead of justnn.Module. #1032 - Added interfaces for most
ActorandCriticclasses to enforce the presence offorwardmethods. #1032 - Simplified
PGPolicyforward by unifying thedist_fninterface (see associated breaking change). #1032 - Use
.modeof distribution instead of relying on knowledge of the distribution type. #1032 - Exception no longer raised on
lenof emptyBatch. #1084 - tests and examples are covered by
mypy. #1077 NetBaseis more used, stricter typing by making it generic. #1077- Use explicit multiprocessing context for creating
Pipeinsubproc.py. #1102
Breaking Changes
data:Collector:- Removed
.dataattribute. #1063 - Collectors no longer reset the environment on initialization.
Instead, the user might have to call
resetexpicitly or passreset_before_collect=True. #1063 - Removed
no_gradargument fromcollectmethod (was unused in tianshou). #1123
- Removed
Batch:- Fixed
iter(Batch(...)which now behaves the same way asBatch(...).__iter__(). Can be considered a bugfix. #1063 - The methods
to_numpyandto_torchin are not in-place anymore (useto_numpy_orto_torch_instead). #1098, #1117
- Fixed
- Logging:
BaseLogger.prepare_dict_for_loggingis now abstract. #1074- Removed deprecated and unused
BasicLogger(only affects users who subclassed it). #1074
- VectorEnvs now return an array of info-dicts on reset instead of a list. #1063
- Changed interface of
dist_fninPGPolicyand all subclasses to take a single argument in both continuous and discrete cases. #1032 utils.net.common.Recurrentnow receives and returns aRecurrentStateBatchinstead of a dict. #1077AtariEnvFactoryconstructor (in examples, so not really breaking) now requires explicit train and test seeds. #1074EnvFactoryRegisterednow requires an explicittest_seedin the constructor. #1074
Tests
- Fixed env seeding it
test_sac_with_il.pyso that the test doesn't fail randomly. #1081
Dependencies
- DeepDiff added to help with diffs of batches in tests. #1098
- Bumped black, idna, pillow
- New extra "eval"
Started after v1.0.0