Tianshou/CHANGELOG.md
2024-04-29 22:30:54 +02:00

78 lines
4.9 KiB
Markdown

# Changelog
## Release 1.1.0
### Api Extensions
- Batch received two new methods: `to_dict` and `to_list_of_dicts`. #1063
- `Collector`s can now be closed, and their reset is more granular. #1063
- Trainers can control whether collectors should be reset prior to training. #1063
- Convenience constructor for `CollectStats` called `with_autogenerated_stats`. #1063
- `SamplingConfig` supports `batch_size=None`. #1077
- Batch received new methods: `to_numpy_` and `to_torch_`. #1098, #1117
- `to_dict` in Batch supports also non-recursive conversion. #1098
- Batch `__eq__` implemented, semantic equality check of batches is now possible. #1098
- `Batch.keys()` deprecated in favor of `Batch.get_keys()` (needed to make iteration consistent with naming) #1105.
- `Experiment` and `ExperimentConfig` now have a `name`, that can however be overridden when `Experiment.run()` is called. #1074
- When building an `Experiment` from an `ExperimentConfig`, the user has the option to add info about seeds to the name. #1074
- New method in `ExperimentConfig` called `build_default_seeded_experiments`. #1074
- `SamplingConfig` has an explicit training seed, `test_seed` is inferred. #1074
- New `evaluation` package for repeating the same experiment with multiple seeds and aggregating the results (important extension!).
Launchers for parallelization currently in alpha state. #1074
- Loggers can now restore the logged data into python by using the new `restore_logged_data` method. #1074
- `continuous.Critic`:
- Add flag `apply_preprocess_net_to_obs_only` to allow the
preprocessing network to be applied to the observations only (without
the actions concatenated), which is essential for the case where we want
to reuse the actor's preprocessing network #1128
### Fixes
- `CriticFactoryReuseActor`: Enable the Critic flag `apply_preprocess_net_to_obs_only` for continuous critics,
fixing the case where we want to reuse an actor's preprocessing network for the critic (affects usages
of the experiment builder method `with_critic_factory_use_actor` with continuous environments) #1128
- `atari_network.DQN`:
- Fix constructor input validation #1128
- Fix `output_dim` not being set if `features_only`=True and `output_dim_added_layer` is not None #1128
### Internal Improvements
- `Collector`s rely less on state, the few stateful things are stored explicitly instead of through a `.data` attribute. #1063
- Introduced a first iteration of a naming convention for vars in `Collector`s. #1063
- Generally improved readability of Collector code and associated tests (still quite some way to go). #1063
- Improved typing for `exploration_noise` and within Collector. #1063
- Better variable names related to model outputs (logits, dist input etc.). #1032
- Improved typing for actors and critics, using Tianshou classes like `Actor`, `ActorProb`, etc.,
instead of just `nn.Module`. #1032
- Added interfaces for most `Actor` and `Critic` classes to enforce the presence of `forward` methods. #1032
- Simplified `PGPolicy` forward by unifying the `dist_fn` interface (see associated breaking change). #1032
- Use `.mode` of distribution instead of relying on knowledge of the distribution type. #1032
- Exception no longer raised on `len` of empty `Batch`. #1084
- tests and examples are covered by `mypy`. #1077
- `NetBase` is more used, stricter typing by making it generic. #1077
- Use explicit multiprocessing context for creating `Pipe` in `subproc.py`. #1102
### Breaking Changes
- Removed `.data` attribute from `Collector` and its child classes. #1063
- Collectors no longer reset the environment on initialization. Instead, the user might have to call `reset`
expicitly or pass `reset_before_collect=True` . #1063
- VectorEnvs now return an array of info-dicts on reset instead of a list. #1063
- Fixed `iter(Batch(...)` which now behaves the same way as `Batch(...).__iter__()`. Can be considered a bugfix. #1063
- Changed interface of `dist_fn` in `PGPolicy` and all subclasses to take a single argument in both
continuous and discrete cases. #1032
- `utils.net.common.Recurrent` now receives and returns a `RecurrentStateBatch` instead of a dict. #1077
- The methods `to_numpy` and `to_torch` in `Batch` is not in-place anymore (use `to_numpy_` or `to_torch_` instead). #1098, #1117
- `AtariEnvFactory` constructor (in examples, so not really breaking) now requires explicit train and test seeds. #1074
- `EnvFactoryRegistered` now requires an explicit `test_seed` in the constructor. #1074
- `BaseLogger.prepare_dict_for_logging` is now abstract. #1074
- Removed deprecated and unused `BasicLogger` (only affects users who subclassed it). #1074
### Tests
- Fixed env seeding it `test_sac_with_il.py` so that the test doesn't fail randomly. #1081
### Dependencies
- [DeepDiff](https://github.com/seperman/deepdiff) added to help with diffs of batches in tests. #1098
- Bumped black, idna, pillow
- New extra "eval"
Started after v1.0.0