Closes: https://github.com/aai-institute/tianshou/issues/1116
### API Extensions
- Batch received new method: `to_torch_`. #1117
### Breaking Changes
- The method `to_torch` in `data.utils.batch.Batch` is not in-place
anymore. Instead, a new method `to_torch_` does the conversion in-place.
#1117
Closes: https://github.com/thu-ml/tianshou/issues/1086
### Api Extensions
- Batch received new method: `to_numpy_`. #1098
- `to_dict` in Batch supports also non-recursive conversion. #1098
- Batch `__eq__` now implemented, semantic equality check of batches is
now possible. #1098
### Breaking Changes
- The method `to_numpy` in `data.utils.batch.Batch` is not in-place
anymore. Instead, a new method `to_numpy_` does the conversion in-place.
#1098
Closes#952
- `SamplingConfig` supports `batch_size=None`. #1077
- tests and examples are covered by `mypy`. #1077
- `NetBase` is more used, stricter typing by making it generic. #1077
- `utils.net.common.Recurrent` now receives and returns a
`RecurrentStateBatch` instead of a dict. #1077
---------
Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
Closes#917
### Internal Improvements
- Better variable names related to model outputs (logits, dist input
etc.). #1032
- Improved typing for actors and critics, using Tianshou classes like
`Actor`, `ActorProb`, etc.,
instead of just `nn.Module`. #1032
- Added interfaces for most `Actor` and `Critic` classes to enforce the
presence of `forward` methods. #1032
- Simplified `PGPolicy` forward by unifying the `dist_fn` interface (see
associated breaking change). #1032
- Use `.mode` of distribution instead of relying on knowledge of the
distribution type. #1032
### Breaking Changes
- Changed interface of `dist_fn` in `PGPolicy` and all subclasses to
take a single argument in both
continuous and discrete cases. #1032
---------
Co-authored-by: Arnau Jimenez <arnau.jimenez@zeiss.com>
Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
Closes: #1058
### Api Extensions
- Batch received two new methods: `to_dict` and `to_list_of_dicts`.
#1063
- `Collector`s can now be closed, and their reset is more granular.
#1063
- Trainers can control whether collectors should be reset prior to
training. #1063
- Convenience constructor for `CollectStats` called
`with_autogenerated_stats`. #1063
### Internal Improvements
- `Collector`s rely less on state, the few stateful things are stored
explicitly instead of through a `.data` attribute. #1063
- Introduced a first iteration of a naming convention for vars in
`Collector`s. #1063
- Generally improved readability of Collector code and associated tests
(still quite some way to go). #1063
- Improved typing for `exploration_noise` and within Collector. #1063
### Breaking Changes
- Removed `.data` attribute from `Collector` and its child classes.
#1063
- Collectors no longer reset the environment on initialization. Instead,
the user might have to call `reset`
expicitly or pass `reset_before_collect=True` . #1063
- VectorEnvs now return an array of info-dicts on reset instead of a
list. #1063
- Fixed `iter(Batch(...)` which now behaves the same way as
`Batch(...).__iter__()`. Can be considered a bugfix. #1063
---------
Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
changed all the occurrences where an action is selected deterministically
- **from**: using the outputs of the actor network.
- **to**: using the mode of the PyTorch distribution.
---------
Co-authored-by: Arnau Jimenez <arnau.jimenez@zeiss.com>
The high-level examples were all broken by changes made to make mypy
pass.
This PR fixes them, making a type change in logging.run_cli instead to
make mypy happy.
* Add an option to SamplingConfig which allows to configure number of
test episodes
* Make OptimizerFactory more flexible, adding method
`create_optimizer_for_params`
* Fix AutoAlphaFactoryDefault using hard-coded Adam optimizer
* Fix mypy issues that were platform/installation-dependent
* Limit scope of nbqa, resolving issues with files generated by old
versions of the build
Fixes#1054
Closes#1005
## Main changes
2. Load vega-embed things using jupyter-book config
3. Add vega-embed dependencies as part of local code for offline
development
4. Reduced duplication in benchmark.js
5. Update sphinx, docutils, and jupyter-book
Co-authored-by: carlocagnetta <c.cagnetta@appliedai.de>
- Added nbqa to pyproject.toml
- Resolved mypy issues on notebooks and related files
- Conducting ruff checks on notebooks
- Add DataclassPPrintMixin for better stats representation
- Improved Notebooks wording and explanations
Resolve: #1004
Related to #974
Addresses part of #1015
### Dependencies
- move jsonargparse and docstring-parser to dependencies to run hl
examples without dev
- create mujoco-py extra for legacy mujoco envs
- updated atari extra
- removed atari-py and gym dependencies
- added ALE-py, autorom, and shimmy
- created robotics extra for HER-DDPG
### Mac specific
- only install envpool when not on mac
- mujoco-py not working on macOS newer than Monterey
(https://github.com/openai/mujoco-py/issues/777)
- D4RL also fails due to dependency on mujoco-py
(https://github.com/Farama-Foundation/D4RL/issues/232)
### Other
- reduced training-num/test-num in example files to a number ≤ 20
(examples with 100 led to too many open files)
- rendering for Mujoco envs needs to be fixed on gymnasium side
(https://github.com/Farama-Foundation/Gymnasium/issues/749)
---------
Co-authored-by: Maximilian Huettenrauch <m.huettenrauch@appliedai.de>
Co-authored-by: Michael Panchenko <35432522+MischaPanch@users.noreply.github.com>