662 Commits

Author SHA1 Message Date
maxhuettenrauch
a043711c10
Fix/deterministic action space sampling in SubprocVectorEnv (#1103) 2024-04-18 16:16:57 +02:00
Daniel Plop
6935a111d9
Add non in-place version of Batch.to_torch (#1117)
Closes: https://github.com/aai-institute/tianshou/issues/1116

### API Extensions

- Batch received new method: `to_torch_`. #1117

### Breaking Changes

- The method `to_torch` in `data.utils.batch.Batch` is not in-place
anymore. Instead, a new method `to_torch_` does the conversion in-place.
#1117
2024-04-17 22:07:24 +02:00
Daniel Plop
ca4f74f40e
Allow two (same/different) Batch objs to be tested for equality (#1098)
Closes: https://github.com/thu-ml/tianshou/issues/1086

### Api Extensions

- Batch received new method: `to_numpy_`. #1098
- `to_dict` in Batch supports also non-recursive conversion. #1098
- Batch `__eq__` now implemented, semantic equality check of batches is
now possible. #1098

### Breaking Changes

- The method `to_numpy` in `data.utils.batch.Batch` is not in-place
anymore. Instead, a new method `to_numpy_` does the conversion in-place.
#1098
2024-04-16 18:12:48 +02:00
Michael Panchenko
049907d9ab Fix type check in atari wrapper, solves #1111 2024-04-16 10:52:48 +02:00
maxhuettenrauch
60d1ba1c8f
Fix/reset before collect in procedural examples, tests and hl experiment (#1100)
Needed due to a breaking change in the Collector which was overlooked in some of the examples
2024-04-16 10:30:21 +02:00
Molasses
766f6fedf2
Fix imports in Readme 2024-04-15 11:32:35 +02:00
Erni
e2a2a6856d
Changed .keys() to get_keys() in batch class (#1105)
Solves the inconsistency that iter(Batch) is not the same as Batch.keys() by "deprecating" the implicit .keys() method

Closes: #922
2024-04-12 12:15:37 +02:00
Michael Panchenko
03e9af04b7
Update README.md (removed instability warning) [skip ci] 2024-04-05 12:05:20 +02:00
Michael Panchenko
bab5c634e7
Missing link in README.md [skip ci] 2024-04-05 12:04:27 +02:00
Daniel Plop
8a0629ded6
Fix mypy issues in tests and examples (#1077)
Closes #952 

- `SamplingConfig` supports `batch_size=None`. #1077
- tests and examples are covered by `mypy`. #1077
- `NetBase` is more used, stricter typing by making it generic. #1077
- `utils.net.common.Recurrent` now receives and returns a
`RecurrentStateBatch` instead of a dict. #1077

---------

Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
2024-04-03 18:07:51 +02:00
Michael Panchenko
55fa6f7f35
Don't raise error on len of empty Batch (#1084) 2024-04-03 13:37:18 +02:00
Erni
bf0d632108
Naming and typing improvements in Actor/Critic/Policy forwards (#1032)
Closes #917 

### Internal Improvements
- Better variable names related to model outputs (logits, dist input
etc.). #1032
- Improved typing for actors and critics, using Tianshou classes like
`Actor`, `ActorProb`, etc.,
instead of just `nn.Module`. #1032
- Added interfaces for most `Actor` and `Critic` classes to enforce the
presence of `forward` methods. #1032
- Simplified `PGPolicy` forward by unifying the `dist_fn` interface (see
associated breaking change). #1032
- Use `.mode` of distribution instead of relying on knowledge of the
distribution type. #1032

### Breaking Changes

- Changed interface of `dist_fn` in `PGPolicy` and all subclasses to
take a single argument in both
continuous and discrete cases. #1032

---------

Co-authored-by: Arnau Jimenez <arnau.jimenez@zeiss.com>
Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
2024-04-01 17:14:17 +02:00
Michael Panchenko
5bf923c9bd Removed more references to Chinese docs [skip ci] 2024-03-28 18:17:25 +01:00
Michael Panchenko
23a33a9aa3 Removed link to Chinese docs [skip ci] 2024-03-28 18:13:15 +01:00
Michael Panchenko
ecb272c61b
Update CHANGELOG.md [skip ci] 2024-03-28 18:06:00 +01:00
bordeauxred
4f65b131aa
Feat/refactor collector (#1063)
Closes: #1058 

### Api Extensions
- Batch received two new methods: `to_dict` and `to_list_of_dicts`.
#1063
- `Collector`s can now be closed, and their reset is more granular.
#1063
- Trainers can control whether collectors should be reset prior to
training. #1063
- Convenience constructor for `CollectStats` called
`with_autogenerated_stats`. #1063

### Internal Improvements
- `Collector`s rely less on state, the few stateful things are stored
explicitly instead of through a `.data` attribute. #1063
- Introduced a first iteration of a naming convention for vars in
`Collector`s. #1063
- Generally improved readability of Collector code and associated tests
(still quite some way to go). #1063
- Improved typing for `exploration_noise` and within Collector. #1063

### Breaking Changes

- Removed `.data` attribute from `Collector` and its child classes.
#1063
- Collectors no longer reset the environment on initialization. Instead,
the user might have to call `reset`
expicitly or pass `reset_before_collect=True` . #1063
- VectorEnvs now return an array of info-dicts on reset instead of a
list. #1063
- Fixed `iter(Batch(...)` which now behaves the same way as
`Batch(...).__iter__()`. Can be considered a bugfix. #1063

---------

Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
2024-03-28 18:02:31 +01:00
maxhuettenrauch
edae9e4403
fixed env seeding in test_sac_with_il.py (#1081) 2024-03-28 12:52:35 +01:00
Michael Panchenko
61bf9adaff
Update CHANGELOG.md [skip ci] 2024-03-20 23:09:26 +01:00
Michael Panchenko
5f96a57bbb
Add CHANGELOG.md 2024-03-20 23:08:34 +01:00
Michael Panchenko
1a4d7deca6
Update publish.yaml, typo [skip ci[ v1.0.0 2024-03-20 00:41:46 +01:00
Michael Panchenko
72df9a580d
Update publish.yaml [skip ci] 2024-03-20 00:41:17 +01:00
Michael Panchenko
55e9bee373
Update publish.yaml [skip ci] 2024-03-20 00:39:54 +01:00
Michael Panchenko
e3661c11e3
Update publish.yaml, missing / [skip ci] 2024-03-20 00:26:11 +01:00
maxhuettenrauch
e82379c47f
Allow explicit setting of multiprocessing context for SubprocEnvWorker (#1072)
Running multiple training runs in parallel (with, for example, joblib)
fails on macOS due to a change in the standard context for
multiprocessing (see
[here](https://stackoverflow.com/questions/65098398/why-using-fork-works-but-using-spawn-fails-in-python3-8-multiprocessing)
or
[here](https://www.reddit.com/r/learnpython/comments/g5372v/multiprocessing_with_fork_on_macos/)).
This PR adds the ability to explicitly set a multiprocessing context for
the SubProcEnvWorker (similar to gymnasium's
[AsyncVecEnv](https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/vector/async_vector_env.py)).
---------

Co-authored-by: Maximilian Huettenrauch <m.huettenrauch@appliedai.de>
Co-authored-by: Michael Panchenko <35432522+MischaPanch@users.noreply.github.com>
2024-03-14 11:07:56 +01:00
Dominik Jain
1714c7f2c7
High-level API: Fix number of test episodes being incorrectly scaled by number of envs (#1071) 2024-03-07 08:57:11 -08:00
Michael Panchenko
6746a80f6d
Add publish workflow, first preparation for next release (#1067) 2024-03-04 12:21:49 +01:00
Michael Panchenko
fdb69f1273
Improve README, minor changes in procedural example (#1068) 2024-03-03 15:07:07 +01:00
Dominik Jain
b6b2c95ac7 Improve README, minor changes in procedural example 2024-03-03 15:06:40 +01:00
Erni
1aee41fa9c
Using dist.mode instead of logits.argmax (#1066)
changed all the occurrences where an action is selected deterministically

- **from**: using the outputs of the actor network.
- **to**: using the mode of the PyTorch distribution.

---------

Co-authored-by: Arnau Jimenez <arnau.jimenez@zeiss.com>
2024-03-03 00:09:39 +01:00
maxhuettenrauch
7c970df53f
Fix/add watch env with obs rms (#1061)
Supports deciding whether to watch the agent performing on the env using high-level interfaces
2024-02-29 15:59:11 +01:00
Dominik Jain
49781e715e
Fix high-level examples (#1060)
The high-level examples were all broken by changes made to make mypy
pass.
This PR fixes them, making a type change in logging.run_cli instead to
make mypy happy.
2024-02-23 23:17:14 +01:00
Ashok Arora
0b61bf8caf
Fix the link to the contributing guide. (#1062) 2024-02-23 23:15:41 +01:00
Carlo Cagnetta
ce371ae736
remove old python versions from poetry classifier (#1059) 2024-02-21 15:27:53 +01:00
Michael Panchenko
9b6cb6903e
Improvements in High-Level API and Poe Tasks (#1055)
* Add an option to SamplingConfig which allows to configure number of
test episodes
* Make OptimizerFactory more flexible, adding method
`create_optimizer_for_params`
* Fix AutoAlphaFactoryDefault using hard-coded Adam optimizer
* Fix mypy issues that were platform/installation-dependent
* Limit scope of nbqa, resolving issues with files generated by old
versions of the build

Fixes #1054
2024-02-15 12:02:16 +01:00
Dominik Jain
26e210a6ae Apply nbqa only to the docs/ folder and exclude the (old) jupyter_execute folder 2024-02-15 11:39:45 +01:00
Dominik Jain
08728ad35e Resolve platform-specific/installation-specific mypy issues
by adding ignores and ignoring unused ignores locally
2024-02-15 11:26:54 +01:00
Dominik Jain
f2e0fd165d Fix gitignore applying to tianshou/env on platfoms with case-insensitive file system 2024-02-15 11:26:39 +01:00
Dominik Jain
eeb2081ca6 Fix AutoAlphaFactoryDefault using hard-coded Adam optimizer instead of passed factory 2024-02-14 20:43:38 +01:00
Dominik Jain
76cbd7efc2 Make OptimizerFactory more flexible by adding a second method which
allows the creation of an optimizer given arbitrary parameters
(rather than a module)
2024-02-14 20:42:06 +01:00
Dominik Jain
bf391853dc Allow to configure number of test episodes in high-level API 2024-02-14 19:14:28 +01:00
Michael Panchenko
8742e3645c
Docs, js - typo in path 2024-02-14 10:50:06 +01:00
Michael Panchenko
5cc51145da
Docs/hotfix (#1052) 2024-02-12 18:54:38 +01:00
Michael Panchenko
7a30b842b6
Add vega scripts explictly to config (#1051) 2024-02-12 18:49:32 +01:00
Michael Panchenko
d3fe87b70d
Docs: added symlinks for paths resolution, removed jquery loading (#1050) 2024-02-12 17:38:25 +01:00
Michael Panchenko
e3c610d37c
Docs: Added jquery, better handling of js files through sphinx config… (#1049)
Closes #1005 #1045
2024-02-12 15:43:32 +01:00
Michael Panchenko
33d241a29b
Docs/html doc issues (#1048)
Closes #1005 

## Main changes

2. Load vega-embed things using jupyter-book config 
3. Add vega-embed dependencies as part of local code for offline
development
4. Reduced duplication in benchmark.js
5. Update sphinx, docutils, and jupyter-book

Co-authored-by: carlocagnetta <c.cagnetta@appliedai.de>
2024-02-09 19:43:10 +01:00
Carlo Cagnetta
5fc314bd4b
Docs/use nbqa on notebooks (#1041)
- Added nbqa to pyproject.toml
- Resolved mypy issues on notebooks and related files
- Conducting ruff checks on notebooks
- Add DataclassPPrintMixin for better stats representation
- Improved Notebooks wording and explanations

Resolve: #1004
Related to #974
2024-02-07 17:28:16 +01:00
maxhuettenrauch
5fe9aea798
Update and fix dependencies related to mac install (#1044)
Addresses part of #1015 

### Dependencies

- move jsonargparse and docstring-parser to dependencies to run hl
examples without dev
- create mujoco-py extra for legacy mujoco envs
- updated atari extra
    - removed atari-py and gym dependencies
    - added ALE-py, autorom, and shimmy
- created robotics extra for HER-DDPG

### Mac specific

- only install envpool when not on mac
- mujoco-py not working on macOS newer than Monterey
(https://github.com/openai/mujoco-py/issues/777)
- D4RL also fails due to dependency on mujoco-py
(https://github.com/Farama-Foundation/D4RL/issues/232)

### Other

- reduced training-num/test-num in example files to a number ≤ 20
(examples with 100 led to too many open files)
- rendering for Mujoco envs needs to be fixed on gymnasium side
(https://github.com/Farama-Foundation/Gymnasium/issues/749)

---------

Co-authored-by: Maximilian Huettenrauch <m.huettenrauch@appliedai.de>
Co-authored-by: Michael Panchenko <35432522+MischaPanch@users.noreply.github.com>
2024-02-06 17:06:38 +01:00
Daniel Plop
eb0215cf76
Refactoring/mypy issues test (#1017)
Improves typing in examples and tests, towards mypy passing there.

Introduces the SpaceInfo utility
2024-02-06 14:24:30 +01:00
Michael Panchenko
4756ee80ff
Fixed links and added poetry install info in README [skip ci] 2024-01-24 18:07:02 +01:00