200 Commits

Author SHA1 Message Date
Dominik Jain
74737416ff Fix typo 2024-04-29 18:27:02 +02:00
maxhuettenrauch
ade85ab32b
Feature/algo eval (#1074)
# Changes

## Dependencies

- New extra "eval"

## Api Extension
- `Experiment` and `ExperimentConfig` now have a `name`, that can
however be overridden when `Experiment.run()` is called
- When building an `Experiment` from an `ExperimentConfig`, the user has
the option to add info about seeds to the name.
- New method in `ExperimentConfig` called
`build_default_seeded_experiments`
- `SamplingConfig` has an explicit training seed, `test_seed` is
inferred.
- New `evaluation` package for repeating the same experiment with
multiple seeds and aggregating the results (important extension!).
Currently in alpha state.
- Loggers can now restore the logged data into python by using the new
`restore_logged_data`

## Breaking Changes
- `AtariEnvFactory` (in examples) now receives explicit train and test
seeds
- `EnvFactoryRegistered` now requires an explicit `test_seed`
- `BaseLogger.prepare_dict_for_logging` is now abstract

---------

Co-authored-by: Maximilian Huettenrauch <m.huettenrauch@appliedai.de>
Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
Co-authored-by: Michael Panchenko <35432522+MischaPanch@users.noreply.github.com>
2024-04-20 23:25:33 +00:00
Daniel Plop
6935a111d9
Add non in-place version of Batch.to_torch (#1117)
Closes: https://github.com/aai-institute/tianshou/issues/1116

### API Extensions

- Batch received new method: `to_torch_`. #1117

### Breaking Changes

- The method `to_torch` in `data.utils.batch.Batch` is not in-place
anymore. Instead, a new method `to_torch_` does the conversion in-place.
#1117
2024-04-17 22:07:24 +02:00
Daniel Plop
ca4f74f40e
Allow two (same/different) Batch objs to be tested for equality (#1098)
Closes: https://github.com/thu-ml/tianshou/issues/1086

### Api Extensions

- Batch received new method: `to_numpy_`. #1098
- `to_dict` in Batch supports also non-recursive conversion. #1098
- Batch `__eq__` now implemented, semantic equality check of batches is
now possible. #1098

### Breaking Changes

- The method `to_numpy` in `data.utils.batch.Batch` is not in-place
anymore. Instead, a new method `to_numpy_` does the conversion in-place.
#1098
2024-04-16 18:12:48 +02:00
Daniel Plop
8a0629ded6
Fix mypy issues in tests and examples (#1077)
Closes #952 

- `SamplingConfig` supports `batch_size=None`. #1077
- tests and examples are covered by `mypy`. #1077
- `NetBase` is more used, stricter typing by making it generic. #1077
- `utils.net.common.Recurrent` now receives and returns a
`RecurrentStateBatch` instead of a dict. #1077

---------

Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
2024-04-03 18:07:51 +02:00
Erni
bf0d632108
Naming and typing improvements in Actor/Critic/Policy forwards (#1032)
Closes #917 

### Internal Improvements
- Better variable names related to model outputs (logits, dist input
etc.). #1032
- Improved typing for actors and critics, using Tianshou classes like
`Actor`, `ActorProb`, etc.,
instead of just `nn.Module`. #1032
- Added interfaces for most `Actor` and `Critic` classes to enforce the
presence of `forward` methods. #1032
- Simplified `PGPolicy` forward by unifying the `dist_fn` interface (see
associated breaking change). #1032
- Use `.mode` of distribution instead of relying on knowledge of the
distribution type. #1032

### Breaking Changes

- Changed interface of `dist_fn` in `PGPolicy` and all subclasses to
take a single argument in both
continuous and discrete cases. #1032

---------

Co-authored-by: Arnau Jimenez <arnau.jimenez@zeiss.com>
Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
2024-04-01 17:14:17 +02:00
Michael Panchenko
5bf923c9bd Removed more references to Chinese docs [skip ci] 2024-03-28 18:17:25 +01:00
Michael Panchenko
23a33a9aa3 Removed link to Chinese docs [skip ci] 2024-03-28 18:13:15 +01:00
bordeauxred
4f65b131aa
Feat/refactor collector (#1063)
Closes: #1058 

### Api Extensions
- Batch received two new methods: `to_dict` and `to_list_of_dicts`.
#1063
- `Collector`s can now be closed, and their reset is more granular.
#1063
- Trainers can control whether collectors should be reset prior to
training. #1063
- Convenience constructor for `CollectStats` called
`with_autogenerated_stats`. #1063

### Internal Improvements
- `Collector`s rely less on state, the few stateful things are stored
explicitly instead of through a `.data` attribute. #1063
- Introduced a first iteration of a naming convention for vars in
`Collector`s. #1063
- Generally improved readability of Collector code and associated tests
(still quite some way to go). #1063
- Improved typing for `exploration_noise` and within Collector. #1063

### Breaking Changes

- Removed `.data` attribute from `Collector` and its child classes.
#1063
- Collectors no longer reset the environment on initialization. Instead,
the user might have to call `reset`
expicitly or pass `reset_before_collect=True` . #1063
- VectorEnvs now return an array of info-dicts on reset instead of a
list. #1063
- Fixed `iter(Batch(...)` which now behaves the same way as
`Batch(...).__iter__()`. Can be considered a bugfix. #1063

---------

Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
2024-03-28 18:02:31 +01:00
maxhuettenrauch
e82379c47f
Allow explicit setting of multiprocessing context for SubprocEnvWorker (#1072)
Running multiple training runs in parallel (with, for example, joblib)
fails on macOS due to a change in the standard context for
multiprocessing (see
[here](https://stackoverflow.com/questions/65098398/why-using-fork-works-but-using-spawn-fails-in-python3-8-multiprocessing)
or
[here](https://www.reddit.com/r/learnpython/comments/g5372v/multiprocessing_with_fork_on_macos/)).
This PR adds the ability to explicitly set a multiprocessing context for
the SubProcEnvWorker (similar to gymnasium's
[AsyncVecEnv](https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/vector/async_vector_env.py)).
---------

Co-authored-by: Maximilian Huettenrauch <m.huettenrauch@appliedai.de>
Co-authored-by: Michael Panchenko <35432522+MischaPanch@users.noreply.github.com>
2024-03-14 11:07:56 +01:00
Dominik Jain
1714c7f2c7
High-level API: Fix number of test episodes being incorrectly scaled by number of envs (#1071) 2024-03-07 08:57:11 -08:00
Michael Panchenko
6746a80f6d
Add publish workflow, first preparation for next release (#1067) 2024-03-04 12:21:49 +01:00
Michael Panchenko
8742e3645c
Docs, js - typo in path 2024-02-14 10:50:06 +01:00
Michael Panchenko
5cc51145da
Docs/hotfix (#1052) 2024-02-12 18:54:38 +01:00
Michael Panchenko
7a30b842b6
Add vega scripts explictly to config (#1051) 2024-02-12 18:49:32 +01:00
Michael Panchenko
d3fe87b70d
Docs: added symlinks for paths resolution, removed jquery loading (#1050) 2024-02-12 17:38:25 +01:00
Michael Panchenko
e3c610d37c
Docs: Added jquery, better handling of js files through sphinx config… (#1049)
Closes #1005 #1045
2024-02-12 15:43:32 +01:00
Michael Panchenko
33d241a29b
Docs/html doc issues (#1048)
Closes #1005 

## Main changes

2. Load vega-embed things using jupyter-book config 
3. Add vega-embed dependencies as part of local code for offline
development
4. Reduced duplication in benchmark.js
5. Update sphinx, docutils, and jupyter-book

Co-authored-by: carlocagnetta <c.cagnetta@appliedai.de>
2024-02-09 19:43:10 +01:00
Carlo Cagnetta
5fc314bd4b
Docs/use nbqa on notebooks (#1041)
- Added nbqa to pyproject.toml
- Resolved mypy issues on notebooks and related files
- Conducting ruff checks on notebooks
- Add DataclassPPrintMixin for better stats representation
- Improved Notebooks wording and explanations

Resolve: #1004
Related to #974
2024-02-07 17:28:16 +01:00
Dominik Jain
39f3ba2266 Add screen recording of high-level example 2024-01-16 13:43:14 +01:00
maxhuettenrauch
522f7fbf98
Feature/dataclasses (#996)
This PR adds strict typing to the output of `update` and `learn` in all
policies. This will likely be the last large refactoring PR before the
next release (0.6.0, not 1.0.0), so it requires some attention. Several
difficulties were encountered on the path to that goal:

1. The policy hierarchy is actually "broken" in the sense that the keys
of dicts that were output by `learn` did not follow the same enhancement
(inheritance) pattern as the policies. This is a real problem and should
be addressed in the near future. Generally, several aspects of the
policy design and hierarchy might deserve a dedicated discussion.
2. Each policy needs to be generic in the stats return type, because one
might want to extend it at some point and then also extend the stats.
Even within the source code base this pattern is necessary in many
places.
3. The interaction between learn and update is a bit quirky, we
currently handle it by having update modify special field inside
TrainingStats, whereas all other fields are handled by learn.
4. The IQM module is a policy wrapper and required a
TrainingStatsWrapper. The latter relies on a bunch of black magic.

They were addressed by:
1. Live with the broken hierarchy, which is now made visible by bounds
in generics. We use type: ignore where appropriate.
2. Make all policies generic with bounds following the policy
inheritance hierarchy (which is incorrect, see above). We experimented a
bit with nested TrainingStats classes, but that seemed to add more
complexity and be harder to understand. Unfortunately, mypy thinks that
the code below is wrong, wherefore we have to add `type: ignore` to the
return of each `learn`

```python

T = TypeVar("T", bound=int)


def f() -> T:
  return 3
```

3. See above
4. Write representative tests for the `TrainingStatsWrapper`. Still, the
black magic might cause nasty surprises down the line (I am not proud of
it)...

Closes #933

---------

Co-authored-by: Maximilian Huettenrauch <m.huettenrauch@appliedai.de>
Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
2023-12-30 11:09:03 +01:00
Michael Panchenko
5d09645a2c
High-level API improvements (#1014)
- [X] I have added the correct label(s) to this Pull Request or linked
the relevant issue(s)
- [X] I have provided a description of the changes in this Pull Request
- [X] I have added documentation for my changes
- [ ] If applicable, I have added tests to cover my changes.
- [X] I have reformatted the code using `poe format` 
- [X] I have checked style and types with `poe lint` and `poe
type-check`
- [ ] (Optional) I ran tests locally with `poe test` 
(or a subset of them with `poe test-reduced`) ,and they pass
- [X] (Optional) I have tested that documentation builds correctly with
`poe doc-build`

Changes in this PR (see individual commits):
* Fix: SamplingConfig.start_timesteps_random was not used
* Environments: Support use of different test environment factory in
convenience constructors `from_factory*`
* SamplingConfig: Improve/extend docstrings, clearly explaining the
parameters
* SamplingConfig: Change default of repeat_per_collect to 1
* Improve logging
* Fix doc-build on Windows
2023-12-21 10:04:14 -06:00
Dominik Jain
da333d8a85 Fix incorrect use of platform-specific path separator 2023-12-21 13:13:51 +01:00
Carlo Cagnetta
b7df31f2a7
Docs/fix trainer fct notebooks (#1009)
This PR resolves #1008
2023-12-14 19:31:53 +01:00
Michael Panchenko
4c24dc6441 Formatting 2023-12-05 23:46:54 +01:00
Michael Panchenko
5f4a02cc69 Docs: improve API landing page 2023-12-05 23:28:29 +01:00
Michael Panchenko
9d1440752e Deal with .jupyter_cache 2023-12-05 22:52:45 +01:00
Michael Panchenko
c50e74f263 Fix rtd build, improvements in task running 2023-12-05 22:42:55 +01:00
Michael Panchenko
0b67447541 Docs: fixing spelling, re-adding spellcheck to pipeline 2023-12-05 13:22:04 +01:00
Michael Panchenko
2e39a252e3 Docstring: minor changes to let ruff pass 2023-12-04 13:52:46 +01:00
Michael Panchenko
28fda00b27 Docs: added links to source code, readded some ruff ignore rules 2023-12-04 13:52:46 +01:00
Michael Panchenko
b12983622b Docs: added sorting order for autogenerated toc 2023-12-04 13:52:46 +01:00
Michael Panchenko
5af29475e8 Docs: removed capitalization 2023-12-04 11:48:10 +01:00
Michael Panchenko
a5685619ce Docs: generate all api docs automatically
Reinstate the -W option
Several overall improvements in docs
Fixed multiple links
2023-12-04 11:48:09 +01:00
Michael Panchenko
006577da08 WIP - restructure doc files 2023-12-04 11:48:09 +01:00
Michael Panchenko
d4b6d9b250 WIP - restructure doc files 2023-12-04 11:47:40 +01:00
carlocagnetta
1515ff9cef Compressed .png and .jpg images 2023-12-04 11:47:40 +01:00
carlocagnetta
fa55217118 Remove get_started.rst page with links to outdated notebooks 2023-12-04 11:47:09 +01:00
carlocagnetta
a12b157ee8 Add launch button for notebooks in colab 2023-12-04 11:47:09 +01:00
carlocagnetta
f5041f4f76 Replaced .png images with .svg where possible 2023-12-04 11:47:09 +01:00
carlocagnetta
a8bceff01e Moved all docs images in docs/_static 2023-12-04 11:47:08 +01:00
carlocagnetta
6fa536fd46 Update Documentation building 2023-12-04 11:47:08 +01:00
carlocagnetta
6f739ccfe6 update docs/.gitignore 2023-12-04 11:46:34 +01:00
carlocagnetta
42d9599f2b Fix docs/requirements.txt 2023-12-04 11:46:18 +01:00
carlocagnetta
9ab5d350c2 Fix docs/requirements.txt 2023-12-04 11:46:18 +01:00
carlocagnetta
06d2703dfc Fix docs/requirements.txt 2023-12-04 11:46:17 +01:00
carlocagnetta
4693b0bfc6 Remove autogenerated docs/api/highllevel 2023-12-04 11:46:16 +01:00
carlocagnetta
396f20b9bb Fix docs/requirements.txt 2023-12-04 11:46:16 +01:00
carlocagnetta
6509a20b4b Add autogenerated api to gitignore 2023-12-04 11:46:16 +01:00
carlocagnetta
573d53dc44 Fix docs/requirements.txt 2023-12-04 11:45:54 +01:00