Tianshou

Author	SHA1	Message	Date
Dominik Jain	74737416ff	Fix typo	2024-04-29 18:27:02 +02:00
maxhuettenrauch	ade85ab32b	Feature/algo eval (#1074 ) # Changes ## Dependencies - New extra "eval" ## Api Extension - `Experiment` and `ExperimentConfig` now have a `name`, that can however be overridden when `Experiment.run()` is called - When building an `Experiment` from an `ExperimentConfig`, the user has the option to add info about seeds to the name. - New method in `ExperimentConfig` called `build_default_seeded_experiments` - `SamplingConfig` has an explicit training seed, `test_seed` is inferred. - New `evaluation` package for repeating the same experiment with multiple seeds and aggregating the results (important extension!). Currently in alpha state. - Loggers can now restore the logged data into python by using the new `restore_logged_data` ## Breaking Changes - `AtariEnvFactory` (in examples) now receives explicit train and test seeds - `EnvFactoryRegistered` now requires an explicit `test_seed` - `BaseLogger.prepare_dict_for_logging` is now abstract --------- Co-authored-by: Maximilian Huettenrauch <m.huettenrauch@appliedai.de> Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de> Co-authored-by: Michael Panchenko <35432522+MischaPanch@users.noreply.github.com>	2024-04-20 23:25:33 +00:00
Daniel Plop	6935a111d9	Add non in-place version of `Batch.to_torch` (#1117 ) Closes: https://github.com/aai-institute/tianshou/issues/1116 ### API Extensions - Batch received new method: `to_torch_`. #1117 ### Breaking Changes - The method `to_torch` in `data.utils.batch.Batch` is not in-place anymore. Instead, a new method `to_torch_` does the conversion in-place. #1117	2024-04-17 22:07:24 +02:00
Daniel Plop	ca4f74f40e	Allow two (same/different) Batch objs to be tested for equality (#1098 ) Closes: https://github.com/thu-ml/tianshou/issues/1086 ### Api Extensions - Batch received new method: `to_numpy_`. #1098 - `to_dict` in Batch supports also non-recursive conversion. #1098 - Batch `__eq__` now implemented, semantic equality check of batches is now possible. #1098 ### Breaking Changes - The method `to_numpy` in `data.utils.batch.Batch` is not in-place anymore. Instead, a new method `to_numpy_` does the conversion in-place. #1098	2024-04-16 18:12:48 +02:00
Daniel Plop	8a0629ded6	Fix mypy issues in tests and examples (#1077 ) Closes #952 - `SamplingConfig` supports `batch_size=None`. #1077 - tests and examples are covered by `mypy`. #1077 - `NetBase` is more used, stricter typing by making it generic. #1077 - `utils.net.common.Recurrent` now receives and returns a `RecurrentStateBatch` instead of a dict. #1077 --------- Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>	2024-04-03 18:07:51 +02:00
Erni	bf0d632108	Naming and typing improvements in Actor/Critic/Policy forwards (#1032 ) Closes #917 ### Internal Improvements - Better variable names related to model outputs (logits, dist input etc.). #1032 - Improved typing for actors and critics, using Tianshou classes like `Actor`, `ActorProb`, etc., instead of just `nn.Module`. #1032 - Added interfaces for most `Actor` and `Critic` classes to enforce the presence of `forward` methods. #1032 - Simplified `PGPolicy` forward by unifying the `dist_fn` interface (see associated breaking change). #1032 - Use `.mode` of distribution instead of relying on knowledge of the distribution type. #1032 ### Breaking Changes - Changed interface of `dist_fn` in `PGPolicy` and all subclasses to take a single argument in both continuous and discrete cases. #1032 --------- Co-authored-by: Arnau Jimenez <arnau.jimenez@zeiss.com> Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>	2024-04-01 17:14:17 +02:00
Michael Panchenko	5bf923c9bd	Removed more references to Chinese docs [skip ci]	2024-03-28 18:17:25 +01:00
Michael Panchenko	23a33a9aa3	Removed link to Chinese docs [skip ci]	2024-03-28 18:13:15 +01:00
bordeauxred	4f65b131aa	Feat/refactor collector (#1063 ) Closes: #1058 ### Api Extensions - Batch received two new methods: `to_dict` and `to_list_of_dicts`. #1063 - `Collector`s can now be closed, and their reset is more granular. #1063 - Trainers can control whether collectors should be reset prior to training. #1063 - Convenience constructor for `CollectStats` called `with_autogenerated_stats`. #1063 ### Internal Improvements - `Collector`s rely less on state, the few stateful things are stored explicitly instead of through a `.data` attribute. #1063 - Introduced a first iteration of a naming convention for vars in `Collector`s. #1063 - Generally improved readability of Collector code and associated tests (still quite some way to go). #1063 - Improved typing for `exploration_noise` and within Collector. #1063 ### Breaking Changes - Removed `.data` attribute from `Collector` and its child classes. #1063 - Collectors no longer reset the environment on initialization. Instead, the user might have to call `reset` expicitly or pass `reset_before_collect=True` . #1063 - VectorEnvs now return an array of info-dicts on reset instead of a list. #1063 - Fixed `iter(Batch(...)` which now behaves the same way as `Batch(...).__iter__()`. Can be considered a bugfix. #1063 --------- Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>	2024-03-28 18:02:31 +01:00
maxhuettenrauch	e82379c47f	Allow explicit setting of multiprocessing context for SubprocEnvWorker (#1072 ) Running multiple training runs in parallel (with, for example, joblib) fails on macOS due to a change in the standard context for multiprocessing (see [here](https://stackoverflow.com/questions/65098398/why-using-fork-works-but-using-spawn-fails-in-python3-8-multiprocessing) or [here](https://www.reddit.com/r/learnpython/comments/g5372v/multiprocessing_with_fork_on_macos/)). This PR adds the ability to explicitly set a multiprocessing context for the SubProcEnvWorker (similar to gymnasium's [AsyncVecEnv](https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/vector/async_vector_env.py)). --------- Co-authored-by: Maximilian Huettenrauch <m.huettenrauch@appliedai.de> Co-authored-by: Michael Panchenko <35432522+MischaPanch@users.noreply.github.com>	2024-03-14 11:07:56 +01:00
Dominik Jain	1714c7f2c7	High-level API: Fix number of test episodes being incorrectly scaled by number of envs (#1071 )	2024-03-07 08:57:11 -08:00
Michael Panchenko	6746a80f6d	Add publish workflow, first preparation for next release (#1067 )	2024-03-04 12:21:49 +01:00
Michael Panchenko	8742e3645c	Docs, js - typo in path	2024-02-14 10:50:06 +01:00
Michael Panchenko	5cc51145da	Docs/hotfix (#1052 )	2024-02-12 18:54:38 +01:00
Michael Panchenko	7a30b842b6	Add vega scripts explictly to config (#1051 )	2024-02-12 18:49:32 +01:00
Michael Panchenko	d3fe87b70d	Docs: added symlinks for paths resolution, removed jquery loading (#1050 )	2024-02-12 17:38:25 +01:00
Michael Panchenko	e3c610d37c	Docs: Added jquery, better handling of js files through sphinx config… (#1049 ) Closes #1005 #1045	2024-02-12 15:43:32 +01:00
Michael Panchenko	33d241a29b	Docs/html doc issues (#1048 ) Closes #1005 ## Main changes 2. Load vega-embed things using jupyter-book config 3. Add vega-embed dependencies as part of local code for offline development 4. Reduced duplication in benchmark.js 5. Update sphinx, docutils, and jupyter-book Co-authored-by: carlocagnetta <c.cagnetta@appliedai.de>	2024-02-09 19:43:10 +01:00
Carlo Cagnetta	5fc314bd4b	Docs/use nbqa on notebooks (#1041 ) - Added nbqa to pyproject.toml - Resolved mypy issues on notebooks and related files - Conducting ruff checks on notebooks - Add DataclassPPrintMixin for better stats representation - Improved Notebooks wording and explanations Resolve: #1004 Related to #974	2024-02-07 17:28:16 +01:00
Dominik Jain	39f3ba2266	Add screen recording of high-level example	2024-01-16 13:43:14 +01:00
maxhuettenrauch	522f7fbf98	Feature/dataclasses (#996 ) This PR adds strict typing to the output of `update` and `learn` in all policies. This will likely be the last large refactoring PR before the next release (0.6.0, not 1.0.0), so it requires some attention. Several difficulties were encountered on the path to that goal: 1. The policy hierarchy is actually "broken" in the sense that the keys of dicts that were output by `learn` did not follow the same enhancement (inheritance) pattern as the policies. This is a real problem and should be addressed in the near future. Generally, several aspects of the policy design and hierarchy might deserve a dedicated discussion. 2. Each policy needs to be generic in the stats return type, because one might want to extend it at some point and then also extend the stats. Even within the source code base this pattern is necessary in many places. 3. The interaction between learn and update is a bit quirky, we currently handle it by having update modify special field inside TrainingStats, whereas all other fields are handled by learn. 4. The IQM module is a policy wrapper and required a TrainingStatsWrapper. The latter relies on a bunch of black magic. They were addressed by: 1. Live with the broken hierarchy, which is now made visible by bounds in generics. We use type: ignore where appropriate. 2. Make all policies generic with bounds following the policy inheritance hierarchy (which is incorrect, see above). We experimented a bit with nested TrainingStats classes, but that seemed to add more complexity and be harder to understand. Unfortunately, mypy thinks that the code below is wrong, wherefore we have to add `type: ignore` to the return of each `learn` ```python T = TypeVar("T", bound=int) def f() -> T: return 3 ``` 3. See above 4. Write representative tests for the `TrainingStatsWrapper`. Still, the black magic might cause nasty surprises down the line (I am not proud of it)... Closes #933 --------- Co-authored-by: Maximilian Huettenrauch <m.huettenrauch@appliedai.de> Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>	2023-12-30 11:09:03 +01:00
Michael Panchenko	5d09645a2c	High-level API improvements (#1014 ) - [X] I have added the correct label(s) to this Pull Request or linked the relevant issue(s) - [X] I have provided a description of the changes in this Pull Request - [X] I have added documentation for my changes - [ ] If applicable, I have added tests to cover my changes. - [X] I have reformatted the code using `poe format` - [X] I have checked style and types with `poe lint` and `poe type-check` - [ ] (Optional) I ran tests locally with `poe test` (or a subset of them with `poe test-reduced`) ,and they pass - [X] (Optional) I have tested that documentation builds correctly with `poe doc-build` Changes in this PR (see individual commits): * Fix: SamplingConfig.start_timesteps_random was not used * Environments: Support use of different test environment factory in convenience constructors `from_factory` SamplingConfig: Improve/extend docstrings, clearly explaining the parameters * SamplingConfig: Change default of repeat_per_collect to 1 * Improve logging * Fix doc-build on Windows	2023-12-21 10:04:14 -06:00
Dominik Jain	da333d8a85	Fix incorrect use of platform-specific path separator	2023-12-21 13:13:51 +01:00
Carlo Cagnetta	b7df31f2a7	Docs/fix trainer fct notebooks (#1009 ) This PR resolves #1008	2023-12-14 19:31:53 +01:00
Michael Panchenko	4c24dc6441	Formatting	2023-12-05 23:46:54 +01:00
Michael Panchenko	5f4a02cc69	Docs: improve API landing page	2023-12-05 23:28:29 +01:00
Michael Panchenko	9d1440752e	Deal with .jupyter_cache	2023-12-05 22:52:45 +01:00
Michael Panchenko	c50e74f263	Fix rtd build, improvements in task running	2023-12-05 22:42:55 +01:00
Michael Panchenko	0b67447541	Docs: fixing spelling, re-adding spellcheck to pipeline	2023-12-05 13:22:04 +01:00
Michael Panchenko	2e39a252e3	Docstring: minor changes to let ruff pass	2023-12-04 13:52:46 +01:00
Michael Panchenko	28fda00b27	Docs: added links to source code, readded some ruff ignore rules	2023-12-04 13:52:46 +01:00
Michael Panchenko	b12983622b	Docs: added sorting order for autogenerated toc	2023-12-04 13:52:46 +01:00
Michael Panchenko	5af29475e8	Docs: removed capitalization	2023-12-04 11:48:10 +01:00
Michael Panchenko	a5685619ce	Docs: generate all api docs automatically Reinstate the -W option Several overall improvements in docs Fixed multiple links	2023-12-04 11:48:09 +01:00
Michael Panchenko	006577da08	WIP - restructure doc files	2023-12-04 11:48:09 +01:00
Michael Panchenko	d4b6d9b250	WIP - restructure doc files	2023-12-04 11:47:40 +01:00
carlocagnetta	1515ff9cef	Compressed .png and .jpg images	2023-12-04 11:47:40 +01:00
carlocagnetta	fa55217118	Remove get_started.rst page with links to outdated notebooks	2023-12-04 11:47:09 +01:00
carlocagnetta	a12b157ee8	Add launch button for notebooks in colab	2023-12-04 11:47:09 +01:00
carlocagnetta	f5041f4f76	Replaced .png images with .svg where possible	2023-12-04 11:47:09 +01:00
carlocagnetta	a8bceff01e	Moved all docs images in docs/_static	2023-12-04 11:47:08 +01:00
carlocagnetta	6fa536fd46	Update Documentation building	2023-12-04 11:47:08 +01:00
carlocagnetta	6f739ccfe6	update docs/.gitignore	2023-12-04 11:46:34 +01:00
carlocagnetta	42d9599f2b	Fix docs/requirements.txt	2023-12-04 11:46:18 +01:00
carlocagnetta	9ab5d350c2	Fix docs/requirements.txt	2023-12-04 11:46:18 +01:00
carlocagnetta	06d2703dfc	Fix docs/requirements.txt	2023-12-04 11:46:17 +01:00
carlocagnetta	4693b0bfc6	Remove autogenerated docs/api/highllevel	2023-12-04 11:46:16 +01:00
carlocagnetta	396f20b9bb	Fix docs/requirements.txt	2023-12-04 11:46:16 +01:00
carlocagnetta	6509a20b4b	Add autogenerated api to gitignore	2023-12-04 11:46:16 +01:00
carlocagnetta	573d53dc44	Fix docs/requirements.txt	2023-12-04 11:45:54 +01:00

1 2 3 4

200 Commits