Tianshou

Author	SHA1	Message	Date
Dominik Jain	d18ded333e	CriticFactoryReuseActor: Fix the case where we want to reuse an actor's preprocessing network for the critic (must be applied before concatenating the actions)	2024-04-29 18:27:02 +02:00
Dominik Jain	0b494845c9	continuous.Critic: Add flag apply_preprocess_net_to_obs_only to allow the preprocessing network to be applied to the observations only (without the actions concatenated), which is essential for the case where we want to reuse the actor's preprocessing network	2024-04-29 18:27:02 +02:00
Dominik Jain	18ed981875	Add pickle/serialisation utils: setstate and getstate	2024-04-29 18:27:02 +02:00
Dominik Jain	be1c8cd235	DQN: * Fix input validation * Fix output_dim not being set if features_only=True and output_dim_added_layer not None	2024-04-29 13:37:26 +02:00
Michael Panchenko	081adedc32	Changelog + dependabot bumps (#1124 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-04-25 08:49:54 -07:00
maxhuettenrauch	ade85ab32b	Feature/algo eval (#1074 ) # Changes ## Dependencies - New extra "eval" ## Api Extension - `Experiment` and `ExperimentConfig` now have a `name`, that can however be overridden when `Experiment.run()` is called - When building an `Experiment` from an `ExperimentConfig`, the user has the option to add info about seeds to the name. - New method in `ExperimentConfig` called `build_default_seeded_experiments` - `SamplingConfig` has an explicit training seed, `test_seed` is inferred. - New `evaluation` package for repeating the same experiment with multiple seeds and aggregating the results (important extension!). Currently in alpha state. - Loggers can now restore the logged data into python by using the new `restore_logged_data` ## Breaking Changes - `AtariEnvFactory` (in examples) now receives explicit train and test seeds - `EnvFactoryRegistered` now requires an explicit `test_seed` - `BaseLogger.prepare_dict_for_logging` is now abstract --------- Co-authored-by: Maximilian Huettenrauch <m.huettenrauch@appliedai.de> Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de> Co-authored-by: Michael Panchenko <35432522+MischaPanch@users.noreply.github.com>	2024-04-20 23:25:33 +00:00
maxhuettenrauch	9c0b3e7292	use explicit multiprocessing context for creating Pipe in subproc.py (#1102 )	2024-04-19 11:08:53 +02:00
maxhuettenrauch	a043711c10	Fix/deterministic action space sampling in SubprocVectorEnv (#1103 )	2024-04-18 16:16:57 +02:00
Daniel Plop	6935a111d9	Add non in-place version of `Batch.to_torch` (#1117 ) Closes: https://github.com/aai-institute/tianshou/issues/1116 ### API Extensions - Batch received new method: `to_torch_`. #1117 ### Breaking Changes - The method `to_torch` in `data.utils.batch.Batch` is not in-place anymore. Instead, a new method `to_torch_` does the conversion in-place. #1117	2024-04-17 22:07:24 +02:00
Daniel Plop	ca4f74f40e	Allow two (same/different) Batch objs to be tested for equality (#1098 ) Closes: https://github.com/thu-ml/tianshou/issues/1086 ### Api Extensions - Batch received new method: `to_numpy_`. #1098 - `to_dict` in Batch supports also non-recursive conversion. #1098 - Batch `__eq__` now implemented, semantic equality check of batches is now possible. #1098 ### Breaking Changes - The method `to_numpy` in `data.utils.batch.Batch` is not in-place anymore. Instead, a new method `to_numpy_` does the conversion in-place. #1098	2024-04-16 18:12:48 +02:00
Michael Panchenko	049907d9ab	Fix type check in atari wrapper, solves #1111	2024-04-16 10:52:48 +02:00
maxhuettenrauch	60d1ba1c8f	Fix/reset before collect in procedural examples, tests and hl experiment (#1100 ) Needed due to a breaking change in the Collector which was overlooked in some of the examples	2024-04-16 10:30:21 +02:00
Molasses	766f6fedf2	Fix imports in Readme	2024-04-15 11:32:35 +02:00
Erni	e2a2a6856d	Changed .keys() to get_keys() in batch class (#1105 ) Solves the inconsistency that iter(Batch) is not the same as Batch.keys() by "deprecating" the implicit .keys() method Closes: #922	2024-04-12 12:15:37 +02:00
Michael Panchenko	03e9af04b7	Update README.md (removed instability warning) [skip ci]	2024-04-05 12:05:20 +02:00
Michael Panchenko	bab5c634e7	Missing link in README.md [skip ci]	2024-04-05 12:04:27 +02:00
Daniel Plop	8a0629ded6	Fix mypy issues in tests and examples (#1077 ) Closes #952 - `SamplingConfig` supports `batch_size=None`. #1077 - tests and examples are covered by `mypy`. #1077 - `NetBase` is more used, stricter typing by making it generic. #1077 - `utils.net.common.Recurrent` now receives and returns a `RecurrentStateBatch` instead of a dict. #1077 --------- Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>	2024-04-03 18:07:51 +02:00
Michael Panchenko	55fa6f7f35	Don't raise error on len of empty Batch (#1084 )	2024-04-03 13:37:18 +02:00
Erni	bf0d632108	Naming and typing improvements in Actor/Critic/Policy forwards (#1032 ) Closes #917 ### Internal Improvements - Better variable names related to model outputs (logits, dist input etc.). #1032 - Improved typing for actors and critics, using Tianshou classes like `Actor`, `ActorProb`, etc., instead of just `nn.Module`. #1032 - Added interfaces for most `Actor` and `Critic` classes to enforce the presence of `forward` methods. #1032 - Simplified `PGPolicy` forward by unifying the `dist_fn` interface (see associated breaking change). #1032 - Use `.mode` of distribution instead of relying on knowledge of the distribution type. #1032 ### Breaking Changes - Changed interface of `dist_fn` in `PGPolicy` and all subclasses to take a single argument in both continuous and discrete cases. #1032 --------- Co-authored-by: Arnau Jimenez <arnau.jimenez@zeiss.com> Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>	2024-04-01 17:14:17 +02:00
Michael Panchenko	5bf923c9bd	Removed more references to Chinese docs [skip ci]	2024-03-28 18:17:25 +01:00
Michael Panchenko	23a33a9aa3	Removed link to Chinese docs [skip ci]	2024-03-28 18:13:15 +01:00
Michael Panchenko	ecb272c61b	Update CHANGELOG.md [skip ci]	2024-03-28 18:06:00 +01:00
bordeauxred	4f65b131aa	Feat/refactor collector (#1063 ) Closes: #1058 ### Api Extensions - Batch received two new methods: `to_dict` and `to_list_of_dicts`. #1063 - `Collector`s can now be closed, and their reset is more granular. #1063 - Trainers can control whether collectors should be reset prior to training. #1063 - Convenience constructor for `CollectStats` called `with_autogenerated_stats`. #1063 ### Internal Improvements - `Collector`s rely less on state, the few stateful things are stored explicitly instead of through a `.data` attribute. #1063 - Introduced a first iteration of a naming convention for vars in `Collector`s. #1063 - Generally improved readability of Collector code and associated tests (still quite some way to go). #1063 - Improved typing for `exploration_noise` and within Collector. #1063 ### Breaking Changes - Removed `.data` attribute from `Collector` and its child classes. #1063 - Collectors no longer reset the environment on initialization. Instead, the user might have to call `reset` expicitly or pass `reset_before_collect=True` . #1063 - VectorEnvs now return an array of info-dicts on reset instead of a list. #1063 - Fixed `iter(Batch(...)` which now behaves the same way as `Batch(...).__iter__()`. Can be considered a bugfix. #1063 --------- Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>	2024-03-28 18:02:31 +01:00
maxhuettenrauch	edae9e4403	fixed env seeding in test_sac_with_il.py (#1081 )	2024-03-28 12:52:35 +01:00
Michael Panchenko	61bf9adaff	Update CHANGELOG.md [skip ci]	2024-03-20 23:09:26 +01:00
Michael Panchenko	5f96a57bbb	Add CHANGELOG.md	2024-03-20 23:08:34 +01:00
Michael Panchenko	1a4d7deca6	Update publish.yaml, typo [skip ci[ v1.0.0	2024-03-20 00:41:46 +01:00
Michael Panchenko	72df9a580d	Update publish.yaml [skip ci]	2024-03-20 00:41:17 +01:00
Michael Panchenko	55e9bee373	Update publish.yaml [skip ci]	2024-03-20 00:39:54 +01:00
Michael Panchenko	e3661c11e3	Update publish.yaml, missing / [skip ci]	2024-03-20 00:26:11 +01:00
maxhuettenrauch	e82379c47f	Allow explicit setting of multiprocessing context for SubprocEnvWorker (#1072 ) Running multiple training runs in parallel (with, for example, joblib) fails on macOS due to a change in the standard context for multiprocessing (see [here](https://stackoverflow.com/questions/65098398/why-using-fork-works-but-using-spawn-fails-in-python3-8-multiprocessing) or [here](https://www.reddit.com/r/learnpython/comments/g5372v/multiprocessing_with_fork_on_macos/)). This PR adds the ability to explicitly set a multiprocessing context for the SubProcEnvWorker (similar to gymnasium's [AsyncVecEnv](https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/vector/async_vector_env.py)). --------- Co-authored-by: Maximilian Huettenrauch <m.huettenrauch@appliedai.de> Co-authored-by: Michael Panchenko <35432522+MischaPanch@users.noreply.github.com>	2024-03-14 11:07:56 +01:00
Dominik Jain	1714c7f2c7	High-level API: Fix number of test episodes being incorrectly scaled by number of envs (#1071 )	2024-03-07 08:57:11 -08:00
Michael Panchenko	6746a80f6d	Add publish workflow, first preparation for next release (#1067 )	2024-03-04 12:21:49 +01:00
Michael Panchenko	fdb69f1273	Improve README, minor changes in procedural example (#1068 )	2024-03-03 15:07:07 +01:00
Dominik Jain	b6b2c95ac7	Improve README, minor changes in procedural example	2024-03-03 15:06:40 +01:00
Erni	1aee41fa9c	Using dist.mode instead of logits.argmax (#1066 ) changed all the occurrences where an action is selected deterministically - from: using the outputs of the actor network. - to: using the mode of the PyTorch distribution. --------- Co-authored-by: Arnau Jimenez <arnau.jimenez@zeiss.com>	2024-03-03 00:09:39 +01:00
maxhuettenrauch	7c970df53f	Fix/add watch env with obs rms (#1061 ) Supports deciding whether to watch the agent performing on the env using high-level interfaces	2024-02-29 15:59:11 +01:00
Dominik Jain	49781e715e	Fix high-level examples (#1060 ) The high-level examples were all broken by changes made to make mypy pass. This PR fixes them, making a type change in logging.run_cli instead to make mypy happy.	2024-02-23 23:17:14 +01:00
Ashok Arora	0b61bf8caf	Fix the link to the contributing guide. (#1062 )	2024-02-23 23:15:41 +01:00
Carlo Cagnetta	ce371ae736	remove old python versions from poetry classifier (#1059 )	2024-02-21 15:27:53 +01:00
Michael Panchenko	9b6cb6903e	Improvements in High-Level API and Poe Tasks (#1055 ) * Add an option to SamplingConfig which allows to configure number of test episodes * Make OptimizerFactory more flexible, adding method `create_optimizer_for_params` * Fix AutoAlphaFactoryDefault using hard-coded Adam optimizer * Fix mypy issues that were platform/installation-dependent * Limit scope of nbqa, resolving issues with files generated by old versions of the build Fixes #1054	2024-02-15 12:02:16 +01:00
Dominik Jain	26e210a6ae	Apply nbqa only to the docs/ folder and exclude the (old) jupyter_execute folder	2024-02-15 11:39:45 +01:00
Dominik Jain	08728ad35e	Resolve platform-specific/installation-specific mypy issues by adding ignores and ignoring unused ignores locally	2024-02-15 11:26:54 +01:00
Dominik Jain	f2e0fd165d	Fix gitignore applying to tianshou/env on platfoms with case-insensitive file system	2024-02-15 11:26:39 +01:00
Dominik Jain	eeb2081ca6	Fix AutoAlphaFactoryDefault using hard-coded Adam optimizer instead of passed factory	2024-02-14 20:43:38 +01:00
Dominik Jain	76cbd7efc2	Make OptimizerFactory more flexible by adding a second method which allows the creation of an optimizer given arbitrary parameters (rather than a module)	2024-02-14 20:42:06 +01:00
Dominik Jain	bf391853dc	Allow to configure number of test episodes in high-level API	2024-02-14 19:14:28 +01:00
Michael Panchenko	8742e3645c	Docs, js - typo in path	2024-02-14 10:50:06 +01:00
Michael Panchenko	5cc51145da	Docs/hotfix (#1052 )	2024-02-12 18:54:38 +01:00
Michael Panchenko	7a30b842b6	Add vega scripts explictly to config (#1051 )	2024-02-12 18:49:32 +01:00

1 2 3 4 5 ...

669 Commits