Tianshou

Author	SHA1	Message	Date
Michael Panchenko	bf3859a457	Extension of ExpLauncher and DataclassPPrintMixin 1. Launch in main process if only 1 exp is passed 2. Launcher returns a list of stats for successful exps 3. More detailed logging for unsuccessful expos 4. Raise error if all runs were unsuccessful 5. DataclassPPrintMixin allows retrieving a pretty repr string 6. Minor improvements in docstrings	2024-05-07 16:21:50 +02:00
Dominik Jain	250a129cc4	SamplingConfig: Improve docstrings of replay_buffer_save_only_last_obs, replay_buffer_stack_num	2024-04-29 18:27:02 +02:00
maxhuettenrauch	ade85ab32b	Feature/algo eval (#1074 ) # Changes ## Dependencies - New extra "eval" ## Api Extension - `Experiment` and `ExperimentConfig` now have a `name`, that can however be overridden when `Experiment.run()` is called - When building an `Experiment` from an `ExperimentConfig`, the user has the option to add info about seeds to the name. - New method in `ExperimentConfig` called `build_default_seeded_experiments` - `SamplingConfig` has an explicit training seed, `test_seed` is inferred. - New `evaluation` package for repeating the same experiment with multiple seeds and aggregating the results (important extension!). Currently in alpha state. - Loggers can now restore the logged data into python by using the new `restore_logged_data` ## Breaking Changes - `AtariEnvFactory` (in examples) now receives explicit train and test seeds - `EnvFactoryRegistered` now requires an explicit `test_seed` - `BaseLogger.prepare_dict_for_logging` is now abstract --------- Co-authored-by: Maximilian Huettenrauch <m.huettenrauch@appliedai.de> Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de> Co-authored-by: Michael Panchenko <35432522+MischaPanch@users.noreply.github.com>	2024-04-20 23:25:33 +00:00
Daniel Plop	8a0629ded6	Fix mypy issues in tests and examples (#1077 ) Closes #952 - `SamplingConfig` supports `batch_size=None`. #1077 - tests and examples are covered by `mypy`. #1077 - `NetBase` is more used, stricter typing by making it generic. #1077 - `utils.net.common.Recurrent` now receives and returns a `RecurrentStateBatch` instead of a dict. #1077 --------- Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>	2024-04-03 18:07:51 +02:00
Dominik Jain	1714c7f2c7	High-level API: Fix number of test episodes being incorrectly scaled by number of envs (#1071 )	2024-03-07 08:57:11 -08:00
Dominik Jain	bf391853dc	Allow to configure number of test episodes in high-level API	2024-02-14 19:14:28 +01:00
Dominik Jain	45a1a3f259	SamplingConfig: Change default of repeat_per_collect to 1 (safest option)	2023-12-21 13:13:51 +01:00
Dominik Jain	408d51f9de	SamplingConfig: Improve/extend docstrings, clearly explaining the parameters	2023-12-21 13:13:51 +01:00
Dominik Jain	dae4000cd2	Revert "Depend on sensAI instead of copying its utils (logging, string)" This reverts commit fdb0eba93d81fa5e698770b4f7088c87fc1238da.	2023-11-08 19:11:39 +01:00
Dominik Jain	fdb0eba93d	Depend on sensAI instead of copying its utils (logging, string)	2023-10-27 20:15:58 +02:00
Dominik Jain	d684dae6cd	Change default number of environments (train=#CPUs, test=1)	2023-10-26 12:50:08 +02:00
Dominik Jain	e63d8d4147	Use ToStringMixin in dataclasses to detect recurring objects in larger object trees	2023-10-18 20:44:18 +02:00
Dominik Jain	d269063e6a	Remove 'RL' prefix from class names	2023-10-18 20:44:17 +02:00
Dominik Jain	1cba589bd4	Add DQN support in high-level API * Allow to specify trainer callbacks (train_fn, test_fn, stop_fn) in high-level API, adding the necessary abstractions and pass-on mechanisms * Add example atari_dqn_hl	2023-10-18 20:44:16 +02:00
Dominik Jain	2671580c6c	Add DDPG high-level API and MuJoCo example	2023-10-18 20:44:16 +02:00
Dominik Jain	6b6d9ea609	Add support for discrete PPO * Refactored module `module` (split into submodules) * Basic support for discrete environments * Implement Atari env. factory * Implement DQN-based actor factory * Implement notion of reusing agent preprocessing network for critic * Add example atari_ppo_hl	2023-10-18 20:44:16 +02:00
Dominik Jain	e993425aa1	Add high-level API support for TD3 * Created mixins for agent factories to reduce code duplication * Further factorised params & mixins for experiment factories * Additional parameter abstractions * Implement high-level MuJoCo TD3 example	2023-10-18 20:44:16 +02:00
Dominik Jain	8ec42009cb	Move RLSamplingConfig to separate module config, fixing cyclic import	2023-10-09 13:02:23 +02:00

18 Commits