Tianshou

Author	SHA1	Message	Date
Dominik Jain	d18ded333e	CriticFactoryReuseActor: Fix the case where we want to reuse an actor's preprocessing network for the critic (must be applied before concatenating the actions)	2024-04-29 18:27:02 +02:00
Daniel Plop	8a0629ded6	Fix mypy issues in tests and examples (#1077 ) Closes #952 - `SamplingConfig` supports `batch_size=None`. #1077 - tests and examples are covered by `mypy`. #1077 - `NetBase` is more used, stricter typing by making it generic. #1077 - `utils.net.common.Recurrent` now receives and returns a `RecurrentStateBatch` instead of a dict. #1077 --------- Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>	2024-04-03 18:07:51 +02:00
Michael Panchenko	a846b52063	Typing: fixed multiple typing issues	2023-12-05 12:04:18 +01:00
Michael Panchenko	2e39a252e3	Docstring: minor changes to let ruff pass	2023-12-04 13:52:46 +01:00
Dominik Jain	dae4000cd2	Revert "Depend on sensAI instead of copying its utils (logging, string)" This reverts commit fdb0eba93d81fa5e698770b4f7088c87fc1238da.	2023-11-08 19:11:39 +01:00
Dominik Jain	ac672f65d1	Add docstring for ActorFactoryTransientStorageDecorator	2023-11-06 17:18:10 +01:00
Dominik Jain	7e6d3d627e	Rename class ActorCriticModuleOpt -> ActorCriticOpt	2023-11-06 16:51:41 +01:00
Dominik Jain	fdb0eba93d	Depend on sensAI instead of copying its utils (logging, string)	2023-10-27 20:15:58 +02:00
Dominik Jain	41bd463a7b	Allow to configure activation function in default networks * Set ReLU as default in all actor and critic factories * Configure non-default in applicable MuJoCo examples	2023-10-18 20:44:18 +02:00
Dominik Jain	4b270eaa2d	Add documentation, improve structure of 'module' package	2023-10-18 20:44:18 +02:00
Dominik Jain	023b33c917	Make mypy happy	2023-10-18 20:44:18 +02:00
Dominik Jain	f6d49774a2	Reify policy persistence, introducing Wold representation	2023-10-18 20:44:17 +02:00
Dominik Jain	686fd555b0	Extend tests, fixing some default behaviour	2023-10-18 20:44:17 +02:00
Dominik Jain	a8a367c42d	Support IQN in high-level API * Add example atari_iqn_hl * Factor out trainer callbacks to new module atari_callbacks * Extract base class for DQN-based agent factories * Improved module factory interface design, achieving higher generality	2023-10-18 20:44:17 +02:00
Dominik Jain	799beb79b4	Support discrete SAC in high-level API * Changed machanism for reusing actor's preprocessing module in critics to avoid special handling in AgentFactory implementations, improving separation of concerns: - Added CriticFactoryReuseActor as the new critic factory - Added ActorFactoryTransientStorageDecorator to pass on the actor data - Added helper classes ActorFuture, ActorFutureProviderProtocol * Add example atari_sac_hl	2023-10-18 20:44:17 +02:00
Dominik Jain	17ef4dd5eb	Support REDQ in high-level API * Implement example mujoco_redq_hl * Add abstraction CriticEnsembleFactory with default implementations to suit REDQ * Fix type annotation of linear_layer in Net, MLP, Critic (was incompatible with REDQ usage)	2023-10-18 20:44:17 +02:00
Dominik Jain	1bb52a6a5c	Simplify critic/agent with optimizer generation After adding a function to create ModuleOpt instances directly from AgentFactory and CriticFactory, * several mixins for AgentFactories are no longer needed (deleted) * additional abstractions for ModuleOptFactories are no longer needed (deleted)	2023-10-18 20:44:17 +02:00
Dominik Jain	6bb3abb2f0	Support PG/Reinforce in high-level API * Add example mujoco_reinforce_hl * Extended functionality of ActorFactory to support creation of ModuleOpt	2023-10-18 20:44:17 +02:00
Dominik Jain	a161a9cf58	Improve type annotations, fix type issues and add checks	2023-10-18 20:44:17 +02:00
Dominik Jain	b54fcd12cb	Change high-level DQN interface to expect an actor instead of a critic, because that is what is functionally required	2023-10-18 20:44:16 +02:00
Dominik Jain	1cba589bd4	Add DQN support in high-level API * Allow to specify trainer callbacks (train_fn, test_fn, stop_fn) in high-level API, adding the necessary abstractions and pass-on mechanisms * Add example atari_dqn_hl	2023-10-18 20:44:16 +02:00
Dominik Jain	358978c65d	Add ToStringMixin to further high-level parameter classes	2023-10-18 20:44:16 +02:00
Dominik Jain	9f0a410bb1	Log full experiment configuration, adding string representations to relevant classes	2023-10-18 20:44:16 +02:00
Dominik Jain	6b6d9ea609	Add support for discrete PPO * Refactored module `module` (split into submodules) * Basic support for discrete environments * Implement Atari env. factory * Implement DQN-based actor factory * Implement notion of reusing agent preprocessing network for critic * Add example atari_ppo_hl	2023-10-18 20:44:16 +02:00

24 Commits