Tianshou

Author	SHA1	Message	Date
Dominik Jain	ac672f65d1	Add docstring for ActorFactoryTransientStorageDecorator	2023-11-06 17:18:10 +01:00
Dominik Jain	7e6d3d627e	Rename class ActorCriticModuleOpt -> ActorCriticOpt	2023-11-06 16:51:41 +01:00
Dominik Jain	fdb0eba93d	Depend on sensAI instead of copying its utils (logging, string)	2023-10-27 20:15:58 +02:00
Dominik Jain	41bd463a7b	Allow to configure activation function in default networks * Set ReLU as default in all actor and critic factories * Configure non-default in applicable MuJoCo examples	2023-10-18 20:44:18 +02:00
Dominik Jain	4b270eaa2d	Add documentation, improve structure of 'module' package	2023-10-18 20:44:18 +02:00
Dominik Jain	023b33c917	Make mypy happy	2023-10-18 20:44:18 +02:00
Dominik Jain	f6d49774a2	Reify policy persistence, introducing Wold representation	2023-10-18 20:44:17 +02:00
Dominik Jain	686fd555b0	Extend tests, fixing some default behaviour	2023-10-18 20:44:17 +02:00
Dominik Jain	a8a367c42d	Support IQN in high-level API * Add example atari_iqn_hl * Factor out trainer callbacks to new module atari_callbacks * Extract base class for DQN-based agent factories * Improved module factory interface design, achieving higher generality	2023-10-18 20:44:17 +02:00
Dominik Jain	799beb79b4	Support discrete SAC in high-level API * Changed machanism for reusing actor's preprocessing module in critics to avoid special handling in AgentFactory implementations, improving separation of concerns: - Added CriticFactoryReuseActor as the new critic factory - Added ActorFactoryTransientStorageDecorator to pass on the actor data - Added helper classes ActorFuture, ActorFutureProviderProtocol * Add example atari_sac_hl	2023-10-18 20:44:17 +02:00
Dominik Jain	17ef4dd5eb	Support REDQ in high-level API * Implement example mujoco_redq_hl * Add abstraction CriticEnsembleFactory with default implementations to suit REDQ * Fix type annotation of linear_layer in Net, MLP, Critic (was incompatible with REDQ usage)	2023-10-18 20:44:17 +02:00
Dominik Jain	1bb52a6a5c	Simplify critic/agent with optimizer generation After adding a function to create ModuleOpt instances directly from AgentFactory and CriticFactory, * several mixins for AgentFactories are no longer needed (deleted) * additional abstractions for ModuleOptFactories are no longer needed (deleted)	2023-10-18 20:44:17 +02:00
Dominik Jain	6bb3abb2f0	Support PG/Reinforce in high-level API * Add example mujoco_reinforce_hl * Extended functionality of ActorFactory to support creation of ModuleOpt	2023-10-18 20:44:17 +02:00
Dominik Jain	a161a9cf58	Improve type annotations, fix type issues and add checks	2023-10-18 20:44:17 +02:00
Dominik Jain	b54fcd12cb	Change high-level DQN interface to expect an actor instead of a critic, because that is what is functionally required	2023-10-18 20:44:16 +02:00
Dominik Jain	1cba589bd4	Add DQN support in high-level API * Allow to specify trainer callbacks (train_fn, test_fn, stop_fn) in high-level API, adding the necessary abstractions and pass-on mechanisms * Add example atari_dqn_hl	2023-10-18 20:44:16 +02:00
Dominik Jain	358978c65d	Add ToStringMixin to further high-level parameter classes	2023-10-18 20:44:16 +02:00
Dominik Jain	9f0a410bb1	Log full experiment configuration, adding string representations to relevant classes	2023-10-18 20:44:16 +02:00
Dominik Jain	6b6d9ea609	Add support for discrete PPO * Refactored module `module` (split into submodules) * Basic support for discrete environments * Implement Atari env. factory * Implement DQN-based actor factory * Implement notion of reusing agent preprocessing network for critic * Add example atari_ppo_hl	2023-10-18 20:44:16 +02:00

19 Commits