Tianshou

Author	SHA1	Message	Date
Dominik Jain	e63d8d4147	Use ToStringMixin in dataclasses to detect recurring objects in larger object trees	2023-10-18 20:44:18 +02:00
Dominik Jain	d84e936430	Apply centrally defined callbacks	2023-10-18 20:44:18 +02:00
Dominik Jain	ae4850692f	DQNExperimentBuilder: Use IntermediateModuleFactory instead of ActorFactory (similar to IQN implementation)	2023-10-18 20:44:18 +02:00
Dominik Jain	83048788a1	Add generalised DQN network representation, adding specialised class for feature_only=True	2023-10-18 20:44:18 +02:00
Dominik Jain	4b270eaa2d	Add documentation, improve structure of 'module' package	2023-10-18 20:44:18 +02:00
Dominik Jain	97e21b5ddf	Remove obsolete mixin, improve class names	2023-10-18 20:44:18 +02:00
Dominik Jain	90eaacb606	PolicyWrapperFactory: Remove unnecessary input type variable	2023-10-18 20:44:18 +02:00
Dominik Jain	fc695a5394	Use logging to report trainer epoch status	2023-10-18 20:44:18 +02:00
Dominik Jain	3bba192633	Add experiment result	2023-10-18 20:44:18 +02:00
Dominik Jain	023b33c917	Make mypy happy	2023-10-18 20:44:18 +02:00
Dominik Jain	76e870207d	Improve persistence handling * Add persistence/restoration of Experiment instance * Add file logging in experiment * Allow all persistence/logging to be disabled * Disable persistence in tests	2023-10-18 20:44:18 +02:00
Dominik Jain	ba803296cc	Add FileLoggerContext	2023-10-18 20:44:17 +02:00
Dominik Jain	3691ed2abc	Support obs_rms persistence for MuJoCo by adding a general mechanism for attaching persistence to Environments instances	2023-10-18 20:44:17 +02:00
Dominik Jain	f6d49774a2	Reify policy persistence, introducing Wold representation	2023-10-18 20:44:17 +02:00
Dominik Jain	ee3813b09c	Ignore temp scripts and temp folder	2023-10-18 20:44:17 +02:00
Dominik Jain	686fd555b0	Extend tests, fixing some default behaviour	2023-10-18 20:44:17 +02:00
Dominik Jain	a8a367c42d	Support IQN in high-level API * Add example atari_iqn_hl * Factor out trainer callbacks to new module atari_callbacks * Extract base class for DQN-based agent factories * Improved module factory interface design, achieving higher generality	2023-10-18 20:44:17 +02:00
Dominik Jain	213e08a846	Add method get_output_dim to BaseActor	2023-10-18 20:44:17 +02:00
Dominik Jain	c7d0b6b4b2	Simplify agent factories by making better use of base classes	2023-10-18 20:44:17 +02:00
Dominik Jain	799beb79b4	Support discrete SAC in high-level API * Changed machanism for reusing actor's preprocessing module in critics to avoid special handling in AgentFactory implementations, improving separation of concerns: - Added CriticFactoryReuseActor as the new critic factory - Added ActorFactoryTransientStorageDecorator to pass on the actor data - Added helper classes ActorFuture, ActorFutureProviderProtocol * Add example atari_sac_hl	2023-10-18 20:44:17 +02:00
Dominik Jain	305b30a6c1	Simplify parameter transformers by applying ParamTransformerChangeValue	2023-10-18 20:44:17 +02:00
Dominik Jain	17ef4dd5eb	Support REDQ in high-level API * Implement example mujoco_redq_hl * Add abstraction CriticEnsembleFactory with default implementations to suit REDQ * Fix type annotation of linear_layer in Net, MLP, Critic (was incompatible with REDQ usage)	2023-10-18 20:44:17 +02:00
Dominik Jain	7af836bd6a	Support TRPO in high-level API and add example mujoco_trpo_hl	2023-10-18 20:44:17 +02:00
Dominik Jain	383a4a6083	Support NPG in high-level API and add example mujoco_npg_hl	2023-10-18 20:44:17 +02:00
Dominik Jain	73a6d15eee	Log Environments	2023-10-18 20:44:17 +02:00
Dominik Jain	a8ea6808c3	Fix ruff type comparison complaint	2023-10-18 20:44:17 +02:00
Dominik Jain	1bb52a6a5c	Simplify critic/agent with optimizer generation After adding a function to create ModuleOpt instances directly from AgentFactory and CriticFactory, * several mixins for AgentFactories are no longer needed (deleted) * additional abstractions for ModuleOptFactories are no longer needed (deleted)	2023-10-18 20:44:17 +02:00
Dominik Jain	6bb3abb2f0	Support PG/Reinforce in high-level API * Add example mujoco_reinforce_hl * Extended functionality of ActorFactory to support creation of ModuleOpt	2023-10-18 20:44:17 +02:00
Dominik Jain	4e93c12afa	Remove obsolete configuration files	2023-10-18 20:44:17 +02:00
Dominik Jain	22dfc4ed2e	Fix type annotations of dist_fn	2023-10-18 20:44:17 +02:00
Dominik Jain	a161a9cf58	Improve type annotations, fix type issues and add checks	2023-10-18 20:44:17 +02:00
Dominik Jain	e6716326bd	Make mypy ignore copied util modules string & logging	2023-10-18 20:44:17 +02:00
Dominik Jain	7ed6c1d71c	Remove obsolete module highlevel.utils	2023-10-18 20:44:17 +02:00
Dominik Jain	1243894eb8	Add DistributionFunctionFactory subclasses for discrete/continuous default	2023-10-18 20:44:17 +02:00
Dominik Jain	a8dc75fbab	ExperimentBuilder: Allow experiment_config and sampling_config to be None	2023-10-18 20:44:17 +02:00
Dominik Jain	837ff13c04	Reorder ExperimentBuilder args (EnvFactory first)	2023-10-18 20:44:17 +02:00
Dominik Jain	d269063e6a	Remove 'RL' prefix from class names	2023-10-18 20:44:17 +02:00
Dominik Jain	50ac385321	Add some basic tests for high-level experiment builder API	2023-10-18 20:44:16 +02:00
Dominik Jain	b54fcd12cb	Change high-level DQN interface to expect an actor instead of a critic, because that is what is functionally required	2023-10-18 20:44:16 +02:00
Dominik Jain	1cba589bd4	Add DQN support in high-level API * Allow to specify trainer callbacks (train_fn, test_fn, stop_fn) in high-level API, adding the necessary abstractions and pass-on mechanisms * Add example atari_dqn_hl	2023-10-18 20:44:16 +02:00
Dominik Jain	358978c65d	Add ToStringMixin to further high-level parameter classes	2023-10-18 20:44:16 +02:00
Dominik Jain	8f67c2e9d9	Disable numba DEBUG logs	2023-10-18 20:44:16 +02:00
Dominik Jain	9f0a410bb1	Log full experiment configuration, adding string representations to relevant classes	2023-10-18 20:44:16 +02:00
Dominik Jain	58bd20f882	Add logging module	2023-10-18 20:44:16 +02:00
Dominik Jain	ce26e25923	Handle ruff complaints in string module	2023-10-18 20:44:16 +02:00
Dominik Jain	de70147752	Add string module from sensAI	2023-10-18 20:44:16 +02:00
Dominik Jain	2671580c6c	Add DDPG high-level API and MuJoCo example	2023-10-18 20:44:16 +02:00
Dominik Jain	6b6d9ea609	Add support for discrete PPO * Refactored module `module` (split into submodules) * Basic support for discrete environments * Implement Atari env. factory * Implement DQN-based actor factory * Implement notion of reusing agent preprocessing network for critic * Add example atari_ppo_hl	2023-10-18 20:44:16 +02:00
Dominik Jain	e0e7349b0a	Add base class BaseActor with method get_preprocess_net for high-level API	2023-10-18 20:44:16 +02:00
Dominik Jain	cd79cf8661	Add A2C high-level API * Add common based class for A2C and PPO agent factories * Add default for dist_fn parameter, adding corresponding factories * Add example mujoco_a2c_hl	2023-10-18 20:44:16 +02:00

... 2 3 4 5 6 ...

619 Commits