619 Commits

Author SHA1 Message Date
Dominik Jain
e63d8d4147 Use ToStringMixin in dataclasses to detect recurring objects in larger object trees 2023-10-18 20:44:18 +02:00
Dominik Jain
d84e936430 Apply centrally defined callbacks 2023-10-18 20:44:18 +02:00
Dominik Jain
ae4850692f DQNExperimentBuilder: Use IntermediateModuleFactory instead of ActorFactory
(similar to IQN implementation)
2023-10-18 20:44:18 +02:00
Dominik Jain
83048788a1 Add generalised DQN network representation, adding specialised class for feature_only=True 2023-10-18 20:44:18 +02:00
Dominik Jain
4b270eaa2d Add documentation, improve structure of 'module' package 2023-10-18 20:44:18 +02:00
Dominik Jain
97e21b5ddf Remove obsolete mixin, improve class names 2023-10-18 20:44:18 +02:00
Dominik Jain
90eaacb606 PolicyWrapperFactory: Remove unnecessary input type variable 2023-10-18 20:44:18 +02:00
Dominik Jain
fc695a5394 Use logging to report trainer epoch status 2023-10-18 20:44:18 +02:00
Dominik Jain
3bba192633 Add experiment result 2023-10-18 20:44:18 +02:00
Dominik Jain
023b33c917 Make mypy happy 2023-10-18 20:44:18 +02:00
Dominik Jain
76e870207d Improve persistence handling
* Add persistence/restoration of Experiment instance
* Add file logging in experiment
* Allow all persistence/logging to be disabled
* Disable persistence in tests
2023-10-18 20:44:18 +02:00
Dominik Jain
ba803296cc Add FileLoggerContext 2023-10-18 20:44:17 +02:00
Dominik Jain
3691ed2abc Support obs_rms persistence for MuJoCo by adding a general mechanism
for attaching persistence to Environments instances
2023-10-18 20:44:17 +02:00
Dominik Jain
f6d49774a2 Reify policy persistence, introducing Wold representation 2023-10-18 20:44:17 +02:00
Dominik Jain
ee3813b09c Ignore temp scripts and temp folder 2023-10-18 20:44:17 +02:00
Dominik Jain
686fd555b0 Extend tests, fixing some default behaviour 2023-10-18 20:44:17 +02:00
Dominik Jain
a8a367c42d Support IQN in high-level API
* Add example atari_iqn_hl
* Factor out trainer callbacks to new module atari_callbacks
* Extract base class for DQN-based agent factories
* Improved module factory interface design, achieving higher generality
2023-10-18 20:44:17 +02:00
Dominik Jain
213e08a846 Add method get_output_dim to BaseActor 2023-10-18 20:44:17 +02:00
Dominik Jain
c7d0b6b4b2 Simplify agent factories by making better use of base classes 2023-10-18 20:44:17 +02:00
Dominik Jain
799beb79b4 Support discrete SAC in high-level API
* Changed machanism for reusing actor's preprocessing module in critics
  to avoid special handling in AgentFactory implementations, improving
  separation of concerns:
    - Added CriticFactoryReuseActor as the new critic factory
    - Added ActorFactoryTransientStorageDecorator to pass on the actor
      data
    - Added helper classes ActorFuture, ActorFutureProviderProtocol
* Add example atari_sac_hl
2023-10-18 20:44:17 +02:00
Dominik Jain
305b30a6c1 Simplify parameter transformers by applying ParamTransformerChangeValue 2023-10-18 20:44:17 +02:00
Dominik Jain
17ef4dd5eb Support REDQ in high-level API
* Implement example mujoco_redq_hl
* Add abstraction CriticEnsembleFactory with default implementations
  to suit REDQ
* Fix type annotation of linear_layer in Net, MLP, Critic
  (was incompatible with REDQ usage)
2023-10-18 20:44:17 +02:00
Dominik Jain
7af836bd6a Support TRPO in high-level API and add example mujoco_trpo_hl 2023-10-18 20:44:17 +02:00
Dominik Jain
383a4a6083 Support NPG in high-level API and add example mujoco_npg_hl 2023-10-18 20:44:17 +02:00
Dominik Jain
73a6d15eee Log Environments 2023-10-18 20:44:17 +02:00
Dominik Jain
a8ea6808c3 Fix ruff type comparison complaint 2023-10-18 20:44:17 +02:00
Dominik Jain
1bb52a6a5c Simplify critic/agent with optimizer generation
After adding a function to create ModuleOpt instances directly from
AgentFactory and CriticFactory,
  * several mixins for AgentFactories are no longer needed (deleted)
  * additional abstractions for ModuleOptFactories are no longer needed (deleted)
2023-10-18 20:44:17 +02:00
Dominik Jain
6bb3abb2f0 Support PG/Reinforce in high-level API
* Add example mujoco_reinforce_hl
* Extended functionality of ActorFactory to support creation of ModuleOpt
2023-10-18 20:44:17 +02:00
Dominik Jain
4e93c12afa Remove obsolete configuration files 2023-10-18 20:44:17 +02:00
Dominik Jain
22dfc4ed2e Fix type annotations of dist_fn 2023-10-18 20:44:17 +02:00
Dominik Jain
a161a9cf58 Improve type annotations, fix type issues and add checks 2023-10-18 20:44:17 +02:00
Dominik Jain
e6716326bd Make mypy ignore copied util modules string & logging 2023-10-18 20:44:17 +02:00
Dominik Jain
7ed6c1d71c Remove obsolete module highlevel.utils 2023-10-18 20:44:17 +02:00
Dominik Jain
1243894eb8 Add DistributionFunctionFactory subclasses for discrete/continuous default 2023-10-18 20:44:17 +02:00
Dominik Jain
a8dc75fbab ExperimentBuilder: Allow experiment_config and sampling_config to be None 2023-10-18 20:44:17 +02:00
Dominik Jain
837ff13c04 Reorder ExperimentBuilder args (EnvFactory first) 2023-10-18 20:44:17 +02:00
Dominik Jain
d269063e6a Remove 'RL' prefix from class names 2023-10-18 20:44:17 +02:00
Dominik Jain
50ac385321 Add some basic tests for high-level experiment builder API 2023-10-18 20:44:16 +02:00
Dominik Jain
b54fcd12cb Change high-level DQN interface to expect an actor instead of a critic,
because that is what is functionally required
2023-10-18 20:44:16 +02:00
Dominik Jain
1cba589bd4 Add DQN support in high-level API
* Allow to specify trainer callbacks (train_fn, test_fn, stop_fn)
  in high-level API, adding the necessary abstractions and pass-on
  mechanisms
* Add example atari_dqn_hl
2023-10-18 20:44:16 +02:00
Dominik Jain
358978c65d Add ToStringMixin to further high-level parameter classes 2023-10-18 20:44:16 +02:00
Dominik Jain
8f67c2e9d9 Disable numba DEBUG logs 2023-10-18 20:44:16 +02:00
Dominik Jain
9f0a410bb1 Log full experiment configuration, adding string representations to relevant classes 2023-10-18 20:44:16 +02:00
Dominik Jain
58bd20f882 Add logging module 2023-10-18 20:44:16 +02:00
Dominik Jain
ce26e25923 Handle ruff complaints in string module 2023-10-18 20:44:16 +02:00
Dominik Jain
de70147752 Add string module from sensAI 2023-10-18 20:44:16 +02:00
Dominik Jain
2671580c6c Add DDPG high-level API and MuJoCo example 2023-10-18 20:44:16 +02:00
Dominik Jain
6b6d9ea609 Add support for discrete PPO
* Refactored module `module` (split into submodules)
* Basic support for discrete environments
* Implement Atari env. factory
* Implement DQN-based actor factory
* Implement notion of reusing agent preprocessing network for critic
* Add example atari_ppo_hl
2023-10-18 20:44:16 +02:00
Dominik Jain
e0e7349b0a Add base class BaseActor with method get_preprocess_net for high-level API 2023-10-18 20:44:16 +02:00
Dominik Jain
cd79cf8661 Add A2C high-level API
* Add common based class for A2C and PPO agent factories
* Add default for dist_fn parameter, adding corresponding factories
* Add example mujoco_a2c_hl
2023-10-18 20:44:16 +02:00