17 Commits

Author SHA1 Message Date
Dominik Jain
d269063e6a Remove 'RL' prefix from class names 2023-10-18 20:44:17 +02:00
Dominik Jain
b54fcd12cb Change high-level DQN interface to expect an actor instead of a critic,
because that is what is functionally required
2023-10-18 20:44:16 +02:00
Dominik Jain
1cba589bd4 Add DQN support in high-level API
* Allow to specify trainer callbacks (train_fn, test_fn, stop_fn)
  in high-level API, adding the necessary abstractions and pass-on
  mechanisms
* Add example atari_dqn_hl
2023-10-18 20:44:16 +02:00
Dominik Jain
9f0a410bb1 Log full experiment configuration, adding string representations to relevant classes 2023-10-18 20:44:16 +02:00
Dominik Jain
2671580c6c Add DDPG high-level API and MuJoCo example 2023-10-18 20:44:16 +02:00
Dominik Jain
6b6d9ea609 Add support for discrete PPO
* Refactored module `module` (split into submodules)
* Basic support for discrete environments
* Implement Atari env. factory
* Implement DQN-based actor factory
* Implement notion of reusing agent preprocessing network for critic
* Add example atari_ppo_hl
2023-10-18 20:44:16 +02:00
Dominik Jain
cd79cf8661 Add A2C high-level API
* Add common based class for A2C and PPO agent factories
* Add default for dist_fn parameter, adding corresponding factories
* Add example mujoco_a2c_hl
2023-10-18 20:44:16 +02:00
Dominik Jain
d4e604b46e Move parameter transformation directly into parameter objects,
achieving greater separation of concerns and improved maintainability
2023-10-18 20:44:16 +02:00
Dominik Jain
e993425aa1 Add high-level API support for TD3
* Created mixins for agent factories to reduce code duplication
 * Further factorised params & mixins for experiment factories
 * Additional parameter abstractions
 * Implement high-level MuJoCo TD3 example
2023-10-18 20:44:16 +02:00
Dominik Jain
367778d37f Improve high-level policy parametrisation
Policy objects are now parametrised by converting the parameter
dataclass instances to kwargs, using some injectable conversions
along the way
2023-10-18 20:44:16 +02:00
Dominik Jain
37dc07e487 Add high-level experiment builder interface 2023-10-18 20:44:05 +02:00
Dominik Jain
3fd60f9e70 Unify PPO configuration objects, use experiment-specific configuration
in mujoco_ppo_hl
2023-10-09 13:02:29 +02:00
Dominik Jain
8ec42009cb Move RLSamplingConfig to separate module config, fixing cyclic import 2023-10-09 13:02:23 +02:00
Dominik Jain
d26b8cb40c Use experiment-specific config in mujoco_sac_hl, adding auto-alpha 2023-10-09 13:02:18 +02:00
Dominik Jain
997b520580 Refactoring, dropping package config 2023-10-09 13:02:07 +02:00
Dominik Jain
316eb3c579 Add SAC high-level interface 2023-10-09 13:02:01 +02:00
Dominik Jain
16ed5fd2a5 Initial high-level interfaces, demonstrated in mujoco_ppo_hl 2023-10-09 13:01:35 +02:00