Dominik Jain
d269063e6a
Remove 'RL' prefix from class names
2023-10-18 20:44:17 +02:00
Dominik Jain
b54fcd12cb
Change high-level DQN interface to expect an actor instead of a critic,
...
because that is what is functionally required
2023-10-18 20:44:16 +02:00
Dominik Jain
1cba589bd4
Add DQN support in high-level API
...
* Allow to specify trainer callbacks (train_fn, test_fn, stop_fn)
in high-level API, adding the necessary abstractions and pass-on
mechanisms
* Add example atari_dqn_hl
2023-10-18 20:44:16 +02:00
Dominik Jain
9f0a410bb1
Log full experiment configuration, adding string representations to relevant classes
2023-10-18 20:44:16 +02:00
Dominik Jain
2671580c6c
Add DDPG high-level API and MuJoCo example
2023-10-18 20:44:16 +02:00
Dominik Jain
6b6d9ea609
Add support for discrete PPO
...
* Refactored module `module` (split into submodules)
* Basic support for discrete environments
* Implement Atari env. factory
* Implement DQN-based actor factory
* Implement notion of reusing agent preprocessing network for critic
* Add example atari_ppo_hl
2023-10-18 20:44:16 +02:00
Dominik Jain
cd79cf8661
Add A2C high-level API
...
* Add common based class for A2C and PPO agent factories
* Add default for dist_fn parameter, adding corresponding factories
* Add example mujoco_a2c_hl
2023-10-18 20:44:16 +02:00
Dominik Jain
d4e604b46e
Move parameter transformation directly into parameter objects,
...
achieving greater separation of concerns and improved maintainability
2023-10-18 20:44:16 +02:00
Dominik Jain
e993425aa1
Add high-level API support for TD3
...
* Created mixins for agent factories to reduce code duplication
* Further factorised params & mixins for experiment factories
* Additional parameter abstractions
* Implement high-level MuJoCo TD3 example
2023-10-18 20:44:16 +02:00
Dominik Jain
367778d37f
Improve high-level policy parametrisation
...
Policy objects are now parametrised by converting the parameter
dataclass instances to kwargs, using some injectable conversions
along the way
2023-10-18 20:44:16 +02:00
Dominik Jain
37dc07e487
Add high-level experiment builder interface
2023-10-18 20:44:05 +02:00
Dominik Jain
3fd60f9e70
Unify PPO configuration objects, use experiment-specific configuration
...
in mujoco_ppo_hl
2023-10-09 13:02:29 +02:00
Dominik Jain
8ec42009cb
Move RLSamplingConfig to separate module config, fixing cyclic import
2023-10-09 13:02:23 +02:00
Dominik Jain
d26b8cb40c
Use experiment-specific config in mujoco_sac_hl, adding auto-alpha
2023-10-09 13:02:18 +02:00
Dominik Jain
997b520580
Refactoring, dropping package config
2023-10-09 13:02:07 +02:00
Dominik Jain
316eb3c579
Add SAC high-level interface
2023-10-09 13:02:01 +02:00
Dominik Jain
16ed5fd2a5
Initial high-level interfaces, demonstrated in mujoco_ppo_hl
2023-10-09 13:01:35 +02:00