Dominik Jain
3691ed2abc
Support obs_rms persistence for MuJoCo by adding a general mechanism
...
for attaching persistence to Environments instances
2023-10-18 20:44:17 +02:00
Dominik Jain
f6d49774a2
Reify policy persistence, introducing Wold representation
2023-10-18 20:44:17 +02:00
Dominik Jain
ee3813b09c
Ignore temp scripts and temp folder
2023-10-18 20:44:17 +02:00
Dominik Jain
686fd555b0
Extend tests, fixing some default behaviour
2023-10-18 20:44:17 +02:00
Dominik Jain
a8a367c42d
Support IQN in high-level API
...
* Add example atari_iqn_hl
* Factor out trainer callbacks to new module atari_callbacks
* Extract base class for DQN-based agent factories
* Improved module factory interface design, achieving higher generality
2023-10-18 20:44:17 +02:00
Dominik Jain
213e08a846
Add method get_output_dim to BaseActor
2023-10-18 20:44:17 +02:00
Dominik Jain
c7d0b6b4b2
Simplify agent factories by making better use of base classes
2023-10-18 20:44:17 +02:00
Dominik Jain
799beb79b4
Support discrete SAC in high-level API
...
* Changed machanism for reusing actor's preprocessing module in critics
to avoid special handling in AgentFactory implementations, improving
separation of concerns:
- Added CriticFactoryReuseActor as the new critic factory
- Added ActorFactoryTransientStorageDecorator to pass on the actor
data
- Added helper classes ActorFuture, ActorFutureProviderProtocol
* Add example atari_sac_hl
2023-10-18 20:44:17 +02:00
Dominik Jain
305b30a6c1
Simplify parameter transformers by applying ParamTransformerChangeValue
2023-10-18 20:44:17 +02:00
Dominik Jain
17ef4dd5eb
Support REDQ in high-level API
...
* Implement example mujoco_redq_hl
* Add abstraction CriticEnsembleFactory with default implementations
to suit REDQ
* Fix type annotation of linear_layer in Net, MLP, Critic
(was incompatible with REDQ usage)
2023-10-18 20:44:17 +02:00
Dominik Jain
7af836bd6a
Support TRPO in high-level API and add example mujoco_trpo_hl
2023-10-18 20:44:17 +02:00
Dominik Jain
383a4a6083
Support NPG in high-level API and add example mujoco_npg_hl
2023-10-18 20:44:17 +02:00
Dominik Jain
73a6d15eee
Log Environments
2023-10-18 20:44:17 +02:00
Dominik Jain
a8ea6808c3
Fix ruff type comparison complaint
2023-10-18 20:44:17 +02:00
Dominik Jain
1bb52a6a5c
Simplify critic/agent with optimizer generation
...
After adding a function to create ModuleOpt instances directly from
AgentFactory and CriticFactory,
* several mixins for AgentFactories are no longer needed (deleted)
* additional abstractions for ModuleOptFactories are no longer needed (deleted)
2023-10-18 20:44:17 +02:00
Dominik Jain
6bb3abb2f0
Support PG/Reinforce in high-level API
...
* Add example mujoco_reinforce_hl
* Extended functionality of ActorFactory to support creation of ModuleOpt
2023-10-18 20:44:17 +02:00
Dominik Jain
4e93c12afa
Remove obsolete configuration files
2023-10-18 20:44:17 +02:00
Dominik Jain
22dfc4ed2e
Fix type annotations of dist_fn
2023-10-18 20:44:17 +02:00
Dominik Jain
a161a9cf58
Improve type annotations, fix type issues and add checks
2023-10-18 20:44:17 +02:00
Dominik Jain
e6716326bd
Make mypy ignore copied util modules string & logging
2023-10-18 20:44:17 +02:00
Dominik Jain
7ed6c1d71c
Remove obsolete module highlevel.utils
2023-10-18 20:44:17 +02:00
Dominik Jain
1243894eb8
Add DistributionFunctionFactory subclasses for discrete/continuous default
2023-10-18 20:44:17 +02:00
Dominik Jain
a8dc75fbab
ExperimentBuilder: Allow experiment_config and sampling_config to be None
2023-10-18 20:44:17 +02:00
Dominik Jain
837ff13c04
Reorder ExperimentBuilder args (EnvFactory first)
2023-10-18 20:44:17 +02:00
Dominik Jain
d269063e6a
Remove 'RL' prefix from class names
2023-10-18 20:44:17 +02:00
Dominik Jain
50ac385321
Add some basic tests for high-level experiment builder API
2023-10-18 20:44:16 +02:00
Dominik Jain
b54fcd12cb
Change high-level DQN interface to expect an actor instead of a critic,
...
because that is what is functionally required
2023-10-18 20:44:16 +02:00
Dominik Jain
1cba589bd4
Add DQN support in high-level API
...
* Allow to specify trainer callbacks (train_fn, test_fn, stop_fn)
in high-level API, adding the necessary abstractions and pass-on
mechanisms
* Add example atari_dqn_hl
2023-10-18 20:44:16 +02:00
Dominik Jain
358978c65d
Add ToStringMixin to further high-level parameter classes
2023-10-18 20:44:16 +02:00
Dominik Jain
8f67c2e9d9
Disable numba DEBUG logs
2023-10-18 20:44:16 +02:00
Dominik Jain
9f0a410bb1
Log full experiment configuration, adding string representations to relevant classes
2023-10-18 20:44:16 +02:00
Dominik Jain
58bd20f882
Add logging module
2023-10-18 20:44:16 +02:00
Dominik Jain
ce26e25923
Handle ruff complaints in string module
2023-10-18 20:44:16 +02:00
Dominik Jain
de70147752
Add string module from sensAI
2023-10-18 20:44:16 +02:00
Dominik Jain
2671580c6c
Add DDPG high-level API and MuJoCo example
2023-10-18 20:44:16 +02:00
Dominik Jain
6b6d9ea609
Add support for discrete PPO
...
* Refactored module `module` (split into submodules)
* Basic support for discrete environments
* Implement Atari env. factory
* Implement DQN-based actor factory
* Implement notion of reusing agent preprocessing network for critic
* Add example atari_ppo_hl
2023-10-18 20:44:16 +02:00
Dominik Jain
e0e7349b0a
Add base class BaseActor with method get_preprocess_net for high-level API
2023-10-18 20:44:16 +02:00
Dominik Jain
cd79cf8661
Add A2C high-level API
...
* Add common based class for A2C and PPO agent factories
* Add default for dist_fn parameter, adding corresponding factories
* Add example mujoco_a2c_hl
2023-10-18 20:44:16 +02:00
Dominik Jain
acd89fa3b0
Remove parameter transformers from config object state,
...
composing the list dynamically instead
2023-10-18 20:44:16 +02:00
Dominik Jain
78b6dd1f49
Adapt class naming scheme
...
* Use prefix convention (subclasses have superclass names as prefix) to
facilitate discoverability of relevant classes via IDE autocompletion
* Use dual naming, adding an alternative concise name that omits the
precise OO semantics and retains only the essential part of the name
(which can be more pleasing to users not accustomed to
convoluted OO naming)
2023-10-18 20:44:16 +02:00
Michael Panchenko
5bcf514c55
Add alternative functional interface for environment creation
...
where a persistable configuration object is passed as an
argument, as this can help to ensure persistability (making the
requirement explicit)
2023-10-18 20:44:16 +02:00
Dominik Jain
d4e604b46e
Move parameter transformation directly into parameter objects,
...
achieving greater separation of concerns and improved maintainability
2023-10-18 20:44:16 +02:00
Dominik Jain
38cf982034
Disable Ruff rule D205 (blank-line-after-summary)
...
because it disallows, in particular, class docstrings that consist
only of a summary line
2023-10-18 20:44:16 +02:00
Dominik Jain
e993425aa1
Add high-level API support for TD3
...
* Created mixins for agent factories to reduce code duplication
* Further factorised params & mixins for experiment factories
* Additional parameter abstractions
* Implement high-level MuJoCo TD3 example
2023-10-18 20:44:16 +02:00
Dominik Jain
6a739384ee
WandbLogger: Use less restrictive type annotation for config
2023-10-18 20:44:16 +02:00
Dominik Jain
367778d37f
Improve high-level policy parametrisation
...
Policy objects are now parametrised by converting the parameter
dataclass instances to kwargs, using some injectable conversions
along the way
2023-10-18 20:44:16 +02:00
Dominik Jain
37dc07e487
Add high-level experiment builder interface
2023-10-18 20:44:05 +02:00
Dominik Jain
4d53d345d6
Ignore Ruff rule RET505, because it sacrifices visual discernability
...
of control flow paths for brevity (regarding return statements)
2023-10-09 13:03:19 +02:00
Dominik Jain
3fd60f9e70
Unify PPO configuration objects, use experiment-specific configuration
...
in mujoco_ppo_hl
2023-10-09 13:02:29 +02:00
Dominik Jain
8ec42009cb
Move RLSamplingConfig to separate module config, fixing cyclic import
2023-10-09 13:02:23 +02:00