587 Commits

Author SHA1 Message Date
Dominik Jain
da2194eff6 Force kwargs in PolicyWrapperFactoryIntrinsicCuriosity init 2023-10-26 10:43:59 +02:00
Dominik Jain
96298eafd8 Add convenient construction mechanisms for Environments
(based on factory function for a single environment)
2023-10-25 21:20:07 +02:00
Dominik Jain
dd4a0eb430 Fix: Add MujocoEnvObsRmsPersistence only if obs_norm is enabled 2023-10-24 13:52:30 +02:00
Dominik Jain
58466ebf5d Keep all ExperimentBuilder tests in one place 2023-10-24 13:14:23 +02:00
Dominik Jain
b5a891557f Revert to simplified environment factory, removing unnecessary config object
(configuration shall be part of the factory instance)
2023-10-24 13:14:23 +02:00
Dominik Jain
f7f20649e3 ExperimentConfig: Improve docstrings, remove obsolete item 'render' 2023-10-20 17:34:27 +02:00
Dominik Jain
7437131d79 Fix tianshou.highlevel depending on jsonargparse
(should be dev dependency only) by introducing a new
place where jsonargparse can be configured:
logging.run_cli, which is also slightly more convenient
2023-10-19 11:40:49 +02:00
Dominik Jain
6cbee188b8 Change interface of EnvFactory to ensure that configuration
of number of environments in SamplingConfig is used
(values are now passed to factory method)

This is clearer and removes the need to pass otherwise
unnecessary configuration to environment factories at
construction
2023-10-19 11:37:20 +02:00
Dominik Jain
89ce40edc0 Docs: Add tianshou.highlevel to docs build via auto-generated .rst files 2023-10-18 22:45:23 +02:00
Dominik Jain
bbfad01a9f Improve docstrings 2023-10-18 22:07:40 +02:00
Dominik Jain
193be9a265 Add 'stdout' to spelling dictionary 2023-10-18 21:13:42 +02:00
Dominik Jain
cc6f0162ff miniblock: Fix type annotation of linear_layer 2023-10-18 20:57:43 +02:00
Dominik Jain
9c5ee55644 Merge remote-tracking branch 'origin/master' into feat/high-level-api
Conflicts:
  poetry.lock
2023-10-18 20:44:45 +02:00
Dominik Jain
41bd463a7b Allow to configure activation function in default networks
* Set ReLU as default in all actor and critic factories
* Configure non-default in applicable MuJoCo examples
2023-10-18 20:44:18 +02:00
Dominik Jain
ed06ab7ff0 Handle obs_norm setting in MuJoCo envs 2023-10-18 20:44:18 +02:00
Dominik Jain
80b1b1ff9d World.restore_path: Add value check 2023-10-18 20:44:18 +02:00
Dominik Jain
c7d0cbb5d3 Experiment: Fix return type annotation, remove unused type arguments 2023-10-18 20:44:18 +02:00
Dominik Jain
ff451f8373 Add documentation to parameters, improve factorisation 2023-10-18 20:44:18 +02:00
Dominik Jain
e63d8d4147 Use ToStringMixin in dataclasses to detect recurring objects in larger object trees 2023-10-18 20:44:18 +02:00
Dominik Jain
d84e936430 Apply centrally defined callbacks 2023-10-18 20:44:18 +02:00
Dominik Jain
ae4850692f DQNExperimentBuilder: Use IntermediateModuleFactory instead of ActorFactory
(similar to IQN implementation)
2023-10-18 20:44:18 +02:00
Dominik Jain
83048788a1 Add generalised DQN network representation, adding specialised class for feature_only=True 2023-10-18 20:44:18 +02:00
Dominik Jain
4b270eaa2d Add documentation, improve structure of 'module' package 2023-10-18 20:44:18 +02:00
Dominik Jain
97e21b5ddf Remove obsolete mixin, improve class names 2023-10-18 20:44:18 +02:00
Dominik Jain
90eaacb606 PolicyWrapperFactory: Remove unnecessary input type variable 2023-10-18 20:44:18 +02:00
Dominik Jain
fc695a5394 Use logging to report trainer epoch status 2023-10-18 20:44:18 +02:00
Dominik Jain
3bba192633 Add experiment result 2023-10-18 20:44:18 +02:00
Dominik Jain
023b33c917 Make mypy happy 2023-10-18 20:44:18 +02:00
Dominik Jain
76e870207d Improve persistence handling
* Add persistence/restoration of Experiment instance
* Add file logging in experiment
* Allow all persistence/logging to be disabled
* Disable persistence in tests
2023-10-18 20:44:18 +02:00
Dominik Jain
ba803296cc Add FileLoggerContext 2023-10-18 20:44:17 +02:00
Dominik Jain
3691ed2abc Support obs_rms persistence for MuJoCo by adding a general mechanism
for attaching persistence to Environments instances
2023-10-18 20:44:17 +02:00
Dominik Jain
f6d49774a2 Reify policy persistence, introducing Wold representation 2023-10-18 20:44:17 +02:00
Dominik Jain
ee3813b09c Ignore temp scripts and temp folder 2023-10-18 20:44:17 +02:00
Dominik Jain
686fd555b0 Extend tests, fixing some default behaviour 2023-10-18 20:44:17 +02:00
Dominik Jain
a8a367c42d Support IQN in high-level API
* Add example atari_iqn_hl
* Factor out trainer callbacks to new module atari_callbacks
* Extract base class for DQN-based agent factories
* Improved module factory interface design, achieving higher generality
2023-10-18 20:44:17 +02:00
Dominik Jain
213e08a846 Add method get_output_dim to BaseActor 2023-10-18 20:44:17 +02:00
Dominik Jain
c7d0b6b4b2 Simplify agent factories by making better use of base classes 2023-10-18 20:44:17 +02:00
Dominik Jain
799beb79b4 Support discrete SAC in high-level API
* Changed machanism for reusing actor's preprocessing module in critics
  to avoid special handling in AgentFactory implementations, improving
  separation of concerns:
    - Added CriticFactoryReuseActor as the new critic factory
    - Added ActorFactoryTransientStorageDecorator to pass on the actor
      data
    - Added helper classes ActorFuture, ActorFutureProviderProtocol
* Add example atari_sac_hl
2023-10-18 20:44:17 +02:00
Dominik Jain
305b30a6c1 Simplify parameter transformers by applying ParamTransformerChangeValue 2023-10-18 20:44:17 +02:00
Dominik Jain
17ef4dd5eb Support REDQ in high-level API
* Implement example mujoco_redq_hl
* Add abstraction CriticEnsembleFactory with default implementations
  to suit REDQ
* Fix type annotation of linear_layer in Net, MLP, Critic
  (was incompatible with REDQ usage)
2023-10-18 20:44:17 +02:00
Dominik Jain
7af836bd6a Support TRPO in high-level API and add example mujoco_trpo_hl 2023-10-18 20:44:17 +02:00
Dominik Jain
383a4a6083 Support NPG in high-level API and add example mujoco_npg_hl 2023-10-18 20:44:17 +02:00
Dominik Jain
73a6d15eee Log Environments 2023-10-18 20:44:17 +02:00
Dominik Jain
a8ea6808c3 Fix ruff type comparison complaint 2023-10-18 20:44:17 +02:00
Dominik Jain
1bb52a6a5c Simplify critic/agent with optimizer generation
After adding a function to create ModuleOpt instances directly from
AgentFactory and CriticFactory,
  * several mixins for AgentFactories are no longer needed (deleted)
  * additional abstractions for ModuleOptFactories are no longer needed (deleted)
2023-10-18 20:44:17 +02:00
Dominik Jain
6bb3abb2f0 Support PG/Reinforce in high-level API
* Add example mujoco_reinforce_hl
* Extended functionality of ActorFactory to support creation of ModuleOpt
2023-10-18 20:44:17 +02:00
Dominik Jain
4e93c12afa Remove obsolete configuration files 2023-10-18 20:44:17 +02:00
Dominik Jain
22dfc4ed2e Fix type annotations of dist_fn 2023-10-18 20:44:17 +02:00
Dominik Jain
a161a9cf58 Improve type annotations, fix type issues and add checks 2023-10-18 20:44:17 +02:00
Dominik Jain
e6716326bd Make mypy ignore copied util modules string & logging 2023-10-18 20:44:17 +02:00