Adjust locations of setting the policy in train/eval mode (#1123)
Michael Panchenko
2024-05-06 20:38:19 +02:00
e94a5c04cfNew context manager: policy_within_training_step
Michael Panchenko
2024-05-06 16:50:48 +02:00
78ea013956Tests: fixed test_psrl.py: use args.reward_threshold instead of spec
Michael Panchenko
2024-05-06 16:16:20 +02:00
6a5b3c837aDocstrings, skip hidden files in autogen_rst
Michael Panchenko
2024-05-05 23:31:20 +02:00
f059b65103Merge branch 'refs/heads/thuml-master' into policy-train-eval
Michael Panchenko
2024-05-05 22:33:51 +02:00
d8e5631567Extended changelog, slightly improved structure
Michael Panchenko
2024-05-05 22:26:49 +02:00
2abb4dac24Reinstated warning module
Michael Panchenko
2024-05-05 22:23:13 +02:00
024b80e79cImprove creation of multiple seeded experiments: * Add class ExperimentCollection to improve usability * Remove parameters from ExperimentBuilder.build * Renamed ExperimentBuilder.build_default_seeded_experiments to build_seeded_collection, changing the return type to ExperimentCollection * Replace temp_config_mutation (which was not appropriate for the public API) with method copy (which performs a safe deep copy)
Dominik Jain
2024-04-30 17:22:11 +02:00
35779696eeClean up handling of an Experiment's name (and, by extension, a run's name)
Dominik Jain
2024-04-30 16:12:43 +02:00
a8e9df31f7Bugfix: allow for training_stat to be None instead of asserting not-None
Michael Panchenko
2024-05-05 22:08:22 +02:00
ca69e79b4aChange the way in which deterministic evaluation is controlled: * Remove flag eval_mode from Collector.collect * Replace flag is_eval in BasePolicy with is_within_training_step (negating usages) and set it appropriately in BaseTrainer
Dominik Jain
2024-05-02 18:31:03 +02:00
ca4dad1139BaseTrainer: Refactoring New method training_step, which * collects training data (method _collect_training_data) * performs "test in train" (method _test_in_train) * performs policy update The old method named train_step performed only the first two points and was now split into two separate methods
Dominik Jain
2024-05-02 18:06:01 +02:00
4f16494609Set torch train mode in BasePolicy.update instead of in each .learn implementation, as this is less prone to errors
Dominik Jain
2024-05-02 11:51:08 +02:00
ea0c4f1a30Update change log with changes from #1131
Dominik Jain
2024-04-30 17:31:48 +02:00
f8cca8b07cImprove creation of multiple seeded experiments: * Add class ExperimentCollection to improve usability * Remove parameters from ExperimentBuilder.build * Renamed ExperimentBuilder.build_default_seeded_experiments to build_seeded_collection, changing the return type to ExperimentCollection * Replace temp_config_mutation (which was not appropriate for the public API) with method copy (which performs a safe deep copy)
Dominik Jain
2024-04-30 17:22:11 +02:00
2b1594a1c8Clean up handling of an Experiment's name (and, by extension, a run's name)
Dominik Jain
2024-04-30 16:12:43 +02:00
d18ded333eCriticFactoryReuseActor: Fix the case where we want to reuse an actor's preprocessing network for the critic (must be applied before concatenating the actions)
Dominik Jain
2024-04-29 14:09:48 +02:00
0b494845c9continuous.Critic: Add flag apply_preprocess_net_to_obs_only to allow the preprocessing network to be applied to the observations only (without the actions concatenated), which is essential for the case where we want to reuse the actor's preprocessing network
Dominik Jain
2024-04-29 14:06:32 +02:00
be1c8cd235DQN: * Fix input validation * Fix output_dim not being set if features_only=True and output_dim_added_layer not None
Dominik Jain
2024-04-29 13:37:26 +02:00
a2b9d7c7d8Changelog [skip-ci]
Michael Panchenko
2024-04-26 18:31:02 +02:00
45922712d9Dosctring add return [skip-ci]
Michael Panchenko
2024-04-26 18:14:20 +02:00
e2e8a699eaChangelog [skip-ci]
Michael Panchenko
2024-04-26 18:11:23 +02:00
6aa33b1bfeFormatting
Michael Panchenko
2024-04-26 17:54:14 +02:00
c28508b3beChangelog
Michael Panchenko
2024-04-26 17:53:34 +02:00
2eaf1f37c2Use the new BaseCollector interface for annotations
Michael Panchenko
2024-04-26 17:53:27 +02:00
07a97c7d93Merge branch 'refs/heads/thuml-master' into policy-train-eval
Michael Panchenko
2024-04-26 17:44:57 +02:00
69f07a8f12Tests: fixed typing issues by declaring union types and no longer reusing var names
Michael Panchenko
2024-04-26 17:37:12 +02:00
4b619c51baCollector: extracted interface BaseCollector, minor simplifications
Michael Panchenko
2024-04-26 16:46:03 +02:00
12d4262f80Tests: removed all instances of if __name__ == ... in tests
Michael Panchenko
2024-04-26 14:58:58 +02:00
7d59302095Added in_eval/in_train mode contextmanager
Michael Panchenko
2024-04-26 14:45:02 +02:00
829fd9c7a5Deleted long deprecated functionality, removed unused warning module
Michael Panchenko
2024-04-26 14:29:16 +02:00
Changelog + dependabot bumps (#1124)
Michael Panchenko
2024-04-25 17:49:54 +02:00
49c750fb09update tests
Maximilian Huettenrauch
2024-04-24 17:06:59 +02:00
8cb17de190update examples
Maximilian Huettenrauch
2024-04-24 17:06:54 +02:00
e499bed8b0add is_eval attribute to policy and set this attribute as well as train mode in appropriate places
Maximilian Huettenrauch
2024-04-24 17:06:42 +02:00