Tianshou

hongshaorou/Tianshou

Fork 0

Commit Graph

Select branches

Hide Pull Requests

feature/algo-eval

master

priv

v0.2.1

v0.2.2

v0.2.3

v0.2.4

v0.2.4.post1

v0.2.5

v0.2.6

v0.2.7

v0.3.0

v0.3.0.post1

v0.3.0rc0

v0.3.1

v0.3.2

v0.4.0

v0.4.1

v0.4.10

v0.4.11

v0.4.2

v0.4.3

v0.4.4

v0.4.5

v0.4.6

v0.4.6.post1

v0.4.7

v0.4.8

v0.4.9

v0.5.0

v1.0.0

88e9c9fe6c

Bump urllib3 from 2.1.0 to 2.2.2 (#1162) master dependabot[bot] 2024-06-20 09:45:49 -10:00
cab5e010ac

Bump tornado from 6.3.3 to 6.4.1 (#1158) dependabot[bot] 2024-06-09 15:00:50 -07:00
e5f20438d3

Bump requests from 2.31.0 to 2.32.0 (#1151) dependabot[bot] 2024-05-23 09:55:22 -07:00
40a4ad10c8

[Fix&Enhance&Format] On README.md about poetry install (#1146) coolermzb3 2024-05-24 00:54:53 +08:00
f86fcd3a43

[ENH] Improving Documentation webpage (#1150) Erni 2024-05-23 18:54:16 +02:00
f0b7abe015

Bugfix/parallel launcher for linux (#1141) Michael Panchenko 2024-05-08 11:02:01 +02:00
bf3859a457 Extension of ExpLauncher and DataclassPPrintMixin Michael Panchenko 2024-05-04 22:09:32 +02:00
1cd22f1d32 Added and used new VenvType: SUBPROC_SHARED_MEM_AUTO Michael Panchenko 2024-05-06 21:22:39 +02:00
d58ae163f2

Bump jinja2 from 3.1.3 to 3.1.4 (#1139) dependabot[bot] 2024-05-07 00:16:21 +02:00
aa77f5549a

Bump werkzeug from 3.0.1 to 3.0.3 (#1138) dependabot[bot] 2024-05-07 00:16:02 +02:00
26b867e442

Adjust locations of setting the policy in train/eval mode (#1123) Michael Panchenko 2024-05-06 20:38:19 +02:00
e94a5c04cf New context manager: policy_within_training_step Michael Panchenko 2024-05-06 16:50:48 +02:00
78ea013956 Tests: fixed test_psrl.py: use args.reward_threshold instead of spec Michael Panchenko 2024-05-06 16:16:20 +02:00
6a5b3c837a Docstrings, skip hidden files in autogen_rst Michael Panchenko 2024-05-05 23:31:20 +02:00
f059b65103 Merge branch 'refs/heads/thuml-master' into policy-train-eval Michael Panchenko 2024-05-05 22:33:51 +02:00
d8e5631567 Extended changelog, slightly improved structure Michael Panchenko 2024-05-05 22:26:49 +02:00
2abb4dac24 Reinstated warning module Michael Panchenko 2024-05-05 22:23:13 +02:00
024b80e79c Improve creation of multiple seeded experiments: * Add class ExperimentCollection to improve usability * Remove parameters from ExperimentBuilder.build * Renamed ExperimentBuilder.build_default_seeded_experiments to build_seeded_collection, changing the return type to ExperimentCollection * Replace temp_config_mutation (which was not appropriate for the public API) with method copy (which performs a safe deep copy) Dominik Jain 2024-04-30 17:22:11 +02:00
35779696ee Clean up handling of an Experiment's name (and, by extension, a run's name) Dominik Jain 2024-04-30 16:12:43 +02:00
a8e9df31f7 Bugfix: allow for training_stat to be None instead of asserting not-None Michael Panchenko 2024-05-05 22:08:22 +02:00
9fbf28ef6e

Improvements pertaining to the handling of multi-experiment creation (#1131) Michael Panchenko 2024-05-05 21:41:53 +02:00
0a7fd1ee8e

Merge branch 'master' into feature/multi-experiment Michael Panchenko 2024-05-05 16:21:26 +02:00
4e38aeb829 Merge branch 'refs/heads/thuml-master' into policy-train-eval Michael Panchenko 2024-05-05 16:03:34 +02:00
82f425e9fe Collector: move @override, removed docstrings from overridden methods Michael Panchenko 2024-05-05 16:01:52 +02:00
26a6cca76e Improved docstrings, added asserts to make mypy happy Michael Panchenko 2024-05-05 15:56:06 +02:00
c5d0e169b5 Collector: removed unnecessary no-grad flag from interfaces. Breaking Michael Panchenko 2024-05-05 15:41:20 +02:00
f876198870 Formatting Michael Panchenko 2024-05-05 15:16:16 +02:00
6927eadaa7 BatchPolicy: check that self.is_within_training_step is True on update Michael Panchenko 2024-05-05 15:14:59 +02:00
2f2d5cb210

Bump tqdm from 4.66.1 to 4.66.3 (#1134) dependabot[bot] 2024-05-05 15:01:46 +02:00
c35be8d07e Establish backward compatibility by implementing __setstate__ Dominik Jain 2024-05-02 18:47:42 +02:00
ca69e79b4a Change the way in which deterministic evaluation is controlled: * Remove flag eval_mode from Collector.collect * Replace flag is_eval in BasePolicy with is_within_training_step (negating usages) and set it appropriately in BaseTrainer Dominik Jain 2024-05-02 18:31:03 +02:00
18f236167f Fix invalid kwarg Dominik Jain 2024-05-02 18:14:26 +02:00
ca4dad1139 BaseTrainer: Refactoring New method training_step, which * collects training data (method _collect_training_data) * performs "test in train" (method _test_in_train) * performs policy update The old method named train_step performed only the first two points and was now split into two separate methods Dominik Jain 2024-05-02 18:06:01 +02:00
4f16494609 Set torch train mode in BasePolicy.update instead of in each .learn implementation, as this is less prone to errors Dominik Jain 2024-05-02 11:51:08 +02:00
f31a91df5d

Typo docstring (#1132) bordeauxred 2024-05-01 08:59:00 +02:00
606128f29a

Merge branch 'master' into feature/multi-experiment Michael Panchenko 2024-04-30 22:52:45 +02:00
393e55aa58 Improve change log #1129 Dominik Jain 2024-04-30 17:47:06 +02:00
ea0c4f1a30 Update change log with changes from #1131 Dominik Jain 2024-04-30 17:31:48 +02:00
f8cca8b07c Improve creation of multiple seeded experiments: * Add class ExperimentCollection to improve usability * Remove parameters from ExperimentBuilder.build * Renamed ExperimentBuilder.build_default_seeded_experiments to build_seeded_collection, changing the return type to ExperimentCollection * Replace temp_config_mutation (which was not appropriate for the public API) with method copy (which performs a safe deep copy) Dominik Jain 2024-04-30 17:22:11 +02:00
2b1594a1c8 Clean up handling of an Experiment's name (and, by extension, a run's name) Dominik Jain 2024-04-30 16:12:43 +02:00
61426acf07

Improve the documentation of compute_episodic_return in base policy. (#1130) bordeauxred 2024-04-30 14:40:16 +02:00
a65920fc68

Support Actor preprocessing network reuse for continuous case, fixes in DQN network (#1128) Michael Panchenko 2024-04-29 23:49:52 +02:00
40f772493e Update change log with changes from #1128 Dominik Jain 2024-04-29 22:30:48 +02:00
83083924df Mention CHANGELOG.md in PR template Dominik Jain 2024-04-29 22:14:36 +02:00
8ac6bf5fbb Improve docstrings Dominik Jain 2024-04-29 17:35:14 +02:00
250a129cc4 SamplingConfig: Improve docstrings of replay_buffer_save_only_last_obs, replay_buffer_stack_num Dominik Jain 2024-04-29 17:12:28 +02:00
74737416ff Fix typo Dominik Jain 2024-04-29 14:10:47 +02:00
d18ded333e CriticFactoryReuseActor: Fix the case where we want to reuse an actor's preprocessing network for the critic (must be applied before concatenating the actions) Dominik Jain 2024-04-29 14:09:48 +02:00
0b494845c9 continuous.Critic: Add flag apply_preprocess_net_to_obs_only to allow the preprocessing network to be applied to the observations only (without the actions concatenated), which is essential for the case where we want to reuse the actor's preprocessing network Dominik Jain 2024-04-29 14:06:32 +02:00
18ed981875 Add pickle/serialisation utils: setstate and getstate Dominik Jain 2024-04-29 14:01:56 +02:00
be1c8cd235 DQN: * Fix input validation * Fix output_dim not being set if features_only=True and output_dim_added_layer not None Dominik Jain 2024-04-29 13:37:26 +02:00
a2b9d7c7d8 Changelog [skip-ci] Michael Panchenko 2024-04-26 18:31:02 +02:00
45922712d9 Dosctring add return [skip-ci] Michael Panchenko 2024-04-26 18:14:20 +02:00
e2e8a699ea Changelog [skip-ci] Michael Panchenko 2024-04-26 18:11:23 +02:00
6aa33b1bfe Formatting Michael Panchenko 2024-04-26 17:54:14 +02:00
c28508b3be Changelog Michael Panchenko 2024-04-26 17:53:34 +02:00
2eaf1f37c2 Use the new BaseCollector interface for annotations Michael Panchenko 2024-04-26 17:53:27 +02:00
07a97c7d93 Merge branch 'refs/heads/thuml-master' into policy-train-eval Michael Panchenko 2024-04-26 17:44:57 +02:00
69f07a8f12 Tests: fixed typing issues by declaring union types and no longer reusing var names Michael Panchenko 2024-04-26 17:37:12 +02:00
4b619c51ba Collector: extracted interface BaseCollector, minor simplifications Michael Panchenko 2024-04-26 16:46:03 +02:00
12d4262f80 Tests: removed all instances of if __name__ == ... in tests Michael Panchenko 2024-04-26 14:58:58 +02:00
7d59302095 Added in_eval/in_train mode contextmanager Michael Panchenko 2024-04-26 14:45:02 +02:00
829fd9c7a5 Deleted long deprecated functionality, removed unused warning module Michael Panchenko 2024-04-26 14:29:16 +02:00
081adedc32

Changelog + dependabot bumps (#1124) Michael Panchenko 2024-04-25 17:49:54 +02:00
49c750fb09 update tests Maximilian Huettenrauch 2024-04-24 17:06:59 +02:00
8cb17de190 update examples Maximilian Huettenrauch 2024-04-24 17:06:54 +02:00
e499bed8b0 add is_eval attribute to policy and set this attribute as well as train mode in appropriate places Maximilian Huettenrauch 2024-04-24 17:06:42 +02:00
ade85ab32b

Feature/algo eval (#1074) maxhuettenrauch 2024-04-21 01:25:33 +02:00
846ca0ff67 Renamed and commented restore_logged_data in TensorboardLogger [skip-ci] feature/algo-eval Michael Panchenko 2024-04-20 15:09:56 +02:00
9c0b3e7292

use explicit multiprocessing context for creating Pipe in subproc.py (#1102) maxhuettenrauch 2024-04-19 11:08:53 +02:00
a043711c10

Fix/deterministic action space sampling in SubprocVectorEnv (#1103) maxhuettenrauch 2024-04-18 16:16:57 +02:00
6935a111d9

Add non in-place version of Batch.to_torch (#1117) Daniel Plop 2024-04-17 22:07:24 +02:00
ca4f74f40e

Allow two (same/different) Batch objs to be tested for equality (#1098) Daniel Plop 2024-04-16 18:12:48 +02:00
049907d9ab Fix type check in atari wrapper, solves #1111 Michael Panchenko 2024-04-16 10:52:48 +02:00
60d1ba1c8f

Fix/reset before collect in procedural examples, tests and hl experiment (#1100) maxhuettenrauch 2024-04-16 10:30:21 +02:00
766f6fedf2

Fix imports in Readme Molasses 2024-04-15 17:32:35 +08:00
e2a2a6856d

Changed .keys() to get_keys() in batch class (#1105) Erni 2024-04-12 12:15:37 +02:00
03e9af04b7

Update README.md (removed instability warning) [skip ci] Michael Panchenko 2024-04-05 12:05:20 +02:00
bab5c634e7

Missing link in README.md [skip ci] Michael Panchenko 2024-04-05 12:04:27 +02:00
8a0629ded6

Fix mypy issues in tests and examples (#1077) Daniel Plop 2024-04-03 18:07:51 +02:00
60e75e38dc Adjusted launchers to new interface Michael Panchenko 2024-04-03 17:55:22 +02:00
7d479af0bb Experiment: use name attribute during run except if overriden explicitly Michael Panchenko 2024-04-03 17:44:41 +02:00
ed12b16d70 Added contextmanager for ExperimentBuilder modifications Michael Panchenko 2024-04-03 17:28:38 +02:00
85e910ec5d Added launcher interface and registry Michael Panchenko 2024-04-03 17:27:46 +02:00
55fa6f7f35

Don't raise error on len of empty Batch (#1084) Michael Panchenko 2024-04-03 13:37:18 +02:00
f2e10b04bb Merge branch 'thuml_master' into feature/algo-eval Maximilian Huettenrauch 2024-04-02 11:03:38 +02:00
bf0d632108

Naming and typing improvements in Actor/Critic/Policy forwards (#1032) Erni 2024-04-01 17:14:17 +02:00
5bf923c9bd Removed more references to Chinese docs [skip ci] Michael Panchenko 2024-03-28 18:17:25 +01:00
23a33a9aa3 Removed link to Chinese docs [skip ci] Michael Panchenko 2024-03-28 18:13:15 +01:00
ecb272c61b

Update CHANGELOG.md [skip ci] Michael Panchenko 2024-03-28 18:06:00 +01:00
4f65b131aa

Feat/refactor collector (#1063) bordeauxred 2024-03-28 18:02:31 +01:00
929dd10267 Merge branch 'thuml_master' into feature/algo-eval Maximilian Huettenrauch 2024-03-28 14:10:55 +01:00
edae9e4403

fixed env seeding in test_sac_with_il.py (#1081) maxhuettenrauch 2024-03-28 12:52:35 +01:00
ec2c5c19d1 added primitive joblib launcher Maximilian Huettenrauch 2024-03-27 17:38:01 +01:00
9c645ff4a0 pleased the mypy gods Maximilian Huettenrauch 2024-03-27 15:37:19 +01:00
ce5fa0dfac fixed logger test Maximilian Huettenrauch 2024-03-27 13:55:22 +01:00
9055eb5924 removed attributes from pandas logger Maximilian Huettenrauch 2024-03-27 13:55:13 +01:00
6d9b697efe restructured and moved RLiableExperimentResult Maximilian Huettenrauch 2024-03-27 12:03:31 +01:00
18d8ffa576 removed name shortener Maximilian Huettenrauch 2024-03-27 12:02:43 +01:00
e95fa26a14 replace assert with exception in wandb logger Maximilian Huettenrauch 2024-03-27 11:38:55 +01:00