39 Commits

Author SHA1 Message Date
maxhuettenrauch
ade85ab32b
Feature/algo eval (#1074)
# Changes

## Dependencies

- New extra "eval"

## Api Extension
- `Experiment` and `ExperimentConfig` now have a `name`, that can
however be overridden when `Experiment.run()` is called
- When building an `Experiment` from an `ExperimentConfig`, the user has
the option to add info about seeds to the name.
- New method in `ExperimentConfig` called
`build_default_seeded_experiments`
- `SamplingConfig` has an explicit training seed, `test_seed` is
inferred.
- New `evaluation` package for repeating the same experiment with
multiple seeds and aggregating the results (important extension!).
Currently in alpha state.
- Loggers can now restore the logged data into python by using the new
`restore_logged_data`

## Breaking Changes
- `AtariEnvFactory` (in examples) now receives explicit train and test
seeds
- `EnvFactoryRegistered` now requires an explicit `test_seed`
- `BaseLogger.prepare_dict_for_logging` is now abstract

---------

Co-authored-by: Maximilian Huettenrauch <m.huettenrauch@appliedai.de>
Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
Co-authored-by: Michael Panchenko <35432522+MischaPanch@users.noreply.github.com>
2024-04-20 23:25:33 +00:00
Daniel Plop
ca4f74f40e
Allow two (same/different) Batch objs to be tested for equality (#1098)
Closes: https://github.com/thu-ml/tianshou/issues/1086

### Api Extensions

- Batch received new method: `to_numpy_`. #1098
- `to_dict` in Batch supports also non-recursive conversion. #1098
- Batch `__eq__` now implemented, semantic equality check of batches is
now possible. #1098

### Breaking Changes

- The method `to_numpy` in `data.utils.batch.Batch` is not in-place
anymore. Instead, a new method `to_numpy_` does the conversion in-place.
#1098
2024-04-16 18:12:48 +02:00
Daniel Plop
8a0629ded6
Fix mypy issues in tests and examples (#1077)
Closes #952 

- `SamplingConfig` supports `batch_size=None`. #1077
- tests and examples are covered by `mypy`. #1077
- `NetBase` is more used, stricter typing by making it generic. #1077
- `utils.net.common.Recurrent` now receives and returns a
`RecurrentStateBatch` instead of a dict. #1077

---------

Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
2024-04-03 18:07:51 +02:00
bordeauxred
4f65b131aa
Feat/refactor collector (#1063)
Closes: #1058 

### Api Extensions
- Batch received two new methods: `to_dict` and `to_list_of_dicts`.
#1063
- `Collector`s can now be closed, and their reset is more granular.
#1063
- Trainers can control whether collectors should be reset prior to
training. #1063
- Convenience constructor for `CollectStats` called
`with_autogenerated_stats`. #1063

### Internal Improvements
- `Collector`s rely less on state, the few stateful things are stored
explicitly instead of through a `.data` attribute. #1063
- Introduced a first iteration of a naming convention for vars in
`Collector`s. #1063
- Generally improved readability of Collector code and associated tests
(still quite some way to go). #1063
- Improved typing for `exploration_noise` and within Collector. #1063

### Breaking Changes

- Removed `.data` attribute from `Collector` and its child classes.
#1063
- Collectors no longer reset the environment on initialization. Instead,
the user might have to call `reset`
expicitly or pass `reset_before_collect=True` . #1063
- VectorEnvs now return an array of info-dicts on reset instead of a
list. #1063
- Fixed `iter(Batch(...)` which now behaves the same way as
`Batch(...).__iter__()`. Can be considered a bugfix. #1063

---------

Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
2024-03-28 18:02:31 +01:00
Michael Panchenko
6746a80f6d
Add publish workflow, first preparation for next release (#1067) 2024-03-04 12:21:49 +01:00
Carlo Cagnetta
ce371ae736
remove old python versions from poetry classifier (#1059) 2024-02-21 15:27:53 +01:00
Dominik Jain
26e210a6ae Apply nbqa only to the docs/ folder and exclude the (old) jupyter_execute folder 2024-02-15 11:39:45 +01:00
Michael Panchenko
33d241a29b
Docs/html doc issues (#1048)
Closes #1005 

## Main changes

2. Load vega-embed things using jupyter-book config 
3. Add vega-embed dependencies as part of local code for offline
development
4. Reduced duplication in benchmark.js
5. Update sphinx, docutils, and jupyter-book

Co-authored-by: carlocagnetta <c.cagnetta@appliedai.de>
2024-02-09 19:43:10 +01:00
Carlo Cagnetta
5fc314bd4b
Docs/use nbqa on notebooks (#1041)
- Added nbqa to pyproject.toml
- Resolved mypy issues on notebooks and related files
- Conducting ruff checks on notebooks
- Add DataclassPPrintMixin for better stats representation
- Improved Notebooks wording and explanations

Resolve: #1004
Related to #974
2024-02-07 17:28:16 +01:00
maxhuettenrauch
5fe9aea798
Update and fix dependencies related to mac install (#1044)
Addresses part of #1015 

### Dependencies

- move jsonargparse and docstring-parser to dependencies to run hl
examples without dev
- create mujoco-py extra for legacy mujoco envs
- updated atari extra
    - removed atari-py and gym dependencies
    - added ALE-py, autorom, and shimmy
- created robotics extra for HER-DDPG

### Mac specific

- only install envpool when not on mac
- mujoco-py not working on macOS newer than Monterey
(https://github.com/openai/mujoco-py/issues/777)
- D4RL also fails due to dependency on mujoco-py
(https://github.com/Farama-Foundation/D4RL/issues/232)

### Other

- reduced training-num/test-num in example files to a number ≤ 20
(examples with 100 led to too many open files)
- rendering for Mujoco envs needs to be fixed on gymnasium side
(https://github.com/Farama-Foundation/Gymnasium/issues/749)

---------

Co-authored-by: Maximilian Huettenrauch <m.huettenrauch@appliedai.de>
Co-authored-by: Michael Panchenko <35432522+MischaPanch@users.noreply.github.com>
2024-02-06 17:06:38 +01:00
Dominik Jain
8188a904af Reintroduce ignored Ruff rules D106 and D205 2024-01-10 15:42:18 +01:00
Michael Panchenko
c50e74f263 Fix rtd build, improvements in task running 2023-12-05 22:42:55 +01:00
Michael Panchenko
19e129d0cf Fix rtd build 2023-12-05 13:23:18 +01:00
Michael Panchenko
0b67447541 Docs: fixing spelling, re-adding spellcheck to pipeline 2023-12-05 13:22:04 +01:00
Michael Panchenko
28fda00b27 Docs: added links to source code, readded some ruff ignore rules 2023-12-04 13:52:46 +01:00
Michael Panchenko
a5685619ce Docs: generate all api docs automatically
Reinstate the -W option
Several overall improvements in docs
Fixed multiple links
2023-12-04 11:48:09 +01:00
Michael Panchenko
006577da08 WIP - restructure doc files 2023-12-04 11:48:09 +01:00
carlocagnetta
6fa536fd46 Update Documentation building 2023-12-04 11:47:08 +01:00
carlocagnetta
8f0c62ace3 Documentation update: jupyter-book running on ReadTheDocs including tutorial notebooks 2023-12-04 11:45:51 +01:00
carlocagnetta
de3a021a0a Setup jupyter-book 2023-12-04 11:41:19 +01:00
carlocagnetta
6a60396839 Add jupyter pkg dependency to poetry env 2023-12-04 11:37:38 +01:00
Dominik Jain
dae4000cd2 Revert "Depend on sensAI instead of copying its utils (logging, string)"
This reverts commit fdb0eba93d81fa5e698770b4f7088c87fc1238da.
2023-11-08 19:11:39 +01:00
Dominik Jain
fdb0eba93d Depend on sensAI instead of copying its utils (logging, string) 2023-10-27 20:15:58 +02:00
Dominik Jain
89ce40edc0 Docs: Add tianshou.highlevel to docs build via auto-generated .rst files 2023-10-18 22:45:23 +02:00
Dominik Jain
9c5ee55644 Merge remote-tracking branch 'origin/master' into feat/high-level-api
Conflicts:
  poetry.lock
2023-10-18 20:44:45 +02:00
Dominik Jain
ee3813b09c Ignore temp scripts and temp folder 2023-10-18 20:44:17 +02:00
Dominik Jain
e6716326bd Make mypy ignore copied util modules string & logging 2023-10-18 20:44:17 +02:00
Dominik Jain
ce26e25923 Handle ruff complaints in string module 2023-10-18 20:44:16 +02:00
Dominik Jain
38cf982034 Disable Ruff rule D205 (blank-line-after-summary)
because it disallows, in particular, class docstrings that consist
only of a summary line
2023-10-18 20:44:16 +02:00
Michael Panchenko
66b7fc542b
Minor dep update (#961)
Support gymnasium >=0.28, small extension of readme
2023-10-09 22:10:09 +02:00
Dominik Jain
4d53d345d6 Ignore Ruff rule RET505, because it sacrifices visual discernability
of control flow paths for brevity (regarding return statements)
2023-10-09 13:03:19 +02:00
Dominik Jain
316eb3c579 Add SAC high-level interface 2023-10-09 13:02:01 +02:00
Dominik Jain
2a1cc6bb55 Enable ruff setting ignore-init-module-imports 2023-10-09 13:01:53 +02:00
Dominik Jain
25c6bbd38c Ignore D106: Missing docstring in public nested class 2023-10-09 13:01:44 +02:00
Dominik Jain
42fc181d74 Add dev dependencies jsonargparse and docstring_parser 2023-10-09 13:01:11 +02:00
Michael Panchenko
b900fdf6f2
Remove kwargs in policy init (#950)
Closes #947 

This removes all kwargs from all policy constructors. While doing that,
I also improved several names and added a whole lot of TODOs.

## Functional changes:

1. Added possibility to pass None as `critic2` and `critic2_optim`. In
fact, the default behavior then should cover the absolute majority of
cases
2. Added a function called `clone_optimizer` as a temporary measure to
support passing `critic2_optim=None`

## Breaking changes:

1. `action_space` is no longer optional. In fact, it already was
non-optional, as there was a ValueError in BasePolicy.init. So now
several examples were fixed to reflect that
2. `reward_normalization` removed from DDPG and children. It was never
allowed to pass it as `True` there, an error would have been raised in
`compute_n_step_reward`. Now I removed it from the interface
3. renamed `critic1` and similar to `critic`, in order to have uniform
interfaces. Note that the `critic` in DDPG was optional for the sole
reason that child classes used `critic1`. I removed this optionality
(DDPG can't do anything with `critic=None`)
4. Several renamings of fields (mostly private to public, so backwards
compatible)

## Additional changes: 
1. Removed type and default declaration from docstring. This kind of
duplication is really not necessary
2. Policy constructors are now only called using named arguments, not a
fragile mixture of positional and named as before
5. Minor beautifications in typing and code 
6. Generally shortened docstrings and made them uniform across all
policies (hopefully)

## Comment:

With these changes, several problems in tianshou's inheritance hierarchy
become more apparent. I tried highlighting them for future work.

---------

Co-authored-by: Dominik Jain <d.jain@appliedai.de>
2023-10-08 08:57:03 -07:00
Anas BELFADIL
c30b4abb8f
Add calibration to CQL as in CalQL paper arXiv:2303.05479 (#915)
- [X] I have marked all applicable categories:
    + [ ] exception-raising fix
    + [ ] algorithm implementation fix
    + [ ] documentation modification
    + [X] new feature
- [X] I have reformatted the code using `make format` (**required**)
- [X] I have checked the code using `make commit-checks` (**required**)
- [X] If applicable, I have mentioned the relevant/related issue(s)
- [X] If applicable, I have listed every items in this Pull Request
below
2023-10-02 22:54:34 -07:00
Michael Panchenko
2cc34fb72b
Poetry install, remove gym, bump python (#925)
Closes #914 

Additional changes:

- Deprecate python below 11
- Remove 3rd party and throughput tests. This simplifies install and
test pipeline
- Remove gym compatibility and shimmy
- Format with 3.11 conventions. In particular, add `zip(...,
strict=True/False)` where possible

Since the additional tests and gym were complicating the CI pipeline
(flaky and dist-dependent), it didn't make sense to work on fixing the
current tests in this PR to then just delete them in the next one. So
this PR changes the build and removes these tests at the same time.
2023-09-05 14:34:23 -07:00
Michael Panchenko
600f4bbd55
Python 3.9, black + ruff formatting (#921)
Preparation for #914 and #920

Changes formatting to ruff and black. Remove python 3.8

## Additional Changes

- Removed flake8 dependencies
- Adjusted pre-commit. Now CI and Make use pre-commit, reducing the
duplication of linting calls
- Removed check-docstyle option (ruff is doing that)
- Merged format and lint. In CI the format-lint step fails if any
changes are done, so it fulfills the lint functionality.

---------

Co-authored-by: Jiayi Weng <jiayi@openai.com>
2023-08-25 14:40:56 -07:00