603 Commits

Author SHA1 Message Date
Dominik Jain
022cfb7f78 Cleaned up handling of output_dim retrieval, adding exceptions for erroneous cases 2024-01-16 14:52:31 +01:00
Dominik Jain
20074931d5 Improve docstrings 2024-01-16 14:52:31 +01:00
Dominik Jain
05a8cf4e74 Refactoring, improving class name EnvFactoryGymnasium -> EnvFactoryRegistered 2024-01-16 14:52:31 +01:00
Dominik Jain
c9cb41bf55 Make envpool usage configuration more explicit 2024-01-16 14:52:31 +01:00
Dominik Jain
a4d7ccba26 Remove PyTorch warning from README 2024-01-16 13:43:14 +01:00
Dominik Jain
be9eb7e241 Improve language in README 2024-01-16 13:43:14 +01:00
Dominik Jain
3c564e82b7 Remove video from procedural example as it pertains to a different algorithm 2024-01-16 13:43:14 +01:00
Dominik Jain
2c72171fca Update procedural example in README 2024-01-16 13:43:14 +01:00
Dominik Jain
62d58faa02 Add example from README (with minor updates) 2024-01-16 13:43:14 +01:00
Dominik Jain
39f3ba2266 Add screen recording of high-level example 2024-01-16 13:43:14 +01:00
Dominik Jain
961e9a7801 Add high-level example to README 2024-01-16 13:43:14 +01:00
Dominik Jain
8d6df2b276 Add high-level discrete example (CartPole) for README 2024-01-12 17:13:50 +01:00
Dominik Jain
1e5ebc2a2d Improve naming of callback classes and related methods/attributes
Add EpochStopCallbackRewardThreshold
2024-01-12 17:13:42 +01:00
Dominik Jain
24b7b82e56 Remove inappropriate warning (warns about supported case according to docstring) 2024-01-12 17:13:42 +01:00
Dominik Jain
ff398beed9 Move callbacks for setting DQN epsilon values to the library 2024-01-12 17:13:42 +01:00
Dominik Jain
63269fe198 Implement make_atari_env via AtariEnvFactory, eliminating duplication 2024-01-12 17:13:42 +01:00
Dominik Jain
19a98c3b2a Fix models using scale_obs not being persistable (due to locally defined class) 2024-01-12 17:13:42 +01:00
Dominik Jain
7fa588309b Update MuJoCo examples to use Ant-v4 instead of Ant-v3 2024-01-12 17:13:42 +01:00
Dominik Jain
eaab7b0a4b Improve environment factory abstractions in high-level API:
* EnvFactory now uses the creation of a single environment as
   the basic functionality which the more high-level functions build
   upon
 * Introduce enum EnvMode to indicate the purpose for which an env
   is created, allowing the factory creation process to change its
   behaviour accordingly
 * Add EnvFactoryGymnasium to provide direct support for envs that
   can be created via gymnasium.make
     - EnvPool is supported via an injectible EnvPoolFactory
     - Existing EnvFactory implementations are now derived from
       EnvFactoryGymnasium
 * Use a separate environment (which uses new EnvMode.WATCH) for
   watching agent performance after training (instead of using test
   environments, which the user may want to configure differently)
2024-01-12 17:13:42 +01:00
Dominik Jain
8188a904af Reintroduce ignored Ruff rules D106 and D205 2024-01-10 15:42:18 +01:00
Dominik Jain
d4e4f4ff63 Experiment builders for DQN and IQN:
* Fix: Disable softmax in default models
  * Add method with_model_factory_default (for DQN)
2024-01-10 15:42:18 +01:00
Dominik Jain
f77d95da04 Fix: Missing type annotation of Experiment.watch_num_episodes 2024-01-08 18:00:37 +01:00
Dominik Jain
97a241a6fc Fix: DiscreteEnvironments.from_factory used incorrect EnvType 2024-01-08 15:58:41 +01:00
maxhuettenrauch
522f7fbf98
Feature/dataclasses (#996)
This PR adds strict typing to the output of `update` and `learn` in all
policies. This will likely be the last large refactoring PR before the
next release (0.6.0, not 1.0.0), so it requires some attention. Several
difficulties were encountered on the path to that goal:

1. The policy hierarchy is actually "broken" in the sense that the keys
of dicts that were output by `learn` did not follow the same enhancement
(inheritance) pattern as the policies. This is a real problem and should
be addressed in the near future. Generally, several aspects of the
policy design and hierarchy might deserve a dedicated discussion.
2. Each policy needs to be generic in the stats return type, because one
might want to extend it at some point and then also extend the stats.
Even within the source code base this pattern is necessary in many
places.
3. The interaction between learn and update is a bit quirky, we
currently handle it by having update modify special field inside
TrainingStats, whereas all other fields are handled by learn.
4. The IQM module is a policy wrapper and required a
TrainingStatsWrapper. The latter relies on a bunch of black magic.

They were addressed by:
1. Live with the broken hierarchy, which is now made visible by bounds
in generics. We use type: ignore where appropriate.
2. Make all policies generic with bounds following the policy
inheritance hierarchy (which is incorrect, see above). We experimented a
bit with nested TrainingStats classes, but that seemed to add more
complexity and be harder to understand. Unfortunately, mypy thinks that
the code below is wrong, wherefore we have to add `type: ignore` to the
return of each `learn`

```python

T = TypeVar("T", bound=int)


def f() -> T:
  return 3
```

3. See above
4. Write representative tests for the `TrainingStatsWrapper`. Still, the
black magic might cause nasty surprises down the line (I am not proud of
it)...

Closes #933

---------

Co-authored-by: Maximilian Huettenrauch <m.huettenrauch@appliedai.de>
Co-authored-by: Michael Panchenko <m.panchenko@appliedai.de>
2023-12-30 11:09:03 +01:00
Michael Panchenko
5d09645a2c
High-level API improvements (#1014)
- [X] I have added the correct label(s) to this Pull Request or linked
the relevant issue(s)
- [X] I have provided a description of the changes in this Pull Request
- [X] I have added documentation for my changes
- [ ] If applicable, I have added tests to cover my changes.
- [X] I have reformatted the code using `poe format` 
- [X] I have checked style and types with `poe lint` and `poe
type-check`
- [ ] (Optional) I ran tests locally with `poe test` 
(or a subset of them with `poe test-reduced`) ,and they pass
- [X] (Optional) I have tested that documentation builds correctly with
`poe doc-build`

Changes in this PR (see individual commits):
* Fix: SamplingConfig.start_timesteps_random was not used
* Environments: Support use of different test environment factory in
convenience constructors `from_factory*`
* SamplingConfig: Improve/extend docstrings, clearly explaining the
parameters
* SamplingConfig: Change default of repeat_per_collect to 1
* Improve logging
* Fix doc-build on Windows
2023-12-21 10:04:14 -06:00
Dominik Jain
da333d8a85 Fix incorrect use of platform-specific path separator 2023-12-21 13:13:51 +01:00
Dominik Jain
e8cc80f990 Environments: Add option to a use a different factory for test envs
to `from_factory` convenience construction mechanisms
2023-12-21 13:13:51 +01:00
Dominik Jain
45a1a3f259 SamplingConfig: Change default of repeat_per_collect to 1 (safest option) 2023-12-21 13:13:51 +01:00
Dominik Jain
408d51f9de SamplingConfig: Improve/extend docstrings, clearly explaining the parameters 2023-12-21 13:13:51 +01:00
Michael Yang
294145aa3d
Fix an example code in readme (#1011)
Simple fix of an error
2023-12-14 22:46:56 -08:00
Carlo Cagnetta
b7df31f2a7
Docs/fix trainer fct notebooks (#1009)
This PR resolves #1008
2023-12-14 19:31:53 +01:00
Dominik Jain
1903a72ecb Improve logging 2023-12-14 19:31:30 +01:00
Dominik Jain
3caa3805f0 Fix: SamplingConfig.start_timesteps_random was not used 2023-12-14 11:47:32 +01:00
dependabot[bot]
ea48cc2989
Bump jupyter-server from 2.10.1 to 2.11.2 (#1003)
Bumps [jupyter-server](https://github.com/jupyter-server/jupyter_server)
from 2.10.1 to 2.11.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/jupyter-server/jupyter_server/releases">jupyter-server's
releases</a>.</em></p>
<blockquote>
<h2>v2.11.2</h2>
<h2>2.11.2</h2>
<p>(<a
href="https://github.com/jupyter-server/jupyter_server/compare/v2.11.1">Full
Changelog</a>)</p>
<h3>Contributors to this release</h3>
<p>(<a
href="https://github.com/jupyter-server/jupyter_server/graphs/contributors?from=2023-11-27&amp;to=2023-12-04&amp;type=c">GitHub
contributors page for this release</a>)</p>
<h2>v2.11.1</h2>
<h2>2.11.1</h2>
<p>(<a
href="https://github.com/jupyter-server/jupyter_server/compare/v2.11.0...40a95e5f39d3f167bebf9232da9fab64818ba97d">Full
Changelog</a>)</p>
<h3>Bugs fixed</h3>
<ul>
<li>avoid unhandled error on some invalid paths <a
href="https://redirect.github.com/jupyter-server/jupyter_server/pull/1369">#1369</a>
(<a href="https://github.com/minrk"><code>@​minrk</code></a>)</li>
<li>Change md5 to hash and hash_algorithm, fix incompatibility <a
href="https://redirect.github.com/jupyter-server/jupyter_server/pull/1367">#1367</a>
(<a href="https://github.com/Wh1isper"><code>@​Wh1isper</code></a>)</li>
</ul>
<h3>Contributors to this release</h3>
<p>(<a
href="https://github.com/jupyter-server/jupyter_server/graphs/contributors?from=2023-11-21&amp;to=2023-11-27&amp;type=c">GitHub
contributors page for this release</a>)</p>
<p><a
href="https://github.com/search?q=repo%3Ajupyter-server%2Fjupyter_server+involves%3Ablink1073+updated%3A2023-11-21..2023-11-27&amp;type=Issues"><code>@​blink1073</code></a>
| <a
href="https://github.com/search?q=repo%3Ajupyter-server%2Fjupyter_server+involves%3Afcollonval+updated%3A2023-11-21..2023-11-27&amp;type=Issues"><code>@​fcollonval</code></a>
| <a
href="https://github.com/search?q=repo%3Ajupyter-server%2Fjupyter_server+involves%3Aminrk+updated%3A2023-11-21..2023-11-27&amp;type=Issues"><code>@​minrk</code></a>
| <a
href="https://github.com/search?q=repo%3Ajupyter-server%2Fjupyter_server+involves%3AWh1isper+updated%3A2023-11-21..2023-11-27&amp;type=Issues"><code>@​Wh1isper</code></a></p>
<h2>v2.11.0</h2>
<h2>2.11.0</h2>
<p>(<a
href="https://github.com/jupyter-server/jupyter_server/compare/v2.10.1...e7c0f331d4cbf82eb1a9e9bc6c260faabda0255a">Full
Changelog</a>)</p>
<h3>Enhancements made</h3>
<ul>
<li>Support get file(notebook) md5 <a
href="https://redirect.github.com/jupyter-server/jupyter_server/pull/1363">#1363</a>
(<a href="https://github.com/Wh1isper"><code>@​Wh1isper</code></a>)</li>
</ul>
<h3>Maintenance and upkeep improvements</h3>
<ul>
<li>Update ruff and typings <a
href="https://redirect.github.com/jupyter-server/jupyter_server/pull/1365">#1365</a>
(<a
href="https://github.com/blink1073"><code>@​blink1073</code></a>)</li>
</ul>
<h3>Documentation improvements</h3>
<ul>
<li>Update api docs with md5 param <a
href="https://redirect.github.com/jupyter-server/jupyter_server/pull/1364">#1364</a>
(<a href="https://github.com/Wh1isper"><code>@​Wh1isper</code></a>)</li>
<li>typo: ServerApp <a
href="https://redirect.github.com/jupyter-server/jupyter_server/pull/1361">#1361</a>
(<a href="https://github.com/IITII"><code>@​IITII</code></a>)</li>
</ul>
<h3>Contributors to this release</h3>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/jupyter-server/jupyter_server/blob/main/CHANGELOG.md">jupyter-server's
changelog</a>.</em></p>
<blockquote>
<h2>2.11.2</h2>
<p>(<a
href="https://github.com/jupyter-server/jupyter_server/compare/v2.11.1">Full
Changelog</a>)</p>
<h3>Contributors to this release</h3>
<p>(<a
href="https://github.com/jupyter-server/jupyter_server/graphs/contributors?from=2023-11-27&amp;to=2023-12-04&amp;type=c">GitHub
contributors page for this release</a>)</p>
<h2>2.11.1</h2>
<p>(<a
href="https://github.com/jupyter-server/jupyter_server/compare/v2.11.0...40a95e5f39d3f167bebf9232da9fab64818ba97d">Full
Changelog</a>)</p>
<h3>Bugs fixed</h3>
<ul>
<li>avoid unhandled error on some invalid paths <a
href="https://redirect.github.com/jupyter-server/jupyter_server/pull/1369">#1369</a>
(<a href="https://github.com/minrk"><code>@​minrk</code></a>)</li>
<li>Change md5 to hash and hash_algorithm, fix incompatibility <a
href="https://redirect.github.com/jupyter-server/jupyter_server/pull/1367">#1367</a>
(<a href="https://github.com/Wh1isper"><code>@​Wh1isper</code></a>)</li>
</ul>
<h3>Contributors to this release</h3>
<p>(<a
href="https://github.com/jupyter-server/jupyter_server/graphs/contributors?from=2023-11-21&amp;to=2023-11-27&amp;type=c">GitHub
contributors page for this release</a>)</p>
<p><a
href="https://github.com/search?q=repo%3Ajupyter-server%2Fjupyter_server+involves%3Ablink1073+updated%3A2023-11-21..2023-11-27&amp;type=Issues"><code>@​blink1073</code></a>
| <a
href="https://github.com/search?q=repo%3Ajupyter-server%2Fjupyter_server+involves%3Afcollonval+updated%3A2023-11-21..2023-11-27&amp;type=Issues"><code>@​fcollonval</code></a>
| <a
href="https://github.com/search?q=repo%3Ajupyter-server%2Fjupyter_server+involves%3Aminrk+updated%3A2023-11-21..2023-11-27&amp;type=Issues"><code>@​minrk</code></a>
| <a
href="https://github.com/search?q=repo%3Ajupyter-server%2Fjupyter_server+involves%3AWh1isper+updated%3A2023-11-21..2023-11-27&amp;type=Issues"><code>@​Wh1isper</code></a></p>
<h2>2.11.0</h2>
<p>(<a
href="https://github.com/jupyter-server/jupyter_server/compare/v2.10.1...e7c0f331d4cbf82eb1a9e9bc6c260faabda0255a">Full
Changelog</a>)</p>
<h3>Enhancements made</h3>
<ul>
<li>Support get file(notebook) md5 <a
href="https://redirect.github.com/jupyter-server/jupyter_server/pull/1363">#1363</a>
(<a href="https://github.com/Wh1isper"><code>@​Wh1isper</code></a>)</li>
</ul>
<h3>Maintenance and upkeep improvements</h3>
<ul>
<li>Update ruff and typings <a
href="https://redirect.github.com/jupyter-server/jupyter_server/pull/1365">#1365</a>
(<a
href="https://github.com/blink1073"><code>@​blink1073</code></a>)</li>
</ul>
<h3>Documentation improvements</h3>
<ul>
<li>Update api docs with md5 param <a
href="https://redirect.github.com/jupyter-server/jupyter_server/pull/1364">#1364</a>
(<a href="https://github.com/Wh1isper"><code>@​Wh1isper</code></a>)</li>
<li>typo: ServerApp <a
href="https://redirect.github.com/jupyter-server/jupyter_server/pull/1361">#1361</a>
(<a href="https://github.com/IITII"><code>@​IITII</code></a>)</li>
</ul>
<h3>Contributors to this release</h3>
<p>(<a
href="https://github.com/jupyter-server/jupyter_server/graphs/contributors?from=2023-11-15&amp;to=2023-11-21&amp;type=c">GitHub
contributors page for this release</a>)</p>
<p><a
href="https://github.com/search?q=repo%3Ajupyter-server%2Fjupyter_server+involves%3Ablink1073+updated%3A2023-11-15..2023-11-21&amp;type=Issues"><code>@​blink1073</code></a>
| <a
href="https://github.com/search?q=repo%3Ajupyter-server%2Fjupyter_server+involves%3AIITII+updated%3A2023-11-15..2023-11-21&amp;type=Issues"><code>@​IITII</code></a>
| <a
href="https://github.com/search?q=repo%3Ajupyter-server%2Fjupyter_server+involves%3Awelcome+updated%3A2023-11-15..2023-11-21&amp;type=Issues"><code>@​welcome</code></a>
| <a
href="https://github.com/search?q=repo%3Ajupyter-server%2Fjupyter_server+involves%3AWh1isper+updated%3A2023-11-15..2023-11-21&amp;type=Issues"><code>@​Wh1isper</code></a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="9bd96576c3"><code>9bd9657</code></a>
Publish 2.11.2</li>
<li><a
href="0056c3aa52"><code>0056c3a</code></a>
Merge pull request from GHSA-h56g-gq9v-vc8r</li>
<li><a
href="88eca9931c"><code>88eca99</code></a>
Bump to 2.12.0.dev0</li>
<li><a
href="3755794a56"><code>3755794</code></a>
Publish 2.11.1</li>
<li><a
href="40a95e5f39"><code>40a95e5</code></a>
avoid unhandled error on some invalid paths (<a
href="https://redirect.github.com/jupyter-server/jupyter_server/issues/1369">#1369</a>)</li>
<li><a
href="ecd5b1f9eb"><code>ecd5b1f</code></a>
Change md5 to hash and hash_algorithm, fix incompatibility (<a
href="https://redirect.github.com/jupyter-server/jupyter_server/issues/1367">#1367</a>)</li>
<li><a
href="8e5d7668ae"><code>8e5d766</code></a>
Bump to 2.12.0.dev0</li>
<li><a
href="cc74bb64ed"><code>cc74bb6</code></a>
Publish 2.11.0</li>
<li><a
href="e7c0f331d4"><code>e7c0f33</code></a>
Update api docs with md5 param (<a
href="https://redirect.github.com/jupyter-server/jupyter_server/issues/1364">#1364</a>)</li>
<li><a
href="0983b715cc"><code>0983b71</code></a>
Update ruff and typings (<a
href="https://redirect.github.com/jupyter-server/jupyter_server/issues/1365">#1365</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/jupyter-server/jupyter_server/compare/v2.10.1...v2.11.2">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=jupyter-server&package-manager=pip&previous-version=2.10.1&new-version=2.11.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/thu-ml/tianshou/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-12-06 12:24:43 +01:00
Michael Panchenko
34f89995f1
Docs/overhaul (#999)
Closes #916 

This PR presents an overhaul of how the docs are built and presented

1. Notebooks are no longer just links in some drive. They are checked in
without their outputs, executed in CI, and thereby serve as integration
tests as well as tutorials. They have been adjusted to work with the
current master branch
2. Execution of notebooks is cached, so it's very fast
3. The api docs are generated automatically with a custom script.
Previously this was only done for the highlevel module
4. The build is happening with jupyter-book (which still uses sphinx in
the backend). It is using the default jupyter book theme, which I think
looks very nice and adds useful navigation to the right side of the
screen
5. Customized api docs rendering for better appearance
6. The toc of the docs is built automatically with jupyter-book. The api
docs generation script has been adjusted accordingly
7. The viewcode and linkcode extensions add source code and links to it
to the docs
8. A bunch of docstrings have been adjusted to better reflect the
configured rules
9. Several typing issues improved to make mypy happy

It was quite a piece of work, I hope you like the result :)
2023-12-06 09:55:46 +01:00
Michael Panchenko
4c24dc6441 Formatting 2023-12-05 23:46:54 +01:00
Michael Panchenko
5f4a02cc69 Docs: improve API landing page 2023-12-05 23:28:29 +01:00
Michael Panchenko
9d1440752e Deal with .jupyter_cache 2023-12-05 22:52:45 +01:00
Michael Panchenko
c50e74f263 Fix rtd build, improvements in task running 2023-12-05 22:42:55 +01:00
Michael Panchenko
19e129d0cf Fix rtd build 2023-12-05 13:23:18 +01:00
Michael Panchenko
0b67447541 Docs: fixing spelling, re-adding spellcheck to pipeline 2023-12-05 13:22:04 +01:00
Michael Panchenko
a846b52063 Typing: fixed multiple typing issues 2023-12-05 12:04:18 +01:00
Michael Panchenko
2e39a252e3 Docstring: minor changes to let ruff pass 2023-12-04 13:52:46 +01:00
Michael Panchenko
28fda00b27 Docs: added links to source code, readded some ruff ignore rules 2023-12-04 13:52:46 +01:00
Michael Panchenko
b12983622b Docs: added sorting order for autogenerated toc 2023-12-04 13:52:46 +01:00
Michael Panchenko
5af29475e8 Docs: removed capitalization 2023-12-04 11:48:10 +01:00
Michael Panchenko
4cfefcf75d Docs: removed conflicting sphinx stuff from a docstring 2023-12-04 11:48:09 +01:00
Michael Panchenko
a5685619ce Docs: generate all api docs automatically
Reinstate the -W option
Several overall improvements in docs
Fixed multiple links
2023-12-04 11:48:09 +01:00
Michael Panchenko
006577da08 WIP - restructure doc files 2023-12-04 11:48:09 +01:00
Michael Panchenko
d4b6d9b250 WIP - restructure doc files 2023-12-04 11:47:40 +01:00