Tianshou/tianshou/utils/lr_scheduler.py

import torch


class MultipleLRSchedulers:
    """A wrapper for multiple learning rate schedulers.

    Every time :meth:`~tianshou.utils.MultipleLRSchedulers.step` is called,
    it calls the step() method of each of the schedulers that it contains.
    Example usage:
    ::

        scheduler1 = ConstantLR(opt1, factor=0.1, total_iters=2)
        scheduler2 = ExponentialLR(opt2, gamma=0.9)
        scheduler = MultipleLRSchedulers(scheduler1, scheduler2)
        policy = PPOPolicy(..., lr_scheduler=scheduler)
    """

    def __init__(self, *args: torch.optim.lr_scheduler.LRScheduler):
        self.schedulers = args

    def step(self) -> None:
        """Take a step in each of the learning rate schedulers."""
        for scheduler in self.schedulers:
            scheduler.step()

    def state_dict(self) -> list[dict]:
        """Get state_dict for each of the learning rate schedulers.

        :return: A list of state_dict of learning rate schedulers.
        """
        return [s.state_dict() for s in self.schedulers]

    def load_state_dict(self, state_dict: list[dict]) -> None:
        """Load states from state_dict.

        :param state_dict: A list of learning rate scheduler
            state_dict, in the same order as the schedulers.
        """
        for s, sd in zip(self.schedulers, state_dict, strict=True):
            s.__dict__.update(sd)
Add learning rate scheduler to BasePolicy (#598) 2022-04-17 08:52:30 -07:00			`import torch`


			`class MultipleLRSchedulers:`
			`"""A wrapper for multiple learning rate schedulers.`

			Every time :meth:`~tianshou.utils.MultipleLRSchedulers.step` is called,
			`it calls the step() method of each of the schedulers that it contains.`
			`Example usage:`
			`::`

			`scheduler1 = ConstantLR(opt1, factor=0.1, total_iters=2)`
			`scheduler2 = ExponentialLR(opt2, gamma=0.9)`
			`scheduler = MultipleLRSchedulers(scheduler1, scheduler2)`
			`policy = PPOPolicy(..., lr_scheduler=scheduler)`
			`"""`

Improve type annotations, fix type issues and add checks 2023-10-09 17:22:52 +02:00			`def __init__(self, *args: torch.optim.lr_scheduler.LRScheduler):`
Add learning rate scheduler to BasePolicy (#598) 2022-04-17 08:52:30 -07:00			`self.schedulers = args`

			`def step(self) -> None:`
			`"""Take a step in each of the learning rate schedulers."""`
			`for scheduler in self.schedulers:`
			`scheduler.step()`

Python 3.9, black + ruff formatting (#921) Preparation for #914 and #920 Changes formatting to ruff and black. Remove python 3.8 ## Additional Changes - Removed flake8 dependencies - Adjusted pre-commit. Now CI and Make use pre-commit, reducing the duplication of linting calls - Removed check-docstyle option (ruff is doing that) - Merged format and lint. In CI the format-lint step fails if any changes are done, so it fulfills the lint functionality. --------- Co-authored-by: Jiayi Weng <jiayi@openai.com> 2023-08-25 23:40:56 +02:00			`def state_dict(self) -> list[dict]:`
Add learning rate scheduler to BasePolicy (#598) 2022-04-17 08:52:30 -07:00			`"""Get state_dict for each of the learning rate schedulers.`

			`:return: A list of state_dict of learning rate schedulers.`
			`"""`
			`return [s.state_dict() for s in self.schedulers]`

Python 3.9, black + ruff formatting (#921) Preparation for #914 and #920 Changes formatting to ruff and black. Remove python 3.8 ## Additional Changes - Removed flake8 dependencies - Adjusted pre-commit. Now CI and Make use pre-commit, reducing the duplication of linting calls - Removed check-docstyle option (ruff is doing that) - Merged format and lint. In CI the format-lint step fails if any changes are done, so it fulfills the lint functionality. --------- Co-authored-by: Jiayi Weng <jiayi@openai.com> 2023-08-25 23:40:56 +02:00			`def load_state_dict(self, state_dict: list[dict]) -> None:`
Add learning rate scheduler to BasePolicy (#598) 2022-04-17 08:52:30 -07:00			`"""Load states from state_dict.`

Remove kwargs in policy init (#950) Closes #947 This removes all kwargs from all policy constructors. While doing that, I also improved several names and added a whole lot of TODOs. ## Functional changes: 1. Added possibility to pass None as `critic2` and `critic2_optim`. In fact, the default behavior then should cover the absolute majority of cases 2. Added a function called `clone_optimizer` as a temporary measure to support passing `critic2_optim=None` ## Breaking changes: 1. `action_space` is no longer optional. In fact, it already was non-optional, as there was a ValueError in BasePolicy.init. So now several examples were fixed to reflect that 2. `reward_normalization` removed from DDPG and children. It was never allowed to pass it as `True` there, an error would have been raised in `compute_n_step_reward`. Now I removed it from the interface 3. renamed `critic1` and similar to `critic`, in order to have uniform interfaces. Note that the `critic` in DDPG was optional for the sole reason that child classes used `critic1`. I removed this optionality (DDPG can't do anything with `critic=None`) 4. Several renamings of fields (mostly private to public, so backwards compatible) ## Additional changes: 1. Removed type and default declaration from docstring. This kind of duplication is really not necessary 2. Policy constructors are now only called using named arguments, not a fragile mixture of positional and named as before 5. Minor beautifications in typing and code 6. Generally shortened docstrings and made them uniform across all policies (hopefully) ## Comment: With these changes, several problems in tianshou's inheritance hierarchy become more apparent. I tried highlighting them for future work. --------- Co-authored-by: Dominik Jain <d.jain@appliedai.de> 2023-10-08 17:57:03 +02:00			`:param state_dict: A list of learning rate scheduler`
Add learning rate scheduler to BasePolicy (#598) 2022-04-17 08:52:30 -07:00			`state_dict, in the same order as the schedulers.`
			`"""`
Poetry install, remove gym, bump python (#925) Closes #914 Additional changes: - Deprecate python below 11 - Remove 3rd party and throughput tests. This simplifies install and test pipeline - Remove gym compatibility and shimmy - Format with 3.11 conventions. In particular, add `zip(..., strict=True/False)` where possible Since the additional tests and gym were complicating the CI pipeline (flaky and dist-dependent), it didn't make sense to work on fixing the current tests in this PR to then just delete them in the next one. So this PR changes the build and removes these tests at the same time. 2023-09-05 23:34:23 +02:00			`for s, sd in zip(self.schedulers, state_dict, strict=True):`
Add learning rate scheduler to BasePolicy (#598) 2022-04-17 08:52:30 -07:00			`s.__dict__.update(sd)`