Tianshou/tianshou/highlevel/optim.py

from abc import ABC, abstractmethod
from typing import Any

import torch
from torch.optim import Adam, RMSprop

from tianshou.utils.string import ToStringMixin


class OptimizerFactory(ABC, ToStringMixin):
    # TODO: Is it OK to assume that all optimizers have a learning rate argument?
    # Right now, the learning rate is typically a configuration parameter.
    # If we drop the assumption, we can't have that and will need to move the parameter
    # to the optimizer factory, which is inconvenient for the user.
    @abstractmethod
    def create_optimizer(self, module: torch.nn.Module, lr: float) -> torch.optim.Optimizer:
        pass


class OptimizerFactoryTorch(OptimizerFactory):
    def __init__(self, optim_class: Any, **kwargs):
        """:param optim_class: the optimizer class (e.g. subclass of `torch.optim.Optimizer`),
            which will be passed the module parameters, the learning rate as `lr` and the
            kwargs provided.
        :param kwargs: keyword arguments to provide at optimizer construction
        """
        self.optim_class = optim_class
        self.kwargs = kwargs

    def create_optimizer(self, module: torch.nn.Module, lr: float) -> torch.optim.Optimizer:
        return self.optim_class(module.parameters(), lr=lr, **self.kwargs)


class OptimizerFactoryAdam(OptimizerFactory):
    def __init__(self, betas=(0.9, 0.999), eps=1e-08, weight_decay=0):
        self.weight_decay = weight_decay
        self.eps = eps
        self.betas = betas

    def create_optimizer(self, module: torch.nn.Module, lr: float) -> Adam:
        return Adam(
            module.parameters(),
            lr=lr,
            betas=self.betas,
            eps=self.eps,
            weight_decay=self.weight_decay,
        )


class OptimizerFactoryRMSprop(OptimizerFactory):
    def __init__(self, alpha=0.99, eps=1e-08, weight_decay=0, momentum=0, centered=False):
        self.alpha = alpha
        self.momentum = momentum
        self.centered = centered
        self.weight_decay = weight_decay
        self.eps = eps

    def create_optimizer(self, module: torch.nn.Module, lr: float) -> RMSprop:
        return RMSprop(
            module.parameters(),
            lr=lr,
            alpha=self.alpha,
            eps=self.eps,
            weight_decay=self.weight_decay,
            momentum=self.momentum,
            centered=self.centered,
        )
Initial high-level interfaces, demonstrated in mujoco_ppo_hl 2023-09-19 18:53:11 +02:00			`from abc import ABC, abstractmethod`
Refactoring, dropping package config 2023-09-20 13:15:06 +02:00			`from typing import Any`
Initial high-level interfaces, demonstrated in mujoco_ppo_hl 2023-09-19 18:53:11 +02:00
			`import torch`
Add A2C high-level API * Add common based class for A2C and PPO agent factories * Add default for dist_fn parameter, adding corresponding factories * Add example mujoco_a2c_hl 2023-09-28 14:28:03 +02:00			`from torch.optim import Adam, RMSprop`
Initial high-level interfaces, demonstrated in mujoco_ppo_hl 2023-09-19 18:53:11 +02:00
Log full experiment configuration, adding string representations to relevant classes 2023-10-03 21:14:22 +02:00			`from tianshou.utils.string import ToStringMixin`
Initial high-level interfaces, demonstrated in mujoco_ppo_hl 2023-09-19 18:53:11 +02:00
Log full experiment configuration, adding string representations to relevant classes 2023-10-03 21:14:22 +02:00
			`class OptimizerFactory(ABC, ToStringMixin):`
Adapt class naming scheme * Use prefix convention (subclasses have superclass names as prefix) to facilitate discoverability of relevant classes via IDE autocompletion * Use dual naming, adding an alternative concise name that omits the precise OO semantics and retains only the essential part of the name (which can be more pleasing to users not accustomed to convoluted OO naming) 2023-09-27 17:20:35 +02:00			`# TODO: Is it OK to assume that all optimizers have a learning rate argument?`
			`# Right now, the learning rate is typically a configuration parameter.`
			`# If we drop the assumption, we can't have that and will need to move the parameter`
			`# to the optimizer factory, which is inconvenient for the user.`
Initial high-level interfaces, demonstrated in mujoco_ppo_hl 2023-09-19 18:53:11 +02:00			`@abstractmethod`
Add SAC high-level interface 2023-09-20 09:29:34 +02:00			`def create_optimizer(self, module: torch.nn.Module, lr: float) -> torch.optim.Optimizer:`
Initial high-level interfaces, demonstrated in mujoco_ppo_hl 2023-09-19 18:53:11 +02:00			`pass`


Adapt class naming scheme * Use prefix convention (subclasses have superclass names as prefix) to facilitate discoverability of relevant classes via IDE autocompletion * Use dual naming, adding an alternative concise name that omits the precise OO semantics and retains only the essential part of the name (which can be more pleasing to users not accustomed to convoluted OO naming) 2023-09-27 17:20:35 +02:00			`class OptimizerFactoryTorch(OptimizerFactory):`
Add SAC high-level interface 2023-09-20 09:29:34 +02:00			`def __init__(self, optim_class: Any, **kwargs):`
Adapt class naming scheme * Use prefix convention (subclasses have superclass names as prefix) to facilitate discoverability of relevant classes via IDE autocompletion * Use dual naming, adding an alternative concise name that omits the precise OO semantics and retains only the essential part of the name (which can be more pleasing to users not accustomed to convoluted OO naming) 2023-09-27 17:20:35 +02:00			""":param optim_class: the optimizer class (e.g. subclass of `torch.optim.Optimizer`),
			which will be passed the module parameters, the learning rate as `lr` and the
			`kwargs provided.`
			`:param kwargs: keyword arguments to provide at optimizer construction`
			`"""`
Initial high-level interfaces, demonstrated in mujoco_ppo_hl 2023-09-19 18:53:11 +02:00			`self.optim_class = optim_class`
			`self.kwargs = kwargs`

Add SAC high-level interface 2023-09-20 09:29:34 +02:00			`def create_optimizer(self, module: torch.nn.Module, lr: float) -> torch.optim.Optimizer:`
			`return self.optim_class(module.parameters(), lr=lr, **self.kwargs)`
Initial high-level interfaces, demonstrated in mujoco_ppo_hl 2023-09-19 18:53:11 +02:00

Adapt class naming scheme * Use prefix convention (subclasses have superclass names as prefix) to facilitate discoverability of relevant classes via IDE autocompletion * Use dual naming, adding an alternative concise name that omits the precise OO semantics and retains only the essential part of the name (which can be more pleasing to users not accustomed to convoluted OO naming) 2023-09-27 17:20:35 +02:00			`class OptimizerFactoryAdam(OptimizerFactory):`
Add high-level experiment builder interface 2023-09-21 12:36:27 +02:00			`def __init__(self, betas=(0.9, 0.999), eps=1e-08, weight_decay=0):`
			`self.weight_decay = weight_decay`
			`self.eps = eps`
			`self.betas = betas`

Add SAC high-level interface 2023-09-20 09:29:34 +02:00			`def create_optimizer(self, module: torch.nn.Module, lr: float) -> Adam:`
Add high-level experiment builder interface 2023-09-21 12:36:27 +02:00			`return Adam(`
			`module.parameters(),`
			`lr=lr,`
			`betas=self.betas,`
			`eps=self.eps,`
			`weight_decay=self.weight_decay,`
			`)`
Add A2C high-level API * Add common based class for A2C and PPO agent factories * Add default for dist_fn parameter, adding corresponding factories * Add example mujoco_a2c_hl 2023-09-28 14:28:03 +02:00

			`class OptimizerFactoryRMSprop(OptimizerFactory):`
			`def __init__(self, alpha=0.99, eps=1e-08, weight_decay=0, momentum=0, centered=False):`
			`self.alpha = alpha`
			`self.momentum = momentum`
			`self.centered = centered`
			`self.weight_decay = weight_decay`
			`self.eps = eps`

			`def create_optimizer(self, module: torch.nn.Module, lr: float) -> RMSprop:`
			`return RMSprop(`
			`module.parameters(),`
			`lr=lr,`
			`alpha=self.alpha,`
			`eps=self.eps,`
			`weight_decay=self.weight_decay,`
			`momentum=self.momentum,`
			`centered=self.centered,`
			`)`