Tianshou/tianshou/exploration/random.py

from abc import ABC, abstractmethod
from typing import Optional, Sequence, Union

import numpy as np


class BaseNoise(ABC, object):
    """The action noise base class."""

    def __init__(self) -> None:
        super().__init__()

    def reset(self) -> None:
        """Reset to the initial state."""
        pass

    @abstractmethod
    def __call__(self, size: Sequence[int]) -> np.ndarray:
        """Generate new noise."""
        raise NotImplementedError


class GaussianNoise(BaseNoise):
    """The vanilla Gaussian process, for exploration in DDPG by default."""

    def __init__(self, mu: float = 0.0, sigma: float = 1.0) -> None:
        super().__init__()
        self._mu = mu
        assert 0 <= sigma, "Noise std should not be negative."
        self._sigma = sigma

    def __call__(self, size: Sequence[int]) -> np.ndarray:
        return np.random.normal(self._mu, self._sigma, size)


class OUNoise(BaseNoise):
    """Class for Ornstein-Uhlenbeck process, as used for exploration in DDPG.

    Usage:
    ::

        # init
        self.noise = OUNoise()
        # generate noise
        noise = self.noise(logits.shape, eps)

    For required parameters, you can refer to the stackoverflow page. However,
    our experiment result shows that (similar to OpenAI SpinningUp) using
    vanilla Gaussian process has little difference from using the
    Ornstein-Uhlenbeck process.
    """

    def __init__(
        self,
        mu: float = 0.0,
        sigma: float = 0.3,
        theta: float = 0.15,
        dt: float = 1e-2,
        x0: Optional[Union[float, np.ndarray]] = None,
    ) -> None:
        super().__init__()
        self._mu = mu
        self._alpha = theta * dt
        self._beta = sigma * np.sqrt(dt)
        self._x0 = x0
        self.reset()

    def reset(self) -> None:
        """Reset to the initial state."""
        self._x = self._x0

    def __call__(self, size: Sequence[int], mu: Optional[float] = None) -> np.ndarray:
        """Generate new noise.

        Return an numpy array which size is equal to ``size``.
        """
        if self._x is None or isinstance(
            self._x, np.ndarray
        ) and self._x.shape != size:
            self._x = 0.0
        if mu is None:
            mu = self._mu
        r = self._beta * np.random.normal(size=size)
        self._x = self._x + self._alpha * (mu - self._x) + r
        return self._x  # type: ignore
Add auto alpha tuning and exploration noise for sac. (#80) Add class BaseNoise and GaussianNoise for the concept of exploration noise. Add new test for sac tested in MountainCarContinuous-v0, which should benefits from the two above new feature. 2020-06-16 22:17:28 +08:00			`from abc import ABC, abstractmethod`
bump to v0.4.3 (#432) * add makefile * bump version * add isort and yapf * update contributing.md * update PR template * spelling check 2021-09-03 05:05:04 +08:00			`from typing import Optional, Sequence, Union`

			`import numpy as np`
ddpg 2020-03-18 21:45:41 +08:00

Add auto alpha tuning and exploration noise for sac. (#80) Add class BaseNoise and GaussianNoise for the concept of exploration noise. Add new test for sac tested in MountainCarContinuous-v0, which should benefits from the two above new feature. 2020-06-16 22:17:28 +08:00			`class BaseNoise(ABC, object):`
			`"""The action noise base class."""`

code format and update function signatures (#213) Cherry-pick from #200 - update the function signature - format code-style - move _compile into separate functions - fix a bug in to_torch and to_numpy (Batch) - remove None in action_range In short, the code-format only contains function-signature style and `'` -> `"`. (pick up from [black](https://github.com/psf/black)) 2020-09-12 15:39:01 +08:00			`def __init__(self) -> None:`
code refactor for venv (#179) - Refacor code to remove duplicate code - Enable async simulation for all vector envs - Remove `collector.close` and rename `VectorEnv` to `DummyVectorEnv` The abstraction of vector env changed. Prior to this pr, each vector env is almost independent. After this pr, each env is wrapped into a worker, and vector envs differ with their worker type. In fact, users can just use `BaseVectorEnv` with different workers, I keep `SubprocVectorEnv`, `ShmemVectorEnv` for backward compatibility. Co-authored-by: n+e <463003665@qq.com> Co-authored-by: magicly <magicly007@gmail.com> 2020-08-19 15:00:24 +08:00			`super().__init__()`
Add auto alpha tuning and exploration noise for sac. (#80) Add class BaseNoise and GaussianNoise for the concept of exploration noise. Add new test for sac tested in MountainCarContinuous-v0, which should benefits from the two above new feature. 2020-06-16 22:17:28 +08:00
type check in unit test (#200) Fix #195: Add mypy test in .github/workflows/docs_and_lint.yml. Also remove the out-of-the-date api 2020-09-13 19:31:50 +08:00			`def reset(self) -> None:`
			`"""Reset to the initial state."""`
			`pass`

Add auto alpha tuning and exploration noise for sac. (#80) Add class BaseNoise and GaussianNoise for the concept of exploration noise. Add new test for sac tested in MountainCarContinuous-v0, which should benefits from the two above new feature. 2020-06-16 22:17:28 +08:00			`@abstractmethod`
code format and update function signatures (#213) Cherry-pick from #200 - update the function signature - format code-style - move _compile into separate functions - fix a bug in to_torch and to_numpy (Batch) - remove None in action_range In short, the code-format only contains function-signature style and `'` -> `"`. (pick up from [black](https://github.com/psf/black)) 2020-09-12 15:39:01 +08:00			`def __call__(self, size: Sequence[int]) -> np.ndarray:`
Add auto alpha tuning and exploration noise for sac. (#80) Add class BaseNoise and GaussianNoise for the concept of exploration noise. Add new test for sac tested in MountainCarContinuous-v0, which should benefits from the two above new feature. 2020-06-16 22:17:28 +08:00			`"""Generate new noise."""`
			`raise NotImplementedError`


			`class GaussianNoise(BaseNoise):`
bump to v0.4.3 (#432) * add makefile * bump version * add isort and yapf * update contributing.md * update PR template * spelling check 2021-09-03 05:05:04 +08:00			`"""The vanilla Gaussian process, for exploration in DDPG by default."""`
Add auto alpha tuning and exploration noise for sac. (#80) Add class BaseNoise and GaussianNoise for the concept of exploration noise. Add new test for sac tested in MountainCarContinuous-v0, which should benefits from the two above new feature. 2020-06-16 22:17:28 +08:00
code format and update function signatures (#213) Cherry-pick from #200 - update the function signature - format code-style - move _compile into separate functions - fix a bug in to_torch and to_numpy (Batch) - remove None in action_range In short, the code-format only contains function-signature style and `'` -> `"`. (pick up from [black](https://github.com/psf/black)) 2020-09-12 15:39:01 +08:00			`def __init__(self, mu: float = 0.0, sigma: float = 1.0) -> None:`
Add auto alpha tuning and exploration noise for sac. (#80) Add class BaseNoise and GaussianNoise for the concept of exploration noise. Add new test for sac tested in MountainCarContinuous-v0, which should benefits from the two above new feature. 2020-06-16 22:17:28 +08:00			`super().__init__()`
			`self._mu = mu`
code format and update function signatures (#213) Cherry-pick from #200 - update the function signature - format code-style - move _compile into separate functions - fix a bug in to_torch and to_numpy (Batch) - remove None in action_range In short, the code-format only contains function-signature style and `'` -> `"`. (pick up from [black](https://github.com/psf/black)) 2020-09-12 15:39:01 +08:00			`assert 0 <= sigma, "Noise std should not be negative."`
Add auto alpha tuning and exploration noise for sac. (#80) Add class BaseNoise and GaussianNoise for the concept of exploration noise. Add new test for sac tested in MountainCarContinuous-v0, which should benefits from the two above new feature. 2020-06-16 22:17:28 +08:00			`self._sigma = sigma`

code format and update function signatures (#213) Cherry-pick from #200 - update the function signature - format code-style - move _compile into separate functions - fix a bug in to_torch and to_numpy (Batch) - remove None in action_range In short, the code-format only contains function-signature style and `'` -> `"`. (pick up from [black](https://github.com/psf/black)) 2020-09-12 15:39:01 +08:00			`def __call__(self, size: Sequence[int]) -> np.ndarray:`
Add auto alpha tuning and exploration noise for sac. (#80) Add class BaseNoise and GaussianNoise for the concept of exploration noise. Add new test for sac tested in MountainCarContinuous-v0, which should benefits from the two above new feature. 2020-06-16 22:17:28 +08:00			`return np.random.normal(self._mu, self._sigma, size)`


			`class OUNoise(BaseNoise):`
add docs of collector and trainer (#20) 2020-04-05 18:34:45 +08:00			`"""Class for Ornstein-Uhlenbeck process, as used for exploration in DDPG.`
fix docs and add docstring check (#210) - fix broken links and out-of-the-date content - add pydocstyle and doc8 check - remove collector.seed and collector.render 2020-09-11 07:55:37 +08:00
add some docs 2020-04-03 21:28:12 +08:00			`Usage:`
			`::`

			`# init`
			`self.noise = OUNoise()`
			`# generate noise`
			`noise = self.noise(logits.shape, eps)`

			`For required parameters, you can refer to the stackoverflow page. However,`
			`our experiment result shows that (similar to OpenAI SpinningUp) using`
bump to v0.4.3 (#432) * add makefile * bump version * add isort and yapf * update contributing.md * update PR template * spelling check 2021-09-03 05:05:04 +08:00			`vanilla Gaussian process has little difference from using the`
add some docs 2020-04-03 21:28:12 +08:00			`Ornstein-Uhlenbeck process.`
			`"""`
ddpg 2020-03-18 21:45:41 +08:00
code format and update function signatures (#213) Cherry-pick from #200 - update the function signature - format code-style - move _compile into separate functions - fix a bug in to_torch and to_numpy (Batch) - remove None in action_range In short, the code-format only contains function-signature style and `'` -> `"`. (pick up from [black](https://github.com/psf/black)) 2020-09-12 15:39:01 +08:00			`def __init__(`
			`self,`
			`mu: float = 0.0,`
			`sigma: float = 0.3,`
			`theta: float = 0.15,`
			`dt: float = 1e-2,`
			`x0: Optional[Union[float, np.ndarray]] = None,`
			`) -> None:`
			`super().__init__()`
Add auto alpha tuning and exploration noise for sac. (#80) Add class BaseNoise and GaussianNoise for the concept of exploration noise. Add new test for sac tested in MountainCarContinuous-v0, which should benefits from the two above new feature. 2020-06-16 22:17:28 +08:00			`self._mu = mu`
			`self._alpha = theta * dt`
			`self._beta = sigma * np.sqrt(dt)`
			`self._x0 = x0`
ddpg 2020-03-18 21:45:41 +08:00			`self.reset()`

type check in unit test (#200) Fix #195: Add mypy test in .github/workflows/docs_and_lint.yml. Also remove the out-of-the-date api 2020-09-13 19:31:50 +08:00			`def reset(self) -> None:`
			`"""Reset to the initial state."""`
			`self._x = self._x0`

fix numpy>=1.20 typing check (#323) Change the behavior of to_numpy and to_torch: from now on, dict is automatically converted to Batch and list is automatically converted to np.ndarray (if an error occurs, raise the exception instead of converting each element in the list). 2021-03-30 16:06:03 +08:00			`def __call__(self, size: Sequence[int], mu: Optional[float] = None) -> np.ndarray:`
fix docs and add docstring check (#210) - fix broken links and out-of-the-date content - add pydocstyle and doc8 check - remove collector.seed and collector.render 2020-09-11 07:55:37 +08:00			`"""Generate new noise.`

code format and update function signatures (#213) Cherry-pick from #200 - update the function signature - format code-style - move _compile into separate functions - fix a bug in to_torch and to_numpy (Batch) - remove None in action_range In short, the code-format only contains function-signature style and `'` -> `"`. (pick up from [black](https://github.com/psf/black)) 2020-09-12 15:39:01 +08:00			Return an numpy array which size is equal to ``size``.
add some docs 2020-04-03 21:28:12 +08:00			`"""`
type check in unit test (#200) Fix #195: Add mypy test in .github/workflows/docs_and_lint.yml. Also remove the out-of-the-date api 2020-09-13 19:31:50 +08:00			`if self._x is None or isinstance(`
bump to v0.4.3 (#432) * add makefile * bump version * add isort and yapf * update contributing.md * update PR template * spelling check 2021-09-03 05:05:04 +08:00			`self._x, np.ndarray`
			`) and self._x.shape != size:`
code format and update function signatures (#213) Cherry-pick from #200 - update the function signature - format code-style - move _compile into separate functions - fix a bug in to_torch and to_numpy (Batch) - remove None in action_range In short, the code-format only contains function-signature style and `'` -> `"`. (pick up from [black](https://github.com/psf/black)) 2020-09-12 15:39:01 +08:00			`self._x = 0.0`
Add auto alpha tuning and exploration noise for sac. (#80) Add class BaseNoise and GaussianNoise for the concept of exploration noise. Add new test for sac tested in MountainCarContinuous-v0, which should benefits from the two above new feature. 2020-06-16 22:17:28 +08:00			`if mu is None:`
			`mu = self._mu`
			`r = self._beta * np.random.normal(size=size)`
			`self._x = self._x + self._alpha * (mu - self._x) + r`
fix numpy>=1.20 typing check (#323) Change the behavior of to_numpy and to_torch: from now on, dict is automatically converted to Batch and list is automatically converted to np.ndarray (if an error occurs, raise the exception instead of converting each element in the list). 2021-03-30 16:06:03 +08:00			`return self._x # type: ignore`