Tianshou/tianshou/utils/logger/tensorboard.py
Michael Panchenko b900fdf6f2
Remove kwargs in policy init (#950)
Closes #947 

This removes all kwargs from all policy constructors. While doing that,
I also improved several names and added a whole lot of TODOs.

## Functional changes:

1. Added possibility to pass None as `critic2` and `critic2_optim`. In
fact, the default behavior then should cover the absolute majority of
cases
2. Added a function called `clone_optimizer` as a temporary measure to
support passing `critic2_optim=None`

## Breaking changes:

1. `action_space` is no longer optional. In fact, it already was
non-optional, as there was a ValueError in BasePolicy.init. So now
several examples were fixed to reflect that
2. `reward_normalization` removed from DDPG and children. It was never
allowed to pass it as `True` there, an error would have been raised in
`compute_n_step_reward`. Now I removed it from the interface
3. renamed `critic1` and similar to `critic`, in order to have uniform
interfaces. Note that the `critic` in DDPG was optional for the sole
reason that child classes used `critic1`. I removed this optionality
(DDPG can't do anything with `critic=None`)
4. Several renamings of fields (mostly private to public, so backwards
compatible)

## Additional changes: 
1. Removed type and default declaration from docstring. This kind of
duplication is really not necessary
2. Policy constructors are now only called using named arguments, not a
fragile mixture of positional and named as before
5. Minor beautifications in typing and code 
6. Generally shortened docstrings and made them uniform across all
policies (hopefully)

## Comment:

With these changes, several problems in tianshou's inheritance hierarchy
become more apparent. I tried highlighting them for future work.

---------

Co-authored-by: Dominik Jain <d.jain@appliedai.de>
2023-10-08 08:57:03 -07:00

95 lines
3.5 KiB
Python

from collections.abc import Callable
from typing import Any
from tensorboard.backend.event_processing import event_accumulator
from torch.utils.tensorboard import SummaryWriter
from tianshou.utils.logger.base import LOG_DATA_TYPE, BaseLogger
from tianshou.utils.warning import deprecation
class TensorboardLogger(BaseLogger):
"""A logger that relies on tensorboard SummaryWriter by default to visualize and log statistics.
:param SummaryWriter writer: the writer to log data.
:param train_interval: the log interval in log_train_data(). Default to 1000.
:param test_interval: the log interval in log_test_data(). Default to 1.
:param update_interval: the log interval in log_update_data(). Default to 1000.
:param save_interval: the save interval in save_data(). Default to 1 (save at
the end of each epoch).
:param write_flush: whether to flush tensorboard result after each
add_scalar operation. Default to True.
"""
def __init__(
self,
writer: SummaryWriter,
train_interval: int = 1000,
test_interval: int = 1,
update_interval: int = 1000,
save_interval: int = 1,
write_flush: bool = True,
) -> None:
super().__init__(train_interval, test_interval, update_interval)
self.save_interval = save_interval
self.write_flush = write_flush
self.last_save_step = -1
self.writer = writer
def write(self, step_type: str, step: int, data: LOG_DATA_TYPE) -> None:
for k, v in data.items():
self.writer.add_scalar(k, v, global_step=step)
if self.write_flush: # issue 580
self.writer.flush() # issue #482
def save_data(
self,
epoch: int,
env_step: int,
gradient_step: int,
save_checkpoint_fn: Callable[[int, int, int], str] | None = None,
) -> None:
if save_checkpoint_fn and epoch - self.last_save_step >= self.save_interval:
self.last_save_step = epoch
save_checkpoint_fn(epoch, env_step, gradient_step)
self.write("save/epoch", epoch, {"save/epoch": epoch})
self.write("save/env_step", env_step, {"save/env_step": env_step})
self.write(
"save/gradient_step",
gradient_step,
{"save/gradient_step": gradient_step},
)
def restore_data(self) -> tuple[int, int, int]:
ea = event_accumulator.EventAccumulator(self.writer.log_dir)
ea.Reload()
try: # epoch / gradient_step
epoch = ea.scalars.Items("save/epoch")[-1].step
self.last_save_step = self.last_log_test_step = epoch
gradient_step = ea.scalars.Items("save/gradient_step")[-1].step
self.last_log_update_step = gradient_step
except KeyError:
epoch, gradient_step = 0, 0
try: # offline trainer doesn't have env_step
env_step = ea.scalars.Items("save/env_step")[-1].step
self.last_log_train_step = env_step
except KeyError:
env_step = 0
return epoch, env_step, gradient_step
class BasicLogger(TensorboardLogger):
"""BasicLogger has changed its name to TensorboardLogger in #427.
This class is for compatibility.
"""
def __init__(self, *args: Any, **kwargs: Any) -> None:
deprecation(
"Class BasicLogger is marked as deprecated and will be removed soon. "
"Please use TensorboardLogger instead.",
)
super().__init__(*args, **kwargs)