Tianshou

History

Jose Antonio Martin H 10d919052b

The new proposed feature is to have trainers as generators.
The usage pattern is:

```python
trainer = OnPolicyTrainer(...)
for epoch, epoch_stat, info in trainer:
    print(f"Epoch: {epoch}")
    print(epoch_stat)
    print(info)
    do_something_with_policy()
    query_something_about_policy()
    make_a_plot_with(epoch_stat)
    display(info)
```

- epoch int: the epoch number
- epoch_stat dict: a large collection of metrics of the current epoch, including stat
- info dict: the usual dict out of the non-generator version of the trainer

You can even iterate on several different trainers at the same time:

```python
trainer1 = OnPolicyTrainer(...)
trainer2 = OnPolicyTrainer(...)
for result1, result2, ... in zip(trainer1, trainer2, ...):
    compare_results(result1, result2, ...)
```

Co-authored-by: Jiayi Weng <trinkle23897@gmail.com>

2022-03-18 00:26:14 +08:00

tianshou.data.rst

fix venv seed, add TOC in docs, and split buffer.py into several files (#303 )

2021-03-02 12:28:28 +08:00

tianshou.env.rst

fix venv seed, add TOC in docs, and split buffer.py into several files (#303 )

2021-03-02 12:28:28 +08:00

tianshou.exploration.rst

test api doc

2020-04-02 09:07:04 +08:00

tianshou.policy.rst

Implement Generative Adversarial Imitation Learning (GAIL) (#550 )