4 Commits

Author SHA1 Message Date
Yi Su
a7c789f851
Improve data loading from D4RL and convert RL Unplugged to D4RL format (#624) 2022-05-04 04:37:52 +08:00
Jiayi Weng
2a9c9289e5
rename save_fn to save_best_fn to avoid ambiguity (#575)
This PR also introduces `tianshou.utils.deprecation` for a unified deprecation wrapper.
2022-03-22 04:29:27 +08:00
Yi Su
9cb74e60c9
Add imitation baselines for offline RL (#566)
add imitation baselines for offline RL; make the choice of env/task and D4RL dataset explicit; on expert datasets, IL easily outperforms; after reading the D4RL paper, I'll rerun the exps on medium data
2022-03-12 21:33:54 +08:00
Yi Su
3592f45446
Fix critic network for Discrete CRR (#485)
- Fixes an inconsistency in the implementation of Discrete CRR. Now it uses `Critic` class for its critic, following conventions in other actor-critic policies;
- Updates several offline policies to use `ActorCritic` class for its optimizer to eliminate randomness caused by parameter sharing between actor and critic;
- Add `writer.flush()` in TensorboardLogger to ensure real-time result;
- Enable `test_collector=None` in 3 trainers to turn off testing during training;
- Updates the Atari offline results in README.md;
- Moves Atari offline RL examples to `examples/offline`; tests to `test/offline` per review comments.
2021-11-28 23:10:28 +08:00