Tianshou

hongshaorou/Tianshou

Fork 0

Commit Graph

Author	SHA1	Message	Date
Anas BELFADIL	d976a5aa91	Fixed hardcoded reward_treshold (#548 )	2022-03-04 10:35:39 +08:00
Jiayi Weng	3d697aa4c6	make unit test faster (#522 ) * test cache expert data in offline training * faster cql test * faster tests * use dummy * test ray dependency	2022-02-09 00:24:52 +08:00
Yi Su	3592f45446	Fix critic network for Discrete CRR (#485 ) - Fixes an inconsistency in the implementation of Discrete CRR. Now it uses `Critic` class for its critic, following conventions in other actor-critic policies; - Updates several offline policies to use `ActorCritic` class for its optimizer to eliminate randomness caused by parameter sharing between actor and critic; - Add `writer.flush()` in TensorboardLogger to ensure real-time result; - Enable `test_collector=None` in 3 trainers to turn off testing during training; - Updates the Atari offline results in README.md; - Moves Atari offline RL examples to `examples/offline`; tests to `test/offline` per review comments.	2021-11-28 23:10:28 +08:00

Author

SHA1

Message

Date

Anas BELFADIL

d976a5aa91

Fixed hardcoded reward_treshold (#548 )

2022-03-04 10:35:39 +08:00

Jiayi Weng

3d697aa4c6

make unit test faster (#522 )

* test cache expert data in offline training

* faster cql test

* faster tests

* use dummy

* test ray dependency

2022-02-09 00:24:52 +08:00

Yi Su

3592f45446

Fix critic network for Discrete CRR (#485 )

- Fixes an inconsistency in the implementation of Discrete CRR. Now it uses `Critic` class for its critic, following conventions in other actor-critic policies;
- Updates several offline policies to use `ActorCritic` class for its optimizer to eliminate randomness caused by parameter sharing between actor and critic;
- Add `writer.flush()` in TensorboardLogger to ensure real-time result;
- Enable `test_collector=None` in 3 trainers to turn off testing during training;
- Updates the Atari offline results in README.md;
- Moves Atari offline RL examples to `examples/offline`; tests to `test/offline` per review comments.

2021-11-28 23:10:28 +08:00

3 Commits