Yi Su
df35718992
Implement TD3+BC for offline RL ( #660 )
...
- implement TD3+BC for offline RL;
- fix a bug in trainer about test reward not logged because self.env_step is not set for offline setting;
2022-06-07 00:39:37 +08:00
Anas BELFADIL
53e6b0408d
Add BranchingDQN for large discrete action spaces ( #618 )
2022-05-15 21:40:32 +08:00
Yi Su
dd16818ce4
implement REDQ based on original contribution by @Jimenius ( #623 )
...
Co-authored-by: Minhui Li
<limh@lamda.nju.edu.cn>
2022-05-01 00:06:00 +08:00
Yi Su
2377f2f186
Implement Generative Adversarial Imitation Learning (GAIL) ( #550 )
...
Implement GAIL based on PPO and provide example script and sample (i.e., most likely not the best) results with Mujoco tasks. (#531 , #173 )
2022-03-06 23:57:15 +08:00
Bernard Tan
bc53ead273
Implement CQLPolicy and offline_cql example ( #506 )
2022-01-16 05:30:21 +08:00
Yi Su
a59d96d041
Add Intrinsic Curiosity Module ( #503 )
2022-01-15 02:43:48 +08:00
Bernard Tan
5c5a3db94e
Implement BCQPolicy and offline_bcq example ( #480 )
...
This PR implements BCQPolicy, which could be used to train an offline agent in the environment of continuous action space. An experimental result 'halfcheetah-expert-v1' is provided, which is a d4rl environment (for Offline Reinforcement Learning).
Example usage is in the examples/offline/offline_bcq.py.
2021-11-22 22:21:02 +08:00
Yi Su
291be08d43
Add Rainbow DQN ( #386 )
...
- add RainbowPolicy
- add `set_beta` method in prio_buffer
- add NoisyLinear in utils/network
2021-08-29 23:34:59 +08:00
Yi Su
c0bc8e00ca
Add Fully-parameterized Quantile Function ( #376 )
2021-06-15 11:59:02 +08:00
Yi Su
f3169b4c1f
Add Implicit Quantile Network ( #371 )
2021-05-29 09:44:23 +08:00
Yi Su
8f7bc65ac7
Add discrete Critic Regularized Regression ( #367 )
2021-05-19 13:29:56 +08:00
Yi Su
b5c3ddabfa
Add discrete Conservative Q-Learning for offline RL ( #359 )
...
Co-authored-by: Yi Su <yi.su@antgroup.com>
Co-authored-by: Yi Su <yi.su@antfin.com>
2021-05-12 09:24:48 +08:00
ChenDRAG
1dcf65fe21
Add NPG policy ( #344 )
2021-04-21 09:52:15 +08:00
ChenDRAG
5057b5c89e
Add TRPO policy ( #337 )
2021-04-16 20:37:12 +08:00
n+e
454c86c469
fix venv seed, add TOC in docs, and split buffer.py into several files ( #303 )
...
Things changed in this PR:
- various docs update, add TOC
- split buffer into several files
- fix venv action_space randomness
2021-03-02 12:28:28 +08:00
Trinkle23897
0acd0d164c
test api doc
2020-04-02 09:07:04 +08:00