Tianshou

hongshaorou/Tianshou

Fork 0

Commit Graph

Select branches

Hide Pull Requests

feature/algo-eval

master

priv

v0.2.1

v0.2.2

v0.2.3

v0.2.4

v0.2.4.post1

v0.2.5

v0.2.6

v0.2.7

v0.3.0

v0.3.0.post1

v0.3.0rc0

v0.3.1

v0.3.2

v0.4.0

v0.4.1

v0.4.10

v0.4.11

v0.4.2

v0.4.3

v0.4.4

v0.4.5

v0.4.6

v0.4.6.post1

v0.4.7

v0.4.8

v0.4.9

v0.5.0

v1.0.0

b9a6d8b5f0

bugfixes: gym->gymnasium; render() update (#769) Will Dudley 2022-11-11 20:25:35 +00:00
06aaad460e

Fix a bug in loading offline data (#768) Yi Su 2022-11-03 16:12:33 -07:00
7ff12b909d

Tiny change since the tests are more than unit tests (#765) fzyzcjy 2022-11-01 22:20:20 +08:00
d42a5fb354

Hindsight Experience Replay as a replay buffer (#753) Juno T 2022-10-31 08:54:54 +09:00
41ae3461f6

bump version to 0.4.10 (#757) v0.4.10 Jiayi Weng 2022-10-16 22:15:20 -07:00
0181fe79a5

fix docs tictactoc dummy vector env #669 (#749) Zodan Jodan 2022-10-04 00:41:31 +00:00
128feb677f

Added support for new PettingZoo API (#751) Markus Krimmel 2022-10-02 18:33:12 +02:00
b0c8d28a7d

Added pre-commit (#752) Markus Krimmel 2022-10-02 17:57:45 +02:00
65c4e3d4cd

Fix NNI tests upon v2.9 upgrade (#750) Yuge Zhang 2022-09-27 04:55:26 +08:00
ea36dc5195

Changes to support Gym 0.26.0 (#748) Markus Krimmel 2022-09-26 18:31:23 +02:00
278c91a222

Update citation and contributor (#721) Jiayi Weng 2022-08-10 20:06:51 -07:00
0f59e38b12

Fix venv wrapper reset retval error with gym env (#712) Jiayi Weng 2022-07-31 11:00:38 -07:00
f270e88461

Do not allow async simulation for test collector (#705) Wenhao Chen 2022-07-23 07:23:55 +08:00
99c99bb09a

Fix 2 bugs and refactor RunningMeanStd to support dict obs norm (#695) Jiayi Weng 2022-07-14 22:52:56 -07:00
65054847ef

bump version to 0.4.9 (#684) v0.4.9 Jiayi Weng 2022-07-04 10:07:16 -07:00
43792bf5ab

Upgrade gym (#613) Yifei Cheng 2022-06-27 18:52:21 -04:00
aba2d01d25

MultiDiscrete to discrete gym action space wrapper (#664) Anas BELFADIL 2022-06-13 00:18:22 +02:00
21b15803ac

Fix exception with watching pistonball environments (#663) Yifei Cheng 2022-06-11 15:12:48 -04:00
df35718992

Implement TD3+BC for offline RL (#660) Yi Su 2022-06-06 09:39:37 -07:00
9ce0a554dc

Add Atari SAC examples (#657) Yi Su 2022-06-03 22:26:08 -07:00
5ecea2402e

Fix save_checkpoint_fn return value (#659) Jiayi Weng 2022-06-02 12:07:07 -05:00
6ad5b520fa

Fix sphinx build error (#655) Jiayi Weng 2022-06-01 00:56:04 -05:00
109875d43d

Fix num_envs=test_num (#653) Jiayi Weng 2022-05-29 23:38:47 -05:00
277138ca5b

Added support for clipping to DQNPolicy (#642) Michal Gregor 2022-05-18 13:33:37 +02:00
c87b9f49bc

Add show_progress option for trainer (#641) Michal Gregor 2022-05-17 17:41:59 +02:00
53e6b0408d

Add BranchingDQN for large discrete action spaces (#618) Anas BELFADIL 2022-05-15 15:40:32 +02:00
a03f19af72

fix pytest error on non-linux system (#638) Jiayi Weng 2022-05-12 08:52:55 -04:00
bf8f63ffc3

use envpool in vizdoom example, update doc (#634) Jiayi Weng 2022-05-08 12:42:16 -04:00
2a7c151738

Add vecenv wrappers for obs_norm to support running mujoco experiment with envpool (#628) v0.4.8 Jiayi Weng 2022-05-05 07:55:15 -04:00
a7c789f851

Improve data loading from D4RL and convert RL Unplugged to D4RL format (#624) Yi Su 2022-05-03 13:37:52 -07:00
dd16818ce4

implement REDQ based on original contribution by @Jimenius (#623) Yi Su 2022-04-30 09:06:00 -07:00
41afc2584a

Convert RL Unplugged Atari datasets to tianshou ReplayBuffer (#621) Yi Su 2022-04-29 04:33:28 -07:00
7f23748347

Compare Atari results with dopamine and OpenAI Baselines (#616) ChenDRAG 2022-04-27 21:10:45 +08:00
876e6b186e hot fix mujoco benchmark Jiayi Weng 2022-04-24 16:49:40 -04:00
5eab7dc218

Add Atari Results (#600) Chengqi Duan 2022-04-24 20:44:54 +08:00
5c9afe72f3

Update Mujoco Bemchmark's webpage (#606) ChenDRAG 2022-04-24 01:11:33 +08:00
e01385ea30

Change action_dim to action_shape (#602) Squeemos 2022-04-21 17:09:57 -07:00
57ecebde38

Add jupyter notebook tutorials using Google Colaboratory (#599) ChenDRAG 2022-04-19 20:58:52 +08:00
92456cdb68

Add learning rate scheduler to BasePolicy (#598) Alex Nikulkov 2022-04-17 08:52:30 -07:00
6fc6857812

Update Multi-agent RL docs, upgrade pettingzoo (#595) Yifei Cheng 2022-04-16 11:17:53 -04:00
18277497ed

fix py39 ci venv test failure (#593) Jiayi Weng 2022-04-12 10:29:39 -04:00
75d7c9f1d9

Fix action scaling bug in SAC (#591) ChenDRAG 2022-04-12 00:26:06 +08:00
f13e415eb0

Add write_flush in two loggers, fix argument passing in WandbLogger (#581) Jiayi Weng 2022-03-29 20:04:23 -04:00
6ab9860183

fix negative collector time (#578) Jiayi Weng 2022-03-25 22:44:08 -04:00
2a9c9289e5

rename save_fn to save_best_fn to avoid ambiguity (#575) v0.4.7 Jiayi Weng 2022-03-21 16:29:27 -04:00
10d919052b

Add Trainers as generators (#559) Jose Antonio Martin H 2022-03-17 17:26:14 +01:00
2336a7db1b

fixed typo in rainbow DQN paper reference (#569) Andrea Boscolo Camiletto 2022-03-16 14:38:51 +01:00
39f8391cfb

Add map_action_inverse for fixing error of storing random action (#568) Minhui Li 2022-03-12 22:26:00 +08:00
9cb74e60c9

Add imitation baselines for offline RL (#566) Yi Su 2022-03-12 05:33:54 -08:00
74f430ea36

Add a comment before SAC alpha loss (#565) Alex Nikulkov 2022-03-08 14:38:42 -08:00
ad2e1eaea0 Fix WandbLogger import error in Atari examples (#562) Chengqi Duan 2022-03-08 08:38:56 -05:00
df3d7f582b

Update WandbLogger implementation (#558) Costa Huang 2022-03-06 17:40:47 -05:00
2377f2f186

Implement Generative Adversarial Imitation Learning (GAIL) (#550) Yi Su 2022-03-06 07:57:15 -08:00
d976a5aa91

Fixed hardcoded reward_treshold (#548) Anas BELFADIL 2022-03-04 03:35:39 +01:00
c248b4f87e

fix conda support and keep API compatibility (#536) v0.4.6.post1 Jiayi Weng 2022-02-25 11:05:02 -05:00
97df511a13

Add VizDoom PPO example and results (#533) v0.4.6 Yi Su 2022-02-24 17:33:34 -08:00
23fbc3b712

upgrade gym version to >=0.21, fix related CI and update examples/atari (#534) Chengqi Duan 2022-02-25 07:40:33 +08:00
c7e2e56fac

Pettingzoo support (#494) Mohammad Mahdi Rahimi 2022-02-15 17:56:45 +03:00
d85bc19269

update dqn tutorial and add envpool to docs (#526) Chengqi Duan 2022-02-15 06:39:47 +08:00
d29188ee77

update atari ppo slots (#529) Yi Su 2022-02-12 12:04:21 -08:00
40289b8b0e

Add atari ppo example (#523) Yi Su 2022-02-10 14:45:06 -08:00
3d697aa4c6

make unit test faster (#522) Jiayi Weng 2022-02-08 11:24:52 -05:00
9c100e0705

Enable venvs.reset() concurrent execution (#517) Chengqi Duan 2022-02-08 00:40:01 +08:00
cd7654bfd5

Fixing casts to int by to_torch_as(...) calls in policies when using discrete actions (#521) Kenneth Schröder 2022-02-06 20:42:46 +01:00
c25926dd8f

Formalize variable names (#509) ChenDRAG 2022-01-30 00:53:56 +08:00
bc53ead273

Implement CQLPolicy and offline_cql example (#506) Bernard Tan 2022-01-16 05:30:21 +08:00
a59d96d041

Add Intrinsic Curiosity Module (#503) Yi Su 2022-01-14 10:43:48 -08:00
a2d76d1276

Remove reset_buffer() from reset method (#501) Markus28 2022-01-13 01:46:28 +01:00
3592f45446

Fix critic network for Discrete CRR (#485) v0.4.5 Yi Su 2021-11-28 07:10:28 -08:00
5c5a3db94e

Implement BCQPolicy and offline_bcq example (#480) Bernard Tan 2021-11-22 22:21:02 +08:00
94d3b27db9

fix tqdm issue (#481) Jiayi Weng 2021-11-18 11:17:44 -05:00
8f19a86966

Implements set_env_attr and get_env_attr for vector environments (#478) Markus28 2021-11-02 17:08:00 +01:00
098d466467

fix atari wrapper to be deterministic (#467) Jiayi Weng 2021-10-19 10:26:11 -04:00
b9eedc516e bump to 0.4.4 v0.4.4 Jiayi Weng 2021-10-13 12:22:24 -04:00
63d752ee0b

W&B: Add usage in the docs (#463) Ayush Chaurasia 2021-10-13 20:58:25 +05:30
926ec0b9b1

update save_fn in trainer (#459) Jiayi Weng 2021-10-13 09:25:24 -04:00
e45e2096d8

add multi-GPU support (#461) Jiayi Weng 2021-10-05 13:39:14 -04:00
5df64800f4

final fix for actor_critic shared head parameters (#458) Jiayi Weng 2021-10-04 11:19:07 -04:00
22d7bf38c8

Improve W&B logger (#441) Ayush Chaurasia 2021-09-24 19:22:23 +05:30
e8f8cdfa41

fix logger.write error in atari script (#444) Jiayi Weng 2021-09-09 00:51:39 +08:00
fc251ab0b8

bump to v0.4.3 (#432) v0.4.3 n+e 2021-09-03 05:05:04 +08:00
a740496a51

fix dual clip implementation (#435) Ending Hsiao 2021-09-02 21:43:14 +08:00
8a5e2190f7

Add Weights and Biases Logger (#427) Andriy Drozdyuk 2021-08-30 10:35:02 -04:00
e4f4f0e144

fix docs build failure and a bug in a2c/ppo optimizer (#428) n+e 2021-08-30 02:07:03 +08:00
291be08d43

Add Rainbow DQN (#386) Yi Su 2021-08-29 08:34:59 -07:00
d161059c3d

Replaced indice by plural indices (#422) Andriy Drozdyuk 2021-08-20 09:58:44 -04:00
728b88b92d

Fix conda install command (#419) deeplook 2021-08-16 12:56:01 +02:00
5b7732a29b

make ppo discrete test script more general (#418) n+e 2021-08-15 21:37:37 +08:00
bba30f83d1

fix sb2's coverage (#412) n+e 2021-08-10 17:43:27 +08:00
42538f8e58

Update README.md (#410) Miguel Morales 2021-08-09 19:14:20 -06:00
0674ff628a

Cite Tianshou's latest paper (#406) ChenDRAG 2021-08-10 08:35:01 +08:00
18d2f25eff

Remove warnings about the use of save_fn across trainers (#408) Andriy Drozdyuk 2021-08-03 21:56:00 -04:00
c19876179a

add env_id in preprocess fn (#391) n+e 2021-07-05 09:50:39 +08:00
ebaca6f8da

add vizdoom example, bump version to 0.4.2 (#384) v0.4.2 n+e 2021-06-26 18:08:41 +08:00
c0bc8e00ca

Add Fully-parameterized Quantile Function (#376) Yi Su 2021-06-14 20:59:02 -07:00
21b2b22cd7

update iqn results and reward plots (#377) Yi Su 2021-06-09 18:05:25 -07:00
f3169b4c1f

Add Implicit Quantile Network (#371) Yi Su 2021-05-28 18:44:23 -07:00
458028a326

fix docs (#373) n+e 2021-05-23 12:43:03 +08:00
655d5fb14f

Allow researchers to choose whether to use Double DQN (#368) Ark 2021-05-21 10:53:34 +08:00
8f7bc65ac7

Add discrete Critic Regularized Regression (#367) Yi Su 2021-05-18 22:29:56 -07:00