Commit Graph

  • b9a6d8b5f0
    bugfixes: gym->gymnasium; render() update (#769) Will Dudley 2022-11-11 20:25:35 +00:00
  • 06aaad460e
    Fix a bug in loading offline data (#768) Yi Su 2022-11-03 16:12:33 -07:00
  • 7ff12b909d
    Tiny change since the tests are more than unit tests (#765) fzyzcjy 2022-11-01 22:20:20 +08:00
  • d42a5fb354
    Hindsight Experience Replay as a replay buffer (#753) Juno T 2022-10-31 08:54:54 +09:00
  • 41ae3461f6
    bump version to 0.4.10 (#757) v0.4.10 Jiayi Weng 2022-10-16 22:15:20 -07:00
  • 0181fe79a5
    fix docs tictactoc dummy vector env #669 (#749) Zodan Jodan 2022-10-04 00:41:31 +00:00
  • 128feb677f
    Added support for new PettingZoo API (#751) Markus Krimmel 2022-10-02 18:33:12 +02:00
  • b0c8d28a7d
    Added pre-commit (#752) Markus Krimmel 2022-10-02 17:57:45 +02:00
  • 65c4e3d4cd
    Fix NNI tests upon v2.9 upgrade (#750) Yuge Zhang 2022-09-27 04:55:26 +08:00
  • ea36dc5195
    Changes to support Gym 0.26.0 (#748) Markus Krimmel 2022-09-26 18:31:23 +02:00
  • 278c91a222
    Update citation and contributor (#721) Jiayi Weng 2022-08-10 20:06:51 -07:00
  • 0f59e38b12
    Fix venv wrapper reset retval error with gym env (#712) Jiayi Weng 2022-07-31 11:00:38 -07:00
  • f270e88461
    Do not allow async simulation for test collector (#705) Wenhao Chen 2022-07-23 07:23:55 +08:00
  • 99c99bb09a
    Fix 2 bugs and refactor RunningMeanStd to support dict obs norm (#695) Jiayi Weng 2022-07-14 22:52:56 -07:00
  • 65054847ef
    bump version to 0.4.9 (#684) v0.4.9 Jiayi Weng 2022-07-04 10:07:16 -07:00
  • 43792bf5ab
    Upgrade gym (#613) Yifei Cheng 2022-06-27 18:52:21 -04:00
  • aba2d01d25
    MultiDiscrete to discrete gym action space wrapper (#664) Anas BELFADIL 2022-06-13 00:18:22 +02:00
  • 21b15803ac
    Fix exception with watching pistonball environments (#663) Yifei Cheng 2022-06-11 15:12:48 -04:00
  • df35718992
    Implement TD3+BC for offline RL (#660) Yi Su 2022-06-06 09:39:37 -07:00
  • 9ce0a554dc
    Add Atari SAC examples (#657) Yi Su 2022-06-03 22:26:08 -07:00
  • 5ecea2402e
    Fix save_checkpoint_fn return value (#659) Jiayi Weng 2022-06-02 12:07:07 -05:00
  • 6ad5b520fa
    Fix sphinx build error (#655) Jiayi Weng 2022-06-01 00:56:04 -05:00
  • 109875d43d
    Fix num_envs=test_num (#653) Jiayi Weng 2022-05-29 23:38:47 -05:00
  • 277138ca5b
    Added support for clipping to DQNPolicy (#642) Michal Gregor 2022-05-18 13:33:37 +02:00
  • c87b9f49bc
    Add show_progress option for trainer (#641) Michal Gregor 2022-05-17 17:41:59 +02:00
  • 53e6b0408d
    Add BranchingDQN for large discrete action spaces (#618) Anas BELFADIL 2022-05-15 15:40:32 +02:00
  • a03f19af72
    fix pytest error on non-linux system (#638) Jiayi Weng 2022-05-12 08:52:55 -04:00
  • bf8f63ffc3
    use envpool in vizdoom example, update doc (#634) Jiayi Weng 2022-05-08 12:42:16 -04:00
  • 2a7c151738
    Add vecenv wrappers for obs_norm to support running mujoco experiment with envpool (#628) v0.4.8 Jiayi Weng 2022-05-05 07:55:15 -04:00
  • a7c789f851
    Improve data loading from D4RL and convert RL Unplugged to D4RL format (#624) Yi Su 2022-05-03 13:37:52 -07:00
  • dd16818ce4
    implement REDQ based on original contribution by @Jimenius (#623) Yi Su 2022-04-30 09:06:00 -07:00
  • 41afc2584a
    Convert RL Unplugged Atari datasets to tianshou ReplayBuffer (#621) Yi Su 2022-04-29 04:33:28 -07:00
  • 7f23748347
    Compare Atari results with dopamine and OpenAI Baselines (#616) ChenDRAG 2022-04-27 21:10:45 +08:00
  • 876e6b186e hot fix mujoco benchmark Jiayi Weng 2022-04-24 16:49:40 -04:00
  • 5eab7dc218
    Add Atari Results (#600) Chengqi Duan 2022-04-24 20:44:54 +08:00
  • 5c9afe72f3
    Update Mujoco Bemchmark's webpage (#606) ChenDRAG 2022-04-24 01:11:33 +08:00
  • e01385ea30
    Change action_dim to action_shape (#602) Squeemos 2022-04-21 17:09:57 -07:00
  • 57ecebde38
    Add jupyter notebook tutorials using Google Colaboratory (#599) ChenDRAG 2022-04-19 20:58:52 +08:00
  • 92456cdb68
    Add learning rate scheduler to BasePolicy (#598) Alex Nikulkov 2022-04-17 08:52:30 -07:00
  • 6fc6857812
    Update Multi-agent RL docs, upgrade pettingzoo (#595) Yifei Cheng 2022-04-16 11:17:53 -04:00
  • 18277497ed
    fix py39 ci venv test failure (#593) Jiayi Weng 2022-04-12 10:29:39 -04:00
  • 75d7c9f1d9
    Fix action scaling bug in SAC (#591) ChenDRAG 2022-04-12 00:26:06 +08:00
  • f13e415eb0
    Add write_flush in two loggers, fix argument passing in WandbLogger (#581) Jiayi Weng 2022-03-29 20:04:23 -04:00
  • 6ab9860183
    fix negative collector time (#578) Jiayi Weng 2022-03-25 22:44:08 -04:00
  • 2a9c9289e5
    rename save_fn to save_best_fn to avoid ambiguity (#575) v0.4.7 Jiayi Weng 2022-03-21 16:29:27 -04:00
  • 10d919052b
    Add Trainers as generators (#559) Jose Antonio Martin H 2022-03-17 17:26:14 +01:00
  • 2336a7db1b
    fixed typo in rainbow DQN paper reference (#569) Andrea Boscolo Camiletto 2022-03-16 14:38:51 +01:00
  • 39f8391cfb
    Add map_action_inverse for fixing error of storing random action (#568) Minhui Li 2022-03-12 22:26:00 +08:00
  • 9cb74e60c9
    Add imitation baselines for offline RL (#566) Yi Su 2022-03-12 05:33:54 -08:00
  • 74f430ea36
    Add a comment before SAC alpha loss (#565) Alex Nikulkov 2022-03-08 14:38:42 -08:00
  • ad2e1eaea0 Fix WandbLogger import error in Atari examples (#562) Chengqi Duan 2022-03-08 08:38:56 -05:00
  • df3d7f582b
    Update WandbLogger implementation (#558) Costa Huang 2022-03-06 17:40:47 -05:00
  • 2377f2f186
    Implement Generative Adversarial Imitation Learning (GAIL) (#550) Yi Su 2022-03-06 07:57:15 -08:00
  • d976a5aa91
    Fixed hardcoded reward_treshold (#548) Anas BELFADIL 2022-03-04 03:35:39 +01:00
  • c248b4f87e
    fix conda support and keep API compatibility (#536) v0.4.6.post1 Jiayi Weng 2022-02-25 11:05:02 -05:00
  • 97df511a13
    Add VizDoom PPO example and results (#533) v0.4.6 Yi Su 2022-02-24 17:33:34 -08:00
  • 23fbc3b712
    upgrade gym version to >=0.21, fix related CI and update examples/atari (#534) Chengqi Duan 2022-02-25 07:40:33 +08:00
  • c7e2e56fac
    Pettingzoo support (#494) Mohammad Mahdi Rahimi 2022-02-15 17:56:45 +03:00
  • d85bc19269
    update dqn tutorial and add envpool to docs (#526) Chengqi Duan 2022-02-15 06:39:47 +08:00
  • d29188ee77
    update atari ppo slots (#529) Yi Su 2022-02-12 12:04:21 -08:00
  • 40289b8b0e
    Add atari ppo example (#523) Yi Su 2022-02-10 14:45:06 -08:00
  • 3d697aa4c6
    make unit test faster (#522) Jiayi Weng 2022-02-08 11:24:52 -05:00
  • 9c100e0705
    Enable venvs.reset() concurrent execution (#517) Chengqi Duan 2022-02-08 00:40:01 +08:00
  • cd7654bfd5
    Fixing casts to int by to_torch_as(...) calls in policies when using discrete actions (#521) Kenneth Schröder 2022-02-06 20:42:46 +01:00
  • c25926dd8f
    Formalize variable names (#509) ChenDRAG 2022-01-30 00:53:56 +08:00
  • bc53ead273
    Implement CQLPolicy and offline_cql example (#506) Bernard Tan 2022-01-16 05:30:21 +08:00
  • a59d96d041
    Add Intrinsic Curiosity Module (#503) Yi Su 2022-01-14 10:43:48 -08:00
  • a2d76d1276
    Remove reset_buffer() from reset method (#501) Markus28 2022-01-13 01:46:28 +01:00
  • 3592f45446
    Fix critic network for Discrete CRR (#485) v0.4.5 Yi Su 2021-11-28 07:10:28 -08:00
  • 5c5a3db94e
    Implement BCQPolicy and offline_bcq example (#480) Bernard Tan 2021-11-22 22:21:02 +08:00
  • 94d3b27db9
    fix tqdm issue (#481) Jiayi Weng 2021-11-18 11:17:44 -05:00
  • 8f19a86966
    Implements set_env_attr and get_env_attr for vector environments (#478) Markus28 2021-11-02 17:08:00 +01:00
  • 098d466467
    fix atari wrapper to be deterministic (#467) Jiayi Weng 2021-10-19 10:26:11 -04:00
  • b9eedc516e bump to 0.4.4 v0.4.4 Jiayi Weng 2021-10-13 12:22:24 -04:00
  • 63d752ee0b
    W&B: Add usage in the docs (#463) Ayush Chaurasia 2021-10-13 20:58:25 +05:30
  • 926ec0b9b1
    update save_fn in trainer (#459) Jiayi Weng 2021-10-13 09:25:24 -04:00
  • e45e2096d8
    add multi-GPU support (#461) Jiayi Weng 2021-10-05 13:39:14 -04:00
  • 5df64800f4
    final fix for actor_critic shared head parameters (#458) Jiayi Weng 2021-10-04 11:19:07 -04:00
  • 22d7bf38c8
    Improve W&B logger (#441) Ayush Chaurasia 2021-09-24 19:22:23 +05:30
  • e8f8cdfa41
    fix logger.write error in atari script (#444) Jiayi Weng 2021-09-09 00:51:39 +08:00
  • fc251ab0b8
    bump to v0.4.3 (#432) v0.4.3 n+e 2021-09-03 05:05:04 +08:00
  • a740496a51
    fix dual clip implementation (#435) Ending Hsiao 2021-09-02 21:43:14 +08:00
  • 8a5e2190f7
    Add Weights and Biases Logger (#427) Andriy Drozdyuk 2021-08-30 10:35:02 -04:00
  • e4f4f0e144
    fix docs build failure and a bug in a2c/ppo optimizer (#428) n+e 2021-08-30 02:07:03 +08:00
  • 291be08d43
    Add Rainbow DQN (#386) Yi Su 2021-08-29 08:34:59 -07:00
  • d161059c3d
    Replaced indice by plural indices (#422) Andriy Drozdyuk 2021-08-20 09:58:44 -04:00
  • 728b88b92d
    Fix conda install command (#419) deeplook 2021-08-16 12:56:01 +02:00
  • 5b7732a29b
    make ppo discrete test script more general (#418) n+e 2021-08-15 21:37:37 +08:00
  • bba30f83d1
    fix sb2's coverage (#412) n+e 2021-08-10 17:43:27 +08:00
  • 42538f8e58
    Update README.md (#410) Miguel Morales 2021-08-09 19:14:20 -06:00
  • 0674ff628a
    Cite Tianshou's latest paper (#406) ChenDRAG 2021-08-10 08:35:01 +08:00
  • 18d2f25eff
    Remove warnings about the use of save_fn across trainers (#408) Andriy Drozdyuk 2021-08-03 21:56:00 -04:00
  • c19876179a
    add env_id in preprocess fn (#391) n+e 2021-07-05 09:50:39 +08:00
  • ebaca6f8da
    add vizdoom example, bump version to 0.4.2 (#384) v0.4.2 n+e 2021-06-26 18:08:41 +08:00
  • c0bc8e00ca
    Add Fully-parameterized Quantile Function (#376) Yi Su 2021-06-14 20:59:02 -07:00
  • 21b2b22cd7
    update iqn results and reward plots (#377) Yi Su 2021-06-09 18:05:25 -07:00
  • f3169b4c1f
    Add Implicit Quantile Network (#371) Yi Su 2021-05-28 18:44:23 -07:00
  • 458028a326
    fix docs (#373) n+e 2021-05-23 12:43:03 +08:00
  • 655d5fb14f
    Allow researchers to choose whether to use Double DQN (#368) Ark 2021-05-21 10:53:34 +08:00
  • 8f7bc65ac7
    Add discrete Critic Regularized Regression (#367) Yi Su 2021-05-18 22:29:56 -07:00