sproblvem 674ba4656b Update README.md
Sub-module function of tianshou.
2017-12-04 16:20:45 +08:00
2017-12-04 11:01:49 +08:00
2017-11-06 15:24:34 +08:00
2017-12-04 11:01:49 +08:00
2017-11-28 15:04:00 +08:00
2017-12-01 01:38:11 +08:00
2017-11-28 15:04:00 +08:00
2017-11-04 01:38:59 +08:00
2017-12-04 16:20:45 +08:00

tianshou

Tianshou(天授) is a reinforcement learning platform.

alt text

agent

    Examples     Self-play Framework

core

Model

    DQN, Policy-Value Network of AlphaGo Zero, PPO-specific, TROP-specific

Algorithm

Loss design

    Actor-Critic (Variations), DQN (Variations), DDPG, TRPO, PPO

Optimization method

    SGD, ADAM, TRPO, natural gradient, etc.

Planning

    MCTS

data

    Training style - Monte Carlo or Temporal Difference

    Reward Reshaping/ Advantage Estimation Function

    Importance weight

    Multithread Read/Write

environment

    DQN repeat frames etc.

simulator

    Go, Othello/Reversi, Warzone

TODO

Search based method parallel.

Description
No description provided
Readme 46 MiB
Languages
Python 100%