From b72bebbc487c155322d609a0f6554a8446849264 Mon Sep 17 00:00:00 2001 From: "Stefano Mariani, PhD" Date: Thu, 26 Oct 2023 18:48:44 +0200 Subject: [PATCH] Fixed misleading multi-agent training sentences (#980) - [X] I have marked all applicable categories: + [ ] exception-raising fix + [ ] algorithm implementation fix + [X] documentation modification + [ ] new feature - [X] I have reformatted the code using `make format` (**required**) - [X] I have checked the code using `make commit-checks` (**required**) - [X] If applicable, I have mentioned the relevant/related issue(s) + resolves issue #973 - [ ] If applicable, I have listed every items in this Pull Request below --- docs/tutorials/tictactoe.rst | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/docs/tutorials/tictactoe.rst b/docs/tutorials/tictactoe.rst index 8e0c193..b15b11a 100644 --- a/docs/tutorials/tictactoe.rst +++ b/docs/tutorials/tictactoe.rst @@ -1,13 +1,15 @@ -Multi-Agent RL -============== +RL against random policy opponent with PettingZoo +================================================= -Tianshou use `PettingZoo` environment for multi-agent RL training. Here are some helpful tutorial links: +Tianshou is compatible with `PettingZoo` environments for multi-agent RL, although does not directly provide facilities for multi-agent RL. Here are some helpful tutorial links: * https://pettingzoo.farama.org/tutorials/tianshou/beginner/ * https://pettingzoo.farama.org/tutorials/tianshou/intermediate/ * https://pettingzoo.farama.org/tutorials/tianshou/advanced/ -In this section, we describe how to use Tianshou to implement multi-agent reinforcement learning. Specifically, we will design an algorithm to learn how to play `Tic Tac Toe `_ (see the image below) against a random opponent. +In this section, we describe how to use Tianshou to implement RL in a multi-agent setting where, however, only one agent is trained, and the other one adopts a fixed random policy. +The user can then use this as a blueprint to replace the random policy with another trainable agent. +Specifically, we will design an algorithm to learn how to play `Tic Tac Toe `_ (see the image below) against a random opponent. .. image:: ../_static/images/tic-tac-toe.png :align: center @@ -176,8 +178,8 @@ Tianshou already provides some builtin classes for multi-agent learning. You can Random agents perform badly. In the above game, although agent 2 wins finally, it is clear that a smart agent 1 would place an ``x`` at row 4 col 4 to win directly. -Train an MARL Agent -------------------- +Train one Agent against a random opponent +----------------------------------------- So let's start to train our Tic-Tac-Toe agent! First, import some required modules. :: @@ -645,4 +647,4 @@ Well, although the learned agent plays well against the random agent, it is far Next, maybe you can try to build more intelligent agents by letting the agent learn from self-play, just like AlphaZero! -In this tutorial, we show an example of how to use Tianshou for multi-agent RL. Tianshou is a flexible and easy to use RL library. Make the best of Tianshou by yourself! +In this tutorial, we show an example of how to use Tianshou for training a single agent in a MARL setting. Tianshou is a flexible and easy to use RL library. Make the best of Tianshou by yourself!