From b364f1a26f1b8528b01a445a488160ce2d910a1c Mon Sep 17 00:00:00 2001
From: Trinkle23897 <trinkle23897@gmail.com>
Date: Thu, 8 Oct 2020 23:16:15 +0800
Subject: [PATCH] specify the meaning of logits in documentation (#238)

---
 docs/tutorials/dqn.rst | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/docs/tutorials/dqn.rst b/docs/tutorials/dqn.rst
index 49f6260..b1971ea 100644
--- a/docs/tutorials/dqn.rst
+++ b/docs/tutorials/dqn.rst
@@ -92,6 +92,10 @@ It is also possible to use pre-defined MLP networks in :mod:`~tianshou.utils.net
 1. Input: observation ``obs`` (may be a ``numpy.ndarray``, ``torch.Tensor``, dict, or self-defined class), hidden state ``state`` (for RNN usage), and other information ``info`` provided by the environment.
 2. Output: some ``logits``, the next hidden state ``state``. The logits could be a tuple instead of a ``torch.Tensor``, or some other useful variables or results during the policy forwarding procedure. It depends on how the policy class process the network output. For example, in PPO :cite:`PPO`, the return of the network might be ``(mu, sigma), state`` for Gaussian policy.
 
+.. note::
+
+    The logits here indicates the raw output of the network. In supervised learning, the raw output of prediction/classification model is called logits, and here we extend this definition to any raw output of the neural network.
+
 
 Setup Policy
 ------------