Replaced .png images with .svg where possible
BIN
docs/_static/images/policy_table.png
vendored
Before Width: | Height: | Size: 34 KiB |
BIN
docs/_static/images/policy_table.svg
vendored
Normal file
After Width: | Height: | Size: 34 KiB |
BIN
docs/_static/images/pseudocode_off_policy.png
vendored
Before Width: | Height: | Size: 95 KiB |
BIN
docs/_static/images/pseudocode_off_policy.svg
vendored
Normal file
After Width: | Height: | Size: 95 KiB |
BIN
docs/_static/images/structure.png
vendored
Before Width: | Height: | Size: 38 KiB |
3
docs/_static/images/structure.svg
vendored
Normal file
After Width: | Height: | Size: 56 KiB |
BIN
docs/_static/images/timelimit.png
vendored
Before Width: | Height: | Size: 28 KiB |
3
docs/_static/images/timelimit.svg
vendored
Normal file
After Width: | Height: | Size: 47 KiB |
@ -825,10 +825,10 @@
|
||||
},
|
||||
"source": [
|
||||
"<center>\n",
|
||||
"<img src=../_static/images/timelimit.png></img>\n",
|
||||
"<img src=../_static/images/timelimit.svg></img>\n",
|
||||
"</center>\n",
|
||||
"<center>\n",
|
||||
"<img src=../_static/images/policy_table.png></img>\n",
|
||||
"<img src=../_static/images/policy_table.svg></img>\n",
|
||||
"</center>"
|
||||
]
|
||||
}
|
||||
|
@ -10,7 +10,7 @@
|
||||
"From its literal meaning, we can easily know that the Collector in Tianshou is used to collect training data. More specifically, the Collector controls the interaction between Policy (agent) and the environment. It also helps save the interaction data into the ReplayBuffer and returns episode statistics.\n",
|
||||
"\n",
|
||||
"<center>\n",
|
||||
"<img src=../_static/images/structure.png></img>\n",
|
||||
"<img src=../_static/images/structure.svg></img>\n",
|
||||
"</center>\n",
|
||||
"\n"
|
||||
]
|
||||
|
@ -10,7 +10,7 @@
|
||||
"Trainer is the highest-level encapsulation in Tianshou. It controls the training loop and the evaluation method. It also controls the interaction between the Collector and the Policy, with the ReplayBuffer serving as the media.\n",
|
||||
"\n",
|
||||
"<center>\n",
|
||||
"<img src=../_static/images/structure.png></img>\n",
|
||||
"<img src=../_static/images/structure.svg></img>\n",
|
||||
"</center>\n",
|
||||
"\n",
|
||||
"\n"
|
||||
@ -34,7 +34,7 @@
|
||||
"source": [
|
||||
"### Pseudocode\n",
|
||||
"<center>\n",
|
||||
"<img src=../_static/images/pseudocode_off_policy.png></img>\n",
|
||||
"<img src=../_static/images/pseudocode_off_policy.svg></img>\n",
|
||||
"</center>\n",
|
||||
"\n",
|
||||
"For the on-policy trainer, the main difference is that we clear the buffer after Line 10."
|
||||
|