{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "provenance": [], "toc_visible": true }, "kernelspec": { "name": "python3", "display_name": "Python 3" }, "language_info": { "name": "python" } }, "cells": [ { "cell_type": "markdown", "source": [ "# Overview\n", "Finally, we can assemble building blocks that we have came across in previous tutorials to conduct our first DRL experiment. In this experiment, we will use [PPO](https://arxiv.org/abs/1707.06347) algorithm to solve the classic CartPole task in Gym." ], "metadata": { "id": "_UaXOSRjDUF9" } }, { "cell_type": "markdown", "source": [ "# Experiment\n", "To conduct this experiment, we need the following building blocks.\n", "\n", "\n", "* Two vectorized environments, one for training and one for evaluation\n", "* A PPO agent\n", "* A replay buffer to store transition data\n", "* Two collectors to manage the data collecting process, one for training and one for evaluation\n", "* A trainer to manage the training loop\n", "\n", "