{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "_UaXOSRjDUF9" }, "source": [ "# Experiment\n", "Finally, we can assemble building blocks that we have came across in previous tutorials to conduct our first DRL experiment. In this experiment, we will use [PPO](https://arxiv.org/abs/1707.06347) algorithm to solve the classic CartPole task in Gym." ] }, { "cell_type": "markdown", "metadata": { "id": "2QRbCJvDHNAd" }, "source": [ "## Experiment\n", "To conduct this experiment, we need the following building blocks.\n", "\n", "\n", "* Two vectorized environments, one for training and one for evaluation\n", "* A PPO agent\n", "* A replay buffer to store transition data\n", "* Two collectors to manage the data collecting process, one for training and one for evaluation\n", "* A trainer to manage the training loop\n", "\n", "