add plotter (#335)

Co-authored-by: Trinkle23897 <trinkle23897@gmail.com>
2021-04-14 14:06:36 +08:00 · 2021-04-14 14:06:36 +08:00 · 333b8fbd66
commit 333b8fbd66
parent dd4a01132c
32 changed files with 357 additions and 12 deletions
--- a/examples/mujoco/README.md
+++ b/examples/mujoco/README.md
@ -36,17 +36,26 @@ $ tensorboard --logdir log
 You can also reproduce the benchmark (e.g. SAC in Ant-v3) with the example script we provide under `examples/mujoco/`:
 ```bash
-$ ./run_experiments.sh Ant-v3
+$ ./run_experiments.sh Ant-v3 sac
 ```
 This will start 10 experiments with different seeds.
 Now that all the experiments are finished, we can convert all tfevent files into csv files and then try plotting the results.
 ```bash
 $ ./tools.py --root-dir ./results/Ant-v3/sac
 $ ./plotter.py --root-dir ./results/Ant-v3 --shaded-std --legend-pattern "\\w+"
 ```
 #### Example benchmark
 <img src="./benchmark/Ant-v3/offpolicy.png" width="500" height="450">
 Other graphs can be found under `/examples/mujuco/benchmark/`
 For pretrained agents, detailed graphs (single agent, single game) and log details, please refer to [https://cloud.tsinghua.edu.cn/d/f45fcfc5016043bc8fbc/](https://cloud.tsinghua.edu.cn/d/f45fcfc5016043bc8fbc/).
 ## Offpolicy algorithms
 #### Notes
@ -236,7 +245,7 @@ Other graphs can be found under `/examples/mujuco/benchmark/`
 <a name="footnote1">[1]</a>  Supported environments include HalfCheetah-v3, Hopper-v3, Swimmer-v3, Walker2d-v3, Ant-v3, Humanoid-v3, Reacher-v2, InvertedPendulum-v2 and InvertedDoublePendulum-v2. Pusher, Thrower, Striker and HumanoidStandup are not supported because they are not commonly seen in literatures.
-<a name="footnote2">[2]</a>  Pretrained agents, detailed graphs (single agent, single game) and log details can all be found [here](https://cloud.tsinghua.edu.cn/d/356e0f5d1e66426b9828/).
+<a name="footnote2">[2]</a>  Pretrained agents, detailed graphs (single agent, single game) and log details can all be found at [https://cloud.tsinghua.edu.cn/d/f45fcfc5016043bc8fbc/](https://cloud.tsinghua.edu.cn/d/f45fcfc5016043bc8fbc/).
 <a name="footnote3">[3]</a>  We used the latest version of all mujoco environments in gym (0.17.3 with mujoco==2.0.2.13), but it's not often the case with other benchmarks. Please check for details yourself in the original paper. (Different version's outcomes are usually similar, though)
--- a/examples/mujoco/benchmark/Ant-v3/all.png
+++ b/examples/mujoco/benchmark/Ant-v3/all.png
--- a/examples/mujoco/benchmark/Ant-v3/offpolicy.png
+++ b/examples/mujoco/benchmark/Ant-v3/offpolicy.png
--- a/examples/mujoco/benchmark/Ant-v3/onpolicy.png
+++ b/examples/mujoco/benchmark/Ant-v3/onpolicy.png
--- a/examples/mujoco/benchmark/HalfCheetah-v3/all.png
+++ b/examples/mujoco/benchmark/HalfCheetah-v3/all.png
--- a/examples/mujoco/benchmark/HalfCheetah-v3/offpolicy.png
+++ b/examples/mujoco/benchmark/HalfCheetah-v3/offpolicy.png
--- a/examples/mujoco/benchmark/HalfCheetah-v3/onpolicy.png
+++ b/examples/mujoco/benchmark/HalfCheetah-v3/onpolicy.png
--- a/examples/mujoco/benchmark/Hopper-v3/all.png
+++ b/examples/mujoco/benchmark/Hopper-v3/all.png
--- a/examples/mujoco/benchmark/Hopper-v3/offpolicy.png
+++ b/examples/mujoco/benchmark/Hopper-v3/offpolicy.png
--- a/examples/mujoco/benchmark/Hopper-v3/onpolicy.png
+++ b/examples/mujoco/benchmark/Hopper-v3/onpolicy.png
--- a/examples/mujoco/benchmark/Humanoid-v3/all.png
+++ b/examples/mujoco/benchmark/Humanoid-v3/all.png
--- a/examples/mujoco/benchmark/Humanoid-v3/offpolicy.png
+++ b/examples/mujoco/benchmark/Humanoid-v3/offpolicy.png
--- a/examples/mujoco/benchmark/Humanoid-v3/onpolicy.png
+++ b/examples/mujoco/benchmark/Humanoid-v3/onpolicy.png
--- a/examples/mujoco/benchmark/InvertedDoublePendulum-v2/all.png
+++ b/examples/mujoco/benchmark/InvertedDoublePendulum-v2/all.png
--- a/examples/mujoco/benchmark/InvertedDoublePendulum-v2/offpolicy.png
+++ b/examples/mujoco/benchmark/InvertedDoublePendulum-v2/offpolicy.png
--- a/examples/mujoco/benchmark/InvertedDoublePendulum-v2/onpolicy.png
+++ b/examples/mujoco/benchmark/InvertedDoublePendulum-v2/onpolicy.png
--- a/examples/mujoco/benchmark/InvertedPendulum-v2/all.png
+++ b/examples/mujoco/benchmark/InvertedPendulum-v2/all.png
--- a/examples/mujoco/benchmark/InvertedPendulum-v2/offpolicy.png
+++ b/examples/mujoco/benchmark/InvertedPendulum-v2/offpolicy.png
--- a/examples/mujoco/benchmark/InvertedPendulum-v2/onpolicy.png
+++ b/examples/mujoco/benchmark/InvertedPendulum-v2/onpolicy.png
--- a/examples/mujoco/benchmark/README.md
+++ b/examples/mujoco/benchmark/README.md
@ -2,36 +2,36 @@
 ## Ant-v3
-![](Ant-v3/offpolicy.png)
+![](Ant-v3/all.png)
 ## HalfCheetah-v3
-![](HalfCheetah-v3/offpolicy.png)
+![](HalfCheetah-v3/all.png)
 ## Hopper-v3
-![](Hopper-v3/offpolicy.png)
+![](Hopper-v3/all.png)
 ## Walker2d-v3
-![](Walker2d-v3/offpolicy.png)
+![](Walker2d-v3/all.png)
 ## Swimmer-v3
-![](Swimmer-v3/offpolicy.png)
+![](Swimmer-v3/all.png)
 ## Humanoid-v3
-![](Humanoid-v3/offpolicy.png)
+![](Humanoid-v3/all.png)
 ## Reacher-v2
-![](Reacher-v2/offpolicy.png)
+![](Reacher-v2/all.png)
 ## InvertedPendulum-v2
-![](InvertedPendulum-v2/offpolicy.png)
+![](InvertedPendulum-v2/all.png)
 ## InvertedDoublePendulum-v2
-![](InvertedDoublePendulum-v2/offpolicy.png)
+![](InvertedDoublePendulum-v2/all.png)
--- a/examples/mujoco/benchmark/Reacher-v2/all.png
+++ b/examples/mujoco/benchmark/Reacher-v2/all.png
--- a/examples/mujoco/benchmark/Reacher-v2/offpolicy.png
+++ b/examples/mujoco/benchmark/Reacher-v2/offpolicy.png
--- a/examples/mujoco/benchmark/Reacher-v2/onpolicy.png
+++ b/examples/mujoco/benchmark/Reacher-v2/onpolicy.png
--- a/examples/mujoco/benchmark/Swimmer-v3/all.png
+++ b/examples/mujoco/benchmark/Swimmer-v3/all.png
--- a/examples/mujoco/benchmark/Swimmer-v3/offpolicy.png
+++ b/examples/mujoco/benchmark/Swimmer-v3/offpolicy.png
--- a/examples/mujoco/benchmark/Swimmer-v3/onpolicy.png
+++ b/examples/mujoco/benchmark/Swimmer-v3/onpolicy.png
--- a/examples/mujoco/benchmark/Walker2d-v3/all.png
+++ b/examples/mujoco/benchmark/Walker2d-v3/all.png
--- a/examples/mujoco/benchmark/Walker2d-v3/offpolicy.png
+++ b/examples/mujoco/benchmark/Walker2d-v3/offpolicy.png
--- a/examples/mujoco/benchmark/Walker2d-v3/onpolicy.png
+++ b/examples/mujoco/benchmark/Walker2d-v3/onpolicy.png
--- a/examples/mujoco/plotter.py
+++ b/examples/mujoco/plotter.py
@ -0,0 +1,235 @@
 #!/usr/bin/env python3
 import re
 import os
 import csv
 import argparse
 import numpy as np
 import matplotlib.pyplot as plt
 import matplotlib.ticker as mticker
 from collections import defaultdict
 from tools import find_all_files
 def smooth(y, radius, mode='two_sided', valid_only=False):
    '''Smooth signal y, where radius is determines the size of the window.
    mode='twosided':
        average over the window [max(index - radius, 0), min(index + radius, len(y)-1)]
    mode='causal':
        average over the window [max(index - radius, 0), index]
    valid_only: put nan in entries where the full-sized window is not available
    '''
    assert mode in ('two_sided', 'causal')
    if len(y) < 2 * radius + 1:
        return np.ones_like(y) * y.mean()
    elif mode == 'two_sided':
        convkernel = np.ones(2 * radius + 1)
        out = np.convolve(y, convkernel, mode='same') / \
            np.convolve(np.ones_like(y), convkernel, mode='same')
        if valid_only:
            out[:radius] = out[-radius:] = np.nan
    elif mode == 'causal':
        convkernel = np.ones(radius)
        out = np.convolve(y, convkernel, mode='full') / \
            np.convolve(np.ones_like(y), convkernel, mode='full')
        out = out[:-radius + 1]
        if valid_only:
            out[:radius] = np.nan
    return out
 COLORS = ([
    # deepmind style
    '#0072B2',
    '#009E73',
    '#D55E00',
    '#CC79A7',
    # '#F0E442',
    '#d73027',  # RED
    # built-in color
    'blue', 'red', 'pink', 'cyan', 'magenta', 'yellow', 'black', 'purple',
    'brown', 'orange', 'teal', 'lightblue', 'lime', 'lavender', 'turquoise',
    'darkgreen', 'tan', 'salmon', 'gold', 'darkred', 'darkblue', 'green',
    # personal color
    '#313695',  # DARK BLUE
    '#74add1',  # LIGHT BLUE
    '#f46d43',  # ORANGE
    '#4daf4a',  # GREEN
    '#984ea3',  # PURPLE
    '#f781bf',  # PINK
    '#ffc832',  # YELLOW
    '#000000',  # BLACK
 ])
 def csv2numpy(csv_file):
    csv_dict = defaultdict(list)
    reader = csv.DictReader(open(csv_file))
    for row in reader:
        for k, v in row.items():
            csv_dict[k].append(eval(v))
    return {k: np.array(v) for k, v in csv_dict.items()}
 def group_files(file_list, pattern):
    res = defaultdict(list)
    for f in file_list:
        match = re.search(pattern, f)
        key = match.group() if match else ''
        res[key].append(f)
    return res
 def plot_ax(
    ax,
    file_lists,
    legend_pattern=".*",
    xlabel=None,
    ylabel=None,
    title=None,
    xlim=None,
    xkey='env_step',
    ykey='rew',
    smooth_radius=0,
    shaded_std=True,
    legend_outside=False,
 ):
    def legend_fn(x):
        # return os.path.split(os.path.join(
        #     args.root_dir, x))[0].replace('/', '_') + " (10)"
        return re.search(legend_pattern, x).group(0)
    legneds = map(legend_fn, file_lists)
    # sort filelist according to legends
    file_lists = [f for _, f in sorted(zip(legneds, file_lists))]
    legneds = list(map(legend_fn, file_lists))
    for index, csv_file in enumerate(file_lists):
        csv_dict = csv2numpy(csv_file)
        x, y = csv_dict[xkey], csv_dict[ykey]
        y = smooth(y, radius=smooth_radius)
        color = COLORS[index % len(COLORS)]
        ax.plot(x, y, color=color)
        if shaded_std and ykey + ':shaded' in csv_dict:
            y_shaded = smooth(csv_dict[ykey + ':shaded'], radius=smooth_radius)
            ax.fill_between(x, y - y_shaded, y + y_shaded, color=color, alpha=.2)
    ax.legend(legneds, loc=2 if legend_outside else None,
              bbox_to_anchor=(1, 1) if legend_outside else None)
    ax.xaxis.set_major_formatter(mticker.EngFormatter())
    if xlim is not None:
        ax.set_xlim(xmin=0, xmax=xlim)
    # add title
    ax.set_title(title)
    # add labels
    if xlabel is not None:
        ax.set_xlabel(xlabel)
    if ylabel is not None:
        ax.set_ylabel(ylabel)
 def plot_figure(
    file_lists,
    group_pattern=None,
    fig_length=6,
    fig_width=6,
    sharex=False,
    sharey=False,
    title=None,
    **kwargs,
 ):
    if not group_pattern:
        fig, ax = plt.subplots(figsize=(fig_length, fig_width))
        plot_ax(ax, file_lists, title=title, **kwargs)
    else:
        res = group_files(file_lists, group_pattern)
        row_n = int(np.ceil(len(res) / 3))
        col_n = min(len(res), 3)
        fig, axes = plt.subplots(row_n, col_n, sharex=sharex, sharey=sharey, figsize=(
            fig_length * col_n, fig_width * row_n), squeeze=False)
        axes = axes.flatten()
        for i, (k, v) in enumerate(res.items()):
            plot_ax(axes[i], v, title=k, **kwargs)
    if title:  # add title
        fig.suptitle(title, fontsize=20)
 if __name__ == "__main__":
    parser = argparse.ArgumentParser(description='plotter')
    parser.add_argument('--fig-length', type=int, default=6,
                        help='matplotlib figure length (default: 6)')
    parser.add_argument('--fig-width', type=int, default=6,
                        help='matplotlib figure width (default: 6)')
    parser.add_argument('--style', default='seaborn',
                        help='matplotlib figure style (default: seaborn)')
    parser.add_argument('--title', default=None,
                        help='matplotlib figure title (default: None)')
    parser.add_argument('--xkey', default='env_step',
                        help='x-axis key in csv file (default: env_step)')
    parser.add_argument('--ykey', default='rew',
                        help='y-axis key in csv file (default: rew)')
    parser.add_argument('--smooth', type=int, default=0,
                        help='smooth radius of y axis (default: 0)')
    parser.add_argument('--xlabel', default='Timesteps',
                        help='matplotlib figure xlabel')
    parser.add_argument('--ylabel', default='Episode Reward',
                        help='matplotlib figure ylabel')
    parser.add_argument(
        '--shaded-std', action='store_true',
        help='shaded region corresponding to standard deviation of the group')
    parser.add_argument('--sharex', action='store_true',
                        help='whether to share x axis within multiple sub-figures')
    parser.add_argument('--sharey', action='store_true',
                        help='whether to share y axis within multiple sub-figures')
    parser.add_argument('--legend-outside', action='store_true',
                        help='place the legend outside of the figure')
    parser.add_argument('--xlim', type=int, default=None,
                        help='x-axis limitation (default: None)')
    parser.add_argument('--root-dir', default='./', help='root dir (default: ./)')
    parser.add_argument(
        '--file-pattern', type=str, default=r".*/test_rew_\d+seeds.csv$",
        help='regular expression to determine whether or not to include target csv '
        'file, default to including all test_rew_{num}seeds.csv file under rootdir')
    parser.add_argument(
        '--group-pattern', type=str, default=r"(/|^)\w*?\-v(\d|$)",
        help='regular expression to group files in sub-figure, default to grouping '
        'according to env_name dir, "" means no grouping')
    parser.add_argument(
        '--legend-pattern', type=str, default=r".*",
        help='regular expression to extract legend from csv file path, default to '
        'using file path as legend name.')
    parser.add_argument('--show', action='store_true', help='show figure')
    parser.add_argument('--output-path', type=str,
                        help='figure save path', default="./figure.png")
    parser.add_argument('--dpi', type=int, default=200,
                        help='figure dpi (default: 200)')
    args = parser.parse_args()
    file_lists = find_all_files(args.root_dir, re.compile(args.file_pattern))
    file_lists = [os.path.relpath(f, args.root_dir) for f in file_lists]
    if args.style:
        plt.style.use(args.style)
    os.chdir(args.root_dir)
    plot_figure(
        file_lists,
        group_pattern=args.group_pattern,
        legend_pattern=args.legend_pattern,
        fig_length=args.fig_length,
        fig_width=args.fig_width,
        title=args.title,
        xlabel=args.xlabel,
        ylabel=args.ylabel,
        xkey=args.xkey,
        ykey=args.ykey,
        xlim=args.xlim,
        sharex=args.sharex,
        sharey=args.sharey,
        smooth_radius=args.smooth,
        shaded_std=args.shaded_std,
        legend_outside=args.legend_outside)
    if args.output_path:
        plt.savefig(args.output_path,
                    dpi=args.dpi, bbox_inches='tight')
    if args.show:
        plt.show()
--- a/examples/mujoco/run_experiments.sh
+++ b/examples/mujoco/run_experiments.sh
@ -2,10 +2,11 @@
 LOGDIR="results"
 TASK=$1
 ALGO=$2
 echo "Experiments started."
 for seed in $(seq 0 9)
 do
-    python mujoco_sac.py --task $TASK --epoch 200 --seed $seed --logdir $LOGDIR > ${TASK}_`date '+%m-%d-%H-%M-%S'`_seed_$seed.txt 2>&1 &
+    python mujoco_${ALGO}.py --task $TASK --epoch 200 --seed $seed --logdir $LOGDIR > ${TASK}_`date '+%m-%d-%H-%M-%S'`_seed_$seed.txt 2>&1 &
 done
 echo "Experiments ended."
--- a/examples/mujoco/tools.py
+++ b/examples/mujoco/tools.py
@ -0,0 +1,100 @@
 #!/usr/bin/env python3
 import os
 import re
 import csv
 import tqdm
 import argparse
 import numpy as np
 from typing import Dict, List, Union
 from tensorboard.backend.event_processing import event_accumulator
 def find_all_files(root_dir: str, pattern: re.Pattern) -> List[str]:
    """Find all files under root_dir according to relative pattern."""
    file_list = []
    for dirname, _, files in os.walk(root_dir):
        for f in files:
            absolute_path = os.path.join(dirname, f)
            if re.match(pattern, absolute_path):
                file_list.append(absolute_path)
    return file_list
 def convert_tfevents_to_csv(
    root_dir: str, refresh: bool = False
 ) -> Dict[str, np.ndarray]:
    """Recursively convert test/rew from all tfevent file under root_dir to csv.
    This function assumes that there is at most one tfevents file in each directory
    and will add suffix to that directory.
    :param bool refresh: re-create csv file under any condition.
    """
    tfevent_files = find_all_files(root_dir, re.compile(r"^.*tfevents.*$"))
    print(f"Converting {len(tfevent_files)} tfevents files under {root_dir} ...")
    result = {}
    with tqdm.tqdm(tfevent_files) as t:
        for tfevent_file in t:
            t.set_postfix(file=tfevent_file)
            output_file = os.path.join(os.path.split(tfevent_file)[0], "test_rew.csv")
            if os.path.exists(output_file) and not refresh:
                content = list(csv.reader(open(output_file, "r")))
                if content[0] == ["env_step", "rew", "time"]:
                    for i in range(1, len(content)):
                        content[i] = list(map(eval, content[i]))
                    result[output_file] = content
                    continue
            ea = event_accumulator.EventAccumulator(tfevent_file)
            ea.Reload()
            initial_time = ea._first_event_timestamp
            content = [["env_step", "rew", "time"]]
            for test_rew in ea.scalars.Items("test/rew"):
                content.append([
                    round(test_rew.step, 4),
                    round(test_rew.value, 4),
                    round(test_rew.wall_time - initial_time, 4),
                ])
            csv.writer(open(output_file, 'w')).writerows(content)
            result[output_file] = content
    return result
 def merge_csv(
    csv_files: List[List[Union[str, int, float]]],
    root_dir: str,
    remove_zero: bool = False,
 ) -> None:
    """Merge result in csv_files into a single csv file."""
    assert len(csv_files) > 0
    if remove_zero:
        for k, v in csv_files.items():
            if v[1][0] == 0:
                v.pop(1)
    sorted_keys = sorted(csv_files.keys())
    sorted_values = [csv_files[k][1:] for k in sorted_keys]
    content = [["env_step", "rew", "rew:shaded"] + list(map(
        lambda f: "rew:" + os.path.relpath(f, root_dir), sorted_keys))]
    for rows in zip(*sorted_values):
        array = np.array(rows)
        assert len(set(array[:, 0])) == 1, (set(array[:, 0]), array[:, 0])
        line = [rows[0][0], round(array[:, 1].mean(), 4), round(array[:, 1].std(), 4)]
        line += array[:, 1].tolist()
        content.append(line)
    output_path = os.path.join(root_dir, f"test_rew_{len(csv_files)}seeds.csv")
    print(f"Output merged csv file to {output_path} with {len(content[1:])} lines.")
    csv.writer(open(output_path, "w")).writerows(content)
 if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument('--root-dir', type=str)
    parser.add_argument(
        '--refresh', action="store_true",
        help="Re-generate all csv files instead of using existing one.")
    parser.add_argument(
        '--remove-zero', action="store_true",
        help="Remove the data point of env_step == 0.")
    args = parser.parse_args()
    csv_files = convert_tfevents_to_csv(args.root_dir, args.refresh)
    merge_csv(csv_files, args.root_dir, args.remove_zero)