compute_episodic_return
- simplify code - apply value normalization (global) and adv norm (per-batch) in on-policy algorithms