* Remove flag `eval_mode` from Collector.collect
* Replace flag `is_eval` in BasePolicy with `is_within_training_step` (negating usages)
and set it appropriately in BaseTrainer
* Remove flag `eval_mode` from Collector.collect
* Replace flag `is_eval` in BasePolicy with `is_within_training_step` (negating usages)
and set it appropriately in BaseTrainer