Intrinsically Motivated Modular Multi-Goal RL
Curious can target multiple modular goals using a single policy. It is intrinsically motivated to choose its own goals. It tracks its own competence and competence progress and focuses on goals with high progress. This enables efficient learning, resistance to distracting goals, forgetting and sensory failures.
How Many Random Seeds ?
Reproducibility in Machine Learning and Deep Reinforcement Learning in particular has become a serious issue in the recent years. In this blog post, we present a statistical guide to perform rigorous comparison of RL algorithms.
Bootstrapping Deep RL with Population-Based Diversity Search
Standard deep RL algorithms using continuous actions suffer from inefficient exploration when facing sparse or deceptive reward problems. Here we propose to decouple exploration and exploitation. An exploration algorithm first optimizes for diversity in the space of behaviors. Then, a state-of-the art deep RL algorithm uses the collected trajectories for bootstrapping.