A research team from the University of Montreal and Max Planck Institute for Intelligent Systems constructs a reinforcement learning agent whose knowledge and reward function can be reused across tasks, along with an attention mechanism that dynamically selects unchangeable knowledge pieces to enable out-of-distribution adaptation and generalization.

Here is a quick read: Yoshua Bengio Team’s Recurrent Independent Mechanisms Endow RL Agents With Out-of-Distribution Adaptation and Generalization Abilities.

The paper Fast and Slow Learning of Recurrent Independent Mechanisms is on arXiv.



Source link