A research team from Carnegie Mellon University, Google Brain and UC Berkeley proposes a robust predictable control (RPC) method for learning reinforcement learning policies that use fewer bits of information. This simple and theoretically-justified algorithm achieves much tighter compression, is more robust, and generalizes better than prior methods, achieving up to 5× higher rewards than a standard information bottleneck.

Here is a quick read: CMU, Google & UC Berkeley Propose Robust Predictable Control Policies for RL Agents.

The paper Robust Predictable Control is on arXiv.



Source link