Reinforcement Learning, Fast and Slow
Recent AI research has given rise to powerful techniques for deep reinforcement learning. In their combination of representation learning with reward-driven behavior, deep reinforcement learning would appear to have inherent interest for psychology and neuroscience.
One reservation has been that deep reinforcement learning procedures demand large amounts of training data, suggesting that these algorithms may differ fundamentally from those underlying human learning.
While this concern applies to the initial wave of deep RL techniques, subsequent AI work has established methods that allow deep RL systems to learn more quickly and efficiently.
Two particularly interesting and promising techniques center, respectively, on episodic memory and meta-learning.
Alongside their interest as AI techniques, deep RL methods leveraging episodic memory and meta-learning have direct and interesting implications for psychology and neuroscience.
One subtle but critically important insight which these techniques bring into focus is the fundamental connection between fast and slow forms of learning.
Deep reinforcement learning (RL) methods have driven impressive advances in artificial intelligence in recent years, exceeding human performance in domains ranging from Atari to Go to no-limit poker.
This progress has drawn the attention of cognitive scientists interested in understanding human learning.
However, the concern has been raised that deep RL may be too sample-inefficient – that is, it may simply be too slow – to provide a plausible model of how humans learn.
In the present review, we counter this critique by describing recently developed techniques that allow deep RL to operate more nimbly, solving problems much more quickly than previous methods.
Although these techniques were developed in an AI context, we propose that they may have rich implications for psychology and neuroscience.
A key insight, arising from these AI methods, concerns the fundamental connection between fast RL and slower, more incremental forms of learning.