Whole-Brain Neural Dynamics of Probabilistic Reward Prediction.
Bach DR., Symmonds M., Barnes G., Dolan RJ.
Predicting future reward is paramount to performing an optimal action. Although a number of brain areas are known to encode such predictions, a detailed account of how the associated representations evolve over time is lacking. Here, we address this question using human magnetoencephalography (MEG) and multivariate analyses of instantaneous activity in reconstructed sources. We overtrained participants on a simple instrumental reward learning task where geometric cues predicted a distribution of possible rewards, from which a sample was revealed 2000 ms later. We show that predicted mean reward (i.e., expected value), and predicted reward variability (i.e., economic risk), are encoded distinctly. Early on, representations of mean reward are seen in parietal and visual areas, and later in frontal regions with orbitofrontal cortex emerging last. Strikingly, an encoding of reward variability emerges simultaneously in parietal/sensory and frontal sources and later than mean reward encoding. An orbitofrontal variability encoding emerged around the same time as that seen for mean reward. Crucially, cross-prediction showed that mean reward and variability representations are distinct and also revealed that instantaneous representations become more stable over time. Across sources, the best fitting metric for variability signals was coefficient of variation (rather than SD or variance), but distinct best metrics were seen for individual brain regions. Our data demonstrate how a dynamic encoding of probabilistic reward prediction unfolds in the brain both in time and space.SIGNIFICANCE STATEMENT Predicting future reward is paramount to optimal behavior. To gain insight into the underlying neural computations, we investigate how reward representations in the brain arise over time. Using magnetoencephalography, we show that a representation of predicted mean reward emerges early in parietal/sensory regions and later in frontal cortex. In contrast, predicted reward variability representations appear in most regions at the same time, and slightly later than for mean reward. For both features, representations dynamically change >1000 ms before stabilizing. The best metric for encoding variability is coefficient of variation, with heterogeneity in this encoding seen between brain areas. The results provide novel insights into the emergence of predictive reward representations.