A new study in reinforcement learning theory shows that extending the temporal difference algorithm to unbiased learning under state uncertainty explains the observed ramping behaviour of dopamine neurons.
Journal article
Curr Biol
14/03/2022
32
R213 - R215