Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

A key problem in reinforcement learning is how an animal is able to learn a sequence of movements when the reward signal only occurs at the end of the sequence. We describe how a hierarchical dynamical model of motor function is able to solve the problem of delayed reward in learning movement sequences using associative (Hebbian) learning. At the lowest level, the motor system encodes simple movements or primitives, while at higher levels the system encodes sequences of primitives. During training, the network is able to learn a high level motor program composed of a specific temporal sequence of motor primitives. The network is able to achieve this despite the fact that the reward signal, which indicates whether or not the desired motor program has been performed correctly, is received only at the end of each trial during learning. Use of a continuous attractor network in the architecture enables the network to generate the motor outputs required to produce the continuous movements necessary to implement the motor sequence.

Original publication

DOI

10.1016/j.neunet.2006.01.016

Type

Journal article

Journal

Neural Netw

Publication Date

03/2007

Volume

20

Pages

172 - 181

Keywords

Animals, Humans, Learning, Models, Neurological, Movement, Neural Networks (Computer), Reward, Time Factors