Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

In their recent article, Sweeny, Guzman-Martinez, Ortega, Grabowecky, and Suzuki (2012) demonstrate that heard speech sounds modulate the perceived shape of briefly presented visual stimuli. Ovals, whose aspect ratio (relating width to height) varied on a trial-by-trial basis, were rated as looking wider when a /woo/ sound was presented, and as taller when a /wee/ sound was presented instead. On the one hand, these findings add to a growing body of evidence demonstrating that audiovisual correspondences can have perceptual (as well as decisional) effects. On the other hand, they prompt a question concerning their origin. Although the currently popular view is that crossmodal correspondences are based on the internalization of the natural multisensory statistics of the environment (see Spence, 2011), these new results suggest instead that certain correspondences may actually be based on the sensorimotor responses associated with human vocalizations. As such, the findings of Sweeny et al. help to breathe new life into Sapir's (1929) once-popular "embodied" explanation of sound symbolism. Furthermore, they pose a challenge for those psychologists wanting to determine which among a number of plausible accounts best explains the available data on crossmodal correspondences.

Original publication




Journal article



Publication Date





550 - 552


audition, crossmodal correspondence, multisensory integration, sound symbolism, vision