Spatial and temporal factors during processing of audiovisual speech: a PET study.
Macaluso E., George N., Dolan R., Spence C., Driver J.
Speech perception can use not only auditory signals, but also visual information from seeing the speaker's mouth. The relative timing and relative location of auditory and visual inputs are both known to influence crossmodal integration psychologically, but previous imaging studies of audiovisual speech focused primarily on just temporal aspects. Here we used Positron Emission Tomography (PET) during audiovisual speech processing to study how temporal and spatial factors might jointly affect brain activations. In agreement with previous work, synchronous versus asynchronous audiovisual speech yielded increased activity in multisensory association areas (e.g., superior temporal sulcus [STS]), plus in some unimodal visual areas. Our orthogonal manipulation of relative stimulus position (auditory and visual stimuli presented at same location vs. opposite sides) and stimulus synchrony showed that (i) ventral occipital areas and superior temporal sulcus were unaffected by relative location; (ii) lateral and dorsal occipital areas were selectively activated for synchronous bimodal stimulation at the same external location; (iii) right inferior parietal lobule was activated for synchronous auditory and visual stimuli at different locations, that is, in the condition classically associated with the 'ventriloquism effect' (shift of perceived auditory position toward the visual location). Thus, different brain regions are involved in different aspects of audiovisual integration. While ventral areas appear more affected by audiovisual synchrony (which can influence speech identification), more dorsal areas appear to be associated with spatial multisensory interactions.