AUDITORY COGNITIVE NEUROSCIENCE EXPERIENCE LAB

 

 

   LAB & LOCATION    PEOPLE  CURRENT FUNDING    RESEARCH     THE SOCIETY    PUBLICATIONS    SOUND GALLERY    SLHS DEPARTMENT     

 

 

RESEARCH

We are always recruiting participants for our studies.  If you have no history or speech and/or hearing disorders and would like to serve as a volunteer, please email Sarah Sullivan. 


3-D reconstructions of static male and female vocal tract shapes during the production of an American English vowel (image provided by Dr. Story).

Project: Vocal Tract Normalization

 

We are currently studying listeners’ abilities to understand speech that is produced by a variety of speakers.  Acoustical analyses reveal large degrees of variation amongst speech produced by different talkers.  For example, the word ‘dog’ produced by two different people will be acoustically very different.  A listener, however, will likely perceive both productions as ‘dog.’  This perceptual constancy in light of acoustic variability is referred to as talker normalization.  The most commonly cited example of talker normalization is the perception of speech produced by males and females.  Despite widely different acoustic signals, resulting largely from differences in vocal tract size/shape, listeners have little difficulty understanding speech produced by either sex.  Dr. Brad Story, one of our collaborators here in the Speech, Language, and Hearing Sciences department at UA, has examined variance in vowel productions using Magnetic Resonance Imaging (MRI) and X-ray microbeam technology.  These images have shown that many of the differences between talkers result from differences in their neutral vocal tract shape (the shape of the airspace when producing a neutral vowel).  We will use Dr. Story’s model to investigate the role neutral vocal tract size/shape plays in talker normalization. 

 

Story and Titze (2002) paper on neutral vocal tracts

 

Story 2005 paper on vocal tract model

 


 

Stimuli were created by manipulating the duration of the silent gap between the “s” and “a” in the word “say.”

Project: Speech in Noise

 

We are conducting a series of experiments studying how different listening environments (i.e., quiet, noisy) affect speech perception.  It has been shown that the perception of words such as “say” or “stay” can be shifted in a predictable manner by changing the duration of a silence gap between the “s” and the vowel in “say” (Best, Morrongiello, and Robson, 1981).  A number of similar findings involving different words, syllables, and complex non-speech sounds are well documented in the area of speech perception.  An individual’s perception, however, might also be affected by the listening context in which the speech is presented (i.e., the words are embedded in background noise).  Specifically, we predict that many classic perceptual shift studies may result in different findings if the same stimuli are presented in a variety of noises (e.g., speech-shaped noise, white noise, etc.).  It is typical for experimenters to present speech (or complex speech-like sounds) in quiet.  These tasks are generally easy for normal listeners and fail to accurately model real life listening situations.  Most real word speech perception takes place in non-ideal environments filled with varying degrees of background noise.  The results will inform us about how listeners integrate acoustic cues to categorize speech sounds phonemically.

 

2008 AAS Poster

 

Best, Morrongiello, and Robson (1981) paper

Ambiguous “say-stay” stimuli were embedded in quiet and speech-shaped noise.

 


 

 

Neural correlates of speech perception

Project: High Density Electroencephalic Event-related Bandpower as a Biomarker for Disordered Speech Perception

 

This collaborative project between UA (Dr. Lotto’s Auditory Cognitive Neuroscience Experience Lab), ASU (Dr. Julie Liss’ Motor Speech Disorders Lab), and the Mayo Clinic in Scottsdale (Dr. John Caviness) seeks to merge three strong independent lines of research with emerging state-of-the-art functional brain mapping technology.  Our ultimate goal is to define the temporal-spatial cortical activation associated with the perception of intelligible speech.  The topic of the neural correlates of speech processing is of broad interest, with both contemporary clinical and basic science implications. The recent surge of papers on the topic includes results of functional neuroimaging studies that define the anatomical substrates of speech processing.  However, the temporal sequence of activation of these structures, an aspect critical to processing degraded speech like dysarthria, remains to be established.  We are in a unique position to tackle this question because of our constellation of expertise in perceptual processing of normal and disordered speech and state-of-the art electroencephalography (EEG) mapping.  This project will incorporate behavioral data on speech perception and production with neurophysiologic data to develop and test a model of how listeners map from auditory to semantic representations. The novel EEG modeling techniques along with work on speech perception at the phonemic and sentence level will be the basis for a unique research program with theoretical and clinical impact. 

 


 

 

Speaker Normalization

Project: Talker/Speaker Normalization

 

Numerous studies have demonstrated that the perception of a single speech sound, typically a vowel or consonant, can be altered by the characteristics of surrounding sounds, sentences, or carrier phrases (Ladefoged and Broadbent, 1957; Mann, 1980, Summerfield, 1981; Lotto and Kluender, 1998; and Holt, 2005).  It has been suggested that this perceptual shift allows listeners to account for individual speaker variability in speech production.  That is, the listener ‘tunes’ or ‘normalizes’ his/her speech sound categories in accordance with the particular characteristics of the talker.  The underlying assumption is that in order to have robust spoken language perception, one cannot rely on average categories for individual speech sounds across speakers.  However, in the 50 years of study regarding “talker normalization,” there has been no demonstration that the perceptual shifts are necessary or even helpful in the understanding of normal spoken language.  We are in the process of testing the importance of talker normalization for comprehension of spoken language in a number of ways. 

 

Classic Ladefoged and Broadbent (1957) paper

 


 

 

Top-down lexical influences or general auditory effects?

Project: A General Auditory Explanation for Lexical (i.e., TRACE model) Context Effects

 

In 1988, Elman and McClelland presented data suggesting that context effects can be triggered by “illusory phonemes.”  In their study, listeners were asked to participate in a phoneme identification task whereby context words (e.g., “foolish” and “Christmas”) were followed by a target sound (an ambiguous /t/-/k/ or /d/-/g/).  Manipulations were made to the final sound of the context word to create an intermediate “sh”/”s” sound.  The TRACE model was then used to accurately predict listeners’ phoneme identification shifts, through the use of top-down lexical influences.  However, there may be a simpler explanation for these findings, one that relies on general auditory contrast effects like those obtained by Lotto and Kluender (1998).  This study tests whether acoustic characteristics of the context words, as opposed to the linguistic content, can account for the findings. 

 

Elman & McClelland (1988) paper

 

Lotto & Kluender (1998) paper

 

 

The University of Arizona

Tucson