Research

Lab area with desks, computer, and printerLab area with private testing rooms

The world is filled with predictability, and sensory systems excel at capitalizing on this structure. This allows perceptual systems to process stimuli efficiently when they are predictable, and emphasize how new stimuli are different when stimuli are unpredictable. This has profound effects on how we discriminate, identify, recognize, and learn sounds, particularly speech sounds. Our research explores these ideas through the following projects:

 

Context effects in speech sound categorization

All of perception takes place in context. How we perceive any given sound is influenced by acoustic properties of sounds heard earlier. If the spectral characteristics of earlier sounds (the context) differ from the spectrum of a later sound (the target), responses to the target sound will become biased. This is known as a spectral contrast effect. We have shown that spectral contrast effects are quite general, occurring for a wide range of stimuli (consonants, vowels, musical instruments), in a wide variety of cases (robust as well as subtle spectral characteristics of earlier sounds), and for different listener populations (normal hearing, listeners with sensorineural hearing loss, simulations of cochlear implant processing). We continue to pursue the mechanisms behind spectral contrast effects in order to better understand how, when, and by how much these effects shape everyday speech perception. Additionally, we are extending these paradigms to temporal contrast effects, when earlier sounds that are spoken quickly can make the target sound seem slower (e.g., /w/) and vice versa, where earlier sounds being spoken slowly make the target sound seem faster (e.g. /b/). Finally, we are exploring effects of statistical context on speech perception, where stable statistical properties of earlier sounds also bias categorization of later speech sounds.

 

Talkers versus talker acoustics in speech perception

Hearing different talkers makes speech perception slower and less accurate than when you hear a single talker. This is known as "talker normalization" and has been observed repeatedly over the last few decades of speech research. Some have gone so far as to claim this is an obligatory part of speech perception. We don't think so. We are exploring these effects and whether they can better explained by signal acoustics rather than the old adage of "different talkers are harder to understand than one talker". 

 

Natural signal statistics in speech and speech perception

The speech signal is incredibly acoustically complex, but it is far from random. Speech acoustics exhibit many different types of predictability and structure (or nonrandomness). This leads to two questions: (1) What are these regularities and/or structure in speech acoustics? (2) More importantly, which regularities matter for speech perception? We could conduct acoustic analyses of the speech signal until the end of time, but these analyses are only worthwhile if they teach us about speech perception. So far, we are learning which signal statistics contribute to spectral context effects in speech categorization, but the reach of these regularities in the speech signal is very wide and will likely make contact with many different areas of speech perception.

 

Adjusting to stable spectral properties in speech perception

Listening environments can alter the acoustic makeup of sounds, resulting in a series of sounds sharing a stable spectral property (e.g. a peak or overall shape of the spectrum). When all sounds share a given spectral property, perception factors it out and increases its reliance on other spectral properties for speech perception. This process is known as spectral calibration, and is complementary to the spectral contrast effects described above. We continue to explore the mechanisms responsible for spectral calibration, in order to better understand how it affects speech perception in everyday listening environments.

 

Informative spectral changes underlie sentence intelligibility

We continue to develop our metric of informative spectral changes in speech, Cochlea-scaled Spectral Entropy (CSE). Rapid changes in the speech envelope are highly informative, whereas more constant, unchanging regions of the speech envelope are far less informative. This measure has revealed these perceptually significant changes in the speech signal, such that replacing these acoustic changes with noise seriously impairs sentence intelligibility. Further questions include over what frequency ranges (bandwidths) and timescales are these changes best calculated. Also, how well does CSE correspond to changes in neural firing rates (Neural Spectral Entropy)?

 

Speech perception for hearing-impaired listeners

We extend the above questions to listeners with impaired or atypical hearing to understand how they utilize spectral changes in the speech signal. One way that we accomplish this is through acoustic simulations of hearing loss or cochlear implant processing. This allows us to efficiently test a number of different parameters that might take on greater importance when perceiving speech with impaired or atypical hearing. This informs testing hearing-impaired listeners directly in similar experiments. Understanding how speech perception is different across healthy and impaired hearing can generate new signal processing techniques in digital hearing aids and cochlear implants in order to improve speech perception for these listeners.

 

 

This research is influenced by a wide range of fields: auditory neuroscience and physiology, neural network modeling and computational perception, high-level perception in other modalities (particularly vision), and information theory and the efficient coding hypothesis. See the Publications and Presentations pages for more information.