Listeners' sensitivity to fine-grained information is maintained over several seconds (Falandays et al., 2020).
In a series of experiments, we have investigated this issue by looking at how long listeners maintain sensitivity to information provided by low-level acoustic differences in speech. These experiments use the visual-world eye-tracking paradigm to study how listeners process spoken sentences as they unfold over time. In these studies, subjects listen to sentences describing a visual scene while their eye movements to objects in the display are recorded.
We presented listeners with sounds varying along a continuum between "she" and "he" that referred to either a female or male character in the display. We found that, over long time periods, spanning multiple words and several seconds, listeners maintained sensitivity to fine-grained differences between the "sh" and "h" sounds in the pronoun. This is much longer than we would expect based on existing models of speech perception, and it suggests that listeners may be passing this information up to higher-level linguistic representations.