Talker variability in audio-visual speech perception.

Shannon L M Heald, Howard C Nusbaum
Author Information
  1. Shannon L M Heald: Department of Psychology, The University of Chicago Chicago, IL, USA.
  2. Howard C Nusbaum: Department of Psychology, The University of Chicago Chicago, IL, USA.

Abstract

A change in talker is a change in the context for the phonetic interpretation of acoustic patterns of speech. Different talkers have different mappings between acoustic patterns and phonetic categories and listeners need to adapt to these differences. Despite this complexity, listeners are adept at comprehending speech in multiple-talker contexts, albeit at a slight but measurable performance cost (e.g., slower recognition). So far, this talker variability cost has been demonstrated only in audio-only speech. Other research in single-talker contexts have shown, however, that when listeners are able to see a talker's face, speech recognition is improved under adverse listening (e.g., noise or distortion) conditions that can increase uncertainty in the mapping between acoustic patterns and phonetic categories. Does seeing a talker's face reduce the cost of word recognition in multiple-talker contexts? We used a speeded word-monitoring task in which listeners make quick judgments about target word recognition in single- and multiple-talker contexts. Results show faster recognition performance in single-talker conditions compared to multiple-talker conditions for both audio-only and audio-visual speech. However, recognition time in a multiple-talker context was slower in the audio-visual condition compared to audio-only condition. These results suggest that seeing a talker's face during speech perception may slow recognition by increasing the importance of talker identification, signaling to the listener a change in talker has occurred.

Keywords

References

  1. J Exp Psychol Learn Mem Cogn. 1994 Sep;20(5):1205-18 [PMID: 7931100]
  2. Nature. 1976 Dec 23-30;264(5588):746-8 [PMID: 1012311]
  3. Psychol Sci. 1994 Jan 1;5(1):42-46 [PMID: 21526138]
  4. Percept Psychophys. 1990 Apr;47(4):379-90 [PMID: 2345691]
  5. Psychol Rev. 1998 Apr;105(2):251-79 [PMID: 9577239]
  6. Cereb Cortex. 2007 Oct;17(10):2387-99 [PMID: 17218482]
  7. Neuroimage. 2005 Mar;25(1):76-89 [PMID: 15734345]
  8. J Exp Psychol Gen. 1986 Jun;115(2):107-17 [PMID: 2940312]
  9. Percept Mot Skills. 2000 Oct;91(2):535-8 [PMID: 11065315]
  10. J Speech Hear Res. 1996 Dec;39(6):1159-70 [PMID: 8959601]
  11. Neuron. 2007 Dec 20;56(6):1116-26 [PMID: 18093531]
  12. Curr Opin Neurobiol. 2001 Apr;11(2):219-24 [PMID: 11301243]
  13. J Acoust Soc Am. 2012 Jan;131(1):466-77 [PMID: 22280608]
  14. J Exp Psychol Learn Mem Cogn. 1991 Jan;17(1):152-62 [PMID: 1826729]
  15. Psychol Rev. 1967 Nov;74(6):431-61 [PMID: 4170865]
  16. J Acoust Soc Am. 1986 Apr;79(4):1086-100 [PMID: 3700864]
  17. J Acoust Soc Am. 1998 Jul;104(1):530-9 [PMID: 9670544]
  18. Ear Hear. 2001 Oct;22(5):412-9 [PMID: 11605948]
  19. J Acoust Soc Am. 1991 Feb;89(2):874-86 [PMID: 2016438]
  20. J Exp Psychol Hum Percept Perform. 2007 Apr;33(2):391-409 [PMID: 17469975]
  21. J Acoust Soc Am. 1989 May;85(5):2223-4 [PMID: 2525139]
  22. Front Psychol. 2012 Feb 01;3:10 [PMID: 22347198]
  23. Ear Hear. 2001 Jun;22(3):236-51 [PMID: 11409859]
  24. Speech Commun. 1993 Oct;13(1-2):109-125 [PMID: 21461185]
  25. J Acoust Soc Am. 1989 May;85(5):2088-113 [PMID: 2659638]
  26. J Exp Psychol Learn Mem Cogn. 2005 Mar;31(2):306-21 [PMID: 15755247]
  27. J Speech Hear Res. 1968 Dec;11(4):796-804 [PMID: 5719234]
  28. J Cogn Neurosci. 2004 Sep;16(7):1173-84 [PMID: 15453972]
  29. Neuroimage. 2009 May 15;46(1):226-40 [PMID: 19457395]
  30. Trends Cogn Sci. 2006 Jan;10(1):14-23 [PMID: 16321563]
  31. Am Psychol. 1992 Apr;47(4):559-69 [PMID: 1595984]
  32. Q J Exp Psychol (Hove). 2011 Jul;64(7):1442-56 [PMID: 21604232]

Word Cloud

Created with Highcharts 10.0.0speechrecognitiontalkermultiple-talkerlistenersaudio-visualperceptionchangephoneticacousticpatternscontextscostvariabilityaudio-onlytalker'sfaceconditionscontextcategoriesperformanceegslowersingle-talkerseeingwordcomparedconditioninterpretationDifferenttalkersdifferentmappingsneedadaptdifferencesDespitecomplexityadeptcomprehendingalbeitslightmeasurablefardemonstratedresearchshownhoweverableseeimprovedadverselisteningnoisedistortioncanincreaseuncertaintymappingreducecontexts?usedspeededword-monitoringtaskmakequickjudgmentstargetsingle-ResultsshowfasterHowevertimeresultssuggestmayslowincreasingimportanceidentificationsignalinglisteneroccurredTalkermultisensoryintegrationnormalization

Similar Articles

Cited By