Audiovisual perceptual learning with multiple speakers.

Advanced Search

Aaron D Mitchel, Chip Gerfen, Daniel J Weiss

Author Information

Aaron D Mitchel: Department of Psychology and Program in Neuroscience, Bucknell University, Lewisburg, PA 17837, USA.
Chip Gerfen: Department of World Languages & Cultures, American University, Washington, DC, USA.
Daniel J Weiss: Department of Psychology and Program in Linguistics, The Pennsylvania State University, University Park, PA, USA.

PMID: 28867850 DOI: 10.1016/j.wocn.2016.02.003

One challenge for speech perception is between-speaker variability in the acoustic parameters of speech. For example, the same phoneme (e.g. the vowel in "cat") may have substantially different acoustic properties when produced by two different speakers and yet the listener must be able to interpret these disparate stimuli as equivalent. Perceptual tuning, the use of contextual information to adjust phonemic representations, may be one mechanism that helps listeners overcome obstacles they face due to this variability during speech perception. Here we test whether visual contextual cues to speaker identity may facilitate the formation and maintenance of distributional representations for individual speakers, allowing listeners to adjust phoneme boundaries in a speaker-specific manner. We familiarized participants to an audiovisual continuum between /aba/ and /ada/. During familiarization, the "b-face" mouthed /aba/ when an ambiguous token was played, while the "D-face" mouthed /ada/. At test, the same ambiguous token was more likely to be identified as /aba/ when paired with a stilled image of the "b-face" than with an image of the "D-face." This was not the case in the control condition when the two faces were paired equally with the ambiguous token. Together, these results suggest that listeners may form speaker-specific phonemic representations using facial identity cues.

Multisensory processes Perceptual learning Speech perception Talker normalization

Lang Cogn Process. 2014;29(7):771-780 [PMID: 25018577]
Cortex. 2014 May;54:117-23 [PMID: 24657480]
Perception. 2011;40(10):1164-82 [PMID: 22308887]
Psychol Rev. 2015 Apr;122(2):148-203 [PMID: 25844873]
J Exp Psychol Hum Percept Perform. 2000 Apr;26(2):806-19 [PMID: 10811177]
Cognition. 2015 Aug;141:121-6 [PMID: 25981732]
J Exp Psychol Hum Percept Perform. 2013 Jun;39(3):623-9 [PMID: 23148468]
J Acoust Soc Am. 2001 Mar;109(3):1181-96 [PMID: 11303932]
Cognition. 2004 Jul;92(3):B13-23 [PMID: 15019556]
Q J Exp Psychol A. 1991 May;43(2):161-204 [PMID: 1866456]
Atten Percept Psychophys. 2009 Aug;71(6):1207-18 [PMID: 19633336]
Percept Psychophys. 1998 Apr;60(3):355-76 [PMID: 9599989]
Nature. 1976 Dec 23-30;264(5588):746-8 [PMID: 1012311]
Cogn Psychol. 2003 Sep;47(2):204-38 [PMID: 12948518]
J Exp Psychol Hum Percept Perform. 2007 Dec;33(6):1483-94 [PMID: 18085958]
Percept Psychophys. 1990 Apr;47(4):379-90 [PMID: 2345691]
Psychol Rev. 1967 Nov;74(6):431-61 [PMID: 4170865]
Percept Psychophys. 1991 Dec;50(6):524-36 [PMID: 1780200]
J Acoust Soc Am. 1989 May;85(5):2223-4 [PMID: 2525139]
Psychol Sci. 2003 Nov;14(6):592-7 [PMID: 14629691]
J Cogn Neurosci. 2005 Mar;17(3):367-76 [PMID: 15813998]
Trends Cogn Sci. 2007 Dec;11(12):535-43 [PMID: 17997124]
Percept Psychophys. 2008 May;70(4):604-18 [PMID: 18556922]
Percept Psychophys. 2005 Feb;67(2):224-38 [PMID: 15971687]
PLoS Biol. 2006 Oct;4(10):e326 [PMID: 17002519]
Curr Dir Psychol Sci. 2008 Dec;17(6):405-409 [PMID: 23914077]
Front Psychol. 2016 Feb 02;7:52 [PMID: 26869959]
Psychon Bull Rev. 2006 Apr;13(2):262-8 [PMID: 16892992]
Front Psychol. 2014 May 16;5:407 [PMID: 24904449]
J Acoust Soc Am. 2006 Apr;119(4):1950-3 [PMID: 16642808]

R01 HD067250/NICHD NIH HHS

Journal Article

OpenLB
Open Library of Bioscience