Visual speech segmentation: using facial cues to locate word boundaries in continuous speech.

Advanced Search

Aaron D Mitchel, Daniel J Weiss

Author Information

Aaron D Mitchel: Department of Psychology, Bucknell University, Lewisburg, PA 17837, USA.
Daniel J Weiss: Department of Psychology and Program in Linguistics, The Pennsylvania State University, 643 Moore Building, University Park, PA 16802, USA.

PMID: 25018577 DOI: 10.1080/01690965.2013.791703

Speech is typically a multimodal phenomenon, yet few studies have focused on the exclusive contributions of visual cues to language acquisition. To address this gap, we investigated whether visual prosodic information can facilitate speech segmentation. Previous research has demonstrated that language learners can use lexical stress and pitch cues to segment speech and that learners can extract this information from talking faces. Thus, we created an artificial speech stream that contained minimal segmentation cues and paired it with two synchronous facial displays in which visual prosody was either informative or uninformative for identifying word boundaries. Across three familiarisation conditions (audio stream alone, facial streams alone, and paired audiovisual), learning occurred only when the facial displays were informative to word boundaries, suggesting that facial cues can help learners solve the early challenges of language acquisition.

audiovisual speech language acquisition multisensory integration speech segmentation visual prosody

Int J Pediatr Otorhinolaryngol. 2003 May;67(5):479-95 [PMID: 12697350]
Cogn Psychol. 2010 Sep;61(2):177-99 [PMID: 20573342]
Cognition. 1996 Oct-Nov;61(1-2):93-125 [PMID: 8990969]
J Speech Lang Hear Res. 2010 Dec;53(6):1529-42 [PMID: 20699342]
Lang Learn Dev. 2009;5(1):30-49 [PMID: 24729760]
J Exp Psychol Hum Learn. 1979 May;5(3):212-28 [PMID: 528913]
J Exp Psychol Learn Mem Cogn. 2011 Sep;37(5):1081-91 [PMID: 21574745]
Q J Exp Psychol (Hove). 2011 May;64(5):1021-40 [PMID: 21347988]
Perception. 2007;36(10):1445-53 [PMID: 18265827]
Q J Exp Psychol (Hove). 2010 Feb;63(2):260-74 [PMID: 19526435]
Child Dev. 2005 May-Jun;76(3):598-613 [PMID: 15892781]
Percept Psychophys. 1993 Sep;54(3):287-95 [PMID: 8414887]
Psychon Bull Rev. 2000 Sep;7(3):504-9 [PMID: 11082857]
Dev Psychol. 2000 Mar;36(2):190-201 [PMID: 10749076]
Psychol Rev. 1991 Apr;98(2):164-81 [PMID: 2047512]
Mem Cognit. 2009 Sep;37(6):889-94 [PMID: 19679867]
Cognition. 2010 Nov;117(2):107-25 [PMID: 20832060]
Nature. 1976 Dec 23-30;264(5588):746-8 [PMID: 1012311]
Psychol Sci. 2004 Feb;15(2):133-7 [PMID: 14738521]
Science. 2007 May 25;316(5828):1159 [PMID: 17525331]
Cogn Sci. 2010 Aug;34(6):1093-106 [PMID: 21564244]
Dev Psychol. 2010 Jan;46(1):66-77 [PMID: 20053007]
Lang Cogn Process. 2014;29(7):771-780 [PMID: 25018577]
Pediatrics. 1975 Oct;56(4):544-9 [PMID: 1165958]
Science. 1982 Dec 10;218(4577):1138-41 [PMID: 7146899]
Cognition. 2008 Sep;108(3):850-5 [PMID: 18590910]
Cogn Psychol. 1999 Nov-Dec;39(3-4):159-207 [PMID: 10631011]
J Exp Psychol Hum Percept Perform. 1996 Apr;22(2):318-31 [PMID: 8934846]
J Speech Lang Hear Res. 2004 Apr;47(2):304-20 [PMID: 15157132]
J Acoust Soc Am. 2000 Sep;108(3 Pt 1):1197-208 [PMID: 11008820]

R01 HD067250/NICHD NIH HHS
R03 HD048996/NICHD NIH HHS

Journal Article

Acoustic effects of non-transparent and transparent face coverings.Does hearing two dialects at different times help infants learn dialect-specific rules?Eye Movements During Visual Speech Perception in Deaf and Hearing Children.Speechreading in hearing children can be improved by training.Native Language Similarity during Foreign Language Learning: Effects of Cognitive Strategies and Affective States.Synchronization by the hand: the sight of gestures modulates low-frequency activity in brain responses to continuous speech.Finding phrases: On the role of co-verbal facial information in learning word order in infancy.Statistical learning of multiple speech streams: A challenge for monolingual infants.From Klingon to Colbertian: Using Artificial Languages to Study Word Learning.Audiovisual perceptual learning with multiple speakers.

See all "Cited by" articles

OpenLB
Open Library of Bioscience