Speaker discrimination as a function of vowel realization: does focus affect perception?

Willemijn Heeren; Cesko Voeten; Tessi Marks

doi:10.51751/dujal9420

Author(s)

Willemijn Heeren Leiden University
Cesko Voeten Fryske Akademy; Utrecht University https://orcid.org/0000-0003-4687-9973
Tessi Marks Leiden University

DOI:

https://doi.org/10.51751/dujal9420

Keywords:

speaker discrimination, focus, vowel quality, speech perception, Dutch

Abstract

The acoustic-phonetic characteristics of speech sounds are influenced by their linguistic position in the syllable or sentence. Because of acoustic-phonetic differences between different speech sounds, sounds vary in the amount of speaker information they contain. However, do spectral and durational differences between realizations of the same sound that were sampled from different linguistic positions also impact speaker information? We investigated speaker discrimination in [−focus] versus [+focus] word realizations. Twenty-one Dutch listeners participated in a same-different task, using stimuli varying in focus, vowel ([aː], [u]), and word context ([ɦ_k], [v_t]), spoken by 11 different speakers. Results showed that an effect of focus on speaker-dependent information was present, but limited to words containing [u]. Moreover, performance on [u] words was influenced by (interactions of) word context and trial type (same-vs. different-speaker). Context-dependent changes in a speech sound’s acoustics may affect its speaker-dependent information, albeit under specific conditions only.

Downloads

Download data is not yet available.

References

Akaike, H. (1971). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov, & F. Csáki (Eds.), 2nd International Symposium on Information Theory, Tsahkadsor, Armenia, USSR, September 2–8, 1971 (pp. 267–281). Budapest, Akadémiai Kiadó.

Amino, K., & Arai, T. (2007). Contribution of consonants and vowels to the perception of speaker identity. In Japan-China Joint Conference on Acoustics. Sendai, Japan.

Andics, A., McQueen, J. M., & Van Turennout, M. (2007). Phonetic content influences voice discriminability. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (pp. 1829–1832). Dudweiler: Pirrot. [https://pure.mpg.de/pubman/faces/ViewItemOverviewPage.jsp?itemId=item_57725]

Baumann, O., & Belin, P. (2010). Perceptual scaling of voice identity: Common dimensions for different vowels and speakers. Psychological Research PRPF, 74(1), 110. [https://link.springer.com/article/10.1007/s00426-008-0185-z]

Bell, A., Brenier, J. M., Gregory, M., Girand, C., & Jurafsky, D. (2009). Predictability effects on durations of content and function words in conversational English. Journal of Memory and Language, 60(1), 92–111. https://doi.org/10.1016/j.jml.2008.06.003

Boersma, P., & Weenink, D. (2018). Praat. Doing phonetics by computer (Version 6.0.42 ) [Computer program].

Bricker, P. D., & Pruzansky, S. (1966). Effects of stimulus content and duration on talker identification. The Journal of the Acoustical Society of America, 40, 1441–1449. https://doi.org/10.1121/1.1910246

Chafe, W.L. (1976). Givenness, contrastiveness, definiteness, subjects, topics and point of view. In C.N. Li (Ed.) Subject and topic (pp. 27–55). Academic Press.

Chen, A. (2009). The phonetics of sentence-initial topic and focus in adult and child Dutch. In M. Vigário, S. Frota, & M. João Freitas (Eds.), Phonetics and phonology: interactions and interrelations (pp. 91-106). John Benjamins Publishing Company.

Cook, S., & Wilding, J. (1997). Earwitness testimony: Never mind the variety, hear the length. Applied Cognitive Psychology, 11(2), 95–111. https://doi.org/10.1002/(SICI)1099-0720(199704)11:2<95::AID-ACP429>3.0.CO;2-O

Drozdova, P., Van Hout, R., & Scharenborg, O. (2017). L2 voice recognition: The role of speaker-, listener-, and stimulus-related factors. The Journal of the Acoustical Society of America, 142(5), 3058–3068. https://doi.org/10.1121/1.5010169

Eefting, W. (1991). The effect of “information value” and “accentuation” on the duration of Dutch words, syllables, and segments. The Journal of the Acoustical Society of America, 89(1), 412–424.

Fant, G. (1960). Acoustic theory of speech production. Mouton and Co.

Fejlová, D., Lukeš, D., & Skarnitzl, R. (2013). Formant contours in Czech vowels: Speaker-discriminating potential. Proceedings of Interspeech 2013, 25–29 August 2013, Lyon, France (pp. 3182–3186).

Fokkema, M., Smits, N., Zeileis, A., Hothorn, T., & Kelderman, H. (2018). Detecting treatment-subgroup interactions in clustered data with generalized linear mixed-effects model trees. Behavior Research Methods, 50, 2016–2034. [https://link.springer.com/article/10.3758/s13428-017-0971-x]

Hanssen, J., Peters, J., & Gussenhoven, C. (2008). Prosodic effects of focus in Dutch declaratives. Proceedings of the 4th international conference on Speech Prosody. Campinas, Brazil, pp. 609–612.

He, L., & Dellwo, V. (2017). Between-speaker variability in temporal organizations of intensity contours. The Journal of the Acoustical Society of America, 141(5), EL488–EL494. https://doi.org/10.1121/1.4983398

He, L., Zhang, Y., & Dellwo, V. (2019). Between-speaker variability and temporal organization of the first formant. The Journal of the Acoustical Society of America, 145(3), EL209–EL214. https://doi.org/10.1121/1.5093450

Heeren W. F. L. (2020). The effect of word class on speaker-dependent information in the Standard Dutch vowel /aː/. The Journal of the Acoustical Society of America, 148(4), 2028–2039. https://doi.org/10.1121/10.0002173

Kavanagh, C. (2012). New consonantal acoustic parameters for forensic speaker comparison Doctoral dissertation. University of York. [https://etheses.whiterose.ac.uk/3980/

Keuleers, E., Brysbaert, M., & New, B. (2010). SUBTLEX-NL: A new frequency measure for Dutch words based on film subtitles. Behavior Research Methods, 42(3), 643–650. https://doi.org/10.3758/BRM.42.3.643

Krifka, M. (2007). Basic notions of information structure. In C. Féry, G. Fanselow, & M. Krifka (Eds.), The notions of information structure (pp. 13–55). Universitätsverlag Potsdam.

Lavan, N., Burston, L. F. K., & Garrido, L. (2019). How many voices did you hear? Natural variability disrupts identity perception from unfamiliar voices. British Journal of Psychology, 110, 576–593. https://doi.org/10.1111/bjop.12348

Lee, Y., Keating, P., & Kreiman, J. (2019). Acoustic voice variation within and between speakers. The Journal of the Acoustical Society of America, 146(3), 1568–1579. https://doi.org/10.1121/1.5125134

McDougall, K. (2006). Dynamic features of speech and the characterization of speakers: Towards a new approach using formant frequencies. International Journal of Speech, Language and the Law, 13(1), 89–126.

Morrison, G. S. (2009). Likelihood-ratio forensic voice comparison using parametric representations of the formant trajectories of diphthongs. The Journal of the Acoustical Society of America, 125(4), 2387–2397. https://doi.org/10.1121/1.3081384

Narayan, C. R., Mak, L., & Bialystok, E. (2017). Words get in the way: Linguistic effects on talker discrimination. Cognitive Science, 41(5), 1361–1376. https://doi.org/10.1111/cogs.12396

Orchard, T., & Yarmey, A. D. (1995). The effects of whispers, voice‐sample duration, and voice distinctiveness on criminal speaker identification, Applied Cognitive Psychology, 9(3), 249–260. https://doi.org/10.1002/acp.2350090306

Psychology Software Tools. (2012). E-Prime (Version 2.0). https://www.pstnet.com

Schindler, C., & Draxler, C. (2013) Using spectral moments as a speaker specific feature in nasals and fricatives. Proceedings of Interspeech (pp. 2793–2796), Lyon, France, 25–29 August 2013.

Sluijter, A. M. C., & Van Heuven, V. J. (1996). Spectral balance as an acoustic correlate of linguistic stress. The Journal of the Acoustical Society of America, 100, 2471–2485. https://doi.org/10.1121/1.417955

Smorenburg, B. J. L., & Heeren, W. F. L. (2020). The distribution of speaker information in Dutch fricatives /s/ and /x/ from telephone dialogues. Journal of the Acoustical Society of America, 147(2), 949–960. https://doi.org/10.1121/10.0000674

Smorenburg, B. J. L., & Heeren W. F. L. (2021). Acoustic and speaker variation in Dutch /n/ and /m/ as a function of phonetic context and syllabic position. The Journal of the Acoustical Society of America, 150(2), 979–989. https://doi.org/10.1121/10.0005845

Stevenage, S. V. (2018). Drawing a distinction between familiar and unfamiliar voice processing: A review of neuropsychological, clinical and empirical findings. Neuropsychologia, 116, 162–178. https://doi.org/10.1016/j.neuropsychologia.2017.07.005

Tagliamonte, S. A., & Baayen, R. H. (2012). Models, forests, and trees of York English: Was/were variation as a case study for statistical practice. Language Variation and Change, 24, 135–178. [https://clarinoai.informatik.uni-leipzig.de/fedora/objects/oai:mrr/datastreams/info/content]

Van Bergem, D. R. (1993). Acoustic vowel reduction as a function of sentence accent, word stress, and word class. Speech Communication, 12(1), 1–23. https://doi.org/10.1016/0167-6393(93)90015-D

Van Bergem, D. R. (1995). Acoustic and lexical vowel reduction. PhD dissertation, University of Amsterdam. [https://dare.uva.nl/search?identifier=6ba47af3-8bf4-4b46-81cb-2adb65dbc955]

Van Berkum, J. J., Van den Brink, D., Tesink, C. M., Kos, M., & Hagoort, P. (2008). The neural integration of speaker and message. Journal of Cognitive Neuroscience, 20(4), 580–591. https://doi.org/10.1162/jocn.2008.20054

Van den Heuvel, H. (1996). Speaker variability in acoustic properties of Dutch phoneme realisations. Doctoral dissertation, Katholieke Universiteit Nijmegen. [https://repository.ubn.ru.nl/bitstream/handle/2066/76416/76416.pdf

Van Heuven, V. J. (1997). Effects of focus distribution and accentuation on the temporal and melodic organisation of word groups in Dutch. In S. Barbiers, J. Rooryck, & J. van de Weijer (Eds.), Small words in the big picture. Squibs for Hans Bennis. HIL Occasional Papers no. 2 Leiden: Holland Institute of Generative Linguistics. 37–42.

Voeten, C. C. (2020). buildmer: Stepwise elimination and term reordering for mixed-effects regression. R package version 1.5. https://CRAN.R-project.org/package=buildmer

Speaker discrimination as a function of vowel realization: does focus affect perception?

Author(s)

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Make a Submission

Stay up-to-date