Function words, especially frequently occurring ones such as (the, that, and, and of ), vary widely in pronunciation. Understanding this variation is essential both for cognitive modeling of lexical production and for computer speech recognition and synthesis. This study investigates which factors affect the forms of function words, especially whether they have a fuller pronunciation (e.g., ði, ðæt, ænd, ʌv) or a more reduced or lenited pronunciation (e.g., ðə, ðīt, n, ə). It is based on over 8000 occurrences of the ten most frequent English function words in a 4-h sample from conversations from the Switchboard corpus. Ordinary linear and logistic regression models were used to examine variation in the length of the words, in the form of their vowel (basic, full, or reduced), and whether final obstruents were present or not. For all these measures, after controlling for segmental context, rate of speech, and other important factors, there are strong independent effects that made high-frequency monosyllabic function words more likely to be longer or have a fuller form (1) when neighboring disfluencies (such as filled pauses uh and um) indicate that the speaker was encountering problems in planning the utterance; (2) when the word is unexpected, i.e., less predictable in context; (3) when the word is either utterance initial or utterance final. Looking at the phenomenon in a different way, frequent function words are more likely to be shorter and to have less-full forms in fluent speech, in predictable positions or multiword collocations, and utterance internally. Also considered are other factors such as sex (women are more likely to use fuller forms, even after controlling for rate of speech, for example), and some of the differences among the ten function words in their response to the factors.

1.
Agresti, A. (1996). An Introduction to Categorical Data Analysis (Wiley, New York).
2.
Bolinger, D. (1986). Intonation and its Parts: Melody in Spoken English (Stanford University Press, Stanford).
3.
Browman
,
C. P.
, and
Goldstein
,
L.
(
1992
). “
Articulatory phonology: An overview
,”
Phonetica
49
,
155
180
.
4.
Bush, N. (1999). “The predictive value of transitional probability for word-boundary palatalization in English,” Master’s thesis, University of New Mexico, Albuquerque, NM.
5.
Byrd, D., Kaun, A., Narayanan, S., and Saltzman, E. (2000). “Phrasal signatures in articulation,” in Papers in Laboratory Phonology V (Cambridge University Press, Cambridge), pp. 70–87.
6.
Byrd
,
D.
(
1994
). “
Relations of sex and dialect to reduction
,”
Speech Commun.
23
,
39
54
.
7.
Clark
,
H. H.
, and
Wasow
,
T.
(
1998
). “
Repeating words in spontaneous speech
,”
Cogn. Psychol.
37
,
201
242
.
8.
Croft
,
W.
(
1995
). “
Intonation units and grammatical structure
,”
Linguistics
33
,
839
882
.
9.
Crystal
,
T. H.
, and
House
,
A. S.
(
1990
). “
Articulation rate and the duration of syllables and stress groups in connected speech
,”
J. Acoust. Soc. Am.
88
,
101
112
.
10.
Dell
,
G. S.
(
1986
). “
A spreading activation theory of retrieval in sentence production
,”
Psychol. Rev.
93
,
283
321
.
11.
Fano, R. M. (1961). Transmission of Information; A Statistical Theory of Communications (MIT Press, Cambridge, MA).
12.
Fidelholz, J. (1975). “Word frequency and vowel reduction in English,” in CLS-75 (University of Chicago, Chicago), pp. 200–213.
13.
Fosler-Lussier, E. (1999a). “Contextual word and syllable pronunciation models,” in Proceedings of the 1999 IEEE ASRU Workshop, Keystone, Colorado.
14.
Fosler-Lussier, E. (1999b). “Dynamic Pronunciation Models for Automatic Speech Recognition,” Ph.D. thesis, University of California, Berkeley. Reprinted as ICSI Technical Report TR-99-015.
15.
Fosler-Lussier
,
E.
, and
Morgan
,
N.
(
1999
). “
Effects of speaking rate and word frequency on conversational pronunciations
,”
Speech Commun.
29
,
137
158
.
16.
Fougeron
,
C.
, and
Keating
,
P. A.
(
1997
). “
Articulatory strengthening at edges of prosodic domains
,”
J. Acoust. Soc. Am.
101
,
3728
3740
.
17.
Fowler
,
C. A.
, and
Housum
,
J.
(
1987
). “
Talkers’ signaling of new and old words in speech and listeners’ perception and use of the distinction
,”
J. Memory Lang.
26
,
489
504
.
18.
Fox Tree
,
J. E.
, and
Clark
,
H. H.
(
1997
). “
Pronouncing ‘the’ as ‘thee’ to signal problems in speaking
,”
Cognition
62
,
151
167
.
19.
Godfrey, J., Holliman, E., and McDaniel, J. (1992). “SWITCHBOARD: Telephone speech corpus for research and development,” in Proceedings of the IEEE International Conference on Acoustics, Speech, & Signal Processing (IEEE ICASSP-92) (IEEE, San Francisco), pp. 517–520.
20.
Greenberg, S. (1997). “Switchboard transcription system,” unpublished manuscript labelers’ manual, revision of 19 February, 1997.
21.
Greenberg, S., Ellis, D., and Hollenback, J. (1996). “Insights into spoken language gleaned from phonetic transcription of the Switchboard corpus,” in Proceedings of the International Conference on Spoken Language Processing (ICSLP-96), Philadelphia, PA, pp. S24–27.
22.
Gregory, M. L., Raymond, W. D., Bell, A., Fosler-Lussier, E., and Jurafsky, D. (1999). “The effects of collocational strength and contextual predictability in lexical production,” in CLS-99 (University of Chicago, Chicago), pp. 151–166.
23.
Griffin
,
Z. M.
, and
Bock
,
K.
(
1998
). “
Constraint, word frequency, and the relationship between lexical processing levels in spoken word production
,”
J. Memory Lang.
38
,
313
338
.
24.
Hock, H. H. (1986). Principles of Historical Linguistics (Mouton, The Hague).
25.
Jescheniak
,
J. D.
, and
Levelt
,
W. J. M.
(
1994
). “
Word frequency effects in speech production: Retrieval of syntactic information and of phonological form
,”
J. Exp. Psychol. Learn Mem. Cogn
20
,
824
843
.
26.
Jespersen, O. (1922). Language (Holt, New York).
27.
Jurafsky
,
D.
(
1996
). “
A probabilistic model of lexical and syntactic access and disambiguation
,”
Cogn. Sci.
20
,
137
194
.
28.
Jurafsky, D., Bell, A., and Girand, C. (2002). “The role of the lemma in form variation,” in Papers in Laboratory Phonology 7, edited by N. Warner and C. Gussenhoven (Mouton de Gruyter, Berlin/New York) pp. 3–34.
29.
Jurafsky, D., Bell, A., Gregory, M., and Raymond, W. D. (2001). “Probabilistic relations between words: Evidence from reduction in lexical production,” in Frequency and the Emergence of Linguistic Structure, edited by J. Bybee and P. Hopper (Benjamins, Amsterdam), pp. 229–254.
30.
Jurafsky, D., and Martin, J. H. (2000). Speech and Language Processing (Prentice-Hall, Englewood Cliffs, NJ).
31.
Keating
,
P. A.
,
Byrd
,
D.
,
Flemming
,
E.
, and
Todaka
,
Y.
(
1994
). “
Phonetic analysis of word and segment variation using the TIMIT corpus of American English
,”
Speech Commun.
14
,
131
142
.
32.
Klatt
,
D. H.
(
1975
). “
Vowel lengthening is syntactically determined in a connected discourse
,”
J. Phonetics
3
,
129
140
.
33.
Krug
,
M.
(
1998
). “
String frequency: A cognitive motivating factor in coalescence, language processing, and linguistic change
,”
J. Engl. Linguistics
26
,
286
320
.
34.
Ladd, D. R., and Campbell, N. (1991). “Theories of prosodic structure: Evidence from syllable duration,” in Proceedings of the 12th International Congress of Phonetic Sciences, Aix-en-Provence, France, pp. 290–293.
35.
Levelt
,
W. J. M.
,
Roelofs
,
A.
, and
Meyer
,
A. S.
(
1999
). “
A theory of lexical access in speech production
,”
Behav. Brain Sci.
22
(
1
),
1
75
.
36.
Lieberman
,
P.
(
1963
). “
Some effects of the semantic and grammatical context on the production and perception of speech
,”
Lang Speech
6
,
172
175
.
37.
MacDonald
,
M. C.
(
1993
). “
The interaction of lexical and syntactic ambiguity
,”
J. Memory Lang.
32
,
692
715
.
38.
Manning, C. D., and Schütze, H. (1999). Foundations of Statistical Natural Language Processing (MIT Press, Cambridge, MA).
39.
Marcus, M. P., Santorini, B., Marcinkiewicz, M. A., and Taylor, A. (1999). Treebank-3, Linguistic Data Consortium (LDC). Catalog #LDC99T42.
40.
McRae
,
K.
,
Spivey-Knowlton
,
M. J.
, and
Tanenhaus
,
M. K.
(
1998
). “
Modeling the influence of thematic fit (and other constraints) in on-line sentence comprehension
,”
J. Memory Lang.
38
,
283
312
.
41.
Meteer, M. et al. (1995). Dysfluency Annotation Stylebook for the Switchboard Corpus, Linguistic Data Consortium. Revised June 1995 by Ann Taylor. ftp://ftp.cis.upenn.edu/pub/treebank/swbd/doc/DFL_book.ps.gz
42.
Neu, H. (1980). “Ranking of constraints on /t,d/ deletion in American English: A statistical analysis,” in Locating Language in Time and Space, edited by W. Labov (Academic, New York), pp. 37–54.
43.
O’Shaughnessy, D. (1992). “Automatic recognition of hesitations in spontaneous speech,” in Proceedings of the IEEE International Conference on Acoustics, Speech, & Signal Processing (IEEE ICASSP-92), Vol. I, pp. 593–596.
44.
Plauché, M., and Shriberg, E. (1999). “Data-driven subclassification of disfluent repetitions based on prosodic features,” in Proceedings of the International Congress of Phonetic Sciences (ICPhS-99), San Francisco, Vol. 2, pp. 1513–1516.
45.
Rhodes, R. A. (1992). “Flapping in American English,” in Proceedings of the 7th International Phonology Meeting, edited by W. U. Dressler, M. Prinzhorn, and J. Rennison (Rosenberg and Sellier, Turin), pp. 217–232.
46.
Rhodes, R. A. (1996). “English reduced vowels and the nature of natural processes,” in Natural Phonology: The State of the Art, edited by B. Hurch and R. A. Rhodes (Mouton de Gruyter, The Hague), pp. 239–259.
47.
Saffran
,
J. R.
,
Newport
,
E. L.
, and
Aslin
,
R. N.
(
1996a
). “
Statistical learning by 8-month-old infants
,”
Science
274
,
1926
1928
.
48.
Saffran, J. R., Aslin, R. N., and Newport, E. L. (1996b). “Statistical cues in language acquisition: Word segmentation by infants,” in COGSCI-96, pp. 376–380.
49.
Schiffrin, D. (1987). Discourse Markers (Cambridge University Press, Cambridge).
50.
Schuchardt, H. (1985). Über die Lautgesetze: Gegen die Junggrammatiker. Robert Oppenheim, Berlin. Excerpted with English translation in Schuchardt, the Neogrammarians, and the Transformational Theory of Phonological Change, edited by T. Vennemann and T. H. Wilbur (Athenaum, Frankfurt, 1972) pp. 39–72.
51.
Seidenberg
,
M. S.
, and
MacDonald
,
M. C.
(
1999
). “
A probabilistic constraints approach to language acquisition and processing
,”
Cogn. Sci.
23
,
569
588
.
52.
Shattuck-Hufnagel, S., and Ostendorf, M. (1999). POSH labeling guide—version 1.0. Unpublished draft.
53.
Shriberg, E. (1994). “Preliminaries to a Theory of Speech Disfluencies,” Ph.D. thesis, University of California, Berkeley, CA. (unpublished).
54.
Shriberg, E. (1995). “Acoustic properties of disfluent repetitions,” in Proceedings of the International Congress of Phonetic Sciences (ICPhS-95), Stockholm, Sweden, Vol. 4, pp. 384–387.
55.
Shriberg, E. (1999). “Phonetic consequences of speech disfluency,” in Proceedings of the International Congress of Phonetic Sciences (ICPhS-99), San Francisco, Vol. I, pp. 619–622.
56.
Silverman, K., Beckman, M. E., Pitrelli, J., Ostendorf, M., Wightman, C., Price, P., Pierrehumbert, J., and Hirschberg, J. (1992). “TOBI: a standard for labelling English prosody,” in Proceedings of the International Conference on Spoken Language Processing (ICSLP-92), Vol. 2, pp. 867–870.
57.
Stemberger, J. (1985). “An interactive activation model of language production,” in Progress in the Psychology of Language, edited by A. Ellis (Erlbaum, London), pp. 143–186.
58.
Trueswell, J. C., and Tanenhaus, M. K. (1994). “Toward a lexicalist framework for constraint-based syntactic ambiguity resolution,” in Perspectives on Sentence Processing, edited by C. Clifton, Jr., L. Frazier, and K. Rayner (Erlbaum, Hillsdale, NJ), pp. 155–179.
59.
Wald, B., and Shopen, T. (1981). “A researcher’s guide to the sociolinguistic variable (ING),” in Style and Variables in English, edited by T. Shopen and J. M. Williams (Winthrop, Cambridge, MA), pp. 219–249.
60.
Zipf
,
G. K.
(
1929
). “
Relative frequency as a determinant of phonetic change
,”
Harv. Studies Classi Philol.
15
,
1
95
.
This content is only available via PDF.
You do not currently have access to this content.