A corpus-based investigation of pausing and the production of formulaic Language in EAP speech fluency

Wang, Lifang (2019) A corpus-based investigation of pausing and the production of formulaic Language in EAP speech fluency. PhD thesis, University of Nottingham.

[img] PDF (Thesis - as examined) - Repository staff only - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (3MB)


Speech fluency and knowledge of formulaic language are interrelated. The relationship between these two aspects is described in this thesis by the Holistic Hypothesis, that is, formulaic sequences are believed to be produced as holistic units, without internal pauses, in naturally occurring discourse. However, this hypothesis is mainly based on impressionistic evidence from adult learner speech. Very few empirical studies have been conducted to examine and compare the production of formulaic language and the presence of internal pauses in large-scale, authentic learner and native speaker data. To fill this gap, this thesis carries out an in-depth, mixed methods, contrastive investigation of the Holistic Hypothesis in two corpora of academic spoken English, one contributed by Chinese EAP learners and the other by native speakers of American English.

The thesis addresses three main questions. First, it asks whether learners and native speakers place pauses in formulaic language, and if they do, in which formulaic sequences they are more likely to pause. Formulaic sequences were identified using a psycholinguistically informed, corpus-based approach, and categorized based on their grammatical structures such as preposition-based or noun phrase formulaic sequences. Results show that approximately one third of the two-word and three-word formulaic sequences were produced without pauses by both groups of speakers, and nearly one third were used holistically by native speakers but not by learners. In the remaining one third, most of the formulaic sequences had internal pauses in both corpora, but a small number of them were produced as wholes in learner speech but with pauses in native speech. Although quite small in number in both corpora, the majority of the four-word formulaic sequences were produced without pauses. Moreover, internal pauses were found in all types of the formulaic sequences, but they seem to have occurred more frequently in preposition-based and conjunction-based clusters. Instances of the subject-verb category appear more likely to be produced without pauses.

Second, the thesis asks what the patterns and the causes of internal pausing are in learner and native speaker formulaic sequences. Internal pauses were examined based on their behavior, such as occurring alone in formulaic language or being accompanied by other pausing phenomena that immediately preceded or followed the formulaic sequences investigated, including prepositional clusters, subject verb clusters, verb phrase clusters, noun phrase clusters, and the clusters made up of conjunctions co-occurring with noun phrase elements, adverbs, or pronouns. Five patterns of internal pausing were observed and the causes of the occurrence were found to be various. Apart from serving as boundary markers for two neighboring grammatical constituents in their wider co-texts, internal pauses in both corpora may also noticeably occur because of speakers’ difficulty in lexical retrieval, overt monitoring, or individual preferred pausing schemes, or because pauses are used to perform discoursal functions. More internal pauses in learner speech appear to have been caused by difficulties in online speech formulation, but in native speech, they seem to be more related to topic shifting and individual pausing preferences.

Third, the thesis examines high frequency formulaic sequences and their pausing patterns in both corpora. Building on the second question, four formulaic sequences were selected from each grammatical category under investigation, that is, prepositional clusters, subject verb clusters, VERB TO clusters, noun phrase clusters, and conjunction pronoun clusters. These clusters were first investigated based on whether they were produced with or without pauses. Holistic production was then examined regarding whether the clusters had pauses on both sides or on either side, whether they involved repetitions or were embedded in longer fluent speech runs. Some of the clusters may also have been contracted as one word. As well as confirming the findings from the first and the second questions, the results show that formulaic sequences in native speech were more likely to be produced without pauses than in learner speech. Most of the two-word clusters investigated were embedded in three- or four-word ones, which were used as holistic units generally by native speakers but had internal pauses placed at different locations in learner speech, so learners’ fluent speech runs appear to be shorter than native speakers’. Additionally, native speakers seem to have contracted more high frequency two-word clusters into one word than learners do.

Overall, the thesis provides empirical evidence that the Holistic Hypothesis is more applicable to native speech than to learner speech. The findings can offer useful insights into the process of learning formulaic language and facilitate the design of a speech fluency syllabus, such as including pausing as an explicit teaching issue or teaching formulaic sequences instead of individual words.

Item Type: Thesis (University of Nottingham only) (PhD)
Supervisors: Dowens, Margaret
McKenny, John
Carter, Ronald
Stockwell, Peter
Keywords: Formulaic language, pausing, learner corpus research, academic spoken English, speech fluency
Subjects: P Language and literature > P Philology. Linguistics
Faculties/Schools: UNNC Ningbo, China Campus > Faculty of Humanities and Social Sciences > School of English
Item ID: 56971
Depositing User: WANG, Lifang
Date Deposited: 17 Jun 2019 03:15
Last Modified: 07 May 2020 13:03
URI: https://eprints.nottingham.ac.uk/id/eprint/56971

Actions (Archive Staff Only)

Edit View Edit View