Variation across spoken genres: comparing the Spoken British National Corpus 2014 and the London-Lund Corpus 2

  • Love, R. (Speaker)
  • Nele Pöldvere (Speaker)

Activity: Talk or presentation typesOral presentation

Description

We present a case study which brings together two contemporary corpora of spoken British English: the Spoken British National Corpus 2014 (Spoken BNC2014; Love et al., 2017) and the London-Lund Corpus 2 (LLC-2; Põldvere et al., 2021). Both corpora comprise transcribed spoken discourse produced by L1 speakers of British English. The Spoken BNC2014 transcripts (totalling 11.5 million words) are derived from recordings gathered from 2012 to 2016, and the corpus exclusively samples the broad genre of casual conversation. The LLC-2 comprises 500,000 words from a variety of discourse contexts, and its recordings were gathered from 2014 to 2019. As synchronically overlapping samples of a national variety, the corpora are potentially complementary: while the Spoken BNC2014 has the advantage of size (relative to the LLC-2), it is assumed to be relatively homogeneous in terms of genre; on the other hand, the LLC-2 is considerably smaller but captures a much greater diversity of speech contexts.

Our case study evaluates the potential for using the corpora to supplement each other to gain a more comprehensive picture of contemporary spoken British English lexis than possible in isolation. Using multidimensional analysis (Biber, 1988; Nini, 2019), we plot the Spoken BNC2014 and LLC-2 texts against Dimension 1 (Involved vs. Informational Discourse) in order to evaluate the genre coverage of the corpora according to formality. We then identify a sub-sample of the most prototypically 'conversational' texts (those with high Dimension 1 scores) and compare how the features of these texts vary between the corpora, considering the potential influence of speaker gender. We explore two factors of potential variation – genre and speaker gender – in order to (a) explore the extent to which social variation is influenced by register, and vice-versa, and (b) methodologically evaluate the complementarity of the Spoken BNC2014 and LLC-2.

References
Biber, D. (1988). Variation across Speech and Writing. Cambridge University Press.
Love, R., Dembry, C., Hardie, A., Brezina, V., & McEnery, T. (2017). The Spoken BNC2014: designing and building a spoken corpus of everyday conversations. International Journal of Corpus Linguistics, 22(3), 319-344. https://doi.org/10.1075/ijcl.22.3.02lov
Nini, A. (2019). The Multi-Dimensional Analysis Tagger. In Berber Sardinha, T. & Veirano Pinto M. (eds), Multi-Dimensional Analysis: Research Methods and Current Issues, 67-94. Bloomsbury Academic.
Põldvere, N., Johansson, V., & Paradis, C. (2021). On the London–Lund Corpus 2: Design, challenges and innovations. English Language and Linguistics, 25(3), 459-483. https://doi.org/10.1017/S1360674321000186
Period16 Jul 2024
Event title11th Inter-Varietal Applied Corpus Studies (IVACS) Biennial Conference
Event typeConference
LocationCambridge, United KingdomShow on map
Degree of RecognitionInternational