Corpus Linguistics in Authorship Identification

Krzysztof J Kredens, Richard M Coulthard

Research output: Chapter in Book/Published conference outputChapter (peer-reviewed)peer-review


Corpus linguistics is basically ‘an empirical approach to studying language, which uses observations of attested data in order to make generalisations about lexis, grammar, and semantics’, and which, in the context of forensic linguistics, offers much more than explanatory possibilities. It provides methods for processing naturally occurring language data with a view to describing the nature of particular instances of language use and the behavior of particular (groups of) language users. Language corpora can thus be used for a variety of forensic linguistic tasks. This article explores how corpora and corpus methodology can aid the forensic linguist: in authorship identification; to analyze texts comparatively in order to comment on the authorship of questioned documents; to interpret the meaning of disputed utterances; and to investigate and describe language use in legal and forensic settings. After discussing authorship attribution, it looks at disputed meanings, corpora in language and law research, corpora for forensic applications, and the Internet as a corpus.
Original languageEnglish
Title of host publicationOxford Handbook of Language and Law
EditorsLawrence M Solan, Peter M Tiersma
Place of PublicationOxford
PublisherOxford University Press
ISBN (Print)9780199572120
Publication statusPublished - Mar 2012


  • forensic linguistics
  • forensic authorship analysis


Dive into the research topics of 'Corpus Linguistics in Authorship Identification'. Together they form a unique fingerprint.

Cite this