Ron Carter, Author at About Words - Cambridge Dictionary blog

Part 1 of 2

In the first of a two-part blog entry, Prof. Ronald Carter of the University of Nottingham provides a brief introduction to corpora and corpus linguistics, exploring ways in which corpora are currently being used to inform language teaching and the development of teaching materials.

What is a corpus?

corpus noun (plural corpuses or corpora) the collection of a single writer’s work or of writing about a particular subject, or a large amount of written and sometimes spoken material collected to show the state of a language

Cambridge Advanced Learner’s Dictionary Third Edition (2008) Cambridge: Cambridge University Press

Many corpora these days run to millions of words. The British National Corpus (BNC), for example, consists of 100 million words of English: a written part (90%) includes newspapers, magazines, journals, books, letters, memos, essays, etc and a spoken part (10%) includes conversations, recorded in a way that achieves a demographic balance, as well as a range of spoken language from business or government meetings, radio shows, phone-ins, etc. These large collections of text are stored and read electronically, allowing researchers to employ a variety of software to reveal different patterns of language that exist within the corpus.

Continue reading “A few words on corpus linguistics” →

About Words – Cambridge Dictionary blog

Author: Ron Carter

A few words on corpus linguistics part 2

A few words on corpus linguistics