Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Corpus Linguistics and Linguistic Theory

Founded by Gries, Stefan Th. / Stefanowitsch, Anatol

Ed. by Wulff, Stefanie

2 Issues per year


IMPACT FACTOR 2017: 1.200
5-year IMPACT FACTOR: 1.386

CiteScore 2017: 0.80

SCImago Journal Rank (SJR) 2017: 0.288
Source Normalized Impact per Paper (SNIP) 2017: 0.930

Online
ISSN
1613-7035
See all formats and pricing
More options …

Language is never, ever, ever, random

Adam Kilgarriff
Published Online: 2005-11-04 | DOI: https://doi.org/10.1515/cllt.2005.1.2.263

Abstract

Language users never choose words randomly, and language is essentially non-random. Statistical hypothesis testing uses a null hypothesis, which posits randomness. Hence, when we look at linguistic phenomena in corpora, the null hypothesis will never be true. Moreover, where there is enough data, we shall (almost) always be able to establish that it is not true. In corpus studies, we frequently do have enough data, so the fact that a relation between two phenomena is demonstrably non-random, does not support the inference that it is not arbitrary. We present experimental evidence of how arbitrary associations between word frequencies and corpora are systematically non-random. We review literature in which hypothesis testing has been used, and show how it has often led to unhelpful or misleading results.

About the article

Published Online: 2005-11-04

Published in Print: 2005-11-18


Citation Information: Corpus Linguistics and Linguistic Theory, Volume 1, Issue 2, Pages 263–276, ISSN (Online) 1613-7035, ISSN (Print) 1613-7027, DOI: https://doi.org/10.1515/cllt.2005.1.2.263.

Export Citation

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

[2]
Rene Markovič, Marko Gosak, Matjaž Perc, Marko Marhl, and Vladimir Grubelnik
Journal of Complex Networks, 2018
[4]
M. Insa Nolte, Clyde Ancarno, and Rebecca Jones
Corpora, 2018, Volume 13, Number 1, Page 27
[6]
Nick C. Ellis and Dave C. Ogden
Topics in Cognitive Science, 2017, Volume 9, Number 3, Page 604
[7]
Nick C. Ellis
Language Learning, 2017, Volume 67, Number S1, Page 40
[8]
Language Learning, 2016, Volume 66, Number S1, Page 313
[9]
Jefrey Lijffijt, Terttu Nevalainen, Tanja Säily, Panagiotis Papapetrou, Kai Puolamäki, and Heikki Mannila
Digital Scholarship in the Humanities, 2016, Volume 31, Number 2, Page 374
[10]
José Tummers, Dirk Speelman, Kris Heylen, and Dirk Geeraerts
Constructions and Frames, 2015, Volume 7, Number 1, Page 1
[11]
Elif Bamyacı, Jana Häussler, and Barış Kabak
Lingua, 2014, Volume 148, Page 254
[12]
Łukasz Dębowski
Chaos: An Interdisciplinary Journal of Nonlinear Science, 2011, Volume 21, Number 3, Page 037105
[13]
YVES PEIRSMAN, DIRK GEERAERTS, and DIRK SPEELMAN
Natural Language Engineering, 2010, Volume 16, Number 04, Page 469

Comments (0)

Please log in or register to comment.
Log in