Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Corpus Linguistics and Linguistic Theory

Founded by Gries, Stefan Th. / Stefanowitsch, Anatol

Ed. by Wulff, Stefanie


IMPACT FACTOR 2017: 1.200
5-year IMPACT FACTOR: 1.386

CiteScore 2017: 0.80

SCImago Journal Rank (SJR) 2017: 0.288
Source Normalized Impact per Paper (SNIP) 2017: 0.930

Online
ISSN
1613-7035
See all formats and pricing
More options …

Frequency data from corpora partially explain native-speaker ratings and choices in overabundant paradigm cells

Neil Bermel
  • Corresponding author
  • School of Languages and Cultures, University of Sheffield, Sheffield, South Yorkshire, United Kingdom of Great Britain and Northern Ireland
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Luděk Knittl
  • School of Languages and Cultures, University of Sheffield, Sheffield, South Yorkshire, United Kingdom of Great Britain and Northern Ireland
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Jean Russell
  • Corporate Information and Computing Services, University of Sheffield, Sheffield, South Yorkshire, United Kingdom of Great Britain and Northern Ireland
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2018-08-31 | DOI: https://doi.org/10.1515/cllt-2016-0032

Abstract

If we can operationalize corpus frequency in multiple ways, using absolute values and proportional values, which of them is more closely connected with the behaviour of language users? In this contribution, we examine overabundant cells in morphological paradigms, and look at the contribution that frequency of occurrence can make to understanding the choices speakers make due to this richness. We look at ways of operationalizing the term frequency in data from corpora and native speakers: the proportional frequency of forms (i. e. percentage of time that a variant is found in corpus data considered as a proportion of all variants) and several interpretations of absolute frequency (i. e. the raw frequency of variants in data from the same corpus). Working with data from unmotivated morphological variation in Czech case forms, we show that different instantiations of frequency help interpret the way variation is perceived and maintained by native speakers. Proportional frequency seems most salient for speakers in forming their judgements, while certain types of absolute frequency seem to have a dominant role in production tasks.

Keywords: corpus linguistics; frequency; morphology; empirical research; surveys; questionnaires; Czech; overabundance

References

  • Baayen, R. Harald, Anna Endresen, Laura A. Janda, Anastasia Makarova & Tore Nesset. 2013. Making choices in Russian: Pros and cons of statistical methods for rival forms. Russian Linguistics 37(3). 253–291.CrossrefGoogle Scholar

  • Baayen, R. Harald, Petar Milin & Michael Ramscar. 2016. Frequency in lexical processing. Aphasiology 30. 1–47.Google Scholar

  • Bermel, Neil. 1993. Sémantické rozdíly v tvarech českého lokálu [Semantic differences in the forms of the Czech locative]. Naše řeč 76. 192–198.Google Scholar

  • Bermel, Neil. 2004. V korpuse nebo v korpusu? Co nám řekne (a neřekne) ČNK o morfologické variaci v tvarech lokálu [V korpuse or v korpusu? What the Czech National Corpus will (and will not) tell us about morphological variation in locative case forms]. In Zdeňka Hladková & Petr Karlík (eds.), Čeština – univerzália a specifika 5, 163–171. Prague: Nakladatelství Lidové Noviny.Google Scholar

  • Bermel, Neil. 2010. Variace a frekvence variant na příkladu tvrdých neživotných maskulin [Variation and the frequency of variants in hard masculine inanimate nouns]. In Světla Čmejrková, Jana Hoffmannová & Eva Havlová (eds.), Užívání a prožívání jazyka, 135–140. Prague: Karolinum.Google Scholar

  • Bermel, Neil & Luděk Knittl. 2012a. Corpus frequency and acceptability judgments: A study of morphosyntactic variants in Czech. Corpus Linguistics and Linguistic Theory 8. 241–275.Google Scholar

  • Bermel, Neil & Luděk Knittl. 2012b. Morphosyntactic variation and syntactic environments in Czech nominal declension: Corpus frequency and native-speaker judgments. Russian Linguistics 36). 91–119.CrossrefGoogle Scholar

  • Bermel, Neil, Luděk Knittl & Jean Russell. 2014. Absolutní a proporcionální frekvence v Českém národním korpusu ve světle výzkumu morfosyntaktické variace v češtině. Naše řeč 97. 216–227.Google Scholar

  • Bermel, Neil, Luděk Knittl & Jean Russell. 2015. Morphological variation and sensitivity to frequency of forms among native speakers of Czech. Russian Linguistics 39. 283–308.CrossrefGoogle Scholar

  • Brown, Dunstan. 2007. Peripheral functions and overdifferentiation: The Russian second locative. Russian Linguistics 31. 61–76.CrossrefGoogle Scholar

  • Bybee, Joan. 2002. Word frequency and context of use in the lexical diffusion of phonetically conditioned sound change. Language Variation and Change 14. 261–290.Google Scholar

  • Bybee, Joan. 2006. From usage to grammar: The mind’s response to repetition. Language 82. 711–733.CrossrefGoogle Scholar

  • Bybee, Joan. 2007. Frequency of use and the organization of language. Oxford: Oxford University Press.Google Scholar

  • Bybee, Joan & David Eddington. 2006. A usage-based approach to Spanish verbs of ‘becoming’. Language 82. 323–355.Google Scholar

  • Čech, Radek. 2012. Několik teoreticko-metodologických poznámek k Mluvnici současné češtiny [Some theoretical-methodological notes on the Grammar of Contemporary Czech]. Slovo a slovesnost 73. 208–216.Google Scholar

  • Čermák, František, Drahomíra Doležalová-Spoustová, Jaroslava Hlaváčová, Milena Hnátková, Tomáš Jelínek, Jan Kocek, Marie Kopřivová, Michal Křen, Renata Novotná, Vladimír Petkevič, Věra Schmiedtová, Hana Skoumalová, Michal Šulc & Zdeněk Velíšek. 2005. SYN2005: A genre-balanced corpus of written Czech. Prague: Ústav Českého národního korpusu FF UK. www.korpus.cz

  • Čermák, František, Jan Králík & Karel Kučera. 1997. Recepce současné češtiny a reprezentativnost korpusu (Výsledky a některé souvislosti jedné orientační sondy na pozadí budování Českého národního korpusu) [The reception of contemporary Czech and corpus representativity: Results and some relevant points of a preliminary sounding done during the building of the Czech National Corpus]. Slovo a slovesnost, 58. 117–123.Google Scholar

  • Clark, Eve V. 1987. The Principle of Contrast: A constraint on acquisition. In B. MacWhinney (ed.), Mechanisms of language acquisition: The 20th annual Carnegie symposium on cognition, 1–33. Hillsdale NJ: Erlbaum.Google Scholar

  • Cochran, William G. & Gertrude M. Cox. 1957. Experimental designs, 2nd edn. New York: John Wiley and Sons.Google Scholar

  • Cowart, Wayne. 1997. Experimental syntax: Applying objective methods to sentence judgments. Thousand Oaks, CA: Sage Publishers.Google Scholar

  • Cummins, George. 1995. Locative in Czech: -u or -e: Choosing locative singular endings in Czech nouns. Slavic and East European Journal 39. 241–260.CrossrefGoogle Scholar

  • Cvrček, Václav & Vilém Kodýtek. 2013. Ke klasifikaci morfologických variant [On classifying morphological variants]. Slovo a slovesnost 74. 139–145.Google Scholar

  • Cvrček, Václav, Vilém Kodýtek, Marie Kopřivová, Dominika Kováříková, Petr Sgall, Michal Šulc, Jan Táborský, Jan Volín & Martina Waclawičová. 2010. Mluvnice současné češtiny [A grammar of contemporary Czech]. Praha: Karolinum.Google Scholar

  • Dąbrowska, Ewa. 2005. Productivity and beyond: Mastering the Polish genitive inflection. Journal of Child Language 32. 191–205.PubMedCrossrefGoogle Scholar

  • Dąbrowska, Ewa. 2006. Low-level schemas or general rules? The role of diminutives in the acquisition of Polish case inflections. Language Sciences 28. 120–135.CrossrefGoogle Scholar

  • Dąbrowska, Ewa. 2008. The effects of frequency and neighbourhood density on adult speakers’ productivity with Polish case inflections: An empirical test of usage-based approaches to morphology. Journal of Memory and Language 58. 931–951.CrossrefGoogle Scholar

  • Dąbrowska, Ewa. 2010. Naive v. expert intuitions: An empirical study of acceptability judgments. Linguistic Review 27. 1–23.CrossrefGoogle Scholar

  • Dąbrowska, Ewa & Marcin Szczerbiński. 2006. Polish children’s productivity with case marking: The role of regularity, type frequency, and phonological diversity. Journal of Child Language 33. 559–597.PubMedCrossrefGoogle Scholar

  • Divjak, Dagmar. 2016. The role of lexical frequency in the acceptability of syntactic variants: Evidence from that-clauses in Polish. Cognitive Science 40. 1–29.Google Scholar

  • Halliday, M. A. K. 2005 [1992]. Language as system and language as instance: The corpus as a theoretical construct. In: J. Svartvik (ed.), Directions in corpus linguistics, 61–77. Berlin: Mouton de Gruyter. Reprinted in Webster, Jonathan J. (ed.), Computational and quantitative studies, 76–92. [Collected Works of M. A. K. Halliday 6]. London: Continuum.Google Scholar

  • Halliday, M. A. K. 2005 [1993]. Quantitative studies and probability in grammar. In Michael Hoey (ed.), Data, description and discourse, 61–77. London: HarperCollins. Reprinted in Webster, Jonathan J. (ed.), Computational and quantitative studies, 130–156. [Collected Works of M. A. K. Halliday 6]. London: Continuum.Google Scholar

  • Hare, Mary L., Michael Ford & William D. Marslen-Wilson. 2001. Ambiguity and frequency effects in regular verb inflection. In: Joan Bybee and P. Hopper (eds.), Frequency and the emergence of linguistic structures, 181–200. Amsterdam and Philadelphia: John Benjamins.Google Scholar

  • Hebal-Jezierska, Milena. 2008. Wariantywność końcówek fleksyjnych rzeczowników męskich żywotnych w języku czeskim [Variation in the flexional endings of masculine animate nouns in Czech]. Warsaw: Wydział Polonistyki Uniwersytetu Warszawskiego.Google Scholar

  • Hebal-Jezierska, Milena & Neil Bermel. 2011. Frequency and oppositions in corpus-based research into morphological variation. In Marek Konopka, Jacqueline Kubczak, Christian Mair, František Šticha & Ulrich H. Waßner (eds.), Gramatik und Korpora 3, 373–388. Tübingen: NarrGoogle Scholar

  • Heister, Julian & Reinhold Kliegl. 2012. Comparing word frequencies from different German text corpora. Lexical Resources in Psycholinguistic Research 3. 27–44.Google Scholar

  • Janda, Laura. 1996. Back from the brink: A study of how relic forms in language serve as source material for analogical extension. Munich and Newcastle: Lincom Europa.Google Scholar

  • Karlík, Petr, Marek Nekula, Zdena Rusínová, Miroslav Grepl, Zdeňka Hladká, Milan Jelínek, Marie Krčmová & Dušan Šlosar. 1995. Příruční mluvnice češtiny [A grammar handbook of Czech]. Prague: Nakladatelství Lidové Noviny.Google Scholar

  • Kasal, Jindřich. 1992. Dublety a jejich užití [Doublets and their usage]. Philologica, 65. 107–114.Google Scholar

  • Klimeš, Lumír. 1953. Lokál singuláru a plurálu vzoru „hrad“ a „město“ [The locative singular and plural of the “hrad” and “město” paradigms]. Naše řeč 36. 212–219.Google Scholar

  • Kolařík, Josef. 1995. Dynamika ve flexi substantiv běžně mluveného jazyka ve Zlíně [Dynamics in the inflection of nouns in ordinary spoken language in Zlín]. In Dana Davidová (ed.), K diferenciaci současného mluveného jazyka [On differentiation in the contemporary spoken language], 79–83. Ostrava: Universitas Ostraviensis Facultas Philosophica.Google Scholar

  • Komárek, Miroslav, Jan Kořenský, Jan Petr & Jarmila Veselková (eds.). 1986. Mluvnice češtiny. Díl 2: Tvarosloví [A grammar of Czech, part II: Morphology]. Prague: Academia.Google Scholar

  • Králík, Jan & Michal Šulc. 2005. The representativeness of Czech corpora. International Journal of Corpus Linguistics 10. 357–366.CrossrefGoogle Scholar

  • Křen, Michal, Tomáš Bartoň, Václav Cvrček, Milena Hnátková, Tomáš Jelínek, Jan Kocek, Renata Novotná, Vladimír Petkevič, Pavel Procházka, Věra Schmiedtová & Hana Skoumalová. 2010. SYN2010: A genre-balanced corpus of written Czech. Prague: Ústav Českého národního korpusu FF UK. www.korpus.cz.

  • Langacker, Ronald. 2008. Cognitive grammar: A basic introduction. Oxford: Oxford University Press.Google Scholar

  • McEnery, Tony & Andrew Hardie. 2012. Corpus linguistics: Method, theory and practice. Cambridge: Cambridge University Press.Google Scholar

  • Rusínová, Zdena 1992. Některé aspekty distribuce alomorfů (genitiv a lokál sg. maskulin) [Some aspects of the distribution of allomorphs: Genitive and locative singular masculine]. Sborník prací filozofické fakulty brněnské univerzity, A 40. 23–31.Google Scholar

  • Schmid, Hans-Jörg. 2015. A blueprint of the Entrenchment-and-conventionalization model. In: Peter Uhrig and Thomas Herbst (eds.), Yearbook of the German Cognitive Linguistics Association, 3–26. Berlin: Walter de Gruyter.Google Scholar

  • Schütze, Carson. 1996. The empirical base of linguistics: Grammaticality judgments and linguistic methodology. Chicago: University of Chicago Press.Google Scholar

  • Sedláček, Miloslav 1982. V Záhřebě i v Záhřebu [Both “v Záhřebě” and “v Záhřebu”]. Naše řeč 65. 11–15.Google Scholar

  • Shannon, Claude E. & Warren Weaver. 1963 [1949]. The mathematical theory of communication. Urbana, IL: University of Illinois Press.Google Scholar

  • Šimandl, Josef. 2010. Dnešní skloňování substantiv typů kámen, břímě [The declension of nouns of the type kámen, břímě today]. Prague: Nakladatelství Lidové Noviny.Google Scholar

  • Štícha, František. 2009. Lokál singuláru tvrdých neživotných maskulin (ve vlaku vs. v potoce): úzus a gramatičnost [The locative singular of hard inanimate masculine nouns (ve vlaku vs. v potoce): Usage and grammaticality]. Slovo a slovesnost 70. 193–220.Google Scholar

  • Thornton, Anna. 2012. Reduction and maintenance of overabundance: A case study on Italian verb paradigms. Word Structure 5. 183–207.CrossrefGoogle Scholar

About the article

Published Online: 2018-08-31

Published in Print: 2018-09-25


Research for this article was carried out as part of the project “Acceptability and forced–choice judgements in the study of linguistic variation”, funded by the Leverhulme Trust (RPG-407). An earlier version of some of this material was published in in Czech in abbreviated form in Bermel et al. (2014).


Citation Information: Corpus Linguistics and Linguistic Theory, Volume 14, Issue 2, Pages 197–231, ISSN (Online) 1613-7035, ISSN (Print) 1613-7027, DOI: https://doi.org/10.1515/cllt-2016-0032.

Export Citation

© 2018 Walter de Gruyter GmbH, Berlin/Boston.Get Permission

Comments (0)

Please log in or register to comment.
Log in