Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter Mouton November 8, 2016

Quantifying polysemy: Corpus methodology for prototype theory

Dylan Glynn
From the journal Folia Linguistica

Abstract

This study addresses the methodological problem of result falsification in Cognitive Semantics, specifically in the descriptive analysis of semasiological variation, or “polysemy”. It argues that manually analysed corpus data can be used to describe models of semantic structure. The method proposed is quantified, permitting repeat analysis. The operationalisation of a semasiological structure employed in the study takes the principle of semantic features and applies them to a contextual analysis of usage-events, associated with the lexeme under scrutiny. The feature analysis, repeated on a large collection of occurrences, results in a set of metadata that constitutes the usage-profile of the lexeme. Multivariate statistics are then employed to identify patterns in those metadata. The case study examines 500 occurrences of the English lexeme annoy. Three basic senses are identified as well as a more complex array of semantic variations linked to morpho-syntactic context of usage.

Acknowledgement

The author wishes to sincerely thank two anonymous reviewers, the general editor Hubert Cuyckens, as well as the special issue editors for their observations, criticisms, and help. Evidently, all shortcomings remain my own.

References

Artstein, Ron & Massimo Poesio. 2007. Inter-coder agreement for computational linguistics. Computational Linguistics 34. 555–596.10.1162/coli.07-034-R2Search in Google Scholar

Berez, Andrea & Stefan Th. Gries. 2009. In defence of corpus-based methods: A behavioral profile analysis of polysemous get in English. University of Washington Working Papers in Linguistics 27. 57–166.Search in Google Scholar

Brugman, Claudia. 1983. The story of over: Polysemy, semantics, and the structure of the lexicon. Trier: LAUT.Search in Google Scholar

Claude, Julien. 2008. Morphometrics with R. New York: Springer.Search in Google Scholar

Coleman, Linda & Paul Kay. 1981. Prototype semantics: The English word lie. Language 57. 26–44.10.1353/lan.1981.0002Search in Google Scholar

Cuyckens, Hubert. 1995. Family resemblance in the Dutch spatial prepositions door and langs. Cognitive Linguistics 6. 183–207.10.1515/cogl.1995.6.2-3.183Search in Google Scholar

Deshors, Sandra. 2014. Constructing meaning in L2 discourse: The case of modal verbs and sequential dependencies. In Dylan Glynn & Mette Sjölin (eds.), Subjectivity and epistemicity: Corpus, discourse, and literary approaches to stance, 329–348. Lund: Lund University Press.Search in Google Scholar

Deshors, Sandra. 2016. Multidimensional perspectives on interlanguage: Exploring may and can across learner corpora. Louvain: Presses Universitaires de Louvain.Search in Google Scholar

Deshors, Sandra & Stefan Th. Gries. 2014. A case for the multifactorial assessment of learner language: The uses of may and can in French-English interlanguage. In Dylan Glynn & Justyna Robinson (eds.), Corpus methods for semantics: Quantitative studies in polysemy and synonymy, 179–204. Amsterdam: John Benjamins.10.1075/hcp.43.07desSearch in Google Scholar

Dirven, René, Louis Goossens, Yvan Putseys & Emma Vorlat. 1982. The scene of linguistic action and its perspectivization byspeak, talk, say, andtell. Amsterdam: John Benjamins.10.1075/pb.iii.6Search in Google Scholar

Divjak, Dagmar. 2006. Ways of intending: A corpus-based Cognitive Linguistic approach to near-synonyms in Russian. In Stefen Th. Gries & Anatol Stefanowitsch (eds.), Corpora in Cognitive Linguistics: Corpus-based approaches to syntax and lexis, 19–56. Berlin: Mouton de Gruyter.Search in Google Scholar

Divjak, Dagmar. 2010. Structuring the lexicon: A clustered model for near-synonymy. Berlin: De Gruyter Mouton.10.1515/9783110220599Search in Google Scholar

Divjak, Dagmar & Nick Fieller. 2014. Cluster analysis: Finding structure in linguistic data. In Dylan Glynn & Kerstin Fischer (eds.), Quantitative Cognitive Semantics: Corpus-driven approaches, 405–442. Berlin: De Gruyter Mouton.10.1075/hcp.43.16divSearch in Google Scholar

Divjak, Dagmar & Stefan Th. Gries. 2006. Ways of trying in Russian: Clustering behavioral profiles. Corpus Linguistics and Linguistic Theory 2. 23–60.10.1515/CLLT.2006.002Search in Google Scholar

Fabiszak, Małgorzata, Anna Hebda, Iwona Kokorniak & Karolina Krawczak. 2014. The semasiological structure of Polish myśleć ‘to think’: A study in verb-prefix semantics. In Dylan Glynn & Kerstin Fischer (eds.), Quantitative Cognitive Semantics: Corpus-driven approaches, 223–252. Berlin: De Gruyter Mouton.10.1075/hcp.43.09fabSearch in Google Scholar

Fillmore, Charles. 1975. An alternative to checklist theories of meaning. Proceedings of the Berkeley Linguistics Society 1. 123–131.10.3765/bls.v1i0.2315Search in Google Scholar

Fillmore, Charles. 1985. Frames and the semantics of understanding. Quaderni di Semantica 6. 222–254.Search in Google Scholar

Geeraerts, Dirk. 1986. On necessary and sufficient conditions. Journal of Semantics 5. 275–291.10.1093/jos/5.4.275Search in Google Scholar

Geeraerts, Dirk. 1989. Prospects and problems of prototype theory. Linguistics 27. 587–612.10.17684/i4A53enSearch in Google Scholar

Geeraerts, Dirk. 1990. The lexicographical treatment of prototypical polysemy. In S. Tsohatzidis (ed.). Meanings and prototypes: Studies in linguistic categorization, 195–210. London: Routledge.Search in Google Scholar

Geeraerts, Dirk. 1993a. Vagueness’s puzzles, polysemy’s vagaries. Cognitive Linguistics 4. 223–272.10.1515/cogl.1993.4.3.223Search in Google Scholar

Geeraerts, Dirk. 1993b. Generalised onomasiological salience. Belgian Journal of Linguistics 8. 43–56.10.1075/bjl.8.04geeSearch in Google Scholar

Geeraerts, Dirk. 1995. Representational formats in Cognitive Semantics. Folia Linguistica 39. 21–41.10.1515/flin.1995.29.1-2.21Search in Google Scholar

Geeraerts, Dirk. 2006. Methodology in Cognitive Linguistics. In Gitte Kristiansen, Michel Achard, René Dirven & Francisco J. Ruiz de Mendoza Ibañez (eds.), Cognitive Linguistics: Current applications and future perspectives, 21–50. Berlin: Mouton de Gruyter.Search in Google Scholar

Geeraerts, Dirk, Stefan Grondelaers & Peter Bakema. 1994. Structure of lexical variation: Meaning, naming and context. Berlin: Mouton de Gruyter.10.1515/9783110873061Search in Google Scholar

Geeraerts, Dirk, Stefan Grondelaers & Dirk Speelman. 1999. Convergentie en divergentie in de Nederlandse woordenschat: Een onderzoek naar kleding- en voetbaltermen [Convergence and divergence in Dutch vocabulary: A study of clothing and football terms]. Amsterdam: Meertens Instituut.Search in Google Scholar

Glynn, Dylan. 2008. Lexical fields, grammatical constructions and synonymy: A study in usage-based Cognitive Semantics. In Hans-Jörg Schmid & Sandra Handl (eds.), Cognitive foundations of linguistic usage-patterns: Empirical studies, 89–118. Berlin: Mouton de Gruyter.10.1515/9783110216035.89Search in Google Scholar

Glynn, Dylan. 2009. Polysemy, syntax, and variation: A usage-based method for Cognitive Semantics. In Vyvyan Evans & Stéphanie Pourcel (eds.), New directions in Cognitive Linguistics, 77–106. Amsterdam & Philadelphia: John Benjamins.10.1075/hcp.24.08glySearch in Google Scholar

Glynn, Dylan. 2010a. Testing the hypothesis: Objectivity and verification in usage-based Cognitive Semantics. In Dylan Glynn & Kerstin Fischer (eds.), Quantitative Cognitive Semantics: Corpus-driven approaches, 239–270. Berlin: De Gruyter Mouton.10.1515/9783110226423.239Search in Google Scholar

Glynn, Dylan. 2010b. Corpus-driven Cognitive Semantics: Introduction to the field. In Dylan Glynn & Kerstin Fischer (eds.), Quantitative Cognitive Semantics: Corpus-driven approaches, 1–42. Berlin: De Gruyter Mouton.10.1515/9783110226423.1Search in Google Scholar

Glynn, Dylan. 2014a. Polysemy and synonymy: Corpus method and cognitive theory. In Dylan Glynn & Justyna Robinson (eds.), Corpus methods for semantics: Quantitative studies in polysemy and synonymy, 7–38. Amsterdam: John Benjamins.10.1075/hcp.43.01glySearch in Google Scholar

Glynn, Dylan. 2014b. The many uses of run: Corpus methods and Socio-Cognitive Semantics. In Dylan Glynn & Justyna Robinson (eds.), Corpus methods for semantics: Quantitative studies in polysemy and synonymy, 117–144. Amsterdam: John Benjamins.10.1075/hcp.43.05glySearch in Google Scholar

Glynn, Dylan. 2014c. Techniques and tools: Corpus methods and statistics for semantics. In Dylan Glynn & Justyna Robinson (eds.), Corpus methods for semantics: Quantitative studies in polysemy and synonymy, 307–342. Amsterdam: John Benjamins.10.1075/hcp.43.12glySearch in Google Scholar

Glynn, Dylan. 2014d. Correspondence Analysis: An exploratory technique for identifying usage patterns. In Dylan Glynn & Justyna Robinson (eds.), Corpus methods for semantics: Quantitative studies in polysemy and synonymy, 443–486. Amsterdam: John Benjamins.10.1075/hcp.43.17glySearch in Google Scholar

Glynn, Dylan. 2014e. Conceptualisation of home in popular Anglo-American texts: A multifactorial diachronic analysis. In Javier Díaz-Vera (ed.), Metaphor and metonymy across time and cultures, 265–294. Amsterdam: John Benjamins.10.1515/9783110335453.265Search in Google Scholar

Glynn, Dylan. 2014f. The social nature of anger: Multivariate corpus evidence for context effects upon conceptual structure. In Iva Novakova, Peter Blumenthal & Dirk Siepmann (eds.), Emotions in discourse, 69–82. Frankfurt am Main: Peter Lang.Search in Google Scholar

Glynn, Dylan. 2015. Semasiology and onomasiology: Empirical questions between meaning, naming and context. In Jocelyne Daems, Eline Zenner, Kris Heylen, Dirk Speelman & Hubert Cuyckens (eds.), Change of paradigms – New Paradoxes: Recontextualizing Language and Linguistics, 47–79. Berlin: De Gruyter Mouton.10.1515/9783110435597-004Search in Google Scholar

Glynn, Dylan. submitted. Cognitive Socio-Semantics: A quantitative study of dialect effects on the polysemy of annoy. Review of Cognitive Linguistics.Search in Google Scholar

Glynn, Dylan & Kerstin Fischer (eds.). 2010. Quantitative methods in Cognitive Semantics: Corpus-driven approaches. Berlin: De Gruyter Mouton.10.1515/9783110226423Search in Google Scholar

Glynn, Dylan & Karolina Krawczak. 2014. Operationalisation of non-observable usage-features: An exploratory study in English and Polish. Paper presented at the International Conference on Evidentiality and Modality in European Languages, Madrid, 6–8 October.Search in Google Scholar

Glynn, Dylan & Justyna Robinson (eds.). 2014. Corpus methods for semantics: Quantitative studies in polysemy and synonymy. Amsterdam: John Benjamins.10.1075/hcp.43Search in Google Scholar

Goguen, Joseph. 1967. L-fuzzy sets. Journal of mathematical analysis and applications 18. 145–174.10.1016/0022-247X(67)90189-8Search in Google Scholar

Goguen, Joseph. 1969. The logic of inexact concepts. Svnthese 19. 325–373.10.1007/BF00485654Search in Google Scholar

Greenacre, Michael 2007. Correspondence analysis in practice, 2nd edn. Boca Raton: Chapman & Hall.10.1201/9781420011234Search in Google Scholar

Gries, Stefan Th. 1999. Particle movement: A cognitive and functional approach. Cognitive Linguistics 10. 105–145.10.1515/cogl.1999.005Search in Google Scholar

Gries, Stefan Th. 2003. Multifactorial analysis in corpus linguistics: A study of particle placement. Continuum: London.Search in Google Scholar

Gries, Stefan Th. 2006. Corpus-based methods and Cognitive Semantics: The many senses of to run. In Stefen Th. Gries & Anatol Stefanowitsch (eds.), Corpora in Cognitive Linguistics: Corpus-based approaches to syntax and lexis, 57–99. Berlin: Mouton de Gruyter.10.1515/9783110197709Search in Google Scholar

Gries, Stefan Th. 2010. Behavioral profiles: A fine-grained and quantitative approach in corpus-based lexical semantics. The Mental Lexicon 5. 323–346.10.1075/bct.47.04griSearch in Google Scholar

Gries, Stefan Th. & Dagmar Divjak. 2009. Behavioral profiles: A corpus-based approach towards cognitive semantic analysis. In Vyvyan Evans & Stephanie Pourcel (eds.), New directions in Cognitive Linguistics, 57–75. Amsterdam: John Benjamins.10.1075/hcp.24.07griSearch in Google Scholar

Gries, Stefan Th. & Naoki Otani. 2010. Behavioral profiles: a corpus-based perspective on synonymy and antonymy. ICAME Journal 34. 121–150.Search in Google Scholar

Gries, Stefan Th. & Anatol Stefanowitsch (eds.). 2006. Corpora in Cognitive Linguistics: Corpus-based approaches to syntax and lexis. Berlin & New York: Mouton de Gruyter.10.1515/9783110197709Search in Google Scholar

Heider, Eleanor [Rosch]. 1971. ‘Focal’ color areas and the development of color names. Developmental Psychology 4. 447–455.10.1037/h0030955Search in Google Scholar

Heider, Eleanor [Rosch]. 1972. Universals in color naming and memory. Journal of Experimental Psychology 93: 10–20.10.1037/h0032606Search in Google Scholar

Herskovits, Annette. 1986. Language and spatial cognition: An interdisciplinary study of the prepositions in English. Cambridge: Cambridge University Press.Search in Google Scholar

Heylen, Kris, Thomas Wielfaert, Dirk Speelman & Dirk Geeraerts. 2015. Monitoring polysemy: Word space models as a tool for large-scale lexical semantic analysis. Lingua 157. 153–172.10.1016/j.lingua.2014.12.001Search in Google Scholar

Hopper, Paul. 1987. Emergent grammar. Berkeley Linguistics Society 13. 139–157.10.3765/bls.v13i0.1834Search in Google Scholar

Horvath, Steve. 2011. Weighted network analysis: Applications in genomics and systems biology. New York: Springer.10.1007/978-1-4419-8819-5Search in Google Scholar

Janda, Laura. 1990. Radial network of a grammatical category – its genesis and dynamic structure. Cognitive Linguistics 1. 269–288.10.1515/cogl.1990.1.3.269Search in Google Scholar

Klavan, Jane. 2014. A multifactorial corpus analysis of grammatical synonymy: The Estonian adessive and adposition peal ‘on’. In Dylan Glynn & Justyna Robinson (eds.), Corpus methods for semantics: Quantitative studies in polysemy and synonymy, 253–278. Amsterdam: John Benjamins.10.1075/hcp.43.10klaSearch in Google Scholar

Krawczak, Karolina. 2014a. Corpus evidence for the cross-cultural structure of social emotions: Shame, embarrassment, and guilt in English and Polish. Poznań Studies in Contemporary Linguistics 50. 441–475.10.1515/psicl-2014-0023Search in Google Scholar

Krawczak, Karolina. 2014b. Epistemic stance predicates in English: A quantitative corpus-driven study of subjectivity. In Dylan Glynn & Mette Sjölin (eds.), Subjectivity and epistemicity: Corpus, discourse, and literary approaches to stance, 355–386. Lund: Lund University Press.Search in Google Scholar

Krawczak, Karolina & Dylan Glynn. 2015. Operationalising mirativity: A usage-based quantitative study on constructional construal in English. Review of Cognitive Linguistics 13(2). 253–282.10.1075/rcl.13.2.04kraSearch in Google Scholar

Krawczak, Karolina & Iwona Kokorniak. 2012. A corpus-driven quantitative approach to the construal of Polish think. Poznań Studies in Contemporary Linguistics 48. 439–472.10.1515/psicl-2012-0021Search in Google Scholar

Lakoff, George. 1973. Hedges: A study in meaning criteria and the logic of fuzzy concepts. Journal of Philosophical Logic 2. 458–508.10.1007/978-94-010-1756-5_9Search in Google Scholar

Lakoff, George. 1987. Women, fire, and dangerous things: What categories reveal about the mind. Chicago: University of Chicago Press.10.7208/chicago/9780226471013.001.0001Search in Google Scholar

Langacker, Ronald. 1987. Foundations of Cognitive Grammar. Vol. 1: Theoretical prerequisites. Stanford: Stanford University Press.Search in Google Scholar

Priness, Ido, Oded Maimon & Irad Ben-Gal. 2007. Evaluation of gene-expression clustering via mutual information distance measure. BMC Bioinformatics 8. 11110.1186/1471-2105-8-111Search in Google Scholar

Rastier, François. 1987. Sémantique interprétative. Paris, Presses Universitaires de France.Search in Google Scholar

Rosch, Eleanor [nee Heider]. 1973. Natural categories. Cognitive Psychology 4. 328–350.10.1016/0010-0285(73)90017-0Search in Google Scholar

Rosch, Eleanor [nee Heider]. 1975. Cognitive reference points. Cognitive Psychology 7. 532–547.10.1016/0010-0285(75)90021-3Search in Google Scholar

Rudzka-Ostyn, Brygida. 1985. Metaphoric processes in word formation. In Wolf Paprotté & René Dirven (eds.), Ubiquity of metaphor: Metaphor in language and thought, 209–241. Amsterdam: John Benjamins.10.1075/cilt.29.11rudSearch in Google Scholar

Rudzka-Ostyn, Brygida. 1988. Semantic extensions into the domain of verbal communication. In Brygida Rudzka-Ostyn (ed.), Topics in Cognitive Linguistics, 507–553. Amsterdam: John Benjamins.10.1075/cilt.50.19rudSearch in Google Scholar

Rudzka-Ostyn, Brygida. 1989. Prototypes, schemas, and cross-category correspondences: The case of ask. Linguistics 27. 613–661.10.1515/ling.1989.27.4.613Search in Google Scholar

Rudzka-Ostyn, Brygida. 1995. Metaphor, schema, invariance: The case of verbs of answering. In Louis Goossens, Paul Pauwels, Brygida Rudzka-Ostyn, Anne-Marie Simon-Vandenbergen & Johan Vanparys (eds.), By word of mouth: Metaphor, metonymy, and linguistic action from a cognitive perspective, 205–244. Amsterdam: John Benjamins.10.1075/pbns.33.08rudSearch in Google Scholar

Sandra, Dominiek & Sandra Rice. 1995. Network analyses of prepositional meaning: Mirroring whose mind – the linguist’s or the language user’s? Cognitive Linguistics 6. 89–130.10.1515/cogl.1995.6.1.89Search in Google Scholar

Schütze, Henrich. 1998. Automatic word sense discrimination. Computational Linguistics 24. 97–123.Search in Google Scholar

Speelman, Dirk & Dylan Glynn. 2005. LiveJournal corpus of American and British English. Leuven: University of Leuven, Department of Linguistics.Search in Google Scholar

Talmy, Leonard. 1985. Force dynamics in language and cognition. Cognitive Science 12. 49–100.10.1207/s15516709cog1201_2Search in Google Scholar

Taylor, John. 1989. Linguistic categorization: Prototypes in linguistic theory. Oxford: Clarendon Press.Search in Google Scholar

Turney, Peter & Patrick Pantel. 2010. From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research 37. 141–188.10.1613/jair.2934Search in Google Scholar

Tyler, Andrea & Vyvyan Evans. 2001. Reconsidering prepositional polysemy networks: The case of over. Language 77. 724–765.10.1515/9783110895698.95Search in Google Scholar

UNESCO. 2013. Statistical guide for partitioning around medoids. http://tinyurl.com/yh5qvqj (accessed 25 April 2016)Search in Google Scholar

Vandeloise, Claude. 1986. L’espace en français. Paris: Seuil.Search in Google Scholar

Victorri, Bernard. 1997. La polysémie: Un artéfact de la linguistique? Revue de sémantique et pragmatique 2. 41–62.Search in Google Scholar

Wierzbicka, Anna. 1990. Prototypes save: On the uses and abuses of the notion “prototype” in linguistics and related fields. In Savas L. Tsohatzidis (ed.), Meanings and prototypes, 347–367. London: Routledge.Search in Google Scholar

Zadeh, Lofti 1965. Fuzzy sets. Information and Control 8. 338–353.10.21236/AD0608981Search in Google Scholar

Zadeh, Lofti 1968. Probability measures of fuzzy events. Journal of Mathematical Analysis and Applications 23. 421–427.10.1016/0022-247X(68)90078-4Search in Google Scholar

Received: 2015-8-31
Received: 2016-2-20
Revised: 2016-3-10
Accepted: 2016-5-31
Published Online: 2016-11-8
Published in Print: 2016-11-1

©2016 by De Gruyter Mouton

Scroll Up Arrow