Jump to ContentJump to Main Navigation
Show Summary Details

Corpus Linguistics and Linguistic Theory

Founded by Gries, Stefan Th. / Stefanowitsch, Anatol

Ed. by Wulff, Stefanie

2 Issues per year

IMPACT FACTOR 2015: 0.429
5-year IMPACT FACTOR: 0.849

SCImago Journal Rank (SJR) 2015: 0.281
Source Normalized Impact per Paper (SNIP) 2015: 0.971
Impact per Publication (IPP) 2015: 0.485

See all formats and pricing

Verb similarity: Comparing corpus and psycholinguistic data

Lara Gil-Vallejo
  • Corresponding author
  • Department of Arts and Humanities, Universitat Oberta de Catalunya, Barcelona, Catalunya, Spain
  • Email:
/ Marta Coll-Florit
  • Department of Arts and Humanities, Universitat Oberta de Catalunya, Barcelona, Catalunya, Spain
/ Irene Castellón
  • Department of Linguistics, Universitat de Barcelona, Barcelona, Catalunya, Spain
/ Jordi Turmo
  • Department of Computer Science, Universitat Politecnica de Catalunya, Barcelona, Catalunya, Spain
Published Online: 2017-01-26 | DOI: https://doi.org/10.1515/cllt-2016-0045


Similarity, which plays a key role in fields like cognitive science, psycholinguistics and natural language processing, is a broad and multifaceted concept. In this work we analyse how two approaches that belong to different perspectives, the corpus view and the psycholinguistic view, articulate similarity between verb senses in Spanish. Specifically, we compare the similarity between verb senses based on their argument structure, which is captured through semantic roles, with their similarity defined by word associations. We address the question of whether verb argument structure, which reflects the expression of the events, and word associations, which are related to the speakers’ organization of the mental lexicon, shape similarity between verbs in a congruent manner, a topic which has not been explored previously. While we find significant correlations between verb sense similarities obtained from these two approaches, our findings also highlight some discrepancies between them and the importance of the degree of abstraction of the corpus annotation and psycholinguistic representations.

Keywords: similarity; semantic roles; word associations


  • Albertuz, F. J. 2007. Sintaxis, semántica y clases de verbos: Clasificación verbal en el proyecto ADESSE. Actas del VI Congreso de Lingüística General, Santiago de Compostela, 3–7 de mayo de 2004. 2015–2030.

  • Alvez, J., J. Atserias, J. Carrera, S. Climent, A. Oliver & G. Rigau. 2008. Consistent annotation of eurowordnet with the top concept ontology. Proceedings of Fourth International WordNet Conference (GWC’08).

  • Baker, S., R. Reichart & A. Korhonen. 2014. An Unsupervised Model for Instance Level Subcategorization Acquisition. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 278–289.

  • Banerjee, S. & T. Pedersen. 2003. Extended gloss overlaps as a measure of semantic relatedness. Ijcai 3. 805–810.

  • Barsalou, L. W., A. Santos, W. K. Simmons & C. D. Wilson. 2008. Language and simulation in conceptual processing. Symbols, embodiment, and meaning. 245–283. Oxford: Oxford University Press.

  • Bonial, C., W. Corvey, M. Palmer, V. V. Petukhova & H. Bunt. 2011. A hierarchical unification of LIRICS and VerbNet semantic roles. Semantic Computing (ICSC), 2011 Fifth IEEE International Conference. 483–489.

  • Brainerd, C. J., Y. Yang, V. F. Reyna, M. L. Howe & B. A. Mills. 2008. Semantic processing in “associative” false memory. Psychonomic Bulletin & Review 15. 1035–1053.

  • Bybee, J. 2010. Language, usage and cognition. Cambridge: Cambridge University Press.

  • Camacho-Collados, J., M. T. Pilehvar & R. Navigli. 2015. A framework for the construction of monolingual and cross-lingual word similarity datasets. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 1–7.

  • Christensen, J., S. Soderland & O. Etzioni. 2010. Semantic role labeling for open information extraction. Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading. 52–60.

  • Chumbley, J. I. & D. A. Balota. 1984. A word’s meaning affects the decision in lexical decision. Memory & Cognition 12. 590–606.

  • Church, K., W. Gale, P. Hanks & D. Hindle. 1989. Word associations and typical predicate-argument relations. Proceedings of the International Workshop on Parsing Technologies.

  • Chwilla, D. J. & H. H. Kolk. 2005. Accessing world knowledge: Evidence from N400 and reaction time priming. Cognitive Brain Research 25(3). 589–606.

  • Clark, H. H. 1970. Word associations and linguistic theory. New horizons in linguistics 1. 271–286.

  • Coll-Florit, M. & S. Gennari. 2011. Time in language: Event duration in language comprehension, Cognitive Psychology 62. 41–79.

  • Comrie, B. 1976. Aspect. Cambridge: Cambridge University Press.

  • Croft, W. & A. Cruse. 2004. Cognitive linguistics. Cambridge: Cambridge University Press.

  • De Deyne, S., D. J. Navarro & G. Storms. 2013. Better explanations of lexical and semantic cognition using networks derived from continued rather than single-word associations. Behavior Research Methods 45(2). 480–498.

  • De Deyne, S., Y. Peirsman & G. Storms. 2009. Sources of semantic similarity. Proceedings of the 31th Annual Conference of the Cognitive Science Society. 1834–1839.

  • De Deyne S. & G. Storms. 2015. Word associations. In J. R. Taylor (ed.), The Oxford handbook of the word. Oxford: Oxford University Press.

  • De Deyne, S., S. Verheyen & G. Storms. 2015. The role of corpus size and syntax in deriving lexico-semantic representations for a wide range of concepts. The Quarterly Journal of Experimental Psychology 68(8). 1643–1664.

  • Deese, J. 1962. Form class and the determinants of association. Journal of verbal learning and verbal behavior 1(2). 79–84.

  • Dowty, D. 1979. Word meaning and montague grammar. Dordrecht: Reidel.

  • Dowty, D. 1991. Thematic proto-roles and argument selection. Language 67(3). 547–619.

  • Faruqui, M. & C. Dyer. 2014. Community evaluation and exchange of word vectors at wordvectors. Org. ACL: System Demonstrations.

  • Fellbaum, C. 1998. WordNet. New Jersey: Blackwell Publishing Ltd.

  • Fellbaum, C. 2015. Lexical relations. In J. R. Taylor (ed.), The Oxford handbook of the word. Oxford: Oxford University Press.

  • Fernández-Montraveta, A. & G. Vázquez. 2014. The SenSem Corpus: An annotated corpus for Spanish and Catalan with information about aspectuality, modality, polarity and factuality. Corpus Linguistics and Linguistic Theory 10(2). 273–288.

  • Ferretti, T. R., K. McRae & A. Hatherell. 2001. Integrating verbs, situation schemas, and thematic role concepts. Journal of Memory & Language 44. 516–547.

  • Fillmore, C. J. 1968. The case for case. Universals in Linguistic Theory. Holt, Rinehart and Winston. 1–88.

  • Fillmore, C. J., C. R. Johnson & M. R. Petruck. 2003. Background to framenet. International Journal of Lexicography 16(3). 235–250.

  • Finkelstein, L., E. Gabrilovich, Y. Matias, E. Rivlin, Z. Solan, G. Wolfman & E. Ruppin. 2001. Placing search in context: The concept revisited. ACM Transactions on Information Systems 20(1). 116–131.

  • Fitzpatrick, T. & C. Izura. 2011. Word association in L1 and L2. Studies in Second Language Acquisition 33(03). 373–398.

  • García-Miguel, J M. 2009 A semantic classification of Spanish verbs. Verb Typologies Revisited: Proceedings of A Cross-Linguistic Reflection on Verbs and Verb Classes, Ghent University, Ghent, Belgium, 5–7 February.

  • Garner, W. R. 1974. The processing of information and structure. New York: Wiley.

  • Gentner, D. & A. B. Markman. 1997. Structure mapping in analogy and similarity. American Psychologist 52(1). 45–56.

  • Goldberg, A. E. 1995. Constructions: A construction grammar approach to argument structure. Chicago: University of Chicago Press.

  • Goldfarb, R., & H. Halpern. 1984. Word association responses in normal adult subjects. Journal of Psycholinguistic Research 13(1). 37–55.

  • Goldstone, R. L. & J. Y. Son. 2005 Similarity. In Holyoak, Keith J. & Morrison, Robert G. The Cambridge handbook of thinking and reasoning, 13–36. New York, NY: Cambridge University Press.

  • Gonzalez-Agirre, A., E. Laparra & G. Rigau. 2012. Multilingual Central Repository version 3.0. In LREC. 2525–2529.

  • Gruber, J. S. 1965. Studies in lexical relations. Diss. MIT. Published as Lexical Structures in Syntax and Semantics. Amsterdam: North Holland, 1976.

  • Hahn, U., N. Chater & L. B. Richardson. 2003. Similarity as transformation. Cognition 87. 1–32.

  • Hare, M., K. McRae & J. L. Elman. 2003. Sense and structure: Meaning as a determinant of verb subcategorization preferences. Journal of Memory and Language 48(2). 281–303.

  • Hernández Muñoz, N. & M. López García. 2014. Análisis de las relaciones semánticas a través de una tarea de libre asociación en español con mapas auto-organizados. RLA. Revista de lingüística teórica y aplicada 52(2). 189–212.

  • Hill, F., R. Reichart & A. Korhonen. 2015. Simlex-999: Evaluating semantic models with (genuine) similarity estimation. Computational Linguistics 41(4). 665–695.

  • Jackendoff, R. S. 1972. Semantic interpretation in generative grammar. Cambridge, MA: The MIT Press.

  • Jackendoff, R. S. 1983. Semantics and Cognition. Cambridge, MA: MIT Press.

  • Jones, MN, J. Willits, S. Dennis & M. Jones. 2015. Models of semantic memory. In J. T. Townsend & J. R. Busemeyer (eds.), The Oxford handbook of Computational and Mathematical Psychology, 232–254. Oxford: Oxford University Press..

  • Keil, F. C. 1989. Concepts, kinds, and cognitive development. Cambridge, MA: MIT Press.

  • Kiss, G. R., C. Armstrong, R. Milroy & J. Piper. 1973. In A. J. Aitken, R. W. Bailey & H. N. Smith (eds.), The Computer and Literary Studies, 153–165. Edinburgh: Edinburgh University Press.

  • Klein, D. E. & G. L. Murphy. 2002. Paper has been my ruin: Conceptual relations of polysemous senses. Journal of Memory and Language 47. 548–570.

  • Kozima, H. & T. Furugori. 1993. Similarity between words computed by spreading activation on an English dictionary. Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics. 232–239.

  • Landauer, T. K. & S. T. Dumais. 1997. A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological review 104(2). 211.

  • Levin, B. 1993. English verb classes and alternations: A preliminary investigation. Chicago: University of Chicago press.

  • Lin, D. 1998. An information-theoretic definition of similarity. ICML 98. 296–304.

  • Liu, D. & D. Gildea. 2010. Semantic role features for machine translation. Proceedings of the 23rd International Conference on Computational Linguistics. 716–724.

  • Luce, P. A., D. B. Pisoni & S. D. Goldinger. 1990. Similarity neighborhoods of spoken words. In G. T. M. Altmann (ed.), Cognitive models of speech processing: Psycholinguistic and computational perspectives. ACL–MIT Press series in natural language processing. 122–147.

  • Lyons, J. 1977. Semantics (vols I & II). Cambridge: Cambridge University Press.

  • Maki, W. S. & E. Buchanan. 2008. Latent structure in measures of associative, semantic, and thematic knowledge. Psychonomic Bulletin & Review 15(3). 598–603.

  • Manning, J. R. & M. J. Kahana. 2012. Interpreting semantic clustering effects in free recall. Memory 20(5). 511–517.

  • McKoon, G. & R. Ratcliff. 1992. Spreading activation versus compound cue accounts of priming: Mediated priming revisited. Journal of Experimental Psychology: Learning, Memory, and Cognition 18(6). 1155.

  • McRae, K., M. Hare, J. L. Elman & T. R. Ferretti. 2005. A basis for generating expectancies for verbs from nouns. Memory & Cognition 33. 1174–1184.

  • McRae, K., S. Khalkhali & M. Hare. 2012. Semantic and associative relations: Examining a tenuous dichotomy. In V. F. Reyna, S. Chapman, M. Dougherty & J. Confrey (eds.), The adolescent brain: Learning, reasoning, and decision making, Washington: American Psychological Association.

  • Merlo, P. & S. Stevenson. 2001. Automatic verb classification based on statistical distributions of argument structure. Computational Linguistics 27(3). 373–408.

  • Michelbacher, L., S. Evert & H. Schütze. 2007. Asymmetric association measures. Proceedings of the Recent Advances in Natural Language Processing (RANLP 2007).

  • Mikolov, T., K. Chen, G. Corrado & J. Dean. 2013. Efficient estimation of word representations in vector space. ICLR Workshop.

  • Moldovan, C. D., P. Ferré, J. Demestre & R. Sánchez-Casas. 2015. Semantic similarity: Normative ratings for 185 Spanish noun triplets. Behavior Research Methods 47(3). 788–799.

  • Mollin, S. 2009. Combining corpus linguistic and psychological data on word co-occurrences: Corpus collocates versus word associations. Corpus Linguistics and Linguistic Theory 5(2). 175–200.

  • Neely, J. H. 1991. Semantic priming effects in visual word recognition: A selective review of current findings and theories. Basic Processes in Reading: Visual Word Recognition 11. 264–336.

  • Nelson, D. L., C. L. McEvoy & S. Dennis. 2000. What is free association and what does it measure? Memory & Cognition 28. 887–899.

  • Nelson, D. L., C. L. McEvoy & T. A. Schreiber. 1998. The University of South Florida word association, rhyme, and word fragment norms. http://www.usf.edu/FreeAssociation/(accessed 16 April 2016).

  • Niles, I. & A. Pease. 2001. Towards a standard upper ontology. In Proceedings of the international conference on Formal Ontology in Information Systems, Volume 2001. 2–9.

  • Nordquist, D. 2009. Investigating elicited data from a usage-based perspective. Corpus Linguistics and Linguistic Theory 5(1). 105–130.

  • Padró, L. & E. Stanilovsky. 2012. Freeling 3.0: Towards wider multilinguality. LREC 2012.

  • Palmer, M., D. Gildea & P. Kingsbury. 2005. The proposition bank: An annotated corpus of semantic roles. Computational linguistics 31(1). 71–106.

  • Patwardhan, S., S. Banerjee & T. Pedersen. 2003. Using measures of semantic relatedness for word sense disambiguation. In A. Gelbukh (ed.), Computational Linguistics and Intelligent Text Processing, 241–257. Berlin: Springer.

  • Peirsman, Y. & D. Geeraerts. 2009. Predicting strong associations on the basis of corpus data. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics. 648–656.

  • Plaut, D. C. 1995. Semantic and associative priming in a distributed attractor network. Proceedings of the 17th Annual Conference of the Cognitive Science Society 17(2). 37–42.

  • Rayner, K. & L. Frazier. 1989 Selection mechanisms in reading lexically ambiguous words. Journal of Experimental Psychology: Learning, Memory, and Cognition 15(5). 779–790.

  • Resnik, P. 1995. Using information content to evaluate semantic similarity in a taxonomy. Proceedings of IJCAI-95. 448–453.

  • Riordan, B. & M. N. Jones. 2007. Comparing semantic space models using child-directed speech. In D. S. MacNamara & J. G. Trafton, Proceedings of the 29th Annual Cognitive Science Society, 599–604. Austin: Cognitive Science Society.

  • Rodd, J., G. Gaskell & W. Marslen-Wilson. 2002 Making sense of semantic ambiguity: Semantic competition in lexical access. Journal of Memory and Language 46. 245–266.

  • Roediger, H. L., III, J. M. Watson, K. B. McDermott & D. A. Gallo. 2001. Factors that determine false recall: A multiple regression analysis. Psychonomic Bulletin & Review 8. 385–407.

  • Sahlgren, M. 2006. The Word-space model: Using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. Stockholm: Stockholm University publishing service, Ph.D. thesis.

  • Savage, C., E. Lieven, A. Theakston & M. Tomasello. 2003. Testing the abstractness of children’s linguistic representations: Lexical and structural priming of syntactic constructions in young children. Developmental Science 6(5). 557–567.

  • Schuler, K. K. 2005. VerbNet: A broad-coverage, comprehensive verb lexicon. Philadelphia, PA: University of Pennsylvania, Ph.D. thesis.

  • Schulte im Walde, S. 2006. Experiments on the automatic induction of German semantic verb classes. Computational Linguistics 32. 159–194.

  • Schulte im Walde, S. 2008. Human associations and the choice of features for semantic verb classification. Research on Language and Computation 6(1).79–111.

  • Schulte im Walde, S., A. Melinger, M. Roth & A. Weber. 2008. An empirical characterisation of response types in German association norms. Research on Language and Computation 6(2). 205–238.

  • Schütze, H. 1992. Dimensions of meaning. In Supercomputing’92 Proceedings. 787–796.

  • Shen, D. & M. Lapata. 2007. Using Semantic Roles to Improve Question Answering. In EMNLP-CoNLL. 12–21.

  • Shepard, R. N. 1962. The analysis of proximities: Multidimensional scaling with an unknown distance function. Psychometrika 27. 125–140.

  • Spence, D. P. & K. C. Owens. 1990. Lexical co-occurrence and association strength. Journal of Psycholinguistic Research 19(5). 317–330.

  • Steyvers, M., R. M. Shiffrin & D. L. Nelson. 2004. Word association spaces for predicting semantic similarity effects in episodic memory. Experimental cognitive psychology and its applications: Festschrift in honor of Lyle Bourne, Walter Kintsch, and Thomas Landauer. 237–249.

  • Sun, L. & A. Korhonen. 2009. Improving Verb Clustering with Automatically Acquired Selectional Preferences. EMNLP.

  • Tversky, A. 1977. Features of similarity. Psychological Review 84(4). 327–352.

  • Vitevitch, M. S. & P. A. Luce. 1999. Probabilistic phonotactics and neighborhood activation in spoken word recognition. Journal of Memory and Language 40(3). 374–408.

  • Vitevitch, M. S. & P. A. Luce. 2016. Phonological neighborhood effects in spoken word perception and production. Annual Review of Linguistics 2. 75–94.

  • Wettler, M. & R. Rapp. 1993. Computation of word associations based on the co-occurrences of words in large corpora. In Proceedings of the 1st Workshop on Very Large Corpora. 84–93.

  • Yang, D. & D. M. Powers. 2006. Verb similarity on the taxonomy of WordNet. In Proceedings of GWC-06. 121–128.

About the article

Published Online: 2017-01-26

Citation Information: Corpus Linguistics and Linguistic Theory, ISSN (Online) 1613-7035, ISSN (Print) 1613-7027, DOI: https://doi.org/10.1515/cllt-2016-0045. Export Citation

Comments (0)

Please log in or register to comment.
Log in