Search Results

You are looking at 1 - 10 of 52 items :

  • "valency/lexicon" x
Clear All

verbal Frames in Functional Generative Description. Prague Bulletin of Mathematical Linguistics , 22:3–40, 1974. Panevová, Jarmila. More Remarks on Control. Prague Linguistic Circle Papers , 2(1):101–120, 1996. Popel, Martin and Zdeněk Žabokrtský. TectoMT: modular NLP framework. Advances in Natural Language Processing , pages 293–304, 2010. Przepiórkowski, Adam and Alexandr Rosen. Czech and Polish Raising/Control with or without Structure Sharing. Research in Language , 3:33–66, 2005. Šindlerová, Jana and Ondřej Bojar. Towards English-Czech Parallel Valency Lexicon

. NEALT Proceedings Series, Vol. 8, i-ii, Northern European Association for Language Technology (NEALT), Tartu University. [4] Collins COBUILD Advanced Learner’s Dictionary (2014). [5] Čermáková, A. (2009). Valence českých substantiv . Lidové noviny, Praha. [6] Hajič, J., Panevová, J., Urešová, Z., Bémová, A., Kolářová, V., and Pajas, P. (2003). PDT-VALLEX: Creating a Largecoverage Valency Lexicon for Treebank Annotation. In Proceedings of The Second Workshop on Treebanks and Linguistic Theories , pages 57–68, Växjö University Press, Sweden. [7] Herbst, T., Heath, D

References [1] Benko, V. (2014). Aranea: Yet Another Family of (Comparable) Web Corpora. In Sojka, P. et al., editors, TSD 2014. LNAI 8655, pages 247–256. Springer International Publishing. [2] Hajič, J. et al. (2003). PDT-VALLEX: Creating a Largecoverage Valency Lexicon for Treebank Annotation. In Proceedings of The Second Workshop on Treebanks and Linguistic Theories, pages 57–68. Vaxjo University Press. [3] Kolářová, V. (2010). Valence deverbativních substantiv v češtině (na materiálu substantiv s dativní valencí). Praha, Karolinum. [4] Kolářová, V. (2014a

A Corpus-Based Analysis of the Complementation Patterns of English Verbs, Nouns and Adjectives

References [1] Lopatková M. (2003). Valency in the Prague Dependency Treebank: Building the Valency Lexicon. In The Prague Bulletin of Mathematical Linguistics, 79‒80, pages 37–60, MFF UK. [2] Lopatková, M., Kettnerová, V., Bejček, E., Vernerová, A., Žabokrtský, Z., and Barančíková, P. (2018). VALLEX 3.5 – Valenční slovník českých sloves. Charles University, Prague, Accessible at: [3] Kettnerová, V., Lopatková, M., and Hrstková, K. (2008). Semantic Classes in Czech Valency Lexicon: Verbs of Communication and Verbs of Exchange

al. 2009 ), containing 354,529 words from Iliad, Odyssey , Hesiod, Aeschylus, Sophocles, and Plato; the PROIEL Project Greek treebank, which contains 187,486 words from Herodotus and the New Testament. In this article, we describe a valency lexicon for Latin verbs derived from the Latin Dependency Treebank. We will also show how we extended the same corpus-driven approach to create an Ancient Greek lexicon from the Ancient Greek Dependency Treebank, thus demonstrating the benefits of a collaborative perspective on the creation of language resources and the

framenet. International Journal of Lexicography , 16(3):235-250, 2003. Hajič, Jan, Eva Hajičová, Jarmila Panevová, Petr Sgall, Petr Pajas, Jan Štěpánek, Jiří Havelka, and Marie Mikulová. Prague Dependency Treebank 2.0 . Linguistic Data Consortium, Philadelphia, PA, USA, 2006. Kettnerová, Václava, Markéta Lopatková, and Klára Hrstková. Semantic roles in valency lexicon of czech verbs: Verbs of communication and exchange. In Ranta, Aarne and Bengt Nordström, editors, Advances in Natural Language Processing (6th International Conference on NLP, GoTAL 2008) , volume 5221

from Scriptum super Sententiis Magistri Petri Lombardi and Summa Theologiae. The paper details the multi-layer annotation style of the IT-TB and its background theoretical motivations. The conversion process to the now widely used Universal Dependencies style is described as well. Across more than a decade, the proj- ect has developed a number of linguistic resources and NLP tools for Latin connected to the IT-TB. As for the resources, the paper presents the syntax- based subcategorization lexicon IT-VaLex and the valency lexicon Latin Vallex. As for the tools, the


This paper presents the Slovene Training Corpus ssj500k 2.2, which has been annotated on the levels of tokenization, sentence segmentation, part-of-speech tagging, lemmatization, syntactic dependencies, named entities, verbal multi-word expressions, and semantic role labeling. It describes the individual layers of annotation and shows the scope of using the training corpus in the production of various lexicons, such as the lexicon of multi-word units and the valency lexicon of modern Slovene. It concludes by presenting our future work, i.e. the annotation of multi-word expressions based on the Slovene Lexical Database.


Automatic word sense disambiguation (WSD) has proven to be an important technique in many natural language processing tasks. For many years the problem of sense disambiguation has been approached with a wide range of methods, however, it is still a challenging problem, especially in the unsupervised setting. One of the well-known and successful approaches to WSD are knowledge-based methods leveraging lexical knowledge resources such as wordnets. As the knowledge-based approaches mostly do not use any labelled training data their performance strongly relies on the structure and the quality of used knowledge sources. However, a pure knowledge-base such as a wordnet cannot reflect all the semantic knowledge necessary to correctly disambiguate word senses in text. In this paper we explore various expansions to plWordNet as knowledge-bases for WSD. Semantic links extracted from a large valency lexicon (Walenty), glosses and usage examples, Wikipedia articles and SUMO ontology are combined with plWordNet and tested in a PageRank-based WSD algorithm. In addition, we analyse also the influence of lexical semantics vector models extracted with the help of the distributional semantics methods. Several new Polish test data sets for WSD are also introduced. All the resources, methods and tools are available on open licences.