Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Lodz Papers in Pragmatics

Founded by Cap, Piotr

Editor-in-Chief: Chilton, Paul / Kopytowska, Monika

2 Issues per year


SCImago Journal Rank (SJR) 2016: 0.105

Online
ISSN
1898-4436
See all formats and pricing
More options …

Towards equivalence links between senses in plWordNet and Princeton WordNet

Ewa Rudnicka / Francis Bond / Łukasz Grabowski / Maciej Piasecki / Tadeusz Piotrowski
Published Online: 2017-09-02 | DOI: https://doi.org/10.1515/lpp-2017-0002

Abstract

The paper focuses on the issue of creating equivalence links in the domain of bilingual computational lexicography. The existing interlingual links between plWordNet and Princeton WordNet synsets (sets of synonymous lexical units – lemma and sense pairs) are re-analysed from the perspective of equivalence types as defined in traditional lexicography and translation. Special attention is paid to cognitive and translational equivalents. A proposal of mapping lexical units is presented. Three types of links are defined: super-strong equivalence, strong equivalence and weak implied equivalence. The strong equivalences have a common set of formal, semantic and usage features, with some of their values slightly loosened for strong equivalence. These will be introduced manually by trained lexicographers. The sense-mapping will partly draw on the results of the existing synset mapping. The lexicographers will analyse lists of pairs of synsets linked by interlingual relations such as synonymy, partial synonymy, hyponymy and hypernymy. They will also consult bilingual dictionaries and check translation probabilities in a parallel corpus. The results of the proposed mapping have great application potential in the area of natural language processing, translation and language learning.

Keywords: equivalence; wordnets; interlingual mapping; synset; lexical unit; sense-level

References

  • Adamska-Sałaciak, Arleta. 2010. Examining equivalence. International Journal of Lexicography 23(4). 387–409.CrossrefWeb of ScienceGoogle Scholar

  • Adamska-Sałaciak, Arleta. 2013. Issues in compiling bilingual dictionaries. In Howard Jackson (ed.), The Bloomsbury companion to lexicography, 213–231. London: Bloomsbury.Google Scholar

  • Adamska-Sałaciak, Arleta. 2014. Bilingual lexicography: translation dictionaries. In Patrick Hanks & Gilles-Maurice de Schryver (eds.), International handbook of modern lexis and lexicography, 1–11. Springer-Verlag: Berlin-Heidelberg.Google Scholar

  • Bentivogli, Luisa & Emanuele Pianta. 2004. Extending WordNet with Syntagmatic Information. In Proceedings of the Second Global WordNet Conference, Brno, Czech Republic, January 20–23, 2004, 47–53.Google Scholar

  • Crenn, Tiphaine. 1996. Register and register labelling in dictionaries. Ottawa: University of Ottawa.Google Scholar

  • Fellbaum, Christiane (ed.). 1998. WordNet: An Electronic Lexical Database. Cambridge: MIT Press.Google Scholar

  • von Fintel, Kai & Lisa Matthewson. 2008. Universals in Semantics. The Linguistic Review 25(1-2). 139–201.Google Scholar

  • Fišer, Darja & Benoit Sagot. 2015. Constructing a poor man’s wordnet in a resource-rich world. Language Resources & Evaluation 49(3). 601–635.Web of ScienceCrossrefGoogle Scholar

  • Hamp, Birgit & Helmut Feldweg. 1997. GermaNeta Lexical Semantic Net for German. In Piek Vossen, Geert Adriaens, Nicoletta Calzolari, Antonio Sanfilippo & Yorick Wilks (eds.), Proceedings of ACL workshop Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications, 9–15. Madrid: ACL.Google Scholar

  • Héja, Enikő. 2016. Revisiting translational equivalence: contributions from datadriven bilingual lexicography. International Journal of Lexicography, ecw032.CrossrefGoogle Scholar

  • Kamiński, Mariusz. 2016. Towards successful communication between the dictionary and the user. In Anna Kuzio, Jolanta Kowal & Miroslawa Wawrzak-Chodaczek (eds.), Social communication in the real and virtual world. Vol. 1., 73–91. Saarbrücken: LAP LAMBERT Academic Publishing.Google Scholar

  • Lardilleux, Adrien & Yves Lepage. 2009. Sampling-based multilingual alignment. International Conference on Recent Advances in Natural Language Processing (RANLP 2009), Borovets, Bulgaria, 214–218. Retrieved from: https://hal.archives-ouvertes.fr/hal-00439789/document.

  • Lew, Robert. 2013. Identifying, ordering and defining senses. In Howard Jackson (ed.), The Bloomsbury companion to lexicography, 284–302. London: Bloomsbury.Google Scholar

  • Lindén, Krister & Lauri Carlson. 2010. FinnWordNet – WordNet påfinska via översättning, LexicoNordicaNordic Journal of Lexicography, 17. 119–140. [English translation ‘FinnWordNet – Finnish Word-Net by translation’]. Retrieved from: http://www.ling.helsinki.fi/~klinden/pubs/FinnWordnetInLexicoNordica-en.pdf.

  • Lui, Marco & Timothy Baldwin. 2011. Cross-domain Feature Selection for Language Identification, In Proceedings of the Fifth International Joint Conference on Natural Language Processing (IJCNLP 2011), Chiang Mai, Thailand. 553–561. Retrieved from:http://www.aclweb.org/anthology/I11-1062.

  • Maziarz, Marek, Maciej Piasecki & Stanisław Szpakowicz. 2013a. The chicken-and-egg problem in wordnet design: synonymy, synsets and constitutive relations. Language Resources and Evaluation 47(3). 769–796.CrossrefWeb of ScienceGoogle Scholar

  • Maziarz, Marek, Maciej Piasecki & Stanisław Szpakowicz. 2015. The System of Register Labels in plWordNet. Cognitive Studies 15. 161–175.Google Scholar

  • Pęzik, Piotr. 2016. Exploring phraseological equivalence with Paralela. In Ewa Gruszczyńska & Agnieszka Leńko-Szymańska (eds.), Polish-Language Parallel Corpora, 67–81. Warszawa: Instytut Lingwistyki Stosowanej UW.Google Scholar

  • Piasecki, Maciej, Stanisław Szpakowicz & Bartosz Broda 2009. A wordnet from the ground up. Wrocław: Oficyna Wydawnicza Politechniki Wrocławskiej.Google Scholar

  • Piasecki, Maciej, Marek Maziarz, Ewa Rudnicka, Agnieszka Dziob & Paweł Kędzia. 2017, in print. plWordnet – a Large Corpus-Based Wordnet of Polish. Linguistic Issues in Language Technology.Google Scholar

  • Piotrowski, Tadeusz. 2011a. Ekwiwalencja w słownikach dwujęzycznych. In Wojciech Chlebda (ed.), Na tropach translatöw: w poszukiwaniu odpowiedniköw przekładowych, 45–70. Opole: Wydawnictwo Uniwersytetu Opolskiego.Google Scholar

  • Piotrowski, Tadeusz. 2011b. Tertium comparationis w przekładoznawstwie. In Piotr Stalmaszczyk (ed.), Metodologie językoznawstwa. Od ontologii do pragmatyki, 175–192. Łόdź: Wydawnictwo Uniwersytetu Łόdzkiego.Google Scholar

  • Rudnicka, Ewa, Marek Maziarz, Maciej Piasecki & Stanisław Szpakowicz 2012. A strategy of mapping Polish WordNet onto Princeton WordNet. In Proceedings of COLING 2012. Retrieved from: www.aclweb.org/anthology/C12-2101.

  • Rudnicka, Ewa, Wojciech Witkowski & Michał Kaliński. 2015. a semi-automatic adjective mapping between plWordNet and Princeton WordNet. In: Pavel Kral & Vaclav Matousek (eds.), Text, speech, dialogue, 360–368. Berlin: Springer.Google Scholar

  • Rudnicka, Ewa, Wojciech Witkowski & Łukasz Grabowski. 2016. Towards a methodology for filtering out gaps and mismatches across wordnets: the case of noun synsets in plWordNet and Princeton WordNet. In Verginica Barbu Mititelu, Corina Forascu, Christiane Fellbaum & Piek Vossen (eds.), Proceedings of the Eighth International Global WordNet Conference 2016, 27–30 Jan 2016, Bucharest, Romania, 344–351. Retrieved from: http://gwc2016.racai.ro/procedings.pdf

  • Rudnicka, Ewa, Maciej Piasecki, Tadeusz Piotrowski, Łukasz Grabowski & Francis Bond. 2017, in print. Mapping wordnets from the perspective of interlingual equivalence. Cognitive Studies 17.Google Scholar

  • Rudnicka, Ewa, Maciej Piasecki & Wojciech Witkowski. 2017, in print. enWordnet – a mapping-based extension of Princeton WordNet. Linguistic Issues in Language Technology.Google Scholar

  • Svensen, Bo. 2009. A Handbook of lexicography. The theory and practice of dictionary-making. Cambridge: Cambridge University Press.Google Scholar

  • Taylor, John. 2012. The mental corpus. How language is represented in the mind. Oxford: Oxford University Press.Google Scholar

  • Vossen, Piek (ed.). 2002. EuroWordNet general documentation, Version 3. Retrieved from: http://www.vossen.info/docs/2002/EWNGeneral.pdf

  • Yong, Heming & Jing Peng. 2007. Bilingual lexicography from a communicative perspective. Amsterdam: John Benjamins.Google Scholar

About the article

Ewa Rudnicka

Ewa Rudnicka is a Research Associate at the Department of Computer Science and Management, Wroclaw University of Technology, Poland. Her research interests include computational bilingual lexicography, comparative linguistics, formal semantics, translation studies. She is the coordinator of the process of mapping plWordNet onto Princeton WordNet. She is a member of G4.19. Language Technology and Computational Linguistic Research Group.

Francis Bond

Francis Bond is an Associate Professor at the Division of Linguistics and Multilingual Studies, Nanyang Technological University, Singapore. He worked on machine translation and natural language understanding in Japan, first at Nippon Telegraph and Telephone Corporation and then at the National Institute of Information and Communications Technology, where his focus was on open source natural language processing. He is an active member of the Deep Linguistic Processing with HPSG Initiative (DELPHIN) and the Global WordNet Association. His main research interest is in natural language understanding. Francis has developed and released wordnets for Chinese, Japanese, Malay and Indonesian and coordinates the Open Multilingual Wordnet.

Łukasz Grabowski

Łukasz Grabowski is an Associate Professor at the Institute of English, University of Opole, Poland. His research interests include corpus linguistics, phraseology, formulaic language, translation studies and lexicography. He is also interested in computer-assisted methods of text analysis. He has published internationally in International Journal of Corpus Linguistics and English for Specific Purposes, among others. He is also Managing Editor of the journal Explorations: A Journal of Language and Literature.

Maciej Piasecki

Maciej Piasecki is an Associate Professor at the Department of Computer Science and Management, Wroclaw University of Technology, Poland. He is a Polish National Coordinator of CLARIN ERIC (www.clarin.eu) and a member of Global WordNet Association Board. He has been an initiator and is the leader of plWordNet project (a large wordnet of Polish) and is the leader of G 4.19. Language Technology and Computational Linguistic Research Group. His research interests cover different areas of natural language processing and engineering, computational lexicography, data extraction and information retrieval.

Tadeusz Piotrowski

Tadeusz Piotrowski is a Professor at the English Department, University of Wrocław, Poland. His research interests include theory, practice, and history of monolingual and bilingual lexicography and dictionaries, corpus linguistics, translation studies, participated in most major bilingual dictionary projects in Poland, working with such companies as PWN, OUP, Pons-Klett, Langenscheidt, Prószyński, Wiedza Powszechna, Kościuszko Foundation, and wrote a number of dictionaries for Spotkania. He is also interested in computational lexicography and computer-assisted text analysis. He published three books and about 200 papers.


Ewa Rudnicka Department of Computer Science and Management Wybrzeże Wyspiańskiego 27 50-370 Wrocław, Poland


Published Online: 2017-09-02

Published in Print: 2017-08-28


Citation Information: Lodz Papers in Pragmatics, Volume 13, Issue 1, Pages 3–24, ISSN (Online) 1898-4436, ISSN (Print) 1895-6106, DOI: https://doi.org/10.1515/lpp-2017-0002.

Export Citation

© 2017 Walter de Gruyter GmbH, Berlin/Boston. Copyright Clearance Center

Comments (0)

Please log in or register to comment.
Log in