Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Poznan Studies in Contemporary Linguistics

Editor-in-Chief: Dziubalska-Kolaczyk, Katarzyna


IMPACT FACTOR 2018: 0.347

CiteScore 2018: 0.56

SCImago Journal Rank (SJR) 2018: 0.252
Source Normalized Impact per Paper (SNIP) 2018: 0.520

Online
ISSN
1897-7499
See all formats and pricing
More options …
Volume 55, Issue 2

Issues

Dependency parsing of Polish

Alina Wróblewska / Piotr Rybak
Published Online: 2019-08-17 | DOI: https://doi.org/10.1515/psicl-2019-0012

Abstract

The predicate-argument structure transparently encoded in dependency-based syntactic representations supports machine translation, question answering, information extraction, etc. The quality of dependency parsing is therefore a crucial issue in natural language processing. In the current paper we discuss the fundamental ideas of the dependency theory and provide an overview of selected dependency-based resources for Polish. Furthermore, we present some state-of-the-art dependency parsing systems whose models can be estimated on correctly annotated data. In the experimental part, we provide an in-depth evaluation of these systems on Polish data. Our results show that graph-based parsers, even those without any neural component, are better suited for Polish than transition-based parsing systems.

Keywords: Polish Dependency Bank; dependency parsing; evaluation

References

  • Ballesteros, M. and J. Nivre. 2012. “MaltOptimizer: An optimization tool for Malt-Parser”. Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics Avignon, France: Association for Computational Linguistics. 58–6. <http://www.aclweb.org/anthology/k12-2012>

  • Bohnet, B. 2010. “Very high accuracy and fast dependency parsing is not a contradiction”. Proceedings of the 23rd International Conference on Computational Linguistics COLING 2010. 89–97.

  • Buchholz, S. and E. Marsi. 2006. “CoNLL-X shared task on Multilingual Dependency Parsing”. Proceedings of the Tenth Conference on Computational Natural Language Learning New York City. 149–164.Google Scholar

  • Carreras, X. 2007. “Experiments with a higher-order projective dependency parser”. In Proceedings of the CONLL Shared Task Session of EMNLP-CONLL 2007. 957–61.Google Scholar

  • Chu, Y. J. and T. H. Liu. 1965. “On the shortest arborescence of a directed graph”. Science Sinica 14. 1396–1400.Google Scholar

  • Derwojedowa, M. 2011. Składnia liczebników we współczesnym języku polskim. Zarys opisu zależnościowego Warszawa: Wydawnictwo Wydziału Polonistyki UW.Google Scholar

  • Diestel, R. 2000. Graph theory Graduate Texts in Mathematics 173.) New York: Springer-Verlag.Google Scholar

  • Dozat, T. and C. D. Manning. 2018. “Simpler but more accurate semantic dependency parsing”. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers. Melbourne: Association for Computational Linguistics. 484–490. <http://aclweb.org/anthology/P18-2077>

  • Dozat, T., P. Qi and C. D. Manning. 2017. “Stanford’s graph-based neural dependency parser at the CoNLL 2017 Shared Task”. Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies Association for Computational Linguistics. 20–30. <http://www.aclweb.org/anthology/K/K17/K17-3002dpdf>

  • Edmonds, J. 1967. “Optimum branchings”. Journal of Research of the National Bureau of Standards 71B(4). 233–240.Google Scholar

  • Eisner, J. M. 1996. “Three new probabilistic models for dependency parsing: An exploration”. Proceedings of the 16th International Conference on Computational Linguistics COLING 1996. 340–345.

  • Fan, R., K.-W. Chang, C.-J. Hsieh, X. -Rui Wang and C.-J. Lin. 2008. “LIBLINEAR: A library for large linear classification”. Journal of Machine Learning Research 9. 1871–1874.Google Scholar

  • Kaplan, R. M., J. T. Maxwell III, T. H. King and R. Crouch. 2004. “Integrating finite-state technology with deep LFG grammars”. Proceedings of the Workshop on Combining Shallow and Deep Processing for NLP 11–20.

  • Kiperwasser, E. and Y. Goldberg. 2016. “Simple and accurate dependency parsing using bidirectional LSTM feature representations”. Transactions of the Association for Computational Linguistics 4. 313–327. <http://aclweb.org/anthology/Q16-1023>

  • Klemensiewicz, Z. 1968. Zarys składni polskiej Warszawa: PWN.Google Scholar

  • Kobyliński, Ł., M. Wasiluk and G. Wojdyga. 2018. “Improving part-of-speech tagging by meta-learning”. Proceedings of the 21st International Conference on Text, Speech and Dialogue (TSD 2018). Brno: Springer, Cham. 144–152. <https://doi.org/https://doi.org/10.1007/978-3-030-00794-S_15>

  • Koehn, P. 2005. “Europarl: A parallel corpus for statistical machine translation”. Proceedings of the 10th Machine Translation Summit Conference Phuket. 79–86.Google Scholar

  • Kübler, S., R. T. McDonald and J. Nivre. 2009. Dependency parsing. Synthesis lectures on human language technologies Morgan & Claypool Publishers.Google Scholar

  • Marcińczuk, M. 2017. “Lemmatization of multi-word common noun phrases and named entities in Polish”. Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP 2017). Varna. 483–491. <https://doi.org/10.26615/978-954-452-049-6_064>

  • McDonald, R., K. Crammer and F. Pereira. 2005. “Online large-margin training of dependency parsers”. Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics ACL 2005. 91–98.Google Scholar

  • McDonald, R. and F. Pereira. 2006. “Online learning of approximate dependency parsing algorithms”. Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics EACL 2006. 81–88.Google Scholar

  • Mel’čuk, I. A. 1988. Dependency syntax: theory and practice Albany: SUNY Press.Google Scholar

  • Mikolov, T., K. Chen, G. Corrado and J. Dean. 2013. “Efficient estimation of word representations in vector space”. CoRR abs/1301.3781. <http://arxiv.org/abs/1301.3781>

  • Newman, M. E. J. 2010. Networks: An introduction New York: Oxford University Press.Google Scholar

  • Nivre, J. 2008. “Algorithms for deterministic incremental dependency parsing”. Computational Linguistics 34(4). 513–553.CrossrefGoogle Scholar

  • Nivre, J. 2009. “Non-projective dependency parsing in expected linear time”. Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP Singapore. 351–59.Google Scholar

  • Nivre, J., J. Hall, S. Kübler, R. McDonald, J. Nilsson, S. Riedel and D. Yuret. 2007. “The CoNLL 2007 Shared Task on Dependency Parsing”. Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007 Prague. 915–932.Google Scholar

  • Nivre, J., J. Hall and J. Nilsson. 2006. “MaltParser: A data-driven parser-generator for dependency parsing”. Proceedings of the Fifth International Conference on Language Resources and Evaluation LREC’06. 2216–2219.Google Scholar

  • Nivre, J., M.-C. de Marneffe, F. Ginter, Y. Goldberg, J. Hajič, C. D. Manning, R. T. McDonald, et al. 2016. “Universal dependencies v1: A multilingual treebank collection”. Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016. 1659–1666. <http://www.lrec-conf.org/proceedings/lrec2016/pdf/348_Paper.pdf>

  • Nivre, J. and J. Nilsson. 2005. “Pseudo-projective dependency parsing”. Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics ACL ’05. Ann Arbor, MI: Association for Computational Linguistics. 99–106. <http://www.aclweb.org/anthology/P05-1013>

  • Obrębski, T. 2002. Automatyczna analiza składniowa języka polskiego z wykorzystaniem gramatyki zależnościowej. (PhD dissertation, Institute of Computer Science, Polish Academy of Sciences, Warsaw.)Google Scholar

  • Patejuk, A. and A. Przepiórkowski. 2014. “Synergistic development of grammatical resources: A valence dictionary, an LFG grammar and an LFG structure bank for Polish”. Proceedings of the Thirteenth International Workshop on Treebanks and Linguistic Theories (TLT 13). Tübingen: Department of Linguistics (SfS), University of Tübingen. 113–126.Google Scholar

  • Pęzik, P., M. Ogrodniczuk and A. Przepiórkowski. 2011. “Parallel and spoken corpora in an open repository of Polish language resources”. Proceedings of the 5th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics Poznań. 511–515.Google Scholar

  • Polguére, A. and I. A. Mel’čuk, eds. 2009. Dependency in linguistic description Studies in Language Companion Series (SLCS) 111.) Amsterdam: Benjamins.Google Scholar

  • Przepiórkowski, A., M. Bańko, R. L. Górski and B. Lewandowska-Tomaszczyk (eds.). 2012. Narodowy Korpus Języka Polskiego [The National Corpus of Polish]. Warsaw: Wydawnictwo Naukowe PWN.Google Scholar

  • Przepiórkowski, A. and A. Wróblewska. 2015. “Supporting LFG parsing with dependency parsing”. Proceedings of the Fourteenth International Workshop on Treebanks and Linguistic Theories (TLT 14). Warsaw: Institute of Computer Science, Polish Academy of Sciences. 168–178.Google Scholar

  • Rybak, P. and A. Wróblewska. 2018. “Semi-supervised neural system for tagging, parsing and lematization”. Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies Brussels, Belgium: Association for Computational Linguistics. 45–54. <https://doi.org/10.18653/v1/K18-2004>

  • Seddah, D., S. Kübler and R. Tsarfaty. 2014. “Introducing the SPMRL 2014 Shared Task on parsing morphologically-rich languages”. Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages. Dublin City University. 103–109. <http://www.aclweb.org/anthology/W14-6111>

  • Seddah, D., R. Tsarfaty, S. Kübler, M. Candito, J.D. Choi, R. Farkas, J. Foster, et al. 2013. “Overview of the SPMRL 2013 Shared Task: A cross-framework evaluation of parsing morphologically rich languages”. Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages Association for Computational Linguistics. 146–182. <http://www.aclweb.org/anthology/W13-4917>

  • Sgall, P., E. Hajičová and J. Panevová. 1986. The meaning of the sentence in its semantic and pragmatic aspects Dordrecht: Reidel.Google Scholar

  • Steinberger, R., A. Eisele, S. Klocek, S. Pilos and P. Schlüter. 2012. “DGT-TM: A freely available translation memory in 22 languages”. Proceedings of the 8th International Conference on Language Resources and Evaluation Istanbul. 454–459.Google Scholar

  • Straka, M. and J. Straková. 2017. “Tokenizing, POS tagging, lemmatizing and parsing UD 2.0 with UDPipe”. Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies Association for Computational Linguistics. 88–99. <http://www.aclweb.org/anthology/K/K17/K17-3009dpdf>

  • Świdziński, M. 1989. “A dependency syntax of Polish”. In: Maxwell, D. and K. Schubert (eds.), Metataxis in practice. Dependency syntax for multilingual machine translation Dordrecht: Foris. 69–88.Google Scholar

  • Tiedemann, J. 2012. “Parallel data, tools and interfaces in OPUS”. Proceedings of the 8th International Conference on Language Resources and Evaluation Istanbul. 2214–2218.Google Scholar

  • Woliński, M. 2015. “Deploying the new valency dictionary Walenty in a DCG parser of Polish”. Proceedings of the Fourteenth International Workshop on Treebanks and Linguistic Theories (TLT 14). Warsaw: Institute of Computer Science, Polish Academy of Sciences. 221–29. <http://tlt14dipipan.waw.pl/proceedings/>

  • Woliński, M. 2019. Automatyczna analiza składnikowa języka polskiego Warsaw: Wydawnictwa Uniwersytetu Warszawskiego.Google Scholar

  • Woliński, M., K. Głowińska and M. Świdziński. 2011. “A preliminary version of Składnica Treebank of Polish”. Proceedings of the 5th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics 299–303.Google Scholar

  • Wróblewska, A. 2012. “Polish dependency bank”. Linguistic Issues in Language Technology 7(1). 1–15.Google Scholar

  • Wróblewska, A. 2014. Polish dependency parser trained on an automatically induced dependency bank. (PhD dissertation, Institute of Computer Science, Polish Academy of Sciences, Warsaw.)Google Scholar

  • Wróblewska, A. 2018. “Extended and enhanced Polish dependency bank in universal dependencies format”. Proceedings of the Second Workshop on Universal Dependencies (UDW 2018). Brussels: Association for Computational Linguistics. 173–182. <https://aclanthology.coli.uni-saarland.de/papers/W18-6020/w18-6020>

  • Wróblewska, A. 2018. “Results of the PolEval 2018 Competition: Dependency parsing shared task”. Proceedings of the PolEval 2018 Workshop. Institute of Computer Science, Polish Academy of Sciences. 11–24.Google Scholar

  • Wróblewska, A. and K. Krasnowska-Kieraś. 2017. “Polish evaluation dataset for compositional distributional semantics models”. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. 784–792.Google Scholar

  • Wróblewska, A., K. Krasnowska-Kieraś and P. Rybak. 2017. “Towards the evaluation of feature embedding models of the fusional languages”. Proceedings of the 8th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics. Poznań: Fundacja Uniwersytetu im. Adama Mickiewicza w Poznaniu. 420–424. <http://ltc.amu.edu.pl/book/papers/SEMS-3dpdf>

  • Zeman, D., O. Dušek, D. Mareček, M. Popel, L. Ramasamy, J. Štěpánek, Z. Žabokrtský and J. Hajič. 2014. “HamleDT: Harmonized multi-language dependency treebank”. Language Resources and Evaluation 48(4). 601–637.CrossrefGoogle Scholar

  • Zeman, D., J. Hajič, M. Popel, M. Potthast, M. Straka, F. Ginter, J. Nivre and S. Petrov. 2018. “CoNLL 2018 Shared Task: Multilingual parsing from raw text to universal dependencies”. Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. Brussels: Association for Computational Linguistics. 1–21. <http://www.aclweb.org/anthology/K18-2001>

  • Zeman, D., M. Popel, M. Straka, J. Hajič, J. Nivre, F. Ginter, J. Luotolahti, et al. 2017. “CoNLL 2017 Shared Task: Multilingual parsing from raw text to universal dependencies”. Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies Vancouver, Canada: Association for Computational Linguistics. 1–19. <https://doi.org/10.18653/1U/K17-3001>

About the article

Alina Wróblewska Instytut Podstaw Informatyki Polskiej Akademii Nauk ul. Jana Kazimierza 5 01-248 Warszawa Poland


Published Online: 2019-08-17

Published in Print: 2019-06-26


Citation Information: Poznan Studies in Contemporary Linguistics, Volume 55, Issue 2, Pages 305–337, ISSN (Online) 1897-7499, ISSN (Print) 0137-2459, DOI: https://doi.org/10.1515/psicl-2019-0012.

Export Citation

© 2019 Faculty of English, Adam Mickiewicz University, Poznań, Poland.Get Permission

Comments (0)

Please log in or register to comment.
Log in