Abstract
In this article we report the results for five POS taggers, i.e., the Mate tagger, the Hunpos tagger, RFTagger, theOpenNLP tagger, andNLTKUnigramtagger, tested on the data of the Ancient Greek Dependency Treebank. This is done in order to find the most efficient POS tagger to use for pre-annotation of new treebank data. A corrected 1-run 10-fold cross validation t test shows that the Mate tagger outperforms all the other taggers, with an accuracy score of 88%.
References
David Bamman. Ancient Greek and Latin dependency treebanks. In Antal van den Bosch Caroline Sporleder and Kalliopi Zervanou, editors, Language Technology for Cultural Heritage, Theory and Applications of Natural Language Processing, pages 79–98. Springer, 2011.10.1007/978-3-642-20227-8_5Search in Google Scholar
Bernd Bohnet and Joakim Nivre. A transition-based system for joint part-of-speech tagging and labeled non-projective dependency parsing. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012, pages 1455–1465. Association for Computational Linguistics, 2012.Search in Google Scholar
Remco R. Bouckaert and Eibe Frank. Evaluating the replicability of significance tests for comparing learning algorithms. In Proceedings of the 8th Pacifica-Asian conference on knowledge discovery and data mining, pages 3–12, 2004.10.1007/978-3-540-24775-3_3Search in Google Scholar
Thorsten Brants. TNT: A statistical part-of-speech tagger. In Proceedings of the Sixth Conference on Applied Natural Language Processing, pages 224–231. Association for Computational Linguistics, 2000.Search in Google Scholar
Gregory Crane. Generating and parsing classical greek. Literary and Linguistic Computing, 6(4):243–245, 1991.10.1093/llc/6.4.243Search in Google Scholar
Sture Holm. A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, (6):65–70, 1979.Search in Google Scholar
Zhiheng Huang, Wei Xu, Kai Yu. Bidirectional LSTM-CRF models for sequence tagging. pages 48–52, 2015. URL http://arxiv.org/ abs/1508.01991.Search in Google Scholar
Michael Piotrowski. Natural language processing for historical languages. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers, 2012.10.2200/S00436ED1V01Y201207HLT017Search in Google Scholar
Guzmán Santafé, Iñaki Inza, and José Antonio Lozano. Dealing with the evaluation of supervised classification algorithms. Artificial Intelligence Review, 44(4):467–508, 2015.10.1007/s10462-015-9433-ySearch in Google Scholar
Helmut Schmid and Florian Laws. Estimation of conditional probabilities with decision trees and an application to fine-grained POS tagging. In Proceedings of the 22Nd International Conference on Computational Linguistics - Volume 1, COLING 2008, pages 777–784. Association for Computational Linguistics, 2008. 10.3115/1599081.1599179Search in Google Scholar
© 2016 G. G. A. Celano et al.
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.