Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Open Linguistics

Editor-in-Chief: Ehrhart, Sabine

Covered by:
Elsevier - SCOPUS
Clarivate Analytics - Emerging Sources Citation Index

Open Access
See all formats and pricing
More options …

Part of Speech Tagging for Ancient Greek

Giuseppe G. A. Celano / Gregory Crane / Saeed Majidi
Published Online: 2016-10-14 | DOI: https://doi.org/10.1515/opli-2016-0020


In this article we report the results for five POS taggers, i.e., the Mate tagger, the Hunpos tagger, RFTagger, theOpenNLP tagger, andNLTKUnigramtagger, tested on the data of the Ancient Greek Dependency Treebank. This is done in order to find the most efficient POS tagger to use for pre-annotation of new treebank data. A corrected 1-run 10-fold cross validation t test shows that the Mate tagger outperforms all the other taggers, with an accuracy score of 88%.

Keywords: Ancient Greek; POS tagging; part of speeech; morphology


  • David Bamman. Ancient Greek and Latin dependency treebanks. In Antal van den Bosch Caroline Sporleder and Kalliopi Zervanou, editors, Language Technology for Cultural Heritage, Theory and Applications of Natural Language Processing, pages 79–98. Springer, 2011.Google Scholar

  • Bernd Bohnet and Joakim Nivre. A transition-based system for joint part-of-speech tagging and labeled non-projective dependency parsing. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012, pages 1455–1465. Association for Computational Linguistics, 2012.Google Scholar

  • Remco R. Bouckaert and Eibe Frank. Evaluating the replicability of significance tests for comparing learning algorithms. In Proceedings of the 8th Pacifica-Asian conference on knowledge discovery and data mining, pages 3–12, 2004.Google Scholar

  • Thorsten Brants. TNT: A statistical part-of-speech tagger. In Proceedings of the Sixth Conference on Applied Natural Language Processing, pages 224–231. Association for Computational Linguistics, 2000.Google Scholar

  • Gregory Crane. Generating and parsing classical greek. Literary and Linguistic Computing, 6(4):243–245, 1991.Google Scholar

  • Sture Holm. A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, (6):65–70, 1979.Google Scholar

  • Zhiheng Huang, Wei Xu, Kai Yu. Bidirectional LSTM-CRF models for sequence tagging. pages 48–52, 2015. URL http://arxiv.org/ abs/1508.01991.Google Scholar

  • Michael Piotrowski. Natural language processing for historical languages. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers, 2012.Google Scholar

  • Guzmán Santafé, Iñaki Inza, and José Antonio Lozano. Dealing with the evaluation of supervised classification algorithms. Artificial Intelligence Review, 44(4):467–508, 2015.Google Scholar

  • Helmut Schmid and Florian Laws. Estimation of conditional probabilities with decision trees and an application to fine-grained POS tagging. In Proceedings of the 22Nd International Conference on Computational Linguistics - Volume 1, COLING 2008, pages 777–784. Association for Computational Linguistics, 2008. Google Scholar

About the article

Received: 2016-04-08

Accepted: 2016-07-20

Published Online: 2016-10-14

Citation Information: Open Linguistics, Volume 2, Issue 1, ISSN (Online) 2300-9969, DOI: https://doi.org/10.1515/opli-2016-0020.

Export Citation

© 2016 G. G. A. Celano et al.. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License. BY-NC-ND 3.0

Comments (0)

Please log in or register to comment.
Log in