Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Digital Classical Philology

Ancient Greek and Latin in the Digital Revolution

Ed. by Berti, Monica

Series:Age of Access? Grundfragen der Informationsgesellschaft 10

Open Access
eBook (PDF)
Publication Date:
August 2019
Copyright year:
2019
ISBN
978-3-11-059957-2
See all formats and pricing

The Project of the Index Thomisticus Treebank

Passarotti, Marco

Abstract

The paper introduces the project of the Index Thomisticus Treebank (IT-TB). The IT-TB is a dependency-based treebank based on the corpus of the Index Thomisticus by father Roberto Busa (IT), which includes the opera omnia of Thomas Aquinas, for a total of approximately 11 million words. Currently, the IT-TB is the largest Latin treebank available, with more than 350,000 nodes in around 17,000 sentences. The annotation covers the entire books 1, 2 and 3 of Summa contra Gentiles, plus excerpts from Scriptum super Sententiis Magistri Petri Lombardi and Summa Theologiae. The paper details the multi-layer annotation style of the IT-TB and its background theoretical motivations. The conversion process to the now widely used Universal Dependencies style is described as well. Across more than a decade, the project has developed a number of linguistic resources and NLP tools for Latin connected to the IT-TB. As for the resources, the paper presents the syntaxbased subcategorization lexicon IT-VaLex and the valency lexicon Latin Vallex. As for the tools, the automatic dependency parsing process is described, highlighting the core issue of portability of NLP tools across the wide diachronic and diatopic span of Latin texts. A section is dedicated to automatic morphological analysis of Latin, introducing the analyzer Lemlat and its recent enhancement with information on derivational morphology and a new set of lexical entries covering a large Onomasticon (from Forcellini dictionary) and Medieval Latin (from Du Cange glossary).

Citation Information

Marco Passarotti (2019). The Project of the Index Thomisticus Treebank. In Monica Berti (Editor), Digital Classical Philology: Ancient Greek and Latin in the Digital Revolution (pp. 299–320). Berlin, Boston: De Gruyter. https://doi.org/10.1515/9783110599572-017

Book DOI: https://doi.org/10.1515/9783110599572

Online ISBN: 9783110599572

© 2019 Walter de Gruyter GmbH, Berlin/Munich/Boston. BY-NC-ND 4.0 This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Get Permission

Comments (0)

Please log in or register to comment.
Log in