Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Digital Classical Philology

Ancient Greek and Latin in the Digital Revolution

Ed. by Berti, Monica

Series:Age of Access? Grundfragen der Informationsgesellschaft 10

Open Access
eBook (PDF)
Publication Date:
August 2019
Copyright year:
See all formats and pricing

Character Encoding of Classical Languages

Tauber, James K.


Underlying any processing and analysis of texts is the need to represent the individual characters that make up those texts. For the first few decades, scholars pioneering digital classical philology had to adopt various workaround for dealing with the various scripts of historical languages on systems that were never intended for anything but English. The Unicode Standard addresses many of the issues with character encoding across the world’s writing systems, including those used by historical languages, but its practical use in digital classical philology is not without challenges. This chapter will start with a conceptual overview of character coding systems and the Unicode Standard in particular but will discuss practical issues relating to the input, interchange, processing and display of classical texts. As well as providing guidelines for interoperability in text representation, various aspects of text processing at the character level will be covered including normalisation, search, regular expressions, collation, and alignment.

Citation Information

James K. Tauber (2019). Character Encoding of Classical Languages. In Monica Berti (Editor), Digital Classical Philology: Ancient Greek and Latin in the Digital Revolution (pp. 137–158). Berlin, Boston: De Gruyter. https://doi.org/10.1515/9783110599572-009

Book DOI: https://doi.org/10.1515/9783110599572

Online ISBN: 9783110599572

© 2019 Walter de Gruyter GmbH, Berlin/Munich/Boston. BY-NC-ND 4.0 This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Get Permission

Comments (0)

Please log in or register to comment.
Log in