Structural Visualization of Manuscripts (StruViMan): Principles, Methods, Prospects

Abstract This paper introduces a tool which offers scholars a new way to visualize the structure of manuscripts. The Structural Visualization of Manuscripts (or StruViMan) is a web-based application, developed as part of the Paratexts of the Greek Bible Project, a European Research Council project based in Munich. Drawing on the principles of structural codicology, StruViMan is able to translate the different stages of a manuscript’s development into a visual model based on the codex’s physical, historical layers and aims to facilitate the comparison of manuscripts. It can be used by any web-connected manuscript database from any cultural area and does not require the presence of electronic images. This presentation begins with a short survey of the principles underpinning the tool’s conception and development, followed by a demonstration of how manuscript data from both biblical and non-biblical Greek codices are transformed into interactive, customizable visualizations with varying display modes. We will also touch upon StruViMan’s technical aspects as an open-access web service, available to any software or database able to call its API using the correct parameters. We close with a preview of new features currently under development, including the ability to “reconstruct” a manuscript whose composite parts are presently in different repositories.


Introduction
In this paper we would like to introduce a tool which offers scholars a new way to visualize the structure of medieval manuscripts.1 The Structural Visualization of Manuscripts (StruViMan)2 came into being as an ERC proof-of-concept project and was developed between 2017-2018 as a practical extension of the Paratexts of the Greek Bible (ParaTexBib)3 project, a larger, five-year ERC project based at the Ludwig Maximilian University in Munich, begun in 2015 and led by Martin Wallraff and Patrick Andrist.4 Although StruViMan was developed within the context of a project dedicated to the Greek Bible, it has been our express goal from the outset to create a tool which can serve any database dealing with any kind of manuscripts. We will begin by looking briefly at the tool's genesis.
The ParaTexBib project has as its central aim to present a comprehensive survey of the paratextual material in manuscripts of the Greek Bible (mostly Gospels) from the 2 nd to the 16 th century. On the basis of digital reproductions each manuscript is carefully reviewed and the presence of biblical texts and paratexts is documented within the framework of the Pinakes database. As we compiled these descriptions it became apparent that while traditional cataloguing methods, focusing on building sequential lists of the pieces of content found in individual manuscripts, have a great deal to offer in terms completeness and precision, they lack the intuitive view of the codex as a structured and multi-layered object that a visualization can convey. It is with this idea of the codex as an object bearing the physical traces of its evolution over time, both in its structure and content, that StruViMan was conceived as a supplement to more traditional manuscript databases and as a way to translate the different stages in a manuscript's development -what one may call its "stratigraphy" -into a visual model. First, we will briefly describe the research context from which the tool emerged, as well as the methodologies that underpin its development. After a short exploration of the design concept, we will demonstrate its various features by applying it to two sample manuscripts, and finally we will touch on the tool's wider application and some future prospects.

Pinakes, NTVMR and La syntaxe du codex
In order to understand how the StruViMan tool came into being, let us first look briefly at the research project from which it emerged, since the ParaTexBib project's approach to working with biblical manuscripts proved formative in the tool's development. In an age when a great many key resources for studying such material (including digital images, databases, and catalogues) are available -sometimes exclusively -on the internet, there is an increasing need to organize and present such data in a way that is easy to access and understand. Accordingly, two decisions were made at the beginning of the project which gave shape to how the project data were handled. The first concerned the processing and storing of the data: rather than constructing a new database from scratch, the project leaders decided to seek partners with pre-existing databases, so that the data could be built up and stored in a well-established and curated environment. This policy paved the way for two fruitful partnerships: the Section grecque of the Institut de recherche et d'histoire de textes maintains Pinakes,5 the largest and most important online database of Greek manuscripts, and it is within their framework that we were able to build a series of new fields and functions that answered our data requirements. Secondly, the ParaTexBib project cooperates with The New Testament Virtual Manuscript Room (NTVMR) at the University of Μünster,6 which houses the largest collection of images of New Testament manuscripts on the internet. Our own ParaTexBib members annotate the paratextual content clustered around the biblical material in each manuscript directly on the relevant digital image in NTVMR and this information is then imported into Pinakes through a bridge (in the form of an API) that was programmed specially for this purpose.
The second decision concerned the presentation and organization of the information gathered. The ParaTexBib project's approach to manuscript description is based on the methodologies set out in La syntaxe du codex, first published in 2013 and soon to appear in English as The Syntax of the Codex.7 A medieval manuscript is often a complex, dynamic, and layered object from its very beginning. Sometimes its journey through the centuries leaves an imprint that is not difficult to discern, but at other times more detective work is required to reveal the various layers and transformations that led to the object as it is preserved today. The authors of La syntaxe thus sought to develop a methodological approach that is both comprehensive and finely articulated in its ability to accommodate the many variables and intricacies of composition and structure that one may encounter in medieval codices. Another important feature of La syntaxe is its emphasis on the points of intersection between a manuscript's codicological and textual features, as well as the development of a descriptive language which can navigate and disentangle the often complicated diachronic interplay between them.

Summary descriptions according to the syntactical model
La syntaxe du codex is able to cater to a level of manuscript complexity that goes beyond the needs of the manuscript descriptions one commonly finds in online manuscript databases. These descriptions are partial in nature and focus largely on the content rather than on the codicological features. The important thing for understanding how our descriptions are organized in Pinakes (and how in turn the StruViMan tool takes manuscript data and transforms them into a visual representation) is the following: our descriptions are arranged according to a so-called syntactical model which allows the reader to approach the codex both "vertically" (meaning that one can see at a glance the different stages of development over time) and also "horizontally" (in the writing, or rather the reading order, meaning what pieces of content appear where in the manuscript).8 The "horizontal" survey of a manuscript's contents is realized in much the same way one would find such in a traditional catalogue: each content item that has been annotated on the digital manuscript images in NTVMR is imported into Pinakes and listed in the reading order in the manuscript. For the "vertical" aspect of the description, each piece of content is then assigned to what is called a production unit -all the parts of the codex which are "the result of one and the same act of production."9 Production units depend heavily on the quire structure of the codex but also define an intermediate historical structure between the quire and the complete codex. A description according to the syntactical model thus operates on three data levels: the data level related to the codex as it is today; the data level related to its constitutive production units (i.e. its historical parts); and the data level of the pieces of content (mostly texts, but also images or musical pieces), always situated within a production unit.

Visualization design concept
Pieces of manuscript content arranged into their respective chronological layers thus make up the basis of the information which StruViMan translates into a visual model. We will now look more closely at the manuscript visualization that StruViMan can generate. This graphic was designed to reflect the image of a book lying on its side (figure 1).

Figure 1. Sample Visualization of a Manuscript in StruViMan10
The visualization has two parts: Structure bar: the "spine" of the book consists of a bar divided into two differently-colored sections. It shows how many production units are present in the manuscript and where they occur. Content bar: the "pages" of the book not only give an overview of the pieces of content within the codex, but also show roughly how much space each item occupies in the manuscript. The different colors show which pieces of content belong to the same types of content (in the case of biblical paratexts this would mean that all prologues have the same color, all the subscriptions another one, and so forth; the same holds for chapter lists, evangelist portraits, etc.).

Example 1: Codex Bodmer 115
Having briefly surveyed the methodology underpinning our descriptions and the ideas behind the tool's visual design concept, we will now look at the tool in action as it transforms manuscript data into a graphic visualization. Even though in the ParaTexBib project our focus is, as mentioned, exclusively on manuscripts with biblical content, we begin here with a non-biblical codex, in order to show the wide and intended applicability of the tool to manuscripts of all kinds. The Codex Bodmer 115 is a collection of treatises on military science written in Greek and currently held at the Fondation Martin Bodmer in Cologny, Switzerland. The catalogue for the Greek manuscripts of that collection, which appeared in 2016, applies the syntactical model of manuscript description set forth in the La syntaxe du codex. It gives the following summary of the contents of Codex Bodmer 115 ahead of the full description (figure 2).11 The Codex Bodmer 115 contains four discrete production units: the first three (A-C) are all the work of the same scribe, Camillo Zanetti, and their time of production can be situated at different points in the second half of the 16 th century with reasonable certitude. The last production unit (D) is dated to 1761. From a "horizontal" perspective we see that there are also four different pieces of content, organized by author and by work. These are a paraphrase of the emperor Maurice's Strategicon, the De velitatione bellica attributed to the emperor Nicephorus Phocas, excerpts from book 7 of the Cesti by Julius Africanus, and the second part of the Apparatus bellicus (it should be noted that the excerpts of bBook 7 of the Cesti form the first part of this work in the manuscript tradition).12 The process then unfolds in the following way: the calling software (in this case, Pinakes) translates the description information into an XML document. Through an API it then communicates with the StruViMan web service and sends XML data corresponding to the manuscript that the user wishes to have displayed. This may be a single long string containing all the required XML data. As StruViMan opens, the manuscript visualization appears in a new browser window; here we see the visualization for Codex Bodmer 115 (figure 3). Beginning in the upper left-hand corner of the graphic is a bar called the workspace panel, which contains a clickable drop-down menu (manuscripts) with the manuscripts that have been sent to StruViMan by the calling software. In the example above this is a single manuscript. Next to it are the help menu and the language toggle button, which allows users to select an English, French, or German interface. These form the basic features of what is called the easy workspace panel, the default setting when the tool is opened. A greater range of display options for the more advanced user can be activated with a click on the green slider button on the right-hand side; we will return to that later.

The manuscript panel
Below this is the manuscript panel, in which we see the manuscript's identifying information; in this case the repository, shelfmark, title assigned to it in Pinakes, and its Diktyon number (a unique numerical identifier for Greek manuscripts).14 It also contains another row of buttons, the first being the tag best readability. In easy mode the default (and only) visualization mode is this so-called best readability one, which means that if the manuscript features content of widely varying lengths, the longer pieces of content are scaled down while the size of any smaller piece of content is enhanced (more on this below).15 The blue button next to it allows the user to create a screenshot of the visualization which can be saved to their computer; the next button with the pencil icon lets the user choose whether or not the various labels appear in the visualization (and the screenshot). At the end of the manuscript panel is also the mode toggle alluded to above; when the user changes the mode in the manuscript panel, the changes apply only to that particular manuscript.

The graphic representation (StruViGraph)
Next we find, taking up the largest part of the screen, the book-shaped visualization with its two main parts: the structure bar in the book's "spine," which is divided into four differently-colored sections. These represent the four production units contained in Codex Bodmer 115 and are labeled with their corresponding number and date; the content bar, located in the "pages" of the stylized "book," gives an overview of the pieces of content in the manuscript and how much space each occupies. The production units each have a different color, since every unit is the result of a separate act of production.

The information area
At the lower end of the graphic there is a table, the information area, with further information about the manuscript, including any known owners and scribes, comments (if any), and a link to the Pinakes entry.
What appears in the information area varies according to where the user clicks on the visualization. If a production unit is selected, the selection is highlighted in the visualization. The information in the table below changes to give a more detailed overview of the unit in question, including the folios it covers, the date, the scribe (if known), and any comments. Similarly, if one clicks on a piece of content, that piece of content is highlighted in the visualization and the table shows information about the work, including its relation to that production unit and any comments.

Example 2: Codex Vaticanus graecus 364
The Codex Bodmer 115 is a relatively straightforward manuscript in terms of its structure and content. Now we consider a manuscript in StruViMan with a greater number of production units and content of more varying sizes. The manuscript Vaticanus graecus 364, held in the Biblioteca Apostolica Vaticana, is a 10 th -century Gospel book which saw the addition of a set of liturgical tables in the 11 th century and the restoration of a number of leaves at the end of the manuscript in the 14 th century to compensate for the loss of the final portion of the liturgical tables. In terms of content it has not only the four Gospels but also a modest selection of standard biblical paratexts comprised of synoptic concordance tables at the beginning of the manuscript, along with a letter containing instructions on how to use them (both attributed to the church historian Eusebius of Caesarea, lists of chapter titles, some rather splendid miniatures (including evangelist portraits), and the aforementioned liturgical tables, which tell the reader what portions of the Gospel are read at different times during the liturgical year. In order to give a summary of the contents, we have opted for a highly abbreviated type of description called a skeleton box, which gives an overview of the manuscript's contents arranged into production units without any pretense to being exhaustive (figure 4).16 Most Gospel paratexts fall into two categories, either pertaining to all four Gospels as a collection or relating to one single Gospel in particular. As the brief description shows, some of the paratexts in this manuscript belong to the former category. The Eusebian material at the beginning (which shows concordances) and the liturgical tables at the end (which show lectionary readings) apply to the four Gospels as one textual unit. Around each Gospel, however, we find that the paratexts have been arranged into small, similarly-patterned clusters. These are the patterns which, in addition to delineating a manuscript's production units, we sought to emphasize most when developing the tool, and this is why it was important to have a visualization mode that could display small but relevent textual elements.
The visual advantages of this mode are especially apparent in the following visualization (figure 5), where smaller paratexts are enhanced, ensuring that the brightly colored slices of the images and chapters are not dwarfed by the larger, gray-toned Gospels around them. It acts as a visual supplement to the more traditional and lengthy description (or electronic manuscript), so that not only are the different layers and elements individually (and at a glance) revealed, but we also see graphically laid out their proximity and relative length and how they interact with each other. In the visualization the later interventions in the codex become clearly discernible: we see the liturgical tables distinct from the rest of the original production unit, while the restored folios at the end of the manuscript are thrown into relief by the change in color in the structure bar.

Brief survey of the features and functions in "advanced" mode
One of the particular challenges we faced in establishing a workable visualization was dealing with pieces of content of such varying sizes. In this manuscript the Gospel of Luke covers a little over 70 folios, while the chapter list and the evangelist portrait that precede it take up a mere three folios combined. This is where the advanced mode, alluded to earlier on, offers a greater degree of customizability in the display for those users wishing to highlight or minimize particular aspects of a manuscript in a visualization. Once again, the user can elect to do this in the workspace panel, which will apply all changes to all open manuscript visualizations, or they can make alterations to individual manuscript visualizations by changing the settings in the manuscript panel for each manuscript display window.
As we see in figure 5, a new button has appeared next to the help menu in the workspace panel: the settings button allows the user to fine-tune the customizations available to individual manuscripts in the easy workspace and then to apply these changes at once to all open manuscripts. This is followed by various buttons for tiling, showing, hiding or deleting multiple manuscripts if they are present on the workspace or in the manuscripts menu (more on this below).
The manuscript panel also offers new features: the visualization default mode remains "best readability," but the visualization mode tag has now become a drop-down menu where the user can select two alternative display modes in addition to the default: proportional, which displays content by the number of folios it occupies, and all content equal, which of course causes some loss to the sense of spatial distribution of different pieces of content, but which has the advantage of showing the sequence of pieces of content very clearly. Next to the screenshot button, the labels on/off button offers further options here as well: the user can opt to display only certain labels (for example, for only a single type of content) and hide all others.
In the colors and visibility menu the user can choose, for example, to omit all production units but one, while also adjusting the color scheme to personal preference. This allows one, for example, easily to visualize a manuscript before it was restored or supplemented with extra material.
It is also possible in the advanced workspace to change the size of the visualization window, so that in the case of manuscripts with a large amount of content one can see the entire visualization on the workspace.
If we return briefly to the workspace panel and press the tile button, we see that Codex Bodmer 115 and Vaticanus graecus 364 now appear next to each other on the workspace, allowing for a direct comparison of their relative manuscript structures (figure 6).
Up to fifteen additional manuscripts can be added to the manuscripts menu in the workspace panel, which can then be opened invidually or all at once. Opening and tiling a great number of manuscripts at the same time exceeds the capacity and width of most computer monitors, but on a regular-sized screen one can easily open up to four manuscript visualizations for side-by-side comparisons. The session in the StruViMan web tool continues as long as the browser window remains open. If a browser tab with a particular manuscript visualization on the workspace is closed, StruViMan can be reopened in a new browser tab and the manuscript will be preserved in the manuscripts menu in the new tab. The same occurs if multiple manuscripts are opened on a single workspace: if an individual manuscript is clicked away, the visualization disappears but the manuscript remains stored in the manuscripts menu in the workspace panel. Closing the browser window effectively ends the session, along with its data and settings.

Programming and technical aspects
The programming and technical development of StruViMan were undertaken by Caroline Strolz of the IT-Gruppe Geisteswissenschaften17 at the Ludwig Maximilian University in Munich in collaboration with Frank Percival. StruViMan is a web application built using JavaScript and other common web technologies such as HTML5 and CSS3. Our programmers built the tool entirely using a JavaScript web framework called Vue. StruViMan is open source and will be made available to the public in 2019, through a web service installed on a server of the Ludwig Maximilian University in Munich.18 Thus, StruViMan does have a server-side component. This is necessary because it is not hosted on the same server as Pinakes and it is therefore easier for the StruViMan server to collect and parse the Pinakes XML data before forwarding it to the browser. As such, it can also be easily extended to interact with other systems or servers to provide similar visualization functionality. There are also plans to create a standalone download that can then be embedded into other platforms.

General applicability and wider use
StruViMan was originally designed as an extension of the ParaTexBib project as a tool for visualizing the structure of the medieval Byzantine manuscripts that form the core of our project. The examples presented in this paper are drawn from descriptions in Pinakes, as this database hosts our project data. We would like to stress, however, that the descriptive parameters used in the visualizations presented in this paper are not binding. Instead of using the production unit as the main structuring element, one might elect to use quire structure, scribal hands, watermarks, or any number of different elements to distinguish between different "layers" in a manuscript.
As such, StruViMan can generate a manuscript visualization for any software or database able to provide the appropriate XML data. The XML data format should allow the tool to be adaptable to manuscripts of all kinds and render it highly customizable. Given that the StruViMan tool is not a website with public screens or windows, but rather a web service with a web application available for interacting with any compatible program on the web, it is easily accessible. The parameters (in the technical sense) for calling the API will be described in greater detail on the StruViMan demonstration site.19

Conclusion
As it exists currently the tool represents a first version, which we hope in the future to enrich with further features and functions. Among these, some of the more promising avenues of development include integrating into the manuscript visualization a parallel display for pieces of content appearing next to each other on a single folio (one might think, for example, of a text with commentary written around it, as in the case of biblical catena manuscripts). Another feature under development for the second version is a reconstruction mode for the creation within the tool of an "empty" codex allowing for the "reconstitution" of codices whose parts are currently held in multiple repositories; it will be possible to save the reconstructed manuscripts. We believe that these additional features will allow scholars to visually define on a very granular level what content elements belong to a certain layer in a manuscript's history and where discontinuities, changes, and additions occur. While StruViMan was developed in the service of a project focusing on biblical paratexts, it is our hope that it will prove itself a useful tool for scholars working with manuscripts of every kind.