June 21, 2023

Interactive Dynamic Presentation (IDP) and Semantic Faceted Search and Browsing (SFB) of the Wittgenstein Nachlass

In 2000 the Wittgenstein Archives at the University of Bergen (WAB) published the CD-ROM edition of Wittgenstein’s Nachlass: The Bergen Electronic Edition (BEE). Moreover, since then WAB has worked towards complementing the static CD-ROM edition with an interactive web platform that additionally allows more user-specific and more user-tailored utilizations of WAB’s Nachlass resources. The paper describes two specific web service tools of this platform: Interactive Dynamic Presentation (IDP) of the Wittgenstein Nachlass and Semantic Faceted Search and Browsing (SFB) of Wittgenstein domain metadata. The paper argues that it is only when these two tools are fully implemented and functional that WAB can adequately serve the scholarly needs of the Wittgenstein Nachlass user community. The paper discusses some selected features and functionalities of the two tools in detail.[1]

1 Introduction

The practice of bringing „the Wittgenstein papers“ or „Wittgenstein’s Nachlass“ (von Wright 1969) to digital users reached its first milestone in 1998 with Vol. 1 of the Bergen CD-ROM edition Wittgenstein’s Nachlass: The Bergen Electronic Edition, edited by the Wittgenstein Archives at the University of Bergen (WAB, under the direction of Claus Huitfeldt. The complete edition (BEE; Wittgenstein 2000) became notable for creating unprecedented new access and research possibilities (cf. Meschini 2020, ch. 4).

Since its establishment in 1990, WAB has worked towards providing digital data and metadata for using the Wittgenstein Nachlass in research and education (cf. Huitfeldt 2006). This includes the creation of machine-readable transcriptions with specialized markup. The transcriptions were originally produced in MECS-WIT format (cf. Huitfeldt 1994). But since the early years of this century, they are maintained in XML TEI format (cf. Pichler 2010). Transcription samples of 5000 Nachlass pages were made available in HTML format under a CC BY-NC 4.0 license on WAB’s website within COST Action A32 (2006 – 10) and the Discovery project (2006 – 09). Most importantly, since 2015 transcriptions of the entire Nachlass, along with high quality Nachlass facsimiles, are made available open access in HTML format (Wittgenstein 2015–, cf. Pichler 2019). In addition, WAB is, for more than a decade now, working on the implementation of semantic web methods and technology. Since 2013, WAB offers free download of a continuously growing computational ontology for the Wittgenstein domain from its website.[2] In order to provide for a common and persistent system of reference for its Nachlass resources, WAB has, within the framework of the Discovery project, assigned unique identifiers to the following: (i) each single one of the (about 150) Nachlass manuscript and typescript items, (ii) each single one of the (about 20,000) Nachlass pages, and (iii) each single one of the (about 55,000) Nachlass „Bemerkungen“ (remarks).[3] A Wittgensteinian Bemerkung is typically no longer than half a page and separated from other Bemerkungen by one or more blank lines.

In consequence, with the aforementioned reference system in place and the open access availability of content, metadata and ontology, it may seem that WAB has achieved its goal of sufficiently equipping the user community. But this is not the case: A static scholarly edition of Wittgenstein’s Nachlass, even if it is regularly updated for content, style and technical formats, will, by its very nature as a static edition, always be inadequate in meeting the ever-evolving, dynamic user needs. Any static edition is necessarily the result of selection and decision processes. While a static scholarly edition of Wittgenstein’s Nachlass certainly remains indispensable as a source of stable, authoritative and easily citable text, for the community to adequately use the resources in research and learning, much more than such an edition is required. The user community of the Wittgenstein Nachlass will always have needs and expectations that are not met even after all requirements of a static scholarly edition are fulfilled. The user needs can in the end only be satisfied by complementing the static edition with a platform that offers (i) access to datasets and aspects of the source not available through the edition and (ii) specific user need driven and tailored access and use.

The paper argues that it is only when two specific tools, namely Interactive Dynamic Presentation (IDP) of the Wittgenstein Nachlass (Wittgenstein (2016–) and Semantic Faceted Search and Browsing (SFB)[4] of Wittgenstein domain metadata, are fully implemented and functional that we begin to adequately address the needs and expectations of the users of Wittgenstein’s Nachlass.[5] Section 2 starts with a short description of the BEE. It then focuses on vital needs of the Wittgenstein Nachlass community that are not yet fulfilled by the BEE and in principle cannot be fulfilled by any static scholarly edition, whether on paper or digital. The two mentioned tools are intended for responding to precisely the ever-evolving and potentially unlimited user needs of the Wittgenstein Nachlass community. Sections 3 and 4 present the rationale and the advantages of the current WAB implementations of the IDP and the SFB tool, respectively. Section 5 endorses the fact that WAB makes its transcriptions of the Wittgenstein Nachlass also available in their XML TEI source format.

2 Open-ended User Needs

The BEE brings together three sub-editions – a facsimile, a normalized transcription and a diplomatic transcription edition – and can be called a „combined edition“ (Pichler & Haugen 2005). The diplomatic and the normalized transcription represent different modes of intervention on WAB’s source transcriptions of the originals. While the diplomatic, for example, retains deleted words, deleted characters, marks insertions as insertions and does not intervene in spelling and grammar, the normalized is directed towards providing a standardized, easy to read, and, finally, also easy to cite, stable authoritative text. To put it more theoretically, one could say that the diplomatic version primarily attends to the document or even to the document-carrier, while the normalized version is strongly text-focused. Thus, one could call the two formats two limiting cases of scholarly editing. The diplomatic version is an extremely helpful aid if one wants to start one’s Nachlass research by reading the facsimile, but from time to time needs deciphering help that is then supplied by the diplomatic version. For diplomatic and normalized transcription samples see Figures 1 – 4.[6]

The triple structure of facsimile, diplomatic and normalized version enabled the BEE to respond in one and the same publication to a spectrum of research needs, rather than simply fulfilling, for example, the request for only an easy to read final version of a text. The BEE demonstrated the significant advantages that digital editions have over print editions in that the former, for example, allow more user-flexible access to the edited material. But at the same time, the edition was still not dynamic enough for adequately responding to the full spectrum of research needs and interests that Nachlass users have and, furthermore, can legitimately expect digital editing to provide. The BEE was not dynamic enough by its very nature of being a static scholarly edition, the purpose of which is to provide a stable and citable authoritative text. No static edition alone will ever be dynamic enough to meet the challenge of accommodating the diverse and evolving needs of the research community. Therefore, while a static scholarly edition will always be required, it must at the same time be complemented by a dynamic research platform that not only offers additional resources, but also additional and interactively available and toggleable analysis tools and additional presentation and filtering options.[7] Let me illustrate the claim with a few examples.

Need for the possibility of chronological sorting: During his military service in WW1, Wittgenstein kept diaries – MSS 101 – 103 (1914 – 17) – where he not only wrote his philosophical reflections that eventually resulted in his first philosophical book, the Tractatus logico-philosophicus (1921/22), but also noted down deeply personal and private remarks. As a rule, he used for the personal and private entries the verso pages and a code, while for the logical and philosophical remarks the recto pages (and no code; see Figure 1).[8] When editing the material for readers primarily interested in Wittgenstein’s philosophy, the two Wittgenstein trustees G.E.M. Anscombe and G.H. von Wright selected from the notebooks what they considered the philosophically relevant portions only and turned these into a normalized book edition called Tagebücher / Notebooks in 1960. Many years later, in an unauthorized edition called Geheime Tagebücher (1985), Wilhelm Baum published the coded personal and private remarks. To date there is no German or English book edition of the Notebooks that contains both portions in one and the same book. While the BEE does contain both, it contains them as separate blocks. For each of the three notebooks, the BEE first presents the sequence of the personal and private remarks and then presents the sequence of the philosophical remarks. Naturally, it makes sense to separate the two types of remarks since they each belong to their own specific discourse. However, as a consequence of such editing practices, the Wittgenstein community has learned to receive the remarks as two entirely separate strands and discourses. Thus, while the division of the content of the notebooks into the philosophical and the personal may seem appropriate and satisfactory from one certain scholarly perspective, upon reflection, the practice clearly reveals disadvantages from another. Namely, it makes equal sense to put in context all remarks that Wittgenstein wrote on a specific day. The separation not only splits the text sequence, but also splits the chronological sequence of the remarks into two and thus makes it cumbersome to connect Wittgenstein’s personal and private remarks with his simultaneous reflections on philosophy and logic (and vice versa) for a better understanding of Wittgenstein’s work (see Figure 2).

Need for the possibility of including / suppressing revision layers: After Wittgenstein’s return to Cambridge in 1929, an event that is often regarded as simultaneous with his return to philosophy, Wittgenstein wanted to publish a second philosophical book. He had different ideas of the book’s contents and form at different times. But the so-called Big Typescript, TS 213 (1933), is widely regarded as a definite and substantial stepping stone in this project, if not as an actual candidate for the envisaged book. In this typescript, Wittgenstein collected between two and three thousand remarks selected from the manuscript volumes he had written since 1929 and organized them into chapters and subchapters. He introduced each chapter with a philosophical topic heading (e. g., „Meaning“)[9] and each subchapter, typically, with a philosophical statement (e. g., „The concept of meaning originates in a primitive conception of language“).[10] However, Wittgenstein soon felt uncomfortable with the arrangement in the Big Typescript and started to not only reorganize the ordering but also to revise the text itself. Moreover, this reworking took place not only in the typescript itself but also in a number of new notebooks and manuscript volumes such that the resulting new text was spread over several items. When producing a book edition from this project, Philosophical Grammar (PG 1969), the third Wittgenstein trustee Rush Rhees tried to come as close as possible to Wittgenstein’s final intended revision of the text for the Big Typescript. However, his edition was criticized for blurring the distinction between the actual and the virtual text of the corpus, and it was remarked that Rhees should have edited the Big Typescript „as it stood“, i. e. without Wittgenstein’s later handwritten revisions (cf. Kenny 1976: 46; as a response, see Rhees 1996). The BEE wisely returned – as a documentary edition – to the actual documents and offered diplomatic and normalized transcriptions thereof. It included hyperlinks where Rhees actually carried out Wittgenstein’s instructions for arranging the text in a different order or replacing part of it altogether. Although this was a required step, at the same time it also deprived readers of an easy way to follow and cite the text in the sequence that resulted from Wittgenstein’s revision. This text was often only virtually given in the Nachlass but had earlier been offered by Rhees’ edition. It is clear that one should be able to have it both ways, and that it should precisely be the digital edition that gives it to the reader in both ways: the text before and after the revision – actually, any text before, with, and after all revisions. This is especially relevant for work with heavily revised typescripts, such as TS 213 or TS 226 (see Figure 3). However, this is not something that the BEE could achieve, and currently it is still not fully achieved by WAB’s Nachlass editions on the web.

Need for the possibility of filtering according to different parameters: From his earliest to his latest writings, whenever Wittgenstein revisited his remarks with an eye for further editing and processing, he marked them with symbols in the margins of the page, for example, a slash, an asterisk, a circle, a letter like a capital S, or a letter like lower case x. At WAB these symbols are called „section marks“. Similarly, when considering collecting his remarks in thematic clusters, Wittgenstein would add numbers or combinations of numbers and letters in the margins such that through these symbols the remarks were assembled in groups.[11] The meaning of the single symbols, especially when it comes to the section marks, is to date only partially known. Whenever they have been included in print or digital editions, the editors simply tried to reproduce them in their graphical appearance. This was also the case with (the diplomatic transcription of) the BEE. Against this practice one could object that the entire point of these symbols is to signalize that Wittgenstein wanted to do things with the remarks thus marked: dictate them, omit them, discard them, revise them, rearrange them, group them, etc. Accordingly, it is to these action intentions that the reader should be directed by the edition, rather than simply receiving only a visual representation of the symbol. For example, about the remarks marked with a slash in MSS 105 – 108 (1929 – 30) we know that most of them were dictated to a typist (cf. Pichler 1994), resulting in TS 208. About the number and letter combinations with which Wittgenstein marked several thousand remarks cut out from his typescripts of 1930 – 31, we know that they constituted the reference system according to which the „Zettel“ collection of TS 212 (1932 – 33), which subsequently became TS 213, was to be thematically organized (cf. Rothhaupt 2016). However, there is a great deal of such editorial symbols left in the Nachlass, and we do not have sufficient knowledge about their functions. Some of them will contain an instruction for how to proceed with, or from them; others will serve to express an evaluation of the remark tagged with the specific symbol. Now, users who want to study the meaning of these symbols further or even to convert their meanings to resulting text selections, groupings and arrangements have a non-negotiable requirement. The requirement is that one has an edition that not only renders the remarks in their original sequence with these symbols included, but also permits filtering and arranging the texts according to these symbols while retaining the possibility of including or omitting the symbols themselves in the resulting output. It should thus, for example, be possible to extract all remarks and only the remarks which in the Nachlass are marked by Wittgenstein with a slash, or an asterisk, or a backslash, etc., or a specific combination of them. It is only then that these users’ needs will be adequately addressed and the scope of the scholarly utilization of WAB’s Nachlass resources can be greatly widened. So, for example, users may become equipped to study specific genetic processes in the Nachlass or recognize thematic groups of which the symbols are often the „indices“. Or they may become enabled to perform more basic tasks such as learning about the function and meaning of the symbol itself. Rothhaupt (2013) has argued that the remarks which Wittgenstein marked with a circle / circle-like symbol (a „Kringel“) contain Wittgenstein’s attempts at a philosophy of culture. Again, it is only if the user has access to a filtering tool such as the one described here, permitting easy extraction of all Bemerkungen and only the Bemerkungen which are marked by Wittgenstein with a „Kringel“-symbol, that they are in the position to efficiently and reliably investigate this hypothesis or to discover other elements in these remarks that led Wittgenstein to mark them all with the „Kringel“-symbol. While WAB’s website (Wittgenstein 2016–) today already offers filtering of Nachlass documents according to section mark parameters, this feature still stands in need of improvement and does not yet fully meet all user requirements.

Need for the possibility of conducting metadata search / combined text and metadata search: The BEE already offered some semantic search functionalities – e. g., search for references to persons, taxonomies for mathematical and logical notation as well as for graphics, possibility to focus on the coded passages or other groupings only. At the same time, many more valuable metadata had been recorded in the transcriptions or via stand-off markup that users could greatly benefit from if only they had processing access to them. Some users would, for example, not only want to search and browse the Nachlass by references to persons or works (for a sample, see Figure 4), but specifically all references to persons or works that have come about or, alternatively, got discarded by later revision (e. g., the revision of the Big Typescript). Or a reader, who has an interest in influences on Wittgenstein but is most acquainted with the book publications from the Nachlass only, may want to search for Wittgenstein’s references to persons and works exclusively in all the remarks which hitherto were not included in any of the book publications from the Nachlass. A reader most interested in Wittgenstein’s writing in code may in turn want to search for any passage in code that hitherto was not published in print. Or one may want to check whether there is a correlation between the remarks marked with a slash and eventual publication of the remarks by the trustees in one of the book publications. One may want to do so either to better understand Wittgenstein’s use of the section marks or to find out to what extent the Nachlass editors let themselves be guided by the section marks for their selection of materials to be published. Moreover, one may want to find out whether there is a correlation between the sequence of the remarks in a specific work by Wittgenstein, e. g., the Philosophical Investigations, and their chronological origin. Other user needs relate to a remark’s genetic path(s), its place of origin in the Nachlass corpus, references to places, events and other named entities, similarity to other remarks (cf. Ullrich 2019; Huitfeldt & Sperberg-McQueen 2020), adherence to text type and genre (philosophical remark, preface, motto, dedication, instruction, aphorism, diary entry, autobiographical remark, personal and private remark, coded remark, mathematical-logical notation, graphic, etc.), adherence to Nachlass group (notebook, loose sheet, „Zettel“, ledger, typescript, dictation, etc.), work status (first draft, elaborated version, final work, etc.), script type (shorthand, code, etc.), the language the remark is written in (German, English), writing and revision instrument (different kinds of pencil, black ink, red ink, blue ink, etc.), research literature referring to it, and so on. Finally, one may frequently also need to conduct searches with both the text and the metadata as one’s research base. To be able to combine text and metadata search becomes, for example, pertinent where one needs to find all and only those documents that contain both a specific word used by Wittgenstein and a specific reference to a person or work. Or one may remember only one or two words from a passage in the Philosophical Investigations and try to find the passage with the help of these words, restricting one’s search to precisely the Philosophical Investigations corpus only. These are all relevant and legitimate needs that can turn out to be pressing in either research or learning. The list in fact seems endless. To meet these and similar needs, efficient and selective access to metadata, and iteratively faceted processing of metadata becomes essential. Unfortunately, neither the BEE nor WAB’s Wittgenstein web services currently fully meet this challenge.

3 Interactive Dynamic Presentation (IDP)

In the preceding section I have given some examples of the great variety of needs encountered in the user community. If one tries to respond to these needs by providing the requested functionalities and services in one and the same static edition, it is likely to fail either technically or in terms of usability (or both). Rather, what one needs is toggleable services that in dynamic and interactive ways produce highly adjustable outputs serving, hic et nunc, the wide spectrum of resources needed: from heavy apparatus to readers’ editions, from original to standardized orthography and grammar text, from physically to chronologically arranged document sequences, etc., etc.[12] Even users who in the beginning are satisfied with having at their disposal the standard presentation formats of the diplomatic (recording all deletions, insertions, overwritings, etc.) and the normalized transcription (giving the resulting text in standardized form) are likely to discover that these represent editorial intervention, or a lack thereof, which does not satisfy all the needs of research one can rightly expect scholarly digital resources to fulfill. Some may find out, for example, that a minor thing such as representation of the original’s line breaks can make a big difference for purposes of their research. Edited text outputs that follow the original line order have indeed many advantages, one of them being that they easily permit comparison of edited text version and facsimile. In other contexts, however, indication of the original’s line-breaks is completely inessential to one’s research interests or even distracting. Interactive Dynamic Presentation (IDP) relieves the editors from having to statically freeze a representation that comes with or without the line breaks; this decision can simply be left to the user who can have it both ways depending on what they judge to be best in a particular context.

One might want to entertain the belief that user needs that fall outside of what is already covered by the diplomatic and the normalized formats can be dealt with by simply increasing the number of formats pre-produced and offered. In this spirit one could suggest adding to the diplomatic and normalized versions, if they come without representation of original line breaks, diplomatic and normalized versions with representation of original line breaks. One might suggest adding the linear version that occupies a middle ground between the diplomatic and normalized version, or also a „typescript only“ version that omits handwritten revisions in the typescript (for an example of the latter see Figure 3). A linear version would, for example, include deleted portions of entire words or even entire sentences – in distinction to the diplomatic version where all deleted parts are included, and in distinction to a strongly normalized version where no deleted parts are included. And it could mark every editorial intervention into orthography and grammar – in distinction to a normalized version where only normalization and interventions on word and punctuation level may be marked, or where no such interventions are marked as such. If we continue along this line, we will sooner or later encounter the question: In the end, even if we were to meet only the most common user requests, how many additional versions must be added? Even for the „typescript only“ version one would again have to distinguish between at least its diplomatic and normalized variants: the former rendering the typescript „as it stood“, the latter permitting text search across unified orthography and grammar and including replacement symbols for logical and mathematical notation as well as other characters not available on the typewriter: „ss“ for „ß“, „ae“ for „ä“, „Ae“ for „Ä“, etc., etc. But all such amendments would not yet help with gaining access to, for example, chronologically sorted outputs of either a single Nachlass item or an entire Nachlass group. Such sorting is relevant for the whole of the Nachlass, for example, of the first Bände series MSS 105 – 122 where there are many jumps from the middle of one manuscript to the other.

Attempting to answer the question, ‘How many versions must be added?’, will make one realize that the number of versions to be produced one is likely to end up with is already too big to feasibly continue along this line of offering pre-made static editions. Rather, one needs to provide something like an interactive and dynamic „laboratory“ setting that lets the user create the variety of outputs needed on the fly. Thus, all add-ons will eventually lead us exactly to the point where IDP already is. Thus, instead of trying to add any extra editions as static add-ons that are pre-made by the editors in advance, it seems a much better strategy to provide for the possibility that the user generates the edition(s) from the underlying text archive on the fly and as required in the hic et nunc-situation. All that is needed for this strategy to work is, in addition to software, something like a text archive in the form of encoded transcriptions, stylesheets for their conversion to the specific output desired, and an interface for running on the transcriptions precisely those parameters of the stylesheets that are required for achieving this output. In this spirit, WAB’s IDP site (Wittgenstein 2016–)[13] allows the user to interactively produce – always from the latest version of the XML TEI (P5) source transcriptions text archive – more tailored and a greater variety of outputs than those already available from the pre-made static editions offered by WAB (e. g., Wittgenstein 2015–[14]). These outputs are created in HTML through XSLT transformation from a single transcription source which contains for each Nachlass item, Nachlass page, Nachlass Bemerkung, Nachlass sentence, formula, drawing, word, letter and character detailed philological / structural / semantic information.[15] Through the site, the user is given access to the ever newest and improved version of the source transcription as well as the possibility to interactively process the source transcription into the presentation that seems best tailored to their specific research needs. This makes the output resource for the users a dynamic, adjustable, revisable and continuously updatable entity.

WAB’s IDP can fulfill its task thanks to its compliance with three principles (cf. Pichler & Bruvik 2014): (1) Separation of matters of transcription (encoding, markup) from matters of presentation; (2) empowerment of users to let them interactively co-produce editions rather than being passive receivers of expert-editor produced editions only; (3) a dynamic and multi-relational view of the relation between the source document and potential presentations of it. These three principles are central ingredients of all text encoding based digital editing at WAB. Naturally, the more detailed and explicit an encoding the source transcription contains, the more powerful and adjustable the IDP can become. With regard to WAB, chronological sorting of the Wittgenstein Nachlass, for example, can be implemented thanks to two features of WAB’s transcriptions. First, for each single Nachlass remark there exists a self-contained complete XML TEI transcription such that the entire Nachlass can be constructed out of the transcriptions of its single remark. Second, for each of the single Nachlass remarks there also exist WAB metadata providing (albeit frequently merely alleged) datings for the remark. The two taken together provide for the possibility of a complete sorting of the entire Nachlass according to chronological parameters. Omission of handwritten revision in typescripts can be achieved thanks to explicit encoding of handwriting in typescripts. Filtering and sorting of the Nachlass texts according to Wittgenstein’s editorial marks or numbers can be put to practice thanks to the specific encoding WAB uses for them.

It is a principle of the IDP model precisely not to provide a different source transcription for each of the different desired outputs. Rather, IDP works on the basis that one explicitly marks everything that is to be subsequently processed in the one and only master source transcription, possibly combined with additional standoff markup, and leaves questions of presentation, filtering, sorting, etc., to the stylesheet and user interface. E. g., with regard to subsequent IDP manipulation of handwritten revisions in typescript, every hand-produced writing act is encoded in such a way that it can be filtered and processed independently of everything that is typewritten (and vice versa).[16] While WAB’s encoding is still far from complete enough to be adequately prepared for all legitimate IDP requests, these examples should suffice to show that the possibilities and capacities that the IDP model offers for responding to user needs are simply enormous. Sorting functionalities (sorting according to chronology; sorting according to physical sequence; sorting according to discourse sequence; sorting in order to more easily relate personal diary entries and philosophical remarks; sorting to better see the chronological sequence of Wittgenstein’s philosophical work, etc.), multiple presentation functionalities (inclusion and omission of handwritten revisions in a typescript such as insertions, deletions, overwritings, underlinings, etc.; presentation of typed text only, in order to study the vocabulary before the typescript was revised and investigate genetic processes, etc.), filtering functionalities (filtering of the Nachlass according to the marks and numbers that Wittgenstein assigns to his remarks; filtering in order to identify thematic groups; filtering in order to separate [according to Wittgenstein’s own judgment] ‘good’ from ‘not so good’ remarks; filtering in order to identify genetic processes, etc.) as well as their combinations (sorting, filtering and inclusion/omission according to text revision stage, etc.) offer benefits that Nachlass scholars, prior to the introduction of IDP, could ask for but could not expect the requests to be fulfilled.

One of the most important effects of IDP is that it creates in the user an awareness that what they are dealing with is not something that simply falls from the sky: that even pre-made editions do not fall from the sky but result from selection and decision processes. Where the user previously, maybe with a sort of innocent and uncritical attitude, simply received and accepted what the editors– be it Wittgenstein’s heirs (cf. Erbacher 2020) or others – gave them (cf. Pichler, Biggs & Uffelmann 2011), with IDP they suddenly recognize that any edition is a product of human action deserving scrutiny and verification. The user may just as well take an active role themselves in the making of the edition; and through IDP they can indeed become a co-agent, taking on some of the editorial responsibilities themselves.

However, even to put all already existent encodings to work and offer them for IDP toggling demands a substantial programming investment. While some of the technical features and functionalities required and mentioned above already work, many do not yet work flawlessly or on all required levels. Chronological sorting, for example, while it has worked for single items already for a long time, is needed the most at higher levels such as the chronological arrangement of Nachlass item groups or even the entire Nachlass corpus. With regard to the above-mentioned WW1 diary group MSS 101 – 103 (1914 – 17) where the chronological sequence is dispersed across different page sequences, this functionality will finally enable users to much more easily connect Wittgenstein’s personal and private remarks with his simultaneous reflections on philosophy and logic. Moreover, the feature of toggling on and toggling off handwritten revisions so that one can view one and the same typescript page, e. g., from the Big Typescript, with and without handwritten corrections and additions, does not as yet fully work as it should. One example where the toggling on and toggling off of handwritten revisions becomes relevant is, as discussed earlier, the study of the sources for Philosophical Grammar. Another important case is Wittgenstein’s handwritten revisions to Rush Rhees’ translation of the early version of the Investigations, contained in TS 226 (cf. Pichler 2020a). Generally, the current inadequacies can be briefly summarized as follows: The IDP tool currently manages to offer access to (1) only a fraction of the encoding, (2) only a fraction of combinatorial possibilities of the encoding, (3) only a fraction of the presentation, sorting and filtering possibilities and needs, and (4) in all these three fields it is susceptible to errors due to undesired interference. It is in fact a major challenge to provide for combined filtering, sorting and presentation modes that work in tandem and do not negatively interfere with each other. This challenge results in the following limitations: with regard to (1), users cannot yet, for example, filter the transcriptions for insertions of a specific subtype; with regard to (2), users are not yet able to, for example, combine filtering of insertions with filtering of the encoding of text alternatives; and with regard to (3), it is not yet possible to render, for example, the type of insertions selected in ways other than what is set by WAB as the default for the IDP site. Moreover, with regard to (2), currently it is, for example, not possible for the user to combine a marking of Wittgenstein’s text alternatives with a diplomatic rendering, or with the inclusion / exclusion of his own markers for text alternative, or with a toggling of including / excluding the alternatives discarded by Wittgenstein.[17]

But equipping the user with the tools required to do this and thus to make the best out of WAB’s resources, will involve much more than cutting edge digital editorial philology methods and tools. It also requires giving the user access to at least the most basic contemporary semantic web technology and methods.

Figure 1: Facsimile of MS 101 facing pages 51v–52r (dated October 29 – 30 and 20, 1914) (,51v_f and,52r_f; © The Master and Fellows of Trinity College, Cambridge, and the University of Bergen, Bergen), plus diplomatic transcription of p. 51v with original line breaks and italics indicating code writing (reproduced by kind permission of The Master and Fellows of Trinity College, Cambridge, and the University of Bergen, Bergen)

Figure 2: Normalized transcription of remark (Bemerkung) Ms-101,51v[2] „Erhielten heute …“ from MS 101, p.51v, chronologically co-arranged with remarks Ms-101,69r[4] and Ms-101,69r[5], equally dated October 30, 1914, but from p. 69r (see,69r_f); italics indicate code writing (reproduced by kind permission of The Master and Fellows of Trinity College, Cambridge, and the University of Bergen, Bergen)

Figure 3: Part of TS 226, p. 2 (see,2_f; © The Master and Fellows of Trinity College, Cambridge, and the University of Bergen, Bergen) with diplomatic transcription of remark Ts-226,2[2] „Augustine describes …“; for a diplomatic transcription of the entire remark see,2[2]_d (reproduced by kind permission of The Master and Fellows of Trinity College, Cambridge, and the University of Bergen, Bergen)

4 Semantic Faceted Search and Browsing (SFB)

WAB’s Nachlass reference system provides an URL for each single Nachlass component. It goes without saying that this is crucially important for working with IDP user-generated content; with the reference system, the component researched or cited not only becomes easy to refer to but also exactly describable, traceable and tractable throughout all filterings, sortings and rearrangements as well as throughout all research articles and annotations or semantic web environments it enters into. The same system of reference is also applied to the facsimiles; this contributes significantly to user-friendliness and the quality of research in terms of its coherence and consistency. Last but not least, the reference system also makes up the backbone of WAB’s computational ontology for the Wittgenstein domain and is thus also essential for working with „semantic Wittgenstein“.

While the Wittgenstein domain’s data and metadata can naturally be accessed via a direct string search, in section 2 a need was identified for conducting more organized metadata search as well as a need for search combining text and metadata. This is precisely what Semantic Faceted Search and Browsing (SFB) is about. SFB can be briefly explained as follows: First, SFB treats a source in their data and metadata, i. e., semantic, classificatory and also taxonomic aspects. Second, SFB applies digital semantic technologies to the source; SFB is thus about organizing and investigating a domain’s semantics rather than a field of editorial philology. Finally, SFB’s search and browsing works with facets as a vehicle. Facets are properties, dimensions or relations of the domain’s objects according to which these objects can be classified and thus include metadata. The term „faceted“ stands for the metadata and data filtering through incremental faceting, and, therefore, by extension, the filtering of data and metadata down to the desired result (the „hit“). There can be a great many and a great variety of relations between the objects of a domain, and at different points there will be a need to focus on different objects and relations. SFB permits to do precisely that and to identify the object(s) which match the focus one has at a particular place and time. Furthermore, each object can have a great number of relations to other objects which one may be aware of but for which one lacks an overview as well as many additional relations which one might not yet be aware of. SFB helps to see and make explicit the relations which, though on the surface, previously remained unseen and thus helps the user achieve a synoptic view of the objects and their relations. SFB is also the tool for simply exploring and working with the relations that one already is aware of, but maybe still needs a highway type of routes to easily move from one known node to another in the semantic landscape. Finally, SFB also offers the possibility to view the remark hit resulting from one’s searching and browsing along with a linear transcription presentation of the remark.

The data model behind WAB’s SFB tool „Wittgenstein Ontology Explorer“[18] is an ontology in OWL (RDF) format that organizes not only the Nachlass but the entire Wittgenstein domain under three top classes: Source, Person and Subject (cf. Pichler & Zöllner-Weber 2013). The Source class houses primary and secondary sources relevant for Wittgenstein research; the subclass Primary Source further divides into Wittgenstein sources, such as the Tractatus, and external sources, such as Augustine’s Confessions. The lowest subclass of Wittgenstein primary sources is the remark, the Bemerkung. The Person class contains historical persons such as authors Wittgenstein refers to (cf. Biesenbach 2014). The Subject class contains subclasses such as Concept and Claim; Concept refers to concepts dealt with by Wittgenstein himself or in Wittgenstein research, such as ‘elementary proposition’, ‘picture’, ‘state of affairs’, ‘essence’, ‘logical analysis’ ‘logical independence’, ‘philosophy’, ‘proof’, etc. Instances of Bemerkung can be interlinked with instances of Concept via the property discusses. Claim refers to the point made or discussed by an individual Wittgenstein remark and may contain an entire statement, such as, „The elementary proposition is a picture of a state of affairs“ (TLP 2016: 4.22). Instances of Bemerkung can again be interlinked with instances of Claim via the property discusses. But before the Wittgenstein domain metadata are made available on the SFB front end, they must be entered into the tool; and before being entered, their relations to each other must be modeled precisely in the ontology. WAB’s reference system for the Nachlass is already implemented in the ontology and the SFB site such that the semantic faceted search and browsing metadata branch can fully communicate with WAB’s editorial philology branch.

WAB’s SFB at present permits search and browsing of the Nachlass along a number of facets, incl. reference to a person, reference to a work, a remark, its dating and its relation to „published works“. It displays any resulting remark hit along with a hyperlink to the corresponding facsimile in the Bergen Nachlass Edition on Wittgenstein Source (Wittgenstein 2015–).

Figure 4: Linear transcription with colour markup (see,2[2]_n) plus SFB-representation (,2%5C%5B2%5C%5D) of remark Ts-226,2[2]

One substantial and important part of the SFB tool is the WiTTFind lemmatized Wittgenstein Nachlass lexicon WiTTLex. WiTTLex contains a lemmatized index for all occurrences of words in the Nachlass (cf. Röhrer 2019). The lexicon is the outcome of a long-standing cooperation between WAB and the Centrum für Informations- und Sprachverarbeitung (CIS) at the Ludwig Maximilians Universität München on the search tool WiTTFind.[19] In this joint project, WAB contributed its facsimiles and encoded XML TEI transcriptions of the Wittgenstein Nachlass as well as XSLT stylesheets for their processing, while CIS provided programming and computational linguistics personnel resources as well as a grammatically encoded digital lexicon of the German language based on Franz Guenthner’s CISLEX system (cf. Hadersbeck et al. 2016). WiTTFind offers lemmatized online text search access to the entire Nachlass, displays each sentence containing any grammatical form of the word searched for within the context of the larger remark, and additionally highlights the segment of the facsimile corresponding to the remark hit. WiTTFind continues WAB’s reference system all the way down to sentence level and is a fine example of the added value created by making one’s data available for research and reuse by others. Implementing WiTTLex in the SFB tool permitted to more adequately respond to one of the needs identified above: simultaneous and combined SFB of both metadata and lemmatized text data. Researchers interested in the genesis of Wittgenstein’s philosophy, for example, may want to know when Wittgenstein started to use the expression „game“ in places where earlier he had written „calculus“, and whether this development can be linked to any other development, e. g., reference to particular works of others, other changes in vocabulary, developments in letter correspondence, meetings and discussions with friends and colleagues, etc. Doing this kind of research becomes possible through the integration of WiTTLex into SFB. WAB additionally plans ingestion of all words as they occur in the Nachlass, i. e., in their inflected forms, into SFB.

In this section I could give only a brief glimpse of how the instances of the rich and comprehensive semantic Wittgenstein domain can be intertwined and their complex relations modeled so that both the instances and the relations between them can subsequently be made available for SFB. The tool goes far beyond only Nachlass-related sources and persons. As of December 2022, the SFB site offers search and browsing of more than 56,000 instances of Source and more than 1000 instances of Person from the Wittgenstein domain. However, the SFB tool still lacks basic functionalities. Tasks of improving SFB include adding and organizing missing facets, e. g., more refined facets for browsing and filtering mathematical and logical notation and graphics along the taxonomical categories that were developed for and already offered by the BEE. A functionality that was added only recently is the chronological sorting of a remark’s variants; they were previously displayable in alphanumeric order only.

5 Access to WAB’s Nachlass Transcriptions in XML TEI Format

While WAB has come far in producing and maintaining its transcriptions of the Wittgenstein Nachlass as well as offering them through the IDP site and has also come far in producing, organizing, maintaining, and offering metadata for the Wittgenstein domain more generally through the SFB site, both tools themselves need extension and optimization towards the strengthening of interactive functionalities and better matching of user needs. Until this is achieved, some more help can be provided by directly offering WAB’s XML TEI transcription to the public. These have in 2022 been deposited in the „CLARINO Bergen Repository“, and each year newer and better versions will be provided there.[20] This equips at least users with XML programming competence to themselves respond to their research needs by directly processing and querying the XML TEI transcription files. But it is only a fraction of Wittgenstein Nachlass users who master XML technology and most will need an interface that offers the resources in a digital language and style that they understand. For these, IDP and SFB will remain central tools.

While at present WAB’s web services including the IDP and SFB tools already enjoy a large number of international users,[21] it is only when these and other additional services are upgraded and optimized, that researchers will be able to take full advantage of WAB’s resources. Only then will users be equipped to fully exploit the multifaceted interrelations between and within the Wittgenstein data and metadata provided by WAB for research and learning. At the same time, it is also only then that the deep issues about the relation between on the one hand the contents and forms of Wittgenstein’s philosophy and work, and on the other hand their interpretation and application can properly begin to play out in their genuinely complex formats via interactive digital media.


This work is licensed under the Creative Commons Attribution 4.0 International License.

