There is a set of statistical measures developed mostly in corpus and computational linguistics and information retrieval, known as keyness measures, which are generally expected to detect textual features that account for differences between two texts or groups of texts. These measures are based on the frequency, distribution, or dispersion of words (or other features). Searching for relevant differences or similarities between two text groups is also an activity that is characteristic of traditional literary studies, whenever two authors, two periods in the work of one author, two historical periods or two literary genres are to be compared. Therefore, applying quantitative procedures in order to search for differences seems to be promising in the field of computational literary studies as it allows to analyze large corpora and to base historical hypotheses on differences between authors, genres and periods on larger empirical evidence. However, applying quantitative procedures in order to answer questions relevant to literary studies in many cases raises methodological problems, which have been discussed on a more general level in the context of integrating or triangulating quantitative and qualitative methods in mixed methods research of the social sciences. This paper aims to solve these methodological issues concretely for the concept of distinctiveness and thus to lay the methodological foundation permitting to operationalize quantitative procedures in order to use them not only as rough exploratory tools, but in a hermeneutically meaningful way for research in literary studies.
Based on a structural definition of potential candidate measures for analyzing distinctiveness in the first section, we offer a systematic description of the issue of integrating quantitative procedures into a hermeneutically meaningful understanding of distinctiveness by distinguishing its epistemological from the methodological perspective. The second section develops a systematic strategy to solve the methodological side of this issue based on a critical reconstruction of the widespread non-integrative strategy in research on keyness measures that can be traced back to Rudolf Carnap’s model of explication. We demonstrate that it is, in the first instance, mandatory to gain a comprehensive qualitative understanding of the actual task. We show that Carnap’s model of explication suffers from a shortcoming that consists in ignoring the need for a systematic comparison of what he calls the explicatum and the explicandum. Only if there is a method of systematic comparison, the next task, namely that of evaluation can be addressed, which verifies whether the output of a quantitative procedure corresponds to the qualitative expectation that must be clarified in advance. We claim that evaluation is necessary for integrating quantitative procedures to a qualitative understanding of distinctiveness. Our reconstruction shows that both steps are usually skipped in empirical research on keyness measures that are the most important point of reference for the development of a measure of distinctiveness. Evaluation, which in turn requires thorough explication and conceptual clarification, needs to be employed to verify this relation.
In the third section we offer a qualitative clarification of the concept of distinctiveness by spanning a three-dimensional conceptual space. This flexible framework takes into account that there is no single and proper concept of distinctiveness but rather a field of possible meanings depending on research interest, theoretical framework, and access to the perceptibility or salience of textual features. Therefore, we shall, instead of stipulating any narrow and strict definition, take into account that each of these aspects – interest, theoretical framework, and access to perceptibility – represents one dimension of the heuristic space of possible uses of the concept of distinctiveness.
The fourth section discusses two possible strategies of operationalization and evaluation that we consider to be complementary to the previously provided clarification, and that complete the task of establishing a candidate measure successfully as a measure of distinctiveness in a qualitatively ambitious sense. We demonstrate that two different general strategies are worth considering, depending on the respective notion of distinctiveness and the interest as elaborated in the third section. If the interest is merely taxonomic, classification tasks based on multi-class supervised machine learning are sufficient. If the interest is aesthetic, more complex and intricate evaluation strategies are required, which have to rely on a thorough conceptual clarification of the concept of distinctiveness, in particular on the idea of salience or perceptibility. The challenge here is to correlate perceivable complex features of texts such as plot, theme (aboutness), style, form, or roles and constellation of fictional characters with the unperceived frequency and distribution of word features that are calculated by candidate measures of distinctiveness. Existing research did not clarify, so far, how to correlate such complex features with individual word features.
The paper concludes with a general reflection on the possibility of mixed methods research for computational literary studies in terms of explanatory power and exploratory use. As our strategy of combining explication and evaluation shows, integration should be understood as a strategy of combining two different perspectives on the object area: in our evaluation scenarios, that of empirical reader response and that of a specific quantitative procedure. This does not imply that measures of distinctiveness, which proved to reach explanatory power in one qualitative aspect, should be supposed to be successful in all fields of research. As long as evaluation is omitted, candidate measures of distinctiveness lack explanatory power and are limited to exploratory use. In contrast with a skepticism that has sometimes been expressed from literary scholars with regard to the relevance of computational literary studies on proper issues of the humanities, we believe that integrating computational methods into hermeneutic literary studies can be achieved in a way that reaches higher explanatory power than the usual exploratory use of keyness measures, but it can only be achieved individually for concrete tasks and not once and for all based on a general theoretical demonstration.
Bondi, Marina, Perspectives on Keywords and Keyness. An Introduction, in: Marina Bondi/Mike Scott (eds.), Keyness in Texts, Amsterdam/Philadelphia 2010, 1–18.10.1075/scl.41.01bonSearch in Google Scholar
Bruza, P.D. et al., Aboutness from a Commonsense Perspective, Journal of the American Society for Information Science 51:12 (2000), 1090–1105.10.1002/1097-4571(2000)9999:9999<::AID-ASI1026>3.0.CO;2-YSearch in Google Scholar
Carnap, Rudolf, Logical Foundations of Probability, Chicago/London/Toronto 1950.Search in Google Scholar
Duncker, Axel, Gattungssystematiken, in: Rüdiger Zymner (ed.), Handbuch Gattungstheorie, Stuttgart/Weimar 2010, 12–15.Search in Google Scholar
Fricke, Harald, Norm und Abweichung, München 1981.Search in Google Scholar
Fricke, Harald, Definitionen und Begriffsformen, in: Rüdiger Zymner (ed.), Handbuch Gattungstheorie, Stuttgart/Weimar 2010, 7–10.Search in Google Scholar
Gabrielatos, Costas, Keyness Analysis: Nature, Metrics and Techniques, in: Charlotte Taylor/Anna Marchi (eds.), Corpus Approaches to Discourse. A critical review, Oxford 2018, 225–258.10.4324/9781315179346-11Search in Google Scholar
Gymnich, Marion/Birgit Neumann/Ansgar Nünning (eds.), Gattungstheorie und Gattungsgeschichte, Trier 2007.Search in Google Scholar
Hempfer, Klaus W., Zum begrifflichen Status der Gattungsbegriffe: Von ›Klassen‹ zu ›Familienähnlichkeiten‹ und ›Prototypen‹, Zeitschrift für französische Sprache und Literatur 120:1 (2010), 14–32.Search in Google Scholar
Herrmann, Berenike J./Karina van Dalen-Oskam/Christof Schöch, Revisiting Style, a Key Concept in Literary Studies, Journal of Literary Theory 9:1 (2015), 25–52.10.1515/jlt-2015-0003Search in Google Scholar
Jauß, Hans Robert, Literaturgeschichte als Provokation der Literaturwissenschaft, Konstanz 1967.Search in Google Scholar
Kelle, Udo, Die Integration qualitativer und quantitativer Forschung – theoretische Grundlagen von »Mixed Methods«, Kölner Zeitschrift für Soziologie und Sozialpsychologie 69:2 (2017), 39–61.10.1007/s11577-017-0451-4Search in Google Scholar
Klimek, Sonja/Ralph Müller, Vergleich als Methode? Zur Empirisierung eines philologischen Verfahrens im Zeitalter der Digital Humanities, Journal of Literary Theory 9:1 (2015), 53–78.10.1515/jlt-2015-0004Search in Google Scholar
Lamping, Dieter, Handbuch der literarischen Gattungen, Stuttgart 2009.Search in Google Scholar
Lincoln, Yvonna S./Egon G. Guba, Paradigmatic Controversies, Contradictions, and Emerging Confluences, Revisited, in: Norman Denzin/Yvonna S. Lincoln (eds.), Handbook of Qualitative Research, Thousand Oaks, CA 52018, 108–150.Search in Google Scholar
Müller, Ralph, Kategorisieren, in: Rüdiger Zymner (ed.), Handbuch Gattungstheorie, Stuttgart/Weimar 2010, 21–23.Search in Google Scholar
Paquot, Magali/Yves Bestgen, Distinctive Words in Academic Writing: A Comparison of three Statistical Tests for Keyword Extraction, DIAL – Digital Access to Libraries,  https://dial.uclouvain.be/pr/boreal/object/boreal:76052 (17.09.2021), 1–23 (originally published in Language and Computers 68 , 247–269).Search in Google Scholar
Schmidt-Hidding, Wolfgang, Zur Methode wortvergleichender und wortgeschichtlicher Studien, in: Europäische Schlüsselwörter, Vol. I: Humor und Witz, ed. by Sprachwissenschaftlichen Colloquium (Bonn), München 1963, 18–33.Search in Google Scholar
Scott, Mike, WordSmith Tools Manual. Version 3.0, Oxford 1998.Search in Google Scholar
Šklovskij, Viktor, Die Kunst als Verfahren , in: Jurij Striedter (ed.), Russischer Formalismus, München 1969, 5–35.Search in Google Scholar
Stamatatos, Efstathios, A Survey of Modern Authorship Attribution Methods, Journal of the American Society for Information Science and Technology 60:3 (2009), 538–556.10.1002/asi.21001Search in Google Scholar
Strube, Werner, Sprachanalytisch-philosophische Typologie literaturwissenschaftlicher Begriffe, in: Christian Wagenknecht (ed.), Zur Terminologie der Literaturwissenschaft, Stuttgart 1989, 35–49.Search in Google Scholar
Stubbs, Michael, Three Concepts of Keywords, in: Marina Bondi/Mike Scott (eds.), Keyness in Texts. Corpus Linguistic Investigations, Amsterdam/Philadelphia 2010, 21–42.10.1075/scl.41.03stuSearch in Google Scholar
Swales, John, Genre Analysis. English in Academic and Research Setting, Cambridge 1990.Search in Google Scholar
Tukey, John W., Exploratory Data Analysis, London et al. 1977.Search in Google Scholar
Voßkamp, Wilhelm, Gattungen als literarisch-soziale Institutionen, in: Walter Hinck/Alexander von Bormann (eds.), Textsortenlehre – Gattungsgeschichte, Heidelberg 1977, 27–44.Search in Google Scholar
Williams, Raymond, Keywords. A Vocabulary of Culture and Society , revised edition, New York 1983.Search in Google Scholar
© 2020 Walter de Gruyter GmbH, Berlin/Boston