Show Summary Details
More options …

# Linguistics Vanguard

### A Multimodal Journal for the Language Sciences

Editor-in-Chief: Bergs, Alexander / Cohn, Abigail C. / Good, Jeff

1 Issue per year

Online
ISSN
2199-174X
See all formats and pricing
More options …
Volume 1, Issue 1 (Dec 2015)

# The complexity of inflectional systems

Gregory Stump
/ Raphael A. Finkel
Published Online: 2015-01-05 | DOI: https://doi.org/10.1515/lingvan-2014-1007

## Abstract

We regard the complexity of any inflection-class system as the extent to which the similarities among its inflection classes tend to inhibit motivated inferences about the word forms realizing a paradigm’s cells. We propose ten objective measures of this sort of complexity. We apply these measures in comparing the declensional systems of Latin and Sanskrit, which we represent in a standard format that we call a “plat”; we execute these measurements with an online tool that is freely available for readers to use. We show that the ten measures are not equivalent; together, they show that the declensional systems of Latin and Sanskrit are roughly comparable in complexity. We discuss a number of methodological issues raised by this new approach to typological comparison.

This article offers supplementary material which is provided at the end of the article.

Keywords: complexity; inflection; morphology; typology

The notion of linguistic complexity has recently attracted a good deal of attention among linguists. Here, we examine one domain of linguistic complexity, that of a language’s inflectional system. Our goals are (i) to show that languages vary in their complexity in this domain, and (ii) to demonstrate that inflectional complexity is itself a complex notion, susceptible to a range of nonequivalent measures.

## 1 Linguistic complexity

The concept of linguistic complexity subsumes three logically independent dichotomies (Miestamo 2006, 2008; Ackerman and Malouf 2013). First, a linguistic system’s complexity might be conceived of in either formal or psycholinguistic terms; that is, its complexity may be seen either as some inherent property of its structure or as the degree of difficulty that it presents for its learners and users. Second, complexity might be conceived of either as a global property of entire languages or as a local property of well-defined linguistic subsystems. Third, complexity can be conceived of in quantitative or organizational terms: a linguistic system is quantitatively complex if it has many parts (where “parts” are essentially anything countable) but is organizationally complex if it is ambiguous with respect to the identity of its parts or their modes of interaction.

The global conception of linguistic complexity poses methodological problems: given varying degrees of complexity in different parts of a language’s grammar and lexicon, it is necessary for this variation to be resolved in determining the language’s global complexity, but there are no universally agreed-upon principles for the resolution of such variation. The psycholinguistic conception of complexity is also problematic: what is complex for a language learner may not be complex for a mature speaker; what is complex for a hearer may not be complex for a speaker. We accordingly restrict our attention to inflectional complexity of a formal and local nature, whose manifestations are in some cases quantitative and in other cases organizational.

## 2 Morphological complexity

Morphology combines two particular domains of complexity. On one hand, a language’s morphology defines syntagmatic associations among formatives at the level of the individual word; on the other hand, it defines paradigmatic associations among distinct words, including the words constituting a lexeme’s inflectional paradigm as well as lexemes related by principles of word formation. Each domain exhibits both quantitative and organizational complexity.

Traditionally, morphological typology has focused on morphotactic complexity at the level of individual words. In the classificatory system proposed by the Schlegel (1808), extended by Humboldt (1836) and refined by Sapir (1921), individual words are compared with respect to degree of synthesis and degree of fusion. The more synthetic morphology a language has, the greater the quantitative complexity of its individual word forms (measured in their average number of bound morphemes). Because fusional morphology involves nonconcatenative processes, it engenders more nonautomatic allomorphy than agglutinative morphology does, and so endows individual words with comparatively greater organizational complexity; and because fusion involves cumulative exponence, it endows individual exponents with greater quantitative complexity (measured as the average number of morphosyntactic properties that they express) than is exhibited by words whose exponence relations are biunique.

Research in inflectional morphology has drawn attention to the importance of inflectional paradigms as a domain of morphological structure (Matthews 1972; Anderson 1992; Aronoff 1994; Stump 2001). Languages vary widely in the size of their inflectional paradigms (=their number of cells, a quantitative measure of complexity): most English verbs have no more than five inflected forms (e.g. sing, sings, sang, sung, singing), and none has more than thirteen (be, am, ain’t, are, aren’t, is, isn’t, was, wasn’t, were, weren’t, been, being) 1; in Latin, by contrast, a typical verb has well over a hundred forms. Recent research, however, has drawn particular attention to cross-linguistic variation in the organizational complexity of inflectional paradigms (Finkel and Stump 2007, 2009). The issue may be equated with what Ackerman et al. (2009: 54) term the Paradigm Cell Filling Problem (PCFP): What licenses reliable inferences about the inflected surface forms of a lexical item? The PCFP is clearly relevant to the task of learning a language’s inflectional morphology: because language learners do not hear a full paradigm of forms on their first exposure to a new lexeme, they must infer the forms of that lexeme that they have not yet encountered. (It is language learners’ capacity to do this that is demonstrated by the “wug test”; Berko 1958.)

In languages whose inflection is not sensitive to inflection-class (IC) distinctions, inferences of this sort are straightforward; but in languages with IC distinctions, a single form of a lexeme may fail to determine all of its other forms. Even so, it is rarely necessary for learners to hear every form of a lexeme in order to acquire its full paradigm, since the word forms in a lexeme’s paradigm are bound by a network of implicative relations that allow many forms to be inferred from one or more other forms, usually in more than one way. In language pedagogy, such networks account for the possibility of learning a lexeme’s entire paradigm by learning its “principal parts”.

Stump and Finkel (2013) relate the notion of morphological complexity to how dependably an inflectional system’s network of implicative relations allows one to infer unknown forms from known forms; specifically, we define the complexity of an IC system as the extent to which the similarities among its ICs tend to inhibit motivated inferences about the word forms realizing a paradigm’s cells. Such similarities may inhibit motivated inferences in various ways, making it desirable to employ a range of approaches to measuring their effects.

Investigating IC systems is not always easy. In many languages, identifying the number of ICs and the differences among them is complicated in a number of ways. If one lexeme inflects differently from every other member of its category, does it constitute its own IC? Does a heteroclite lexeme belong to a hybrid IC of its own, or does it belong partly to one IC, and partly to another? Do stem alternations participate in determining a lexeme’s IC membership, or is a lexeme’s pattern of stem alternation independent of its IC membership – for instance, do sell and smell belong to the same IC, or not? We return to difficulties of this sort in Section 5; for the moment, we simply assume that it is possible to arrive at a well-motivated conception of a language’s IC system and that its properties are measurable.

Table 1

Latin declensional plat

As a basis for such measurements, it is useful to represent a language’s IC system by what we call a plat: a table in which (i) each column corresponds to the morphosyntactic property set (MPS) σ associated with a given paradigm cell; (ii) each row corresponds to a particular inflection class C; and (iii) the distinguisher of σ in C is listed in row C of column σ. Given any inflection class C and any MPS σ for which members of C inflect, the distinguisher of σ in C is a form d such that for any lexeme L in C whose pairing with σ is realized by a word form w,

• a)

d is part of w, and

• b)

d distinguishes w from some or all of L’s other word forms.

Ordinarily, all of the word forms in a paradigm share a fixed phonological sequence, their theme; each word form can therefore be factored into two parts – its theme and its distinguisher. Very often, a word form’s theme is its stem and its distinguisher is an inflectional affix (or a sequence of such affixes), as in the case of walk-ed. But in many cases, a word form’s theme consists of only part of its stem, so that its distinguisher consists of the stem’s remainder together with any affixes (e.g. brought, with theme /br/ and distinguisher /ɔt/). To compare ICs in a language that exhibits stem ablaut, we often find it useful to construct a plat in which a word form’s distinguisher is represented as the rime of its stem’s final syllable plus any affixes, e.g. b|ite, b|it, b|itten.

A plat facilitates a variety of complexity calculations, including set-theoretic computations of a cell or paradigm’s predictiveness or predictability as well as information-theoretic measurements of a cell’s entropy or that of a full paradigm. To exemplify these different measures of an IC system’s complexity, we employ the plats in Tables 1 and 2, which represent substantial fragments of the Latin and Sanskrit declensional systems. (In these plats, we identify each declension not by its traditional name, but by a member lexeme, e.g. aqua names the Latin first declension.)

Table 2

Sanskrit declensional plat

## 3 Measuring inflectional complexity

Because motivated inferences about the word forms realizing a paradigm’s cells may be inhibited in various ways, there are, correspondingly, various ways to measure an IC system’s complexity. Here, we discuss ten such measurements, which we apply to our Latin and Sanskrit plats by means of the PPA (“Principal-Part Analyzer”) software (available for online use at http://www.cs.uky.edu/~raphael/linguistics/analyze.html). The supplementary material includes the Latin and Sanskrit declensional plats in plain-text format; selected computations based on the Latin and Sanskrit plats; and basic instructions for using the online PPA software. Although the ten measures are different, they are alike in their motivation: to quantify the extent to which the similarities among a language’s ICs limit the ease with which the realizations of a paradigm’s cells are interpredictable.

## 3.1 Distillations

In the Latin plat (Table 1), there are twelve MPSs, each containing a case (N[ominative], V[ocative], G[enitive], D[ative], A[ccusative], Ab[lative]) 2 and a number (sg, pl); in the Sanskrit plat, there are 24MPSs, each containing a case (those of Latin plus the I[nstrumental] and L[ocative]) and a number (sg, du, pl). 3 The Latin plat shows that in each IC, the distinguisher of the Vpl is always identical to that of the Npl. Moreover, the distinguisher of the Apl, though not always identical to that of the Npl, is always isomorphic to it – that is, there is a one-to-one and onto correspondence between the distinguishers of the Npl and those of the Apl. We use the term distillation to refer to a set of MPSs whose distinguishers are isomorphic; our practice is to give a distillation the name of its first member. Distillations are the basis for our first complexity measure:

Complexity measure 1. The more distillations an IC system has, the more complex it is.

In an IC system whose MPSs all belong to the same distillation, every member of every paradigm uniquely determines every other member; but the more distillations there are, the greater the number of potential obstacles to inferring one member of a paradigm from another. By this measurement, the declensional system of Sanskrit (with the 13 distillations in Table 3) is more complex than that of Latin (with 9 distillations).

 Latin Sanskrit Nsg Apl Vsg Nsg Gsg Vsg Dsg Asg Asg Isg Absg {Dsg, Absg, Gsg} {Npl, Vpl, Apl} Lsg Gpl {Ndu, Vdu, Adu} {Dpl, Abpl} {Idu, Ddu Abdu} {Gdu, Ldu} {Npl, Vpl} {Ipl, Dpl, Abpl, Lpl} Gpl
Table 3

Distillations in Latin and Sanskrit declensional morphology

## 3.2 Principal parts

The notion of principal parts, familiar from language pedagogy, affords additional measures of inflectional complexity. We define a principal-part set as in (1).

(1) A set of principal parts for a lexeme L is any set of cells in L’s paradigm PL from whose realizations one can reliably deduce the realization of every remaining cell in PL.

This conception of principal parts is more flexible than the traditional pedagogical notion, which is tacitly subject to three restrictions. Traditionally, principal-part sets are
• uniform (in the paradigms of lexemes belonging to the same syntactic category, the same members always serve as principal parts),

• unique (the label “principal parts” is, by convention, reserved for one of potentially many possible candidates for a lexeme’s principal-part set), and

• optimal (there is no smaller set of forms that could serve just as adequately as principal parts).

The definition in (1) dispenses with these three requirements but leaves open the possibility that for some kinds of measurements, we may opt to adhere to one or more of them. For example, we distinguish between static principal-part sets, which adhere to the requirement of uniformity, and dynamic principal-part sets, which do not.

An inflectional system that requires fewer principal parts is less complex than one that requires more. In applying this criterion, we must examine optimal principal-part sets of either the static type or the dynamic type:

Complexity measure 2. The larger the number of static principal parts an IC system optimally requires, the more complex that system is.

Complexity measure 3. The larger the average number of dynamic principal parts that the ICs in a system optimally require, the more complex that system is.

Of the two measures, 3 gives a clearer picture of an inflectional system’s network of implicative relations in those cases in which different ICs exhibit different implicative patterns. As it happens, the Latin declensional system and the Sanskrit declensional system require essentially the same number of principal parts on an optimal analysis: Latin requires no fewer than 4 static principal parts and an average of 1.24 dynamic principal parts (Table 4); Sanskrit requires no fewer than 4 static principal parts and an average of 1.21 dynamic principal parts (Table 5).

 Declension Number of optimal dynamic principal parts AGILE_N, ALACRIS_F, ANIMAL, AQUA, ATRŌX_MF, ATRŌX_N, AUXILIUM, BŌS, CELERE_N, CELERIS_F, CĪVIS, CORNŪ, CORPUS, DIĒS, FĪLIUS, FRŪCTUS, HOMŌ, MARE, MISER_M, MISERA_F, MISERUM_N, OPUS, PARS, PATER, PRĪNCEPS, RĒS, RĒX, SŪS 1 AGILIS_MF, ALACER_M, ALACRE_N, CELER_M, DOMINUS, DŌNUM, ŪLLA_F, ŪLLUM_N, ŪLLUS_M 2 Average 1.24
Table 4

The optimal number of dynamic principal parts for Latin declensions

 Declension Number of optimal dynamic principal parts ALI, AŚVA, BALIN_M, BALIN_N, BHŪ, DĀNA, DĀTṚ_N, DHĪ, DĪRGHĀYUS_MF, DĪRGHĀYUS_N, JAGAT, MADHU, MANAS, MARUT, NADĪ, NĀMAN, PAŚU, PRATYAÑC_M, PRATYAÑC_N, RĀJAN, SĒNĀ, ŚREYAS_M, ŚUKRAŚOCIS_MF, ŚUKRAŚOCIS_N, SUMANAS_MF, TUDAT_M, TUDAT_N, VADHŪ, VIDVĀṀS_M, VIDVĀṀS_N 1 BHAGAVAT_M, DĀTṚ_M, DHENU, MATI, MĀTṚ, PITṚ, SVASṚ, VĀRI 2 Average 1.21
Table 5

The optimal number of dynamic principal parts for Sanskrit declensions

By relaxing the uniqueness requirement, we can examine the number of alternative principal-part analyses to which an inflectional system is subject. In an IC system requiring no fewer than n principal parts, we can think of the density of the system’s viable principal-part sets as the proportion of n-member subsets of a paradigm’s cells that “work” as principal parts. The higher the density of viable principal-part sets among a paradigm’s cells, the greater the number of ways to infer the realization of one cell from that of other cells. Thus:

Complexity measure 4. The lower the density of an IC system’s optimal static principal-part sets, the more complex it is.

Complexity measure 5. The lower the average density of an IC system’s optimal dynamic principal-part sets, the more complex it is.

The Latin plat allows eight optimal static principal-part analyses, while the Sanskrit plat affords only six; these optimal analyses are listed in Table 6. As this table shows, a Latin noun’s Absg cell is a particularly important predictor, appearing in all eight of the optimal static principal-part analyses for Latin; in Sanskrit, a noun’s Nsg and Apl cells are comparably important as predictors.

 Latin Sanskrit Analyses {Nsg, Gsg, Asg, Absg} {Nsg, Vsg, Asg, Apl} {Nsg, Gsg, Absg, Npl} {Nsg, Vsg, Ndu, Apl} {Nsg, Dsg, Asg, Absg} {Nsg, Vsg, Npl, Apl} {Nsg, Dsg, Absg, Npl} {Nsg, Asg, Apl, Gpl} {Vsg, Gsg, Asg, Absg} {Nsg, Ndu, Apl, Gpl} {Vsg, Gsg, Absg, Npl} {Nsg, Npl, Apl, Gpl} {Vsg, Dsg, Asg, Absg} {Vsg, Dsg, Absg, Npl} Density 8 analyses with 4 principal parts out of 126 possible 4-member analyses for 9 distillations: density = 0.063 6 analyses with 4 principal parts out of 715 possible 4-member analyses for 13 distillations: density = 0.008
Table 6

Optimal static principal-part analyses of the Latin and Sanskrit declensional systems and their density

Complexity measure 5 is complicated by the fact that different ICs typically have different numbers of optimal dynamic principal parts and different numbers of alternative principal-part analyses. While this variation might seem to constitute variation in quantitative complexity, it actually reflects variation in organizational complexity: an IC that requires few dynamic principal parts and has many alternative analyses inhibits motivated inferences about unlisted forms to a much lower degree than an IC that requires more dynamic principal parts and has few alternative analyses. Table 7 reveals this variation in organizational complexity. By the density measures in Tables 6 and 7, the Sanskrit declensional system is more complex than that of Latin: in Sanskrit, there are fewer ways to choose viable sets of optimal principal parts.

Table 7

The density of optimal dynamic principal-part analyses for Latin and Sanskrit declensions

Principal-part analysis seeks to isolate the predictive from the predictable on the scale of whole paradigms: where the inflectional paradigm PL of lexeme L is a set of cells (each the pairing of L with an appropriate MPS), principal-part analysis seeks a subset Q of PL such that the realization of Q’s members is predictive of the realization of the remaining cells in PL (i.e. those constituting Q’s complement in PL). Thus, what is predictive and what is predictable are the realizations of complementary sets of cells. But a paradigm’s network of implicative relations can also be scrutinized in a finer-grained way, by examining the predictability and predictiveness of a paradigm’s individual cells – that is, by asking what may be predicted from the realization of a single cell, and what suffices to predict a single cell’s realization. Like principal-part analysis, this approach is the basis for various kinds of complexity measures.

One such measure is an IC’s cell predictor number – the number of optimal dynamic principal parts required to determine the realization a particular cell in a paradigm P belonging to that IC, averaged across all of P’s cells:

Complexity measure 6. The higher an IC system’s cell predictor number, the more complex it is.

This measure yields very similar results for Latin and Sanskrit: given any cell C in a Sanskrit declensional paradigm P belonging to any IC, there is always at least one other cell in P whose realization by itself determines the realization of C. The same is nearly true in Latin. The only IC deviating from this pattern is that of ALACER_M “lively”: on average, the realization of a given cell in a paradigm belonging to this declension can only be inferred by reference to the realization of 1.22 other cells (Table 8). By this measure, the Latin declensional system is only negligibly more complex than that of Sanskrit.

 Latin Sanskrit ALACER_M class: 1.22 All classes: 1.00 All other classes: 1.00 Average over ICs: 1.01 Average over ICs: 1.00
Table 8

The cell predictor numbers of Latin and Sanskrit declension classes

Another measure of complexity at the level of individual paradigm cells is that of cell predictiveness – for any cell C, the fraction of the other cells in the paradigm whose realization is fully determined by that of C. A given cell in a declensional paradigm P may be more or less predictive depending on the IC to which P belongs. Consider, for example, the Gsg cell in a Latin nominal paradigm; as Table 9 shows, the realization of the Gsg cell is very predictive in some declensions (e.g. the AQUA declension, in which it predicts the realization of every other cell) but comparatively unpredictive in others (e.g. the OPUS declension, in which it only determines the realization of the Dsg and Dpl cells).

Table 9

The predictiveness of the Gsg cell in Latin declensional paradigms (In column α, 1 means that the realization of the Gsg cell determines that of α; 0 means that it does not.)

Averaging over ICs, the Gsg cell in a Latin declensional paradigm has an overall predictiveness of 0.598; averaging again over all distillations, a given cell in a Latin declensional paradigm has a predictiveness of 0.603. In Sanskrit, this final average in 0.591. Thus, by the following measurement, the Sanskrit declensional system is marginally more complex than that of Latin.

Complexity measure 7. The lower an IC system’s average cell predictiveness, the more complex it is.

A final complexity measurement at the cell level is that of cell predictability. Intuitively, the predictability of cell C in paradigm P is the ratio of (a) to (b), where (a) is the number of nonempty subsets of P’s other cells whose realization uniquely determines that of C and (b) is the number of all nonempty subsets of P’s other cells. 4 (For a more precise definition, see Stump and Finkel 2013: 99ff.) Cell predictability correlates with complexity in the following way:

Complexity measure 8. The lower an IC system’s average cell predictability, the more complex it is.

Thus, consider the cell predictability figures for declensional paradigms in Latin (Table 10) and Sanskrit (Table 11).

Table 10

Cell predictabilities and IC predictabilities for the Latin declensional system

Table 11

Cell predictabilities and IC predictabilities for the Sanskrit declensional system

In Latin declensions, the predictability of individual paradigm cells is highly variable. The realization of the Absg is completely unpredictable in four declensions (those of CĪVIS, MARE, AGILIS_MF and ALACRE_N); most cells, however, exhibit substantially higher predictability. Cell predictability averaged across ICs varies from a low of 0.812 in the Vsg to a high of 0.980 in the Dpl. (This means that 81.2% of cell sets of up to four members suffice to predict a paradigm’s Vsg form, but that 98.0% suffice to predict its Dpl.) Cell predictability averaged across distillations varies from a low of 0.670 in the ALACER_M declension to a high of 0.994 in the HOMŌ, PRĪNCEPS, ANIMAL, BŌS and PARS declensions, each of whose cells attains this same level of predictability. The overall average cell predictability exhibited by the Latin declensions is 0.865.

In Sanskrit, the non-neuter r-stem declensions (those of PITR, MĀTR, DĀTR_M and SVASR) are completely unpredictable in the Apl; the TUDAT_M and BHAGAVAT_M declensions are likewise unpredictable in the Nsg. Cell predictability averaged across ICs varies from a low of 0.814 in the NVpl to a high of 0.992 in the GLdu. Cell predictability averaged across distillations varies from a low of 0.805 in the non-neuter r-stem declensions to a high of 0.996 in the DHĪ declension. The overall average cell predictability exhibited by the Sanskrit declensions is 0.903. Thus, by complexity measure 8, the Sanskrit declensional system is slightly simpler than that of Latin.

## 3.4 Predictability of ICs

Each of the four complexity measures discussed in Section 3.2 above is based on optimal principal-part analyses. But the network of implicative relations that relate the cells of a paradigm might comprise relations of many sorts, not all of which are necessarily optimal in this sense. The measure of IC predictability is sensitive to this wider range of implicative relations.

Intuitively, the IC predictability of a lexeme L’s IC is the fraction of adequate (though not necessarily optimal) dynamic principal-part sets among all nonempty subsets of cells in L’s paradigm. (For a more precisely developed definition, see Stump and Finkel 2013: 94ff.) This notion is the basis for the following complexity measure:

Complexity measure 9. The lower a system’s average IC predictability, the more complex it is.

The IC predictabilities of the Latin declensions are listed in the rightmost columns of Table 10. As this table shows, the HOMŌ, PRĪNCEPS, ANIMAL, BŌS and PARS declensions are maximally predictable: no matter which cell we pick in the paradigms of these nouns, its realization predicts that of every other cell in its paradigm. Two declensions, AGILIS_MF and ALACRE_N, exhibit the lowest IC predictability (0.306). Averaged across ICs, the IC predictability of Latin declensions is 0.709.

In Sanskrit (Table 11), the non-neuter r-stem declensions exhibit the lowest IC predictability (0.155), and the DHĪ declension exhibits the highest (0.997). Averaged, Sanskrit declensions exhibit an IC predictability of 0.702, strikingly close to the corresponding average in Latin.

## 3.5 Entropy

The complexity of a language’s inflection-class system can also be measured in information-theoretic terms as the entropy of the system’s variables (Shannon 1948, 1951; cf. Moscoso del Prado Martín et al. 2004; Milin et al. 2009; Ackerman et al. 2009). Each of the MPSs in a plat may be seen as a variable with a limited range of values (its distinguishers); a MPS’s entropy is a measure of uncertainty as to its value in a given IC. If a MPS σ has the same value in every IC, then its entropy is zero – there is no uncertainty about its value in any given IC. If it has two possible values (the distinguishers a and b) and these are evenly distributed across ICs, its entropy is one bit. (The intuition is that one bit of information is needed to eliminate all uncertainty about σ’s value in a given IC; this bit of information might be thought of as an answer to one yes-no question, e.g. “In this IC, does σ have the value a?”) Because the distinguishers realizing a MPS σ may not be evenly distributed across ICs, the entropy of σ is sensitive to the variable probabilities of its range of values. Thus, Shannon’s formula for computing the entropy H(σ) of a variable σ takes account of the probability P(x) of each member x of the set Dσ of possible values for σ: $H\left(\mathrm{\sigma }\right)=-\sum _{\mathrm{x}\in {\mathrm{D}}_{\mathrm{\sigma }}}\mathrm{P}\left(\mathrm{x}\right)\phantom{\rule{thinmathspace}{0ex}}\mathrm{l}\mathrm{o}{\mathrm{g}}_{2}\phantom{\rule{thinmathspace}{0ex}}\mathrm{P}\left(\mathrm{x}\right)$A variable’s entropy can be reduced by knowing the value of some other variable; in the case at hand, if one knows the distinguisher realizing the MPS τ in a given IC, that information may narrow the range of possible distinguishers realizing the MPS σ in that IC. Thus, the entropy H(σ|τ) of the MPS σ conditional on the MPS τ is defined as follows: $H\left(\mathrm{\sigma }|\mathrm{\tau }\right)=\phantom{\rule{thinmathspace}{0ex}}-\sum _{\mathrm{x}\in {\mathrm{D}}_{\mathrm{\tau }}}\mathrm{P}\left(\mathrm{x}\right)\sum _{\mathrm{y}\in {\mathrm{D}}_{\mathrm{\sigma }}}\mathrm{P}\left(\mathrm{y}|\mathrm{x}\right)\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{l}\mathrm{o}{\mathrm{g}}_{2}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{P}\left(\mathrm{y}|\mathrm{x}\right)$Conditional entropy is a crucial component of a complexity measure that Stump and Finkel (2013: 300) call n-MPS entropy. The n-MPS entropy of an MPS σ is the average of the entropy of σ conditional on members of Cσ, the collection of MPSs not including σ with up to n members. The formula for the n-MPS entropy Hn(σ) of a MPS σ is ${H}_{n}\left(\mathrm{\sigma }\right)=\frac{{\sum }_{\mathrm{\tau }\phantom{\rule{thinmathspace}{0ex}}\in \phantom{\rule{thinmathspace}{0ex}}{C}_{\mathrm{\sigma }}}H\left(\mathrm{\sigma }|\mathrm{\tau }\right)}{|{C}_{\mathrm{\sigma }}|}.$The larger the value chosen for n, the lower a MPS’s n-MPS entropy; setting n at a lower number (arbitrarily, 4) has the effect of heightening the contrast among the n-MPS entropy measurements for a given set of MPSs. Thus, our final complexity measurement:

Complexity measure 10. The higher an IC system’s 4-MPS entropy, the more complex it is.

The 4-MPS entropy of the Latin declensional system is 0.16; that of the Sanskrit declensional system is 0.10. These values are quite low; they reveal that notwithstanding their complexity by other criteria, these two declensional systems possess a large measure of redundancy, enough to allow many of the forms in an inflectional paradigm to be deduced in many different ways.

These facts support the Low Conditional Entropy Conjecture – “the hypothesis that enumerative morphological complexity [=quantitative complexity] is effectively unrestricted, as long as the average conditional entropy, a measure of integrative complexity [=organizational complexity], is low” (Ackerman and Malouf 2013: 436). The average conditional entropy measurement is effectively that of 1-MPS entropy. The average conditional entropy of our Latin plat is 0.49, and that of the Sanskrit plat is 0.46; these figures tend toward the low end of the spectrum of average conditional entropy measurements reported by Ackerman and Malouf for their sample of ten inflectional systems.

## 4 Summary

The ten complexity measures discussed above afford a close comparison of the declensional systems of Latin and Sanskrit; in applying our complexity measures to these systems, we have arrived at the figures summarized in Table 12. These results reveal that overall, the declensional systems of Latin and Sanskrit are very similar in complexity. This is a nontrivial conclusion, since the application of our ten complexity measures to other inflectional systems has revealed a good deal of cross-linguistic variation. Elsewhere, we have employed our ten complexity measures in comparing IC systems from twelve genetically and typologically diverse languages; see Stump and Finkel (2013: 314ff). The extent of the variation that we have observed reveals that the complexity of IC systems is a significant typological variable.

The results in Table 12 also substantiate our claim that complexity is itself complex and that the dimensions of this complexity are most clearly revealed by drawing on a range of subtly different metrics. 5 One may, of course, be inclined choose one of the ten measures as the “right” one, but we question whether this is a productive impulse. By the criterion of average cell predictability, the Latin declensional system is more complex than that of Sanskrit; by the criteria of density, Sanskrit is more complex than Latin. No one measure is fully revealing, because complexity can take more than one form.

Table 12

Summary of complexity measurements of the Latin and Sanskrit declensional systems (In each row, the shaded column exhibits higher complexity by the criterion at hand.)

## 5 Constructing plats

Each measure of an IC system’s complexity presupposes a precise representation of the system’s parts. But the same system may be represented in more than one way, and the precise way in which it is represented affects the results of complexity measurements. Consider, for example, the construction of a plat representing English conjugations. How should the distinguishers be represented in this plat? One possibility is to represent all distinguishers in phonetic transcription, as in Table 13(a); in this mode of representation, the CAST and PASS conjugations exhibit the same distinguisher in the past tense, as do the LEAD and SAY conjugations. Another possibility is to represent all distinguishers morphophonologically, as in Table 13(b); in this mode of representation, the past-tense distinguisher of PASS differs from that of CAST, while LEAD and SAY continue to have the same distinguisher in the past tense. Still another possibility is to include some grammatical information in the representation of a word form’s distinguisher; for instance, one might represent morphological boundaries (as in Table 13(c)), so that LEAD and SAY have distinct distinguishers.

Table 13

Alternative representations for distinguishers in a plat of English conjugations

Stump and Finkel (2013: 51f) distinguish between two logical extremes for the construction of plats. A hearer-oriented plat contains only that inflectional information that is perceptually available to the hearer. By contrast, a speaker-oriented plat is one whose distinguishers carry additional grammatical information that mature speakers of a language would have access to, including information about a word form’s morphophonology, its morphological boundaries, its gender-class membership or valence, and so on. Grammatical properties of these sorts have a disambiguating function (cf. again Table 13); they enrich the information upon which inferences about a lexeme’s IC membership are based. For this reason, complexity measurements based on a speaker-oriented plat make an IC system look less complex than measurements based on a hearer-oriented plat of that system.

This difference does not mean that speaker-oriented plats are “better than” hearer-oriented plats, because they represent different kinds of things. A hearer-oriented plat might be seen as representing inflectional contrasts that are directly perceptible to a language learner; a plat of this sort might be useful for modeling inferences in the acquisition of a language’s morphology. By contrast, a speaker-oriented plat may be seen as encoding all of the information upon which mature speakers base deductions about inflected forms that are not directly listed in the lexicon. Consider, for example, the possibility of representing a lexeme L’s IC membership, not diacritically, but by a set of “principal parts”; L’s principal parts would in that case be embedded in a lexical entry for L containing additional information about L’s gender (or valence, etc.), about its stems (hence about its representation in compounds, but also about the segmentation of its principal parts), and so on. The principles for deducing L’s unlisted forms would then be based on L’s principal parts but could be streamlined by reference to other information in L’s lexical entry.

Hearer-oriented plats and speaker-oriented plats may differ in other ways as well. A hearer-oriented plat might include impostors – inflectional patterns that are ambiguous in their implications. Consider again the past-tense form cast, whose inflectional pattern is a subtype of the IC to which set belongs (in which infinitive, past tense and past participle are realized by the same, invariant form). Because of the similarity of /kæst/ to /pæst/, the inflectional pattern of cast is an impostor with respect to the IC of pass. The inclusion of impostor patterns in a hearer-oriented plat raises its complexity because it heightens the ambiguity of certain forms’ IC membership and thus increases the opacity of those forms for the hearer/learner. In a speaker-oriented plat that includes information about morphological boundaries, many patterns cease to be impostors at all; the inflectional pattern of suffixless /kæst/ does not parallel that of the suffixed form /pæs‐d/ in such a plat. Logically, impostors should be omitted from a speaker-oriented plat in any event: mature speakers of English recognize that cast participates in precisely the same implicative relations as set, a fact in no way diminished by the accidental phonetic similarity of its past-tense and past-participial forms to those of pass. Wholly irregular patterns embodied by a single lexeme (e.g. those of be or go) should also seemingly be omitted from a speaker-oriented plat, since they do not constitute generalizations about nontrivial sets of lexemes. One can, of course, imagine plats that do not strictly fall on the continuum from speaker-oriented to hearer-oriented plats: to examine the ways in which a particular mode of exponence contributes to an inflectional system’s complexity, one might construct plats whose distinguishers are limited in a particular way, e.g. to tone patterns, ablaut patterns, patterns of vowel intercalation, or patterns of referral. Many kinds of plats are imaginable, each a potentially informative basis for analysis.

There is a moral here. Claims about inflectional complexity are not always precise about the kind of representation they presuppose for a language’s IC system; in the absence of this information, such claims cannot be taken at face value. We urge those who investigate inflectional complexity to make their data sets publicly available. See Stump and Finkel (2013) for further discussion of the importance of this representational issue, and see the link below for access to the data sets on which that discussion is based.

Online PPA engine

http://www.cs.uky.edu/~raphael/linguistics/analyze.html

Companion website for Stump and Finkel (2013) Morphological typology: From word to paradigm. Cambridge University Press. This site includes all of the plats analyzed in the book.

http://morphologicaltypology.as.uky.edu/

## References

• Ackerman, Farrell, James P. Blevins & Robert Malouf. 2009. Parts and wholes: Implicative patterns in inflectional paradigms. In James P. Blevins & Juliette Blevins (eds.), Analogy in grammar: Form and acquisition, 54–82. Oxford: Oxford University Press. Google Scholar

• Ackerman, Farrell & Robert Malouf. 2013. Morphological organization: The low conditional entropy conjecture. Language 89. 429–464.

• Anderson, Stephen R. 1992. A-morphous morphology. Cambridge: Cambridge University Press. Google Scholar

• Aronoff, Mark. 1994. Morphology by itself: Stems and inflectional classes. Cambridge, MA & London: MIT Press. Google Scholar

• Berko, Jean. 1958. The child’s learning of English morphology. Word 14. 150–177.

• Finkel, Raphael & Gregory Stump. 2007. Principal parts and morphological typology. Morphology 17. 39–75.

• Finkel, Raphael & Gregory Stump. 2009. Principal parts and degrees of paradigmatic transparency. In James P. Blevins & Juliette Blevins (eds.), Analogy in grammar: Form and Acquisition, 13–53. Oxford: Oxford University Press. Google Scholar

• Humboldt, Wilhelm von. 1836. Über die Verschiedenheit des menschlichen Sprachbaues und ihren Einfluss auf die geistige Entwickelung des Menschengeschlechts. Berlin: F. Dümmler. Google Scholar

• Matthews, P. H. 1972. Inflectional morphology: A theoretical study based on aspects of Latin verb conjugation. Cambridge: Cambridge University Press. Google Scholar

• Miestamo, Matti. 2006. On the feasibility of complexity metrics. In K. Kerge & M.-M. Sepper (eds.), Finest linguistics: Proceedings of the Annual Finnish and Estonian Conference of Linguistics, Tallinn, May 6–7, 2004, 11–26. Tallinn: Tallinn University Department of Estonian. Google Scholar

• Miestamo, Matti. 2008. Grammatical complexity in a cross-linguistic perspective. In Matti Miestamo, Kaius Sinnemäki & Fred Karlsson (eds.), Language complexity: Typology, contact, change, 23–41. Amsterdam & Philadelphia, PA: John Benjamins. Google Scholar

• Milin, Petar, Victor Kuperman, Aleksandar Kostić & R. Harald Baayen. 2009. Words and paradigms bit by bit: An information-theoretic approach to the processing of paradigmatic structure in inflection and derivation. In James P. Blevins & Juliette Blevins (eds.), Analogy in grammar: Form and acquisition, 214–252. Oxford: Oxford University Press. Google Scholar

• Moscoso del Prado Martín, Fermín, Aleksandar Kostić & R. Harald Baayen. 2004. Putting the bits together: An information-theoretical perspective on morphological processing. Cognition 94(1). 1–18.

• Sapir, Edward. 1921. Language: An introduction to the study of speech. New York: Harcourt, Brace & Co. Google Scholar

• Schlegel, Friedrich von. 1808. Über die Sprache und Weisheit der Indier: Ein Beitrag zur Begründung der Alterthumskunde. Heidelberg: Mohr & Zimmer. Google Scholar

• Shannon, Claude E. 1948. A mathematical theory of communication. Bell System Technical Journal 27(3). 379–423.

• Shannon, Claude E. 1951. Prediction and entropy of printed English. Bell System Technical Journal 30(1). 50–64.

• Stump, Gregory. 2001. Inflectional morphology. Cambridge: Cambridge University Press. Google Scholar

• Stump, Gregory & Raphael Finkel. 2013. Morphological typology: From word to paradigm. Cambridge: Cambridge University Press. Google Scholar

• Zwicky, Arnold & Geoff Pullum. 1983. Cliticization vs inflection: English n’t. Language 59. 502–513.

## Footnotes

• 1

On the inflectional status of -n’t, see Zwicky and Pullum (1983).

• 2

For present purposes, we omit the locative case, which is not distinct from the ablative case in the paradigms of most lexemes in Classical Latin.

• 3

In Table 2, we save space by grouping identical columns of distinguishers into single columns; for example, the NVAdu column is a conflation of the identical Ndu, Vdu and Adu columns.

• 4

Because large subsets of the other cells in C’s paradigm almost always suffice to determine the realization of C, we find it useful to set a limit number on the cardinality of subsets employed in this calculation; this has the effect of making differences in cell predictability more pronounced. Here, we employ a limit number of 4. We set a similar limit on the cardinality of subsets employed in the calculation of IC predictability (Section 3.4).

• 5

A reviewer asked why a system’s number of ICs shouldn’t be seen as an additional measure of complexity. If one construes an IC system’s complexity as we do (as the extent to which the similarities among its ICs tend to inhibit motivated inferences about the word forms realizing a paradigm’s cells), then a system’s number of ICs is not a relevant measure. For instance, if two IC systems are maximally simple (allowing the realization of every cell in a paradigm to be deduced from that of every other cell), then they are equal in complexity regardless of whether one has fifty ICs and the other, only five.

Published Online: 2015-01-05

Published in Print: 2015-12-01

Citation Information: Linguistics Vanguard, ISSN (Online) 2199-174X,

Export Citation