This paper presents the hypothesis that words (and not morphemes) are the minimal units of connection between sound and meaning in human languages. Such a proposal implies the definition of the word as a categorized syntactic derivation that is linked in memory to a representation in the sensorimotor system. The main implications of the hypothesis are the following: (i) A non-lexicalist model is compatible with the phenomena of lexical integrity and lack of productivity that motivate lexicalist models. (ii) It can be concluded that bound morphemes (roots and affixes) are neither syntactic nor conceptual entities, but purely morphophonological ones. Morphemes are side effects of linguistic change operating as resources to optimize the processing and memorization of words. And (iii) a neo-constructionist conception of words is made compatible with a paradigmatic morphology.
There are two different propositions in the title of this article: that everything is syntax, and that words are important. Its conditional formulation includes the additional assumption that if everything were syntax, words would not be interesting for the study of natural languages. And indeed, it is possible to predict that those who reject the protasis – that everything is syntax – tend to accept the apodosis, while those who accept the former, tend to reject the latter.
My main concern is to outline a model in which both statements are true; that is, a model in which the title’s question makes perfect sense and to which the model is intended to be a specific answer. Thus, whereas this approach fits non-lexicalist models, in the sense that it rejects the existence of a morphological component capable of creating words, it does not share the typical conclusion of these models, according to which words are epiphenomenal (e.g. Julien 2007). Quite the contrary, the present contribution places the word as a central feature of human language. The apparent paradox that this entails is solved with the central hypothesis to be developed here, which is twofold: (i) that words are not lexical units, but rather syntactic constructions produced by the computational system; and (ii) that such syntactic constructions are the smallest units of connection between the conceptual-intentional system and the sensorimotor system (in the terminology of the influential model of Hauser et al. 2002).
The first part of the hypothesis – that words are built in the syntax – is not original at all (e.g. Baker 1988; Marantz 1997; Julien 2002; Starke 2009; Borer 2013) and it faces the same serious objections that can be posed of any theory that seeks to ignore the clearly uneven behavior of words (semantic, syntactic and phonological) in comparison with other syntactic constructions (see Anderson 1992; Aronoff 2007; Williams 2007). The second part of the hypothesis, which as far as I know is original, seeks to avoid these objections by claiming that what makes the syntactical constructions we call words special (i.e. “lexical”) is that words are the smallest units of human language in which a connection between sense and sound is established; the connection between meaning and sound (or hand gestures), then, is established only through syntax. 
The principal conclusion of this is that the lexicon of a language is not formed by previous pairings of meaning and sound, which are later handled by syntax, but consists of phonological words that have an essential role in externalizing the syntactic derivations produced by the computational system in its interaction with the conceptual-intentional system. The crucial feature of the word is, at the same time, to be a syntactic construction and the minimum unit of connection with the sensorimotor system. This feature would then be the source of the properties that robustly point to a certain discontinuity between so-called “phrasal syntax” and “lexical syntax” (and which underlie the lexicalist position).
Although this proposal is non-lexicalist (or neo-constructionist) in that it fails to recognize more than a generative engine, it is also true to say that if words are systematic links between syntactic computations and stored phonological forms, there somehow emerges a new form of lexicalism in that the apparent discontinuity between the (alleged) two types of syntax is only a consequence of the architecture of the connection in human language between the computational system and the sensorimotor system. The connection of the syntactic derivations that we call words with the sensorimotor system implies memorization. The consequence of this fact is that, although the structure of words is purely syntactic, their syntax is not ephemeral (using the appealing expression of Aronoff 2007), but is stored, and then automatically exposed to idiosyncrasy.
Moreover, this conception of the word makes compatible a non-lexicalist model with a morphological theory based on the word-and-paradigm model (e.g. Anderson 1992 or Stump 2001), more empirically adequate than models that consider morphemes as objects manipulated by syntax.
In Section 2 I will consider the architecture of the Faculty of Language in which the proposal is integrated. In Section 3 I will define the concept of I-Lexicon (internalized lexicon) as an interface system for the externalization of language, and I will show that this is the appropriate framework for the hypothesis that the word is the minimal connection between sense and sound. In Section 4 I underline the importance of the proposed hypothesis as a solution to the lexicalist / anti-lexicalist controversy. Section 5 develops an additional assumption (that categorization is the minimum condition for a syntactic word to be a phonological word) and a more radical proposal (that what is categorized are not roots, but concepts). Section 6 presents a model of the syntactic structure of words, and Section 7 addresses the ability of this model to predict the phenomena of lexical integrity and the characteristic tension between compositionality and idiosyncrasy in complex words. In Section 8 I address the core concepts of paradigm and analogy assuming the conception of morphemes as elements without syntactic or semantic features, while in Section 9 I develop the hypothesis that morphological structure is the legacy of the historical reanalysis of syntactic constructions. Section 10 presents the main conclusions.
2 The asymmetrical architecture of the faculty of language
According to Hauser et al.’s (2002) model, the human faculty of language (FL) can be conceived as a system integrated minimally by three independent components: a conceptual-intentional system (CI) related to meaning and interpretation, a sensorimotor system (SM) related to the perception and production of linguistic signals, and a computational system (CS), responsible for the creation of the syntactic structure underlying linguistic expressions.
Note that in this model there is no obvious place for morphology. In fact, there is not even a clear place for the core lexical component of human languages. In the interpretation of the architecture of the FL I will defend in this article, the lexicon is to be considered as an interface system between the computational system (Narrow Syntax) and the sensorimotor system. In order that there be reasonable grounds for this proposal we need to take into account a significant asymmetry in the relationship between the three essential components of FL.
According to proposals of Chomsky (2007) and Berwick and Chomsky (2011, 2016), I will assume that the computational system has an asymmetrical relationship with the two “external” components (CI and SM), such that the computational system would be optimized for its interaction with the CI system, while the relationship with the SM system would be ancillary or secondary. It is then implied that the computational system is coupled with the CI system to form a kind of “internal language of thought” that would be essentially homogeneous within the species and that would not be evolutionarily designed for communication, but for thought. Chomsky has suggested that from an evolutionary point of view, the computational system was initially (and of course remains) a “language of thought” independent of communication and of the systems of externalization: “the earliest stage of language would have been just that: a language of thought, used internally” (Chomsky 2007: 13).
The connection of the computational system with the SM system is what would allow the “externalization” of language for interaction and communication with others. Since the connection of the internal language of thought with the externalization systems is posterior or secondary, it would be precisely within this process that the principal (if not exclusive) source of the structural diversity among human languages (understood as knowledge systems or I-languages) would emerge:
Parameterization and diversity, then, would be mostly –possibly entirely– restricted to externalization. That is pretty much what we seem to find: a computational system efficiently generating expressions interpretable at the semantic/pragmatic interface, with diversity resulting from complex and highly varied modes of externalization, which, furthermore, are readily susceptible to historical change. (Berwick and Chomsky 2011: 37–38)
The externalization of the computational system (which is supposedly unchanging and universal in its structure and in its connection to the CI system), gives rise, however, to different I-languages (Spanish, Russian, Chinese, etc.). This would be because the externalization process essentially involves learning “the language” of the environment. Note that what is implied is that what humans have to learn from the environment are actually the patterns of externalization of the internal language of thought (which I assume is strictly conditioned and guided in its development by our own biological design). Like any cognitive system, or any other organ, language development is conditioned by two types of factors: general factors, common to all languages (biological ones or those following from natural laws, factors 1 and 3 of Chomsky 2005) and specific factors for each language (environmental ones, factor 2 of Chomsky 2005). The central hypothesis of this model is that the internal language of thought (that is, the computational system and the intentional-conceptual system) is essentially conditioned in its development by general factors (biological or physical), while the connection to the sensorimotor system (externalization) is especially sensitive to environmental linguistic stimuli. To put it more simply: when we learn the language of the environment, what we essentially learn is how to externalize the internal language of thought in the same way that other people in our linguistic community do.
Obviously, there is an apparent contradiction in the statement that the internal language of thought is externalized in a given internal language (I-language). The adjective internal is somehow being used here in reference to different “continents”. An I-language is internal (as well as individual and intensional) insofar as it is a mental organ, a system of knowledge internal to the mind and brain of a person, not just a public or shared social object. The expression internal language (of thought), meanwhile, refers to the language of thought formed by the conceptual-intentional system in its interaction with the computational system. Therefore, any I-language includes a naturally conditioned component (which in turn includes the CS), and also a component “internalized” from the environment through the acquisition process, which is precisely what distinguishes languages from each other, and where linguistic changes can occur. We will call this component the lexical interface or I-lexicon (internalized lexicon).
A central idea henceforth is that this asymmetry in the relationship between the computational system and the CI and SM systems would explain why, according to our hypothesis, the connection between sense and sound in natural languages is characteristically post-syntactic.
3 The I-Lexicon as an externalization interface of the language of thought
When we speak about the externalization of the internal language of thought we are not talking about the externalization of I-language (i.e. the use of any language) but about the stable connection of the computational system with the SM component that allows syntactic computations to be materialized as sounds (or as visual signs) and to be stored in long-term memory (see Figure 1).
The establishment of the connection between the language of thought and the SM system produces different I-languages, so in principle we can suppose that it is during the development of this connection that language change occurs (and hence where the diversity of languages arises). The interface between the computational system and the SM system must include at least a lexicon or repertoire of linguistic forms for linking syntactic computations with a phonological system that produces chains of articulated sounds (or, where appropriate, visual signs). Language development in an individual speaker (i.e. the process of language acquisition from the environment) therefore consists of the development of an I-lexicon in the individual’s brain.
From this schematic presentation it follows that syntax is the only source of compositionality in human language. If we return to the architecture of language reflected in Figure 1, we see that it is assumed that there is only one “generative engine” in language: syntax. There is no place for a morphological module for word formation. In this model words (interpreted as phonological words) form part of the I-lexicon and are dedicated to externalize syntactic derivations. Therefore, morphology “belongs” to the sensorimotor system, and indeed the model predicts that morphology is merely the outcome of the history of each language. The fact that the range of variation in languages is much greater in morphology and phonology than in semantics and syntax supports such a vision (see Section 9 for discussion).
The I-lexicon of Figure 1, then, is similar to the Vocabulary in Distributed Morphology, i.e. “the list of Vocabulary Items, objects that provide phonological content to functional morphemes” (Embick 2015: 20) with one crucial difference: in the present model vocabulary items have the minimal size of phonological or prosodic words, which externalize fragments of syntactic derivations. In this regard it is important to consider Embick’s remark: “Admitting the Vocabulary into the ontology of the theory is a consequence of denying that functional heads are “traditional” morphemes (= morphemes that possess both sound and meaning from the beginning) in favor of a realizational approach to sound/meaning connections” (Embick 2015: 19). The model I advocate goes further and claims that there are no traditional morphemes at all. It is not only so-called “functional morphemes” that lack phonological features, but also so-called “lexical morphemes” (i.e. roots). Therefore, it makes no sense to assert that syntax combines morphemes. I will argue in the following sections that syntax combines functional categories and conceptual elements (and that it is inadequate to call these entities morphemes). It is thus a radically “late insertion” model, in which the exponents are phonological words and in which morphemes belong to phonology. Morphemes are the building blocks of externalization and they lack syntactic and semantic properties. In this sense the model adopts what I call a “generalized nonanalytic listing” (using nonanalytic listing in the sense of Bermúdez-Otero 2012). 
In the framework of Distributed Morphology each language has three lists: the Vocabulary, the list of Syntactic Terminals (i.e. “The list containing the Roots and the Functional morphemes”, Embick 2015: 20) and the Encyclopedia (i.e. “the list of special semantic information”, Embick 2015: 20). In the model I am going to develop only two lists would be specific to each language (the Vocabulary and the Encyclopedia). The “list of syntactic terminals” is conceived as universal, because, as I will argue, it is not formed by morphemes (language specific units), but is made up of elements of the conceptual-intentional system (C and F), and these elements are unrelated to possible differences in externalization. In this model the “presyntactic lexicon” is universal, not language-specific, and belongs to the conceptual system.
In the scheme of Figure 1 I have assumed that functional categories also belong to the conceptual system. This is a plausible possibility in a model in which functional categories are always interpretable. An interesting possibility is suggested in Sigurðsson (2011), according to which UG provides a Universal Lexicon formed only by an initial root (Root0) and an initial functional feature (Feature0), analogous to Chomsky’s (2008)Edge-Feature (EF). Both Sigurðsson and Chomsky seem to assume that F0 or EF “belong” to the computational system. In the following I will assume that there is a universal inventory of interpretable functional categories, as well as a universal set of atomic conceptual elements. The phylogenetic (or ontogenetic) origin of these elements is now irrelevant.
As already mentioned, the hypothesis of the single engine is not original. It is based on the already long (and robust) non-lexicalist tradition of generative grammar: the lexical syntax of Hale and Keyser (1993, 2002), Distributed Morphology (DM) (Halle and Marantz 1993; Marantz 1997; Embick and Noyer 2007; Embick 2015), the Exo-skeletal model (XS) of Borer (2005a, 2005b, 2013) or Nanosyntax (NS) (Starke 2009). The essential feature of all these approaches is the (minimalist) idea that words with internal structure are formed by the same principles and mechanisms as other syntactic structures. The main problem that these traditions have to face is to account for the marked syntactic, semantic and phonological differences that exist in all languages between, on the one hand, complex words and, on the other hand, phrases and sentences, as work in the lexicalist tradition has shown. The proposal I am developing aims to reconcile both these views.
4 Beyond the lexicalism / anti-lexicalism controversy: Three auxiliary hypotheses
The idea that a language consists of the systematic matching of sounds and meanings is commonsense. In all theoretical approaches the lexicon is precisely this: a collection of phonological representations paired with a collection of meanings. This property of the lexicon, exemplified by the Saussurean theory of the linguistic sign, is repeated in all approaches to language, regardless of their orientation. Even in generative grammar the role of the lexicon is similar; in any of its various models the claim is made that syntax is nourished by a lexicon from which syntax takes the items that are combined to build larger structures. Of course, how this happens is the subject of lively controversy, and is the focus of the dispute between lexicalist and anti-lexicalist approaches at least since Chomsky (1970). In any case, it is clear that even anti-lexicalist approaches assume a lexicon (a systematic pairing of meanings and sounds) before syntactic computation, although in this case not formed by words, but by smaller units, typically morphemes. The central idea of the current contribution, in line with the nanosyntactic approach, is that such an assumption is wrong.  The specific alternative I suggest is that in human language there is no meaning-sound matching before syntax. In other words, meaning and sound do not match directly, but only through the computational system. 
The present proposal draws on the nanosyntactic concept of phrasal spell-out. NS focuses on the findings of contemporary syntactic theory according to which basic syntactic nodes are often sub-morphemic, with the result that in such a case: “morphemes and words can no longer be the spellout of a single terminal. Rather, a single morpheme must ‘span’ several syntactic terminals, and therefore corresponds to an entire syntactic phrase”.  If morphemes are the realization of complex syntactic structures, then it follows that “entire syntactic phrases are stored in the lexicon (not just terminals) and it also means that there cannot be any lexicon before the syntax – i.e. syntax does not ‘project from the lexicon’”. 
Williams (2007) notes that the lexicalist hypothesis does not necessarily imply renouncing syntax as a means of explaining the internal structure of words, but states that the syntax which does so is at least partly different from the syntax explaining sentence structure. The approach of Ackerman and Neeleman (2007) is explicit: “Are the generative systems that produce words and phrases identical or distinct?” (2007: 325). Their conclusion opts for the second option. Aronoff (2007) expresses a similar opinion: “the scope of syntax-based logical compositionality should not be extended below the lexeme because lexeme-internal structure (however it is described) is different from lexeme-external structure” (Aronoff 2007: 805).
It is important to note that Aronoff and others propose a lexicalist (lexeme-based) approach in contrast to an anti-lexicalist one, identifying the latter view with a morpheme-based approach. And what I want to argue is that in this sense Aronoff, Anderson, and others are right. I suggest that what is empirically inadequate is not anti-lexicalism, but the fact that it is based on the morpheme. My proposal here is both an anti-lexicalist and also a non-morpheme-based approach (i.e. “a-morphous” in Anderson’s 1992 expression).
Clearly, the null hypothesis is that there is only one generative component in language. Since there is no doubt about the generative capacity of syntax, the introduction of generative mechanisms in morphology and the lexicon is redundant. Of course, the adoption of the lexicalist hypothesis is not a suicidal renunciation of theoretical elegance, but the result of the need for descriptive adequacy, and a means of limiting the inadequate over-generation of neo-constructionist models. Among the main arguments in favor of a lexicalist theory, two can be highlighted: the phenomena of lexical integrity (which makes invisible for syntactic operations part of the internal structure of words), and the limited productivity of many derivative and compounding processes (i.e. the typically degraded compositionality of complex words). The present approach suggests that the hypothesis that the word is the smallest unit of pairing with SM could provide an explanatory framework to this (apparently) restricted and idiosyncratic dimension of the internal syntax of words within a purely non-lexicalist context. 
This general proposal is based on three auxiliary hypotheses. The first one implies that (i) the units forming the I-lexicon in Figure 1 are actually phonological words and not morphological units (affixes or roots).  This would follow from the second auxiliary hypothesis, which states that (ii) syntactic derivations (resulting from the interaction between the conceptual-intentional system and the computational system) may only be connected to the sensorimotor system after syntactic words have been built. That is, for something to be a phonological word (p-word) it needs to be a syntactic word (s-word). This could be represented as s-word > p-word, to be read as ‘every p-word is an expression of at least one s-word’. The motivation for such a hierarchy follows directly from the FL asymmetry described in Section 2, and is consistent with the idea that words are fragments of structure which are materialized with prosodically defined units, p-words. The third auxiliary hypothesis proposes that (iii) the s-word is a categorized fragment of syntactic derivation. Thus, I assume that a requirement to be a phonological word (that is, to have a “significant”) is to be a syntactic word, and a requirement to be a syntactic word is to belong to a syntactic category, that is, that a given functional category had attached to a syntactic derivation categorizing it (see Section 5).
A frequent criticism of lexicalist models is that when describing words they tend to favor the prosodic over the semantic dimension (since their meaning need not be atomic) or over the syntactic dimension (because their structure need not be atomic).  My proposal is in keeping with this conclusion, but avoids its problems, in that it postulates that the phonological word is the only listed entity (in the I-lexicon), and therefore it is naturally devoid of any special semantic or syntactic status. In fact, the hypothesis formulated is perfectly consistent with the scheme of Figure 2, taken from Borer (2013: 16), which is the scheme that the author expressly develops in her monograph.
In such a scheme, P-RaD is for “Phonological Rule Application Domain” which, according to Borer, would correspond to the prosodic domain of a single main stress. Our hypothesis adds that the syntactic elements to the left of the P-RaD have no prior direct connection with the SM system, and that only categorized derivations (s-words) can be P-RaDs.
To properly evaluate the overall hypothesis and the auxiliary hypotheses we need an explicit theory of what syntactic categories are, and of what is categorized. This is the aim of the next section.
5 Categorizing concepts
A crucial assumption on which the present proposal is based is that the computational system does not combine lexical units (whether morphemes or words), but only computes functional categories. However, in that syntax ends up producing derivations readable by the conceptual-intentional system (i.e. interpretable), it is logical to think that a basic syntactic operation must be to connect the computational system itself with conceptual elements. Let’s call this operation categorization. In more explicit terms, categorization consists of the merging of a concept (taken from the conceptual system) and a basic functional category.
5.1 Why concepts and not roots?
At this point it would be reasonable to ask why am I assuming that what is categorized are concepts and not, as has become standard in the literature, roots. Up to now I have been closely following the (more or less divergent) neo-constructionist models (such as DM and XS), but with a crucial difference: instead of the categorization of roots I propose the categorization of concepts. The reason for this discrepancy is clear: because I take seriously the asymmetry described in Section 2, the theory I am outlining is not morphological, but purely syntactic. Perhaps a problem with neo-constructivist models is that they are not sufficiently emancipated from morphology. Recall that my central hypothesis is that there is no link between sound and meaning below the level of the syntactic word, so it makes no sense to introduce purely morphological elements (such as roots and affixes) into syntactic derivations. Anyway, the idea that what is syntactically categorized are conceptual elements is implicit in the discussion about the nature of words in the discourse of anti-lexicalist authors: “Natural language syntax operates on units that are standardly characterized as bundles of features. Such features are lexicalized concepts. Syntax creates ever-larger molecules by combining featural atoms through iterated use of Merge” (Boeckx 2008: 63, [my italics]).
Let us recall now that as early as the thirteenth century speculative grammarians known as Modistae (see Bursill-Hall 1971) claimed that the Latin terms doleo ‘to hurt’ and dolor ‘pain’ had the same meaning but different modi significandi (‘modes of meaning’). At the same time, they said that meanings are not relevant for grammar, which focused on the different modes of meaning. When I argue that a functional category takes a concept as a complement and makes it syntactically computable, I mean that the essential function of syntax is to establish computations between concepts by means of functional categories. In other words, syntax is the only way to mix independent concepts to produce new and more complex concepts (and, ultimately, thought). See Pietroski (2011, 2012) for a development of these ideas.
In fact, Embick and Marantz’s (2008)categorization assumption (CA) establishes conditions for roots very similar to those we are taking now for the concepts that the computational system recruits from the conceptual system:
Roots cannot appear (cannot be pronounced or interpreted) without being categorized; they are categorized by merging syntactically with category-defining functional heads. If all category-defining heads are phase heads in Chomsky’s (2001) sense – that is, if they are heads that initiate spell-out – the categorization assumption would follow from the general architecture of the grammar. (Embick and Marantz 2008: 6)
But note that this condition is much more natural if applied to conceptual elements than if applied to roots, because roots are defined as fragments of words that do not include categorizing morphology, a circular definition. Moreover, as I have suggested, it is in fact the architecture of FL which imposes the CA, and does so in the hierarchy s-word > p-word by virtue of the asymmetry discussed in Section 2.
Allow me to persist with this idea for a moment. If we consider the very reasonable considerations of Panagiotidis (synthetizing the literature on root categorization), it can be seen that they are much more applicable to concepts than to roots: “interpretable categorial features, borne by categorizing heads, define the fundamental interpretive perspective of their complement, thus licensing root material” (Panagiotidis 2015: 78). In fact, later on he says: “grammatical categories, such as ‘noun’ and ‘verb’, are particular interpretive perspectives on concepts” (Panagiotidis 2015: 84), and he suggests that given that “uncategorized roots are FLN-extraneous” they “would not be recognized at the interface between syntax and the Conceptual-Intentional/SEM systems” (Panagiotidis 2015: 94). But again, this restriction seems more reasonable if applied to concepts than to roots. Stating that “roots are essentially ‘imported’ into the syntactic derivation” (Panagiotidis 2015: 94) fits perfectly with the idea that his roots are actually concepts, as indeed he himself seems to suggest: “the ability of FLN to manipulate roots enables it to denote concepts and, ultimately, to be used to ‘refer’” (Panagiotidis 2015: 94).
In the present proposal, an essential lack of concepts is precisely that they are not directly connected to the sensorimotor system. What dolor and doleo share from the semantic point of view, although it is obviously crucial to understand these words, is not a linguistic unit, that is, not a sound/meaning pairing, because the concept only connects with sound when it is within a word. Note that this implies, contrary to the usual models, that the computational system (syntax) is not fed on lexical units (be they conceived as roots, morphemes or words), but directly computes concepts by means of functional categories. Consequently, the assumption I make is that the connection between concepts and sounds (i.e. phonological representations) is necessarily mediated by syntax, and therefore that only categorized concepts (i.e. syntactic words) can have a relation to the phonological component. Borer, although posing a similar model in many respects, produces the opposite reasoning. She defines roots as phonological indexes that have no meaning.  In Borer’s model roots only receive meaning when realized as phonological forms. But there is something contradictory in such a concept of root, as Borer herself seems to appreciate: “there is also a theoretical claim put forth here which is likewise less tan self-evident: that a root in itself need not, and possibly cannot, serve as an independent domain for phonological spellout” (Borer 2013: 24).
Indeed, in the world’s languages it is common that roots are not pronounceable in isolation, which lends credence to the hypothesis that what enters the syntactic computation is the “concept” and not the “phonology” of the “root”. In fact, considering that roots have only phonological form clearly implies that roots are purely morphological objects, not syntactic ones, which is what our model predicts for roots (and morphemes in general).
On the other hand, if categorization has the effect of imposing an “interpretive perspective” on what is categorized, then it seems more reasonable to think that what is categorized is a concept, and not a root. For a concept it is natural to receive an interpretive perspective, whereas for a root it is not (especially if defined as a phonological index). The reason for this state of affairs, as I have suggested, relates to the asymmetric architecture of the essential components of FL. If a concept is only interpretable if categorized (assuming the arguments about roots of the DM tradition), then the categorization requirement is imposed by the CI system, which is primary with respect to the SM system.
5.2 Why are concepts categorized?
The categorization assumption is indeed a stipulation, but has some interesting (although speculative) theoretical foundation. The basic idea is that categorization makes formally homogeneous concepts that by their very nature are heterogeneous and incompatible. As Boeckx (2008) has suggested, categorization (lexicalization) imposes a common format to all concepts. This would allow lexical items to combine under their new shared format, rather than being limited to their natural affinity (i.e. purely semantic or conceptual). This in turn would mean that concepts that originally reside in different mental modules (and that possibly would be opaque to each other) can be combined to give rise to new concepts:
It is quite possible that what is at first a formal restriction on lexical items is the source of a cross-modular syntax of thought – giving rise to a full-blown language of thought, arguably the source of our Great (mental) Leap Forward at the evolutionary scale. (Boeckx 2008: 78)
Following the same line of thought, Ott (2009) notes a certain consensus in the field of comparative psychology on the fact that humans and other species share much of their conceptual systems. As Hurford has observed, “some (not all) of a human system of common-sense understanding precedes a system of language, both ontogenetically and phylogenetically” (Hurford 2007: 87, cited in Ott 2009), which would imply that such systems are not part of the evolution of human language, but predate such a faculty. It seems that both human neonates and other animals have considerable conceptual abilities, but animals, unlike what happens as humans develop, seem unable to integrate these various “mental languages”. This is what Ott (2009) calls Hauser’s paradox. Ott suggests that the ability to associate concepts with words (lexicalization) is the key to explaining this paradox, in the sense that concepts could be combined beyond their modular constraints, resulting in a productive system that would transcend the limits of core-knowledge domains, such as social relationships or spatial reasoning:
If these speculations are on the right track, the significant cognitive gap between humans and non-linguistic animals is not the result of a profound remodeling of the pre-linguistic mind. Rather, the sudden addition of recursive syntax, paired with a capacity for lexicalization, plausibly led to the explosive emergence of symbolic thought that paved the way for modern human behavior. (Ott 2009: 267 [my italics])
That “capacity for lexicalization”, which is precisely what in this paper I have identified with categorization, is thus a syntactic operation on concepts. In this context, it is not strange that the first phase of computation (categorization) is the minimal phase connected with the phonological component. And it is therefore expected that the first phase is memorized for the purpose of interaction with other speakers. But what is matched is not a “sense” and “sound”, but a morphophonemic structure (a p-word) and a syntactic structure (including an s-word).
The lexicon, thus understood, is not the input of the processes of syntactic derivation, but part of the output. In a certain sense one could say that syntax describes a kind of loop, because it produces derivations from concepts and “returns” them to the CI system creating new concepts (as shown by the vertical arrows in Figure 1). As Boeckx and Ott suggest, syntax decapsulates conceptual systems of various types and puts them at the service of a single computational system. In our specific version, syntactic categories take concepts belonging to different cognitive niches and make them formally homogeneous and computable. That is, s-words “rescue” concepts from “darkness”, multiplying the computational power of our Kind in an unexpected way in the natural realm.
5.3 How are concepts categorized?
Henceforth I will assume Baker’s (2003) theory of syntactic categories, with some modifications adopted from Panagiotidis’ (2015) model. Although these two models are different, they share a crucial trait: in both of them categories (or categorial features) are interpretable, in the sense that they have a specific function within the CI system (to provide “interpretive perspectives on concepts”, according to Panagiotidis’ formulation).
Following Baker (2003) I assume that nouns, verbs and adjectives (N, V and A) are the only lexical categories that exist, but following Panagiotidis (2015) arguments, I do not assume that A is a category of the same level as N and V, or that it can be considered the default lexical category, as proposed by Baker.  Thus, N and V are basic functional categories that take a concept and turn it into a syntactic word. It could be said, then, that the functional categories N and V are different “flavors” of what Chomsky (2008) calls an Edge-Feature. In this sense, the syntactic category is the property that makes a concept computable. In the likely event that categories themselves are interpretable, then we could say that the computational system (the Merge function implemented through an Edge-Feature) has the mission of building more complex concepts (such as dolor or doleo) which previously would not be available for the mind, and, of course, for language.
The computational system (as indicated by the vertical arrows in Figure 1) links concepts together and creates more complex concepts without adding anything new, apart from the structure. The rationale for the s-word > p-word hierarchy is based precisely on this fact: if only the concepts included in s-words are relevant for the language of thought (a primary relationship), it is expected that only what is (at least) an s-word will be susceptible of externalization (a secondary relationship). Another way of putting it would be to say that s-words exist prior to p-words and independently of them.
The first operation of syntax is therefore that of converting concepts into syntactic words (N or V). Following Panagiotidis (2015) I will assume that N and V assign to the concept they categorize a particular interpretive perspective (e.g. “extension in space” in the case of N and “extension in time” in the case of V). Syntactic words, under conditions which vary from language to language, are externalized, that is, are paired with phonological representations (phonological words), and these representations will eventually be stored.
The relevant question now is whether this global vision can explain the typically idiosyncratic character of words, that is, if it is the fact that words are the minimum limit for externalization which explains their idiosyncrasy.
6 Words are regular syntactic constructions
This model predicts that the interpretation of every s-word always has a compositional part (at least concept and category). No word is an atom syntactically and semantically, though it may be so morphologically. However, this does not imply that all complex words are fully compositional. In fact, typically they are not (consider, for example, recital or irascible). This fact (well known but still surprising) would follow from the property that singles out words in this model: that they are the smallest units of connection between meaning and sound (i.e. the smallest units of linguistic memorization).
6.1 What makes words special?
The externalization of syntactic derivations has the requirement that certain fragments of syntactic derivation are stored in the motor system, i.e. they are integrated into memory linked to phonological forms. Traditional models (and many current ones) tend to identify the stored fragments with the terminal nodes (as shown on the right side of Figure 3),  while the model I am proposing, along with the NS model, posit a different match, shown on the left side of Figure 3. In this diagram we see that the syntactic “atoms” (A, B, C, D, E) do not match the “phonological atoms” (W1, W2, assuming that the “phonological atoms” accessible to the computational system are not acoustic features or phonemes, but the smallest units of the prosodic hierarchy which are not sensitive to purely phonological information, that is, p-words).
Again I invoke the asymmetric linking of CS with CI and SM to explain this discrepancy: the relation between CS and CI is “optimal” and primary, while the relation between CS and SM is “imperfect” and secondary. The “size” of the smallest units of externalization (p-words) would be determined by inherent properties of SM (e.g. respiration) and are unrelated to the internal language of mind. 
The main difference between this model and the DM, NS and XS approaches is that I assume that lexical entries have the minimum size of phonological words, that is, I conclude that traditional morphemes lack syntactic and semantic properties. I think it can also be argued that the lexical limit of “idiosyncrasy” is better predicted with this model than with the “first phase model” (e.g. Marantz 1997; Embick and Marantz 2008; Arad 2003 or Panagioditis 2015). In these approaches it is stipulated that the limit of lexical idiosyncrasy is demarcated by categorization, understood as a first phase of syntactic derivation.  Without prejudging here whether or not categorizing nodes are phases in the Chomskyan sense (see Fábregas 2014 for reasonable objections), it is important to note that the exception to compositionality can not follow solely from the fact that categorization constitutes a phase, since higher phases (v, C) have no such effect, at least systematically. In the present model the limit of idiosyncrasy is justified because the first connection of the computational system with SM is set in the syntactic word, and it is this first connection that involves memorization.
Note that in purely conceptual and computational terms (i.e. in the language of thought) no idiosyncrasy is possible, since all elements (conceptual and functional) are compositionally interpretable. There can only be idiomaticity when the meaning of a linguistic form (e.g. recital or irascible) does not follow from its (presumed) structure; but that can only happen, obviously, when the structure is associated with a phonological form. Idiomaticity does not exist in the inner language of thought, it only exists in the mismatches that occur between the syntactic structure (the meaning, ultimately) and the morphological and phonological structure, part of the SM system.
6.2 Deriving words
Adapting the vision of root categorization of DM to our model, we can say that basic functional categories (F) take a concept from the conceptual system and project through merge in a construction, as shown in Figure 4:
By definition F can only be V or N, which impose an interpretive perspective on the concept. Following Baker’s (2003) model, it can be assumed that the category V takes a concept as a complement and adds a specifier (a subject). Therefore, if F is V, we have the scheme of Figure 5:
This would be the basic scheme of an unaccusative verb or a fragment of a transitive verb, in the sense that the specifier (X) is the internal argument of the verb. To simplify the argument we assume that V is the category that in some theories is represented as Aspect (Borer 2005b) or Process (Ramchand 2008) and that establishes the relation between the internal argument and the process phase of the event. Although for clarity and convenience I use the traditional labels (V, N), these represent interpretable functional categories and are not only categorial labels.  The syntactic categorization of the concept has two simultaneous consequences: it converts a concept previously isolated in a computable unit, and categorizes it determining its subsequent behavior in the derivation (and, ultimately, of course, its interpretation).
Let’s assume that the concept of the above scheme is the concept underlying the verb destroy. If the construction of Figure 5 is selected by v (corresponding to the category v of Chomsky 1995, voice of Kratzer 1996 or Init in Ramchand 2008), then we have a new verbalization of the derivation, with the addition of a new specifier (the external argument or initiator), as shown in Figure 6.
So far we have a derivation that roughly means that an argument Y initiates a process that happens to the argument X, a process that is defined by the selected concept (that of ‘destroy’). Note that the terminal nodes of this structure (without considering specifiers) are not words or morphemes, that is, they are neither listemes (the sound/meaning pairings of Borer’s 2005a; Borer 2005b model) nor roots or affixes (as in DM and in Borer 2013). In fact, beyond the inventory of functional categories (which we assume to be universal and limited), there are no lexical entities involved in the derivation. The structure of Figure 6 is purely syntactic and could be externalized as a sentence fragment or as a single word. In other words, on this approach there is still no systematic matching of sounds and meanings; there are only conceptual entities and functional categories.
The hypothesis that the matching between sense and sound is produced only through syntax implies that only once a concept is categorized (is an s-word) is a candidate to be linked to a phonological form (p-word). The structure in Figure 6, once the arguments have been evacuated, could be externalized in English with the p-word destroy, but not, for example, in Spanish. According to Spanish morphology such a sequence is unpronounceable as a p-word, since finite verbs in Spanish express (at least) tense and agreement features, which do not appear in the scheme. Thus, the derivation of a finite verb in Spanish must also reach the functional node T, as in Figure 7:
I have suggested the hypothesis that only categorized derivations can have phonological realization. This is a necessary but not sufficient condition. The particular range of materialization of heads in a structure like that of Figure 7 will depend on the morphological structure of p-words (i.e. on the extent of their paradigms). According to this hypothesis, the concept shown in Figure 7 would not be a possible candidate for externalization (in any language), so we assume that it is incorporated into its head (V in this case).  V would already be a possible candidate, but not in Spanish. Therefore V incorporates to v, but the result remains unsatisfactory in Spanish. Only an incorporation to T will meet in the same continuous span of heads all the features expressed by a Spanish verbal form, such as, for example, destruyó ‘(he/she) destroyed’ (ignoring for the moment singular and third person phi features). Only then can the materialization (and linearization) of the span [T[v[V concept]]] with the p-word destruyó happen in Spanish.  Note that we can consider the structure in Figure 7 as a schematization of the computational processes that allow interpreting the p-word destruyó. Therefore, we can say that the diagram in Figure 7 is also the “meaning” (the signifié in the Saussurean sign) of the “signifier” destruyó. Thus, the I-lexicon entry of destruyó would include, minimally, the morphological and phonological elements needed to build the exponents (p-words) of the verb paradigm, that is, the “signifier”, connected to the fragment structure shown in Figure 7, that is, its “meaning” (which includes a link to a conceptual element, a perspective of interpretation, an agentive head, and a tense specification). In this model the meaning of lexical units (words) is not stored in the I-lexicon, but it is always “calculated” in the interaction between the computational system and the conceptual-intentional system.
To complete the perspective, let us consider the projection of concepts as nouns, i.e. nominalization. According to Baker’s (2003) theory of lexical categories, nouns are characterized by what he calls the identity criterion, that is, sameness. Baker proposes that only nouns have a semantic component that makes it legitimate to ask whether X is the same as Y. Baker notes that nouns, verbs and adjectives have application criteria, so that by knowing what dog means we can identify which objects are dogs, knowing what blue means we identify what things are blue, and knowing what crying means we recognize who is crying. But, according to him, only nouns “set standards by which one can judge whether two things are the same or not” (Baker 2003: 101).
Moreover, Borer (2005a) develops a detailed and sophisticated theory of the functional categories that transform a root into a noun. Borer basically argues that nouns are the listemes dominated by three functional categories: CL(assifer), Q(uantity) and D(eterminer). The closest category to the listeme (the root) in her model is CL, which is related to plural number. According to Borer, plurality has the effect of fragmenting a continuous interpretation in what we might think of as reticles or partitions. The subsequent specification of the Quantity head (with a cardinal or a quantifier) selects a number of cells fragmenting the denotation of the noun. 
It is interesting to note that Baker’s model, based on the (more diffuse) idea of the identity criterion, is also related to plurality and quantity. Indeed, by definition, plurality is a requirement to count. But it is not the only one. To count dogs, for example, it is not only relevant that there is a plurality of dogs, but that we are sure about whether the next dog is different or equal to the previous one (whether it is the same or not). Meanwhile, mass nouns cannot be counted, but can be measured. And in order to know how much water there is, we must be sure that the water we are measuring now is not the same water that we have already measured. Therefore, it seems reasonable to assume that the basic nominal category is the category that provides the identity criterion or sameness to nouns, and that legitimizes the upper category of plural number.
It does not seem unreasonable to assert that a requirement for plurality is singularity, so I assume that the first nominalizing category is singular number, which will be responsible for the identity criterion (sameness).  I also assume without further discussion that morphological gender in languages such as Spanish is the expression of singular number, that is, the categorizer.  Therefore, I will represent nominalization as the merge of a concept to category N (for singular number, which also – e.g. in Spanish – can be masculine or feminine). Plural number (n), equivalent to Borer’s CL head, will therefore be another superior functional category selecting N as a complement. Thus, the derivation of a singular noun such as libro ‘book’ in Spanish (or book in English) will be like the scheme on the left in Figure 8; the derivation of libros (or books) would imply the scheme on the right. As in Figure 7, the concept is incorporated into its head (N), where it gets singular number (i.e. identity criterion, nominality), and the resulting complex is incorporated into n, where it obtains plural number and count reading.
Ignoring for now of the complex semantics of nouns, the key idea is that spell-out is always phrasal, regardless of the morphological complexity of words in a given language. Compare the representation of the words libro and book:
Note that unlike what is assumed in other neo-constructionist models, I am not assuming that morphemes are realizations of syntactic nodes, as would be tempting in the case of Spanish where there is a clear correspondence between N and the suffix -o and between the root libr- and the concept, a correspondence that is expandable to plural (with n being -s):
This is essentially the strategy of neo-constructivist models. However, the proposal I have set out here is based on the fact that syntactic nodes are typically sub-morphemic (Starke 2009). It is well known that the phenomena of fusion and suppletion, typical of many languages, challenge the claim that morphemes correspond to terminal syntactic nodes (see Anderson 1992, Chapter 3, for a full review). Moreover, although I am arguing that words are syntactic structures, it is also known that the claim that the internal structure of words can be explained as the result of the application of syntactic rules to constituent morphemes is inadequate (again, see Anderson 1992, chapter 10 in this case, for a detailed discussion). This morphemic view, characteristic of models such as Selkirk (1982) or DM, implies that morphemes are manipulated by syntax to create words, and that the same syntax then continues to operate with words to create larger structures. No doubt these models are the most theoretically attractive and elegant, but perhaps they are not the most empirically adequate.
I therefore suggest that the schemes in Figure 9 are the correct ones, while the diagram in Figure 10 involves an inappropriate mix of syntactic and morpho- phonological entities.  The central idea now is that the greater correspondence between both levels in Spanish is a simple historical accident, and therefore it is alien to the computational system. It is certainly true that the morphology of Spanish uses the scheme in Figure 7 to produce the p-word destruyó (and the diagram in Figure 10 to produce the p-word libros) in a different way from how the morphology of English does in producing destroyed (or books), but the syntax is alien to these differences, something that only is clearly reflected in the schemes in Figure 9.
The hypothesis that I propose is that the correlation between syntactic complexity (compositionality) and morphological complexity (analyzability) is contingent, accidental, and in a certain sense weak, something expectable if we take the asymmetric architecture of FL seriously.
A brief description of so-called derivational morphology will better enable us to show the plausibility and scope of the present model.
6.3 Deriving derived words
I anticipated that the present model, although it is not lexicalist, implies a paradigmatic morphology, and this is true both as regards inflectional morphology and derivational morphology, unlike Borer’s XS model. However, I will assume Borer’s (2013) analysis (common with DM) of derivative affixes as (re-)categorizing functional heads, that is, as more specific versions (more complex in their constituent features) of the basic categories (N, V, and A). But I do not assume that syntax manipulates affixes, because in this model syntax only handles functional categories and concepts, not morphemes. The relation between (re-)categorizing heads and affixes, as stated at the end of the previous section, is not direct or deterministic, although it does often reveal a certain isomorphism and (as discussed in Section 8) it may have an important role in the acquisition, storage and use of p-words.
Thus, the structure that would provide interpretation to the p-word destruction (similarly for Spanish destrucción) would be that in Figure 11, consisting of the merge of the derivation of Figure 7 with the head N, instead of T:
The representation in Figure 11, inspired by Borer’s (2005b, 2013) proposal for so-called complex event nominals, is intended to show that destruction, as used in Hannibal’s destruction of Rome (or, in the similar Spanish La destrucción de Aníbal de Roma) is not a noun formed from a verb, as a lexicalist theory would argue. Thus, there is no need to postulate a complex system of morphological adjustment rules, or a mechanism of arguments inheritance. The eventive use of destruction implies that the concept merges with V, which creates an event on an argument (Rome), and which is then merged with v, creating an initiation by Hannibal of the event, this then categorized as a noun, with identity criterion, i.e. with singular number (and, in Spanish, feminine gender). And this is exactly what the diagram in Figure 11 represents. Apart from many relevant details of the derivation (such as the system of case assignment to the arguments), the representation in Figure 11 shows that the meaning of destruction is not obtained from the lexical entry of the verb destroy and the subsequent application of rules of word formation, but is obtained, as in “normal” syntax, from the computation of concepts without grammatical properties through the basic operations of syntax and its functional categories. Figure 11 shows the subordination of an event to a noun, as would happen in the construction Hannibal’s action of destroying Rome (or in Spanish La acción de Aníbal de destruir Roma): in fact, the underlying structure of these NPs (in both English and Spanish) would be essentially the same as in Figure 11.
At this point I would like to make clear a common sense idea that, in light of the proposed model, may be tractable in a formal theory of language: complex words are actually the result of the reanalysis of regular syntactic constructions as “lexical units” (i.e. as syntactic structures associated with single p-words). The compositional meaning of destruction is action of destroying and indeed both expressions have roughly the same underlying syntactic structure (that in Figure 11).
If it is true that language change is limited to the component internalized from the linguistic environment (the I-lexicon, which is the only area in which realignments between meaning and sound can be produced), and if it is true, as our model establishes, that morphology is part of the internalized component (and is therefore the outcome of linguistic change), then it is inevitable to conclude that (inflected and derived) complex words are historical accidents, and therefore that morphemes are purely morphological residues of history. I said before that morphemes are not signs. We might now add that morphemes are actually “ghosts” (or “fossils”) of ancient words.
I will return to this in Section 9, but first we must address the crucial question of how a neo-constructivist model can account for “lexical integrity” effects and for the typical (and unexpected) lack of compositionality (and of productivity) of derivative morphology.
7 On the non-ephemeral syntax of words
Every neo-constructionist conception of complex words makes two predictions which are not met: (i) there should be a computational continuity between the internal and the external syntax of words, and (ii) the meaning of words with internal structure should be compositional. We will consider these separately, although they are closely connected.
7.1 The lexical integrity hypothesis
The apparent lack of derivational continuity between the internal and the external syntax of words (i.e. on the left and right sides of P-RaD in the scheme of Figure 2) is one of the most robust objections to a syntactic view of the internal structure of words. Ackerman and Neeleman (2007: 332), for example, argue that if the lexical and phrasal syntax were the same, then it would be expected that a noun that is incorporated into a higher affix could strand its complements or modifiers. Indeed, this is often impossible. Consider the Spanish example of (1):
If the internal structure of this derived word (zapatero ‘cobbler’) is built in syntax, then there is no way to explain why the example of (1b) is ungrammatical (which indeed it is, if intended to represent the sense ‘Cobbler that mends leather shoes’). The objection is relevant, but it only affects “morphemic” theories postulating that morphological word formation is performed in the syntax; that is, in our example, that derive zapatero ‘cobbler’ from zapato ‘shoe’, and that derive zapato from the root zapat-, according to the scheme in Figure 12.
Indeed, if the scheme in Figure 12 is a legitimate syntactic object, nothing should prevent the incorporation of N (zapato) into the superior N (-ero), producing (1b).  Any stipulation to prevent this process would add complexity to the system, eliminating the advantage over a lexicalist interpretation, or making the former indistinguishable from the latter. As Bermúdez-Otero (2012: 50) points out, this is like turning a component into the waste bin of another one.  However, the system I describe here, based on the hypothesis that word formation, even if it is done in the syntax, does not work with words, roots or affixes, can avoid such an objection while maintaining the null hypothesis that there is only one generative component in human language. Consider the derivation in Figure 13 as an alternative to the “morphemic” derivation of Figure 12:
The selected concept (the same concept associated with the Spanish word zapato ‘shoe’) is incorporated into V forming a verb (which would explain that the meaning of zapatero ‘person who mends shoes’ involves the activity of repairing shoes). V, in turn, is selected by v, introducing an initiator of the event (represented by an empty pronoun PRO in the scheme). Following Fábregas’ (2012) analysis of agentive derivatives with the Spanish suffix -dor (‘-er’), we could stipulate that the affix -ero corresponds to a set of features that include a D feature (which enables it to saturate an argument position, in this case the event initiator) and an N feature (which makes it a nominalizer). Following the derivation proposed by Fábregas, N is “re-projected” taking v as a complement and nomializing it. 
According to the left diagram in Figure 3, above, I postulated that a structure like that in Figure 13 is associated in memory with the phonemic form zapatero, which allows us to say that this structure is the meaning of the p-word. Note, crucially, that a similar structure would underlie the noun phrase Persona que repara zapatos ‘Person who mends shoes’ (which includes a relative clause). What matters now is that the speaker links such a fragment of syntactic structure to a single p-word (zapatero). Or to put it in another way, that such fragment of derivation corresponds to a p-word in the speaker’s I-lexicon. Syntax is thus a uniform computational system that allows us to “calculate” the meaning of linguistic expressions combining functional categories and concepts, while ignoring the issue of whether or not there are p-words available for particular structure fragments, or whether they are morphologically complex or not. The Spanish word zapatero is a nominal element with a “relative clause” which is materialized as a single p-word. We might also point out here that there is no place in Figure 13 for a possible prepositional complement of zapato, precisely because zapato does not appear as a noun in the derivation. And this fact explains the impossibility of Example (1b) without throwing weeds to the neighbor’s yard.
The most important implication is that, contrary to intuition (that of both speakers and linguists), I’m assuming that zapatero is not derived from zapato, but that the two words share the same concept, part of their structure, and also part of their phonological form (i.e. they form part of a paradigm), while they differ in the functional categories involved in their internal structure.
The prediction of this model is, therefore, that the interpretation of words, beyond Saussurean arbitrariness, is always compositional. The explanation for the abundant cases in which this expectation is not met is that in such examples the structure associated with the phonological form is different, regardless of morphological structure. The present approach then, following the DM framework, adheres to the principle of Full Decomposition: “No complex objects are stored in memory; i.e. every complex object must be derived by the grammar” (Embick 2015: 17). Nevertheless, I disagree with the notion of “complex object” used in this framework. Embick argues that complex objects are those objects “that consist of more than one morpheme” (Embick 2015: 17), assuming that they are derived syntactically “every time they are employed” (Embick 2015: 17). Under the present approach, the principle of Full Decomposition applies to the internal elements of words, but assumes that they are purely syntactic (functional categories and categorized concepts) and not morphological. Thus, I will show that the hypothesis that “every word and every phrase is built every time it is used” (Embick 2015: 17), as well as being theoretically interesting, is in fact more empirically adequate once we adopt the (“a-morphous”) alternative notion of “complex object”.
Indeed, it may be objected that Full Decomposition only works for compositional derivatives (or for simple words), which does not avoid the fact that we have to postulate a specific lexical entry when the meaning of a derivative includes features not present in the structure or when parts of the (alleged) structure are not interpreted. This, as we know, is extremely common. Indeed, the Spanish word zapatero typically means ‘person who mends shoes’, but not things like ‘person who buys shoes’, ‘person who draws shoes’ or ‘person who falsifies shoes’. However, a scheme like that in Figure 13 predicts that zapatero should also have these meanings (and anything else that involves a transitive verb). 
It is important at this point to return to the essential function of syntax according to the diagram in Figure 1: to create more complex concepts from simpler concepts. The specific meaning of zapatero ‘person who mends shoes’ (as is the case with the meaning of most derived words) is a complex mixture of compositionality and idiosyncracy. Therefore, if (as we have assumed) interpretation is always compositional, we must stipulate that the structure in Figure 13 has additional information, for example, assuming that the verbalizer V is associated with a certain conceptual element and not others, as in the scheme in Figure 14:
Figure 14 is intended to show that the structure associated with the p-word zapatero in Spanish (i.e. its meaning) specifies that the event involved in the verbalization (the Concept-2 in the scheme) is related to the concept ‘repair’ (or ‘make’) but not to ‘falsify’ (or ‘destroy’, etc.). Note that no complexity is added to the system, as Figure 14 also would be the underlying structure of the analytical version Persona que repara zapatos ‘Person who mends shoes’. Syntax is capable of building very specific and refined concepts (which is no surprise, in that this is basically its raison d’être), with the peculiarity that only a very small number of these complex and specific concepts, contingently, are associated with memorized phonological forms. Thus, in Spanish, the complex concept ‘person who repairs shoes’ (whose derivation is that in Figure 14) has an associated p-word (zapatero), while the complex concept ‘person who destroys shoes’ does not, although both complex concepts are perfectly legitimate and compositionally generated by syntax. As I noted above, the diagram in Figure 14 is actually the non-lexicalist version of the Saussurean linguistic sign: the structure (including the basic concepts) is the signifié and the p-word (zapatero) is the signifiant. The potential signifié ‘person who destroys shoes’ has no specific signifiant, and therefore it is not usually considered a sign.
7.3 Morphological compositionality?
The problems of lack of compositionality begin when the interpretation of a morphologically complex word does not correspond to its supposed structure. But here the emphasized predicate is crucial. In a purely syntactic approach (as is the case here) the interpretation of words is always compositional (i.e. fully decomposable) because the morphological structure is irrelevant (invisible) for syntax. In fact, a morphologically complex word may have a relatively simple syntactic structure. For example, transmission with the meaning ‘gearbox’ would have a syntactic structure similar to that in Figure 8 (left side), while its compositional version would have a structure similar to that in Figure 11. Of course, the relation between morphological structure and syntactic structure is not arbitrary, but the fact that the relation is not arbitrary does not necessarily imply that it is isomorphic or that one structure determines the other.
In fact, they are structures of a different nature. Syntactic structure is hierarchical (the result of iterated merge), while the structure of derivative morphology is linear (i.e. concatenative).  The claim that morphological structure is the result of the historical reanalysis of syntactic constructions predicts such a difference, since the syntactic structure “belongs” to the computational system, while the morphological structure “belongs” to the sensorimotor system. And this is so because the morphological structure is the result of the externalization (linearization) of syntactic structure (as reflected in the left side of Figure 3, in which a flat sequence, W, corresponds to a fragment of hierarchical structure). If we consider the formulation made by Everaert et al. (2015) about the asymmetry discussed in Section 2, that is, “what reaches the mind is unordered, what reaches the ear is ordered” (Everaert et al. 2015: 740), we can conclude that morphological structure is part of the linearization of the hierarchical structures generated by syntax, and therefore it is not itself hierarchical, but linear.
This model predicts that there is a (non-deterministic, but psychologically real) correlation between morphological complexity and syntactic complexity, so that a greater degree of morphological complexity corresponds to greater syntactic complexity, and vice versa. This is clearly seen if we contrast the representation of book (Figure 9) and that of destruction (Figure 11). These representations clearly illustrate the correlation between the complexity of syntactic and morphological structures, and also allow us to understand the temptation of correlating these structures deterministically, a common feature of morpheme-based theories. However, it is easy to find, in all languages that have derivative morphology, morphologically complex expressions with unexpectedly simple underling syntactic structures. Thus, we can compare the structure associated with transmission in its reading of a complex event nominal (left side of Figure 15) with the structure associated with the same p-word in its idiomatic reading ‘gearbox’ (right side of Figure 15, in which it is assumed that the concept involved is different):
Borer (2013: 500) proposes an explanation for the phonological form of the complex event nominal through an incremental derivation by phase from a hierarchical morphological structure, but it then remains unexplained why the non-compositional versions (or those with smaller internal structure, as in the case of R-nominals) have exactly the same morphological and phonological structure. The phrasal spell-out model that we assume, along with a paradigmatic conception of derivational morphology, explains more naturally this unexpected identity. On the one hand, the conception of derivatives as the externalization of phrases explains the tendency towards the correlation between syntactic complexity and morphological complexity (both in inflectional and derivative morphology). On the other hand, the fact that this externalization is the result of a historical reanalysis (see Section 9) explains the possible disparity of complexity in pairs such as those in Figure 15. Thus, the morphologically complex structure of transmission is explained as a consequence of the fact that it is the materialization of a complex derivation (‘the action of transmitting’), whereas the same morphological structure found in transmission ‘gear box’ is explained as the result of the impoverishment of the original syntactic structure, again the result of a reanalysis.  It is important to note that the impoverishment of the original syntactic structure involves the association of the derivation of the new interpretation to a different concept (in this case, the same one that would underlie the compound gear box). 
The vision of the lexicon that emerges from this model coincides with the notion of I-lexicon of Figure 1: a “passive” vocabulary of formants (p-words) associated with syntactic derivations produced by the computational system.  But if there are no word formation rules in the lexicon, and if syntax does not manipulate morphemes to create words, what is the source of complex words? What is the source of the intuition that some words are created from others, and that different words share a smaller number of constituents? A possible answer to these questions comes from two notions of traditional grammar: paradigm and analogy.
8 Paradigms: Productivity and analogy
Baker used the fact that “there is not always a simple relationship between the size of a morphological unit and the complexity of the syntactic node it corresponds to” (2003: 277) as an excuse not to reject the difference between morphology and syntax. My anti-lexicalist model does not deny that syntax and morphology are independent. On the contrary, it claims that they are radically different. It is true that, in a sense, the morphological structure of a word “tells a story” about its internal syntactic structure, but morphological structure neither determines syntactic structure nor is derived from it. The (unsurprising) general prediction is that the more complex the morphological structure, the more complex the syntax associated with a word, but nothing else. Of course, the morphological structure of (inflected and derived) complex words is not arbitrary, but the ultimate explanation of the internal arrangement of morphemes, though it stems from the syntactic structure, is conditioned by the historical evolution of p-words (see Section 9).
One way to capture this “weak” relationship between the two types of structures is by adopting a paradigmatic conception of the morphology of languages, in the traditional sense (see Robins 1959). In fact, hereinafter I use the expression paradigmatic in a general associative sense, one which dates back at least to Saussure (1916). The illustration in Figure 16, itself taken from Saussure (1916: 175), links paradigmatically enseignement ‘education’ with words that have the same root, with words of the same semantic field, with words that share the derivative suffix, and with words that have the same rhyme.
One possible way to understand this scheme is to argue that the associative relationships expressed with dotted lines represent areas of shared memory. Thus, enseignement shares the root with enseigner, enseignons, etc., the suffix with changement, armement, etc., (at least) part of the concept with apprentissage, éducation, etc., and part of phonology with clément, justement, etc. These shared memory areas are relevant in the processing and use of words, as experimental psycholinguistics has shown over the last half century.
Although the scheme in Figure 16 was published a hundred years ago, it expresses visually the same perception of reality that underlies modern (and more sophisticated) lexicalist conceptions of the relations between words, such as Jackendoff (1975), Aronoff (1976) or Anderson (1992), and paradigmatic versions of morphology (Stump 2001), all of them compatible (except in the conception of the underlying syntax of words) with the non-lexicalist model defended in the current contribution. Thus, one could say that the paradigmatic links reflected in Saussure’s scheme have the same function as Jackendoff’s (1975) lexical redundancy rules (updated in Jackendoff and Audring 2018). The vision of the I-lexicon of a language as a set of p-words paradigmatically associated with each other can capture in a natural and elegant way the analogical processes underlying the (usually limited) productivity of derivative processes and the role and nature of bound morphemes in languages.
Recall that, according to the model I have outlined, bound morphemes (roots and affixes) are neither signs nor do they have syntactic properties (they are invisible to the computational system), but are in fact fragments of phonology in the service of the storage and use of the exponents that make up the I-lexicon. Jackendoff’s notion of lexical redundancy rules, according to which each rule “expresses a relation among items stored in the lexicon” (Jackendoff and Audring 2018: 9) nicely captures this conception of morphemes, in the sense, for example, that the relation between transmit and transmission (in any of its readings) is never a derivative relation, but a paradigmatic relation between both words. Such a relation will be closer in the case of the complex event reading of transmission, in that in this case both forms also share much of the syntactic structure they materialize (something similar to the diagram in Figure 6). Similarly, the relation between impetuous and other -ous derived words in English is a (non-derivative) paradigmatic relationship, which explains speakers’ intuition that such words are part of the same paradigm whether or not there is (or they know) a base term, and at the same time it also explains the processes of spontaneous innovation (and subsequent possible historical changes) that characterize the derivational morphology of languages. In this sense, morphemes can be conceived as side products of linguistic changes that contribute to enhanced efficiency in the storage and processing of the phonological exponents that make up the I-lexicon of every language.
Jackendoff and Audring consider the terms cognitive and cognition as “sisters” in the sense that it cannot be determined which one derives from the other. The same applies to cases of “un-directionality” of the type in assassin and assassinate (Jackendoff and Audring 2018: 17–18): the apparent paradox that a morphologically derived term (assassinate) is involved in the definition of the alleged base (assassin ‘person who assassinates’) is resolved naturally in our model, since the conclusion that emerges from it is that all lexical relations are actually “fraternal”.
In Jackendoff and Audring’s model, along with the usual lexical entries, there are schemes (understood as impoverished lexical entries). According to the authors, “the role of a scheme […] is not to permit material to be omitted from other entries, but to confirm or codify or motivate generalizations among lexical entries” (Jackendoff and Audring 2018: 16). They argue that schemes can be of two types: generative and relational. But if all relations are fraternal, then relational schemes are redundant. Moreover, since we do not accept the “lexical” (i.e. “constructional”) conception of syntax of such a model, generative schemes are also dispensable in favor of the merge process of syntax, the only source of compositionality in human language.
But if the I-lexicon, as we have conceived it, consists solely of p-words clustered in dense networks of shared memory, and if syntax operates only with functional categories, then what is the source of the productive and compositional processes of inflectional and derivative morphology?
The only possible alternative is the (no less traditional) concept of analogy. It is not my intention to develop a coherent model of the concept of analogy (see Fertig 2013 for an updated review), but only to note that it is no less suitable for explaining the processes of lexical creation because it is a traditional concept. Of course, saying that analogy is the essential mechanism of lexical creation does not imply accepting that analogy has any relevance in explaining the syntax of human languages. In the model I am outlining, analogy has a range of action limited to the I-lexicon, in the sense that it is a process that creates forms, not syntactic structures. 
In simplified terms, an analogical process involves the extension of a similarity between two forms (A-A’) to a third form (B), creating a new one (B’), as it is shown in the traditional scheme in Figure 17, illustrated with a common case of paradigmatic regularization (the replacement of the etymological plural form kine, C in the scheme, with the analogical form cows).
Analogy is therefore consubstantial to the concept of paradigm. In fact, it is arguably the essential mechanism of paradigm creation. The process of analogy exploits profusely the associative relations shown in Figure 16, but it is not a recursive generative mechanism. Only syntax can create a structure like that in Figure 11, associated, for example, with the word formation; that is, syntax is the only source of compositionality.
I have assumed that complex words are the result of the reanalysis of phrases as single p-words.  But obviously these processes do not occur constantly in the minds of speakers. In the case of the English word formation these processes did not even occur in the minds of Latin speakers: they were limited to learning and transmitting the p-word formation (or any other in the paradigm) for tens of centuries. Learning the word formation means building and memorizing a phonological form (which for the sake of simplicity we identify with a p-word) and assigning it a structure (i.e. a meaning), something that can only be done with a computational system. The computational system, as I proposed in Section 6, builds a derivation around a concept and determines its interpretation compositionally. It is the linking of this phonological form to a syntactic structure like that in Figure 11, together with the paradigmatic linkage to the word form (and its structure), that may explain how it is that the speaker, without having heard it before, can create a word such as causation (assuming he/she also knows the word cause).  What matters now is that the productive use of causation does not imply a rule of word formation, but rather the extension of the paradigm of cause by analogy with the paradigm of form, formation, forming, etc.
A possible objection here is that there is not much difference between an analogical process and a rule of word formation. However, I think it is contradictory to use the notion of rule and then to set many restrictions to block its implementation. Of course we can denominate “rule” any productive analogical process (with a very wide output), in the trivial sense that a rule expresses a regularity, but without assuming that it is the same kind of rule as “syntactic rules” (which, on the other hand, are not used in modern syntactic theories). Recall that I am not claiming that analogy creates the syntactic structure underlying causation, but only the p-word. Its structure is created by syntax. In addition, analogical processes are independently required. Consider, for example, back-formation processes (as in the case of sculpt from sculptor, by analogy with write-writer, etc.). It does not make much sense to speak in these cases of word formation rules.
Returning to the (imagined) innovation of causation, I argue that when that word is innovated, its meaning has to be strictly compositional, since the formation of the p-word causation occurs in the I-lexicon (in which there is no meaning, but only phonological forms linked to syntax). The fact that, in our example, causation is formed by analogy with formation means that causation is assigned exactly the same syntactic structure, the only difference being the identification of the basic concept, which is common with the word cause, again as a result of paradigmatic organization. The products of analogy are by definition compositional and only the “permanence” of p-words in the I-lexicon (i.e. memory) may involve changes in their structure due to the effect of reanalysis, typically structure impoverishment (as reflected in the diagrams in Figure 15 for the two meanings of transmission).
In fact, it is also possible that the syntactic analysis of a p-word is enriched, even during the life of a speaker, for example if the paradigm is extended. If a speaker of Spanish recently arrived in Aragon learns the Aragonese Spanish word laminero ‘sweet tooth’, he/she will surely analyze it as a concept plus a categorizer (as on the left side of Figure 18). But if the speaker later discovers that there is also the word lamín ‘sweet food’, he/she may introduce more structural complexity (perhaps an analysis in which a name is adjectivized, as on the right side of Figure 18).
It is plausible that the process of lexical acquisition follows this pattern, proceeding as far as possible to assign rich structure to morphologically complex words, freeing up memory, and rearranging and expanding the paradigms and the conceptual system itself.
Let us return to the structure in Figure 13 above, proposed for zapatero ‘cobbler’. I have assumed that it includes the same basic concept that would be involved in the word zapato ‘shoe’. In this case one might ask why the structure is not simply materialized with the word zapato itself. At this point the paradigmatic nature of the morphological theory involved can be appreciated. The structure of Figure 13 is not a system of word formation, but a syntactic structure which the speaker brings, so to speak, in order to interpret the meaning of the word zapatero in a given context. In the model I have presented (as in NS) the materialization of sequences of terminal nodes involves a competition between the forms “anchored” to a given concept. So, zapato ‘shoe’, zapatería ‘shoe store’, zapatero ‘cobbler’, zapatilla ‘slipper’, etc., are all p-words stored as part of a paradigm, which in turn is associated (through syntax) to a particular conceptual area. The form zapatero would be the most compatible with such a structure among those that make up the paradigm, in this case according to the mechanisms developed in the NS model (e.g. Fábregas 2007).
Anderson (1992: 186) suggests that word formation rules primarily serve to establish the relations between words in the lexicon that are part of the speaker’s linguistic knowledge. This vision, although lexicalist, is compatible with the model we are developing, to the extent that it can be said that derivative families (zapato ‘shoe’, zapatero ‘cobbler, zapatear ‘toe-tapping’, zapatilla ‘slipper’, zapateado ‘tap dance’, etc.) form complex paradigms with purely morphological and phonological structure which are interpreted in light of the syntactic structure with which they are related. In this model the role of word formation rules is played by syntax itself, but without interacting with morphemes. The degree of compositionality of derivatives depends on the “amount” of structure with which they are interpreted. It seems reasonable to assume a certain correlation between the morphological and the syntactic complexity of a word. But this correlation is not deterministic, which is a common problem for theories based on word formation rules and for morpheme-based syntactic theories, such as DM. The idea I suggest is that the morphological structure of phonological words serves as a record (or indication) of underlying syntactic complexity, with a sort of “mnemonic” value. Anderson (1992: 189 et seq.) concludes that the rules of word formation have a dual mission: to form new words and to serve as a model to analyze others. This suspicious dual role can be simplified by assuming that the basic process of word formation is analogy. Note that the most serious problem of word formation rules is that they typically over generate. But analogy, as with morphology in general, is an accident, not a truly generative process, and hence it has limited productivity.
I have argued that the relation between syntactic structure and morphological structure is weak and indirect. This is a consequence of taking seriously the structural asymmetry of the language faculty as presented in Section 2, and it also follows from the different nature of the two types of structures (hierarchical in one case, linear in another). In fact, the separation of syntactic structure and morphological structure is a common independent conclusion in contemporary morphology, for both theoretical and empirical reasons (see Stewart and Stump 2007 for a full and clearly reasoned review). But if (against morphemic models such as DM or, in part, XS) the syntactic structure does not determine the morphological structure, two important questions arise: (i) what is the origin of morphological structure? and (ii) why is there some isomorphism between the two types of structures in the world’s languages? The model I have put forward can give a unique answer for both questions: morphological structure comes from syntax, but it is mediated by history. To use Givón’s (1971) felicitous expression, “today’s morphology is yesterday’s syntax”.
9 Today’s morphology is yesterday’s syntax
The model presented in Section 2 postulates that all linguistic diversity is located in the externalizing component of language, assuming that syntax (i.e. Narrow Syntax in terms of Hauser et al. 2002) is common to all languages. Against lexicalist models, I have also assumed that syntax is the only generative engine of language, and that there is not a specific module of word formation. The usual conclusion drawn from such premises is that morphological structure comes from syntax, which has led to morphemic models like DM (and more weakly, XS and NS), which stipulate that syntax operates on morphemes to build words.
The fact that there is a certain isomorphism between the order of morphemes and the structural hierarchy of the syntactic categories that morphemes (allegedly) externalize provides important empirical support for such a proposal. Thus, Julien (2002) reviewed 530 languages from 280 different family groups and concluded that her consideration of verbal morphology supports the hypothesis that the order of morphemes is determined exclusively by syntax. For his part, Stump (2001) presents numerous examples in which morphological structure is not correlated with syntactic structure, such as Albanese lahesha and Latin lavabar (both meaning ‘I was washed’) (Stump 2001: 25). In Albanian the voice morpheme precedes the tense morpheme, while in Latin tense precedes voice, but this does not mean that the voice (v) and tense (T) heads necessarily have different relations of structural domain in both languages.
The hypothesis I have proposed in which p-words are the externalization of phrasal fragments (through the mechanism of historical reanalysis) is consistent with both groups of empirical evidence. Thus, if it is true that morphemes are old p-words reanalyzed as part of a new p-word (as I claim), then is it expected that the internal linear structure of complex words preserve with some transparency the syntactic hierarchy that the old words externalized. The historical formation of a p-word has, so to speak, the effect of “freezing” that sequence in the I-lexicon, and analogy can then keep it “alive” in (semi)productive processes, producing the illusion that there are rules of word formation or that syntax combines morphemes to produce them.
Without getting into specifics, it may be interesting to compare two (simplified) alternative analyses for centralization, as shown in Figure 19.
The type of analysis I propose, on the left, does not assume that the morphemes forming the word centralization are the realization of the nodes of the syntactic structure, notwithstanding a certain correspondence between the two entities. Note that the proposed structure is ordinary syntax, similar to that which would underlie the analytical expression action of making central. The morphemic proposal, on the right, has to postulate that either the heads are on the right, or that if they are on the left, then there must be a massive upwards movement. According to the present model, the “fossilized” nature of morphological structure (and its subsequent drift due to new reanalysis and sound changes) leads to the expectation there are severe discrepancies between the morphological and the (current) syntactic structure in some languages (or in some paradigms of some languages).
Therefore, according to the proposed model, bound morphemes (roots and affixes) are not primitive entities of the faculty of language, in that they are “invisible” to the computational system, which is solely responsible for the compositionality of linguistic expressions. Accordingly, it makes sense that morphemes are a by-product of processes of historical reanalysis in the crucial phase of the construction (the “internalization”) of the I-lexicon. Consider Figure 20.
In this scheme it is observed that the computational system generates derivations through the unlimited, binary and endocentric merge of the elements A, B, C … J (these are either concepts of the CI system or functional categories; also interpretable in CI, such as number, tense, quantity, definiteness, etc.). The p-words (W1, W2, etc.) of the I-lexicon (specific to each language) externalize structure fragments. Typically, each W represents more than one syntactic node. So-called “lexical words” (nouns, verbs, adjectives) always externalize at least two syntactic nodes (the concept and the categorizer) and they always have phonological expression. Functional categories may not have phonological expression, which is an important source of structural diversity in languages (see Roberts and Roussou 2003).
The more syntactic nodes that W materializes, the more likely that its morphological structure is complex. This is so because when a historical process of reanalysis (indicated by the two cases in Figure 20) causes a sequence of words to be reanalyzed as a single p-word (W1–W2 as W5), then W1 and W2 (formerly p-words) are left as recurring fragments, associated more or less transparently with the syntactic nodes they materialize (J-I and H-G-F respectively). W1 and W2 are morphemes.
If the above model is reasonably correct, then morphemes are neither primitive elements of the computational system, nor of the conceptual-intentional system. They are, then, elements of the sensorimotor system. They are side effects of linguistic change operating as resources to optimize the processing and memorization of words. Although words have an internal syntactic structure by definition, there are no rules of word formation, and no generative lexicon. The foundation of word formation is syntax, but the mechanism is analogy.
As I have mentioned, the existence of affixation (and especially non-compositional affixation) is an indication of the historical antiquity of languages, which is why McWhorter (2011) uses this criterion to determine if a language is a creole. If morphology is a side effect of history, then it belongs to the externalization component of language (in the SM component), which makes it less plausible and consistent that there is a generative module to create words, and that it operates with morphemes.
10.1 Why are words so important?
Non-lexicalist theories often conclude that words are epiphenomena. Julien (2007), for example, states that “the discussion of whether complex words are formed in the syntax or prior to syntax is futile, because words as such are not formed in the grammar at all. They are not grammatical entities” (Julien 2007: 210), which leads her to conclude that “the concept of ‘word’ has no theoretical significance in the grammar at all” (Julien 2007: 212). It seems reasonable to claim that there are no word formation procedures, but this does not mean that words are not created. Words are created in syntax when concepts are categorized (s-word), and they are created in morphology when syntactic structures are materialized (p-word). The two types of words do not necessarily coincide, as there is a hierarchy s-word > p-word and the matching range depends on the morphological structure (i.e. on the uninterrupted history) of each language. In the model I have outlined in this contribution, syntax determines the internal structure of words, but crucially it does not determine their morphological structure, a task for morphology and history. 
What people learn, use, recognize, utter, forget, recall, and miss are words, not morphemes. Julien herself (2007) notes that words have a greater appearance of psychological reality than morphemes, which is surprising if it is true that syntax operates with morphemes and that words do not exist. The explanation Julien offers is that the reason why words are more accessible to the speaker has to do with their distributional properties: “since words are the minimal morpheme strings that can be used as utterances and that may be permuted more or less freely, words are the minimal linguistic units that speakers can manipulate consciously” (2007: 234). In light of this, “word-internal morphemes, by contrast, cannot be consciously manipulated in the same way, and consequently, word internal morphemes are less salient than words in the awareness of speakers” (Julien 2007: 234). Such an explanation is clever and interesting, but fails to explain why words have the distributional privilege that morphemes lack. The most natural explanation is to state quite clearly that words acquire their distributional independence by virtue of being the smallest units of syntax that connect to the SM component. Morphemes are just that, pure forms that more or less accurately “recapitulate” the internal structure of words. The morphological rules of a language do not determine the internal syntactic structure of a word, but determine its shape. In thus sense, morphology is morphology.
The model I have presented is actually a variant of DM, on which it is based. The essential difference is that DM is a morphemic model, with the difficulty inherent in all models that mix syntactic and morphological entities. In the present approach syntax does not work with morphemes, but only with syntactic categories, incorporating concepts through merge (categorization). However, the morphological theory involved is realizational (late insertion) in the sense that morphology operates with fragments of derivations to produce or select forms within paradigms. Thus, the words cloud and cloudy are related, but not derivatively. Both are constructions on the same concept, though certainly they do not express the same meaning, in accordance with the syntactic derivation of each one.
The terminal nodes in the syntactic trees of lexicalist models are usually words, while they are morphemes in non-lexicalist accounts, so that sentences are made directly with morphemes without the intervention of the notion of word. The model outlined in these pages vindicates the relevance of words, not as lexical units, but as fragments of syntactic derivation linked to phonology. The central question is to what extent this conception of the word (i) really explains those aspects that support the lexicalist hypothesis, and at the same time (ii) permit us to dispense with it.
Williams (2007: 356) summarizes these issues as follows:
Williams argues that either the facts of (2) are not so, or “we need something like the Lexical Hypothesis” (Williams 2007: 356). The non-lexicalist model I have outlined suggests that (2a) and (2c) are not really indisputable facts, showing that the internal structure of words is just regular syntax (although synthetically externalized, so to speak). And both (2b) and (2d), to the extent that they are different, are explained through the hypothesis that words are fragments of derivation sensitive to phonological spell-out. The early materialization (spell-out) makes these derivations opaque to other syntactic processes, predicting lexical atomicity effects.
Lexicalist theories, like traditional theories based on Indo-European morphology, start from words and project them in syntax. Syntactic theories unmake the word, pushing it aside as an epiphenomenon, and focus on the syntactic construction of sentences by means of morphemes. The present model aims to combine the best of both traditions: on the one hand, the word is constructed syntactically, it is syntax, but on the other hand, the word (as a phonological/morphological form) exists independently as a system of materialization and linearization of syntactic structures. Words therefore do not exist as stored lexical units prior to syntax, but exist as phonological forms stored and organized in (more or less extensive) paradigms. Syntax forms derivations using functional categories and concepts; an essential part of the derivation is the categorization (lexicalization) of concepts, at which point the phonological component is accessed and a word (that is, a fragment of derivation associated with a phonological form or a paradigm) is selected or created.
For all these reasons, it is perhaps not surprising that, as Anderson (2015) notes, Saussure did not use the term morpheme once in his posthumous and celebrated Cours. He did not perceive the complex syntactic structure underlying words, but he did understand that morphemes were not an indispensable part of the essence of languages, as suggested by the fact that bound morphemes, unlike phonemes, words or sentences, are not universal in languages (although remarkably common).
10.2 Where is morphology?
The model developed over the course of these pages, although it vindicates the word, is radically anti-lexicalist in the sense that morphology is located in the sensorimotor system, that is, outside the FLN. Moreover, as reflected in Figure 1, in this model the lexicon (defined as I-lexicon) is also “external” and belongs to the sensorimotor system. Following the important insight of DM, I conclude that what is distributed is not the morphology, but the lexicon. The famous Saussurean signifiant belongs to SM, and the no less famous (and elusive) signifié belongs to CI. According to the present framework only syntax (CS) establishes a link between them. The relation between syntax and CI is strictly compositional, while the relation with SM is, although sometimes isomorphic, accidental and variable.
The conclusion that morphology is part of SM is consistent with recent theoretical developments in the NS framework: “Morphology, we argue, may be reduced entirely to the function that spells out the syntactic tree by choosing and inserting phonologically contentful lexical items” (Bye and Svenonius 2012: 428). But not surprisingly it is also consistent with findings from avowedly lexicalist perspectives:
Modern Hebrew verb roots and their alternation classes, like those of English and Latin, thus furnish yet another example of morphology by itself, leaving lexical meaning to reside where it belongs, not in roots, which are purely grammatical objects, but in lexemes, where language meets the world. (Aronoff 2007: 828)
The possibility that traditionally opposing positions can be considered (to a certain degree) complementary, capturing the empirical advantages of both, is a direct consequence of the hypothesis that the syntactic word is the minimal connection between CI and SM.
I have assumed a paradigmatic morphology without entering into its formal details. In general the model described here is consistent with Bermúdez-Otero’s vision according to which “morphology connects the output representations generated by the syntax with the underlying forms in the input to the phonology” (Bermúdez-Otero 2012: 50). According to his strictly modular model of the relationship between syntax, morphology and phonology, the morpheme (the morph in his terms) constitutes “the representational currency of morphology”. In this approach, morphological operations are strictly modular only “if they treat morphemes as inalterable units, and never change their syntactic specifications or their phonological content”. As the author notes, “in this overall conception of grammar, only syntax manipulates syntactic features, and only phonology manipulates phonological features” (Bermúdez-Otero 2012: 50). 
The image Bermúdez-Otero proposes of morph(eme)s as insects trapped in amber is indeed an elegant metaphor, and provides a satisfying reflection of the vision I have proposed of morphemes as a result of history (and as tools of memory):
[A] morph is like a translucent droplet of amber encasing a fossilized insect, the phonological content of the morph being like the body of the insect itself, and the morphology like the laboratory of an entomologist working with a collection of such specimens. (Bermúdez-Otero 2012: 51)
Beard, a representative of another influential lexicalist account, concludes that “the simplest and most consistent universal theory of morphology will represent grammatical morphemes as purely phonological operations on lexemes or phrasal positions, not as listable objects” (Beard 1995: 378), a position that is also compatible with the vision of morphology set out in my proposal.
I have suggested that the lexicon of languages is a paradigmatic collection of p-words stored in the sensorimotor system (but see footnote 18). The idea that the lexicon is distributed implies that the meaningful part of the traditional linguistic sign belongs to the conceptual-intentional system. Note that on this view, the traditional central components of human languages (meaning and sound) actually belong to the “external” components of the FL in Hauser et al.’s (2002) model, and not to its (supposedly) specifically human and specifically linguistic core, which may be surprising. The hypothesis that words are always syntactic constructions may allow us to better understand why the human computational system creates, in its interaction with the “external” systems, a specifically human system of knowledge and communication. The key idea, then, is that the meaning of words (like the meaning of phrases) is not encoded in the lexicon (which is purely morphological and phonological), but is the result of the interaction between the computational system and the conceptual-intentional system.
10.3 What is the meaning of words?
According to the present model, words do not copy or translate extra-linguistic concepts, but simply compute them syntactically. When a person sees a book, we assume that he/she accesses his/her concept of book, not the meaning of the word book. Therefore, when instead of seeing a book a person hears the word book, we should continue to assume that he/she accesses the book concept, not the meaning of the word. But when we use language we do not access the concept directly, but through the syntactic structure of the word book. It could be said then that syntax contributes to the creation of a substitute for the perceptual representation of the book itself. Language actually builds substitutes of perceptions and emotions. The sentence You saw that man provides sophisticated and complex instructions to replace in our interlocutor’s brain his/her past perception of such a man. Language allows the brain to experience perceptions and states that are not real, and what is interesting is that it does so through operations that emulate perceptual operations. Such operations are modulated by functional categories, including the basic ones (V, N, A), profiling concepts as events, objects or properties, and others, such as tense, aspect, definiteness, etc.
People accumulate concepts because it is the only thing that their brains understand, and when they think, what they do is to manipulate concepts, to relate them, to compare them, to position themselves with respect to them. But to do this, concepts are not enough; functional categories and syntax are needed. People do not normally seek to communicate concepts but want others to reproduce in their brains those operations that we make with concepts: we want to communicate the relations between concepts, how we perceive them, how we understand them, how concepts move us. And for this reason words do not correspond to concepts, but to syntactic computations. When we tell someone I’ve seen the book we do not want to talk about the concept of book or about the concept of seeing (unless you’re a philosopher): we want to reproduce in his/her mind the scene of ourselves seeing the book as if our interlocutor had been there. The only strategy to do so is to compute concepts with functional categories and translate such computations phonologically (or visually). We select the concepts most related to the objects and events involved, and we build structures using functional categories that provide an emulation of the perceptual stimuli that the listener would have had if she/he had been there. Perhaps this explains why when we understand the word walk the brain circuits that we actually use to walk are activated (Pulvermüller et al. 2005): because the word walk, besides its syntax, includes the “motor concept” walk itself, not a meaning or a sign of it.
A language that is not used to speak does not require words (p-words): concepts and syntax would do the job. But such a language allow us to access only our own concepts and our own experiences. It is conceivable that we might in this way have an even rich inner life, but it would almost certainly be different. When connecting the CI system with the SM system we associate a computation (which includes concepts and instructions for interpreting them) to a phonological form. This can be interpreted as an additional system of memory (and, if Jackendoff 2012 is right, as an enhanced access of thought to the conscious mind). In the process of syntactic derivation we associate a piece of computation to a phonological form (e.g. book). But that is not a concept, it is, as it were, a point of view on a concept, it is a set of instructions for computing a concept (see Pietroski 2011). The minimal computation (the syntactic word) enters a new memory system: the phonological form. The phonological form (morphology and phonology) is a “translation” of a syntactic computation into the motor system. This allows the storage and use of computations, and also the learning of computations from others. When we learn a language, we learn how to do computations on concepts from what others have done before. And that makes the difference, too.
I wish to thank José Francisco Val, Mamen Horno, David Serrano-Dolader, Carlos Piera, Antonio Fábregas, Juan Carlos Moreno, Bárbara Marqueta, Mark Dingemanse, two Linguistics anonymous reviewers, and the audience at the 27th Colloquium on Generative Grammar (Alcalá de Henares, May 2017) for their helpful comments on earlier versions of this paper. Any remaining errors are the responsibility of the author. The research underlying this paper was supported by the Spanish State Research Agency (AEI) & FEDER (EU) grant FFI2017-82460-P, and the support of the Gobierno de Aragón (Spain) to the research group Psylex (Language and Cognition).
Ackema, Peter & Ad Neeleman. 2007. Morphology ¹ syntax. In Gillian Ramchand & Charles Reiss (eds.), The Oxford handbook of linguistic interfaces, 324–352. Oxford: Oxford University Press.10.1093/oxfordhb/9780199247455.013.0011Search in Google Scholar
Anderson, Stephen R. 2015. The morpheme: Its nature and use. In Matthew Baerman (ed.), The Oxford handbook of inflection, 11–33. Oxford: Oxford University Press.10.1093/oxfordhb/9780199591428.013.2Search in Google Scholar
Aronoff, Mark. 1976. Word formation in generative grammar. Cambridge, MA: The MIT Press.Search in Google Scholar
Baker, Mark. 1988. Incorporation. Chicago: University of Chicago Press.Search in Google Scholar
Beard, Robert. 1995. Lexeme–Morpheme base morphology: A general theory of inflection and word formation. Albany, NY: State University of New York Press.Search in Google Scholar
Bermúdez-Otero, Ricardo. 2012. The architecture of grammar and the division of labor in exponence. In Joachim Trommer (ed.), The morphology and phonology of exponence, 8–83. Oxford: Oxford University Press.10.1093/acprof:oso/9780199573721.003.0002Search in Google Scholar
Berwick, Robert C. & Noam Chomsky. 2011. The biolinguistic program: The current state of its development. In Anna-Maria Di Sciullo & Cedric Boeckx (eds.), The biolinguistic enterprise, 19–41. Oxford: Oxford University Press.Search in Google Scholar
Boeckx, Cedric. 2008. Bare syntax. Oxford: Oxford University Press.Search in Google Scholar
Bye, Patrik & Peter Svenonius. 2012. Non-concatenative morphology as epiphenomenon. In Joachim Trommer (ed.), The morphology and phonology of exponence, 427–495. Oxford: Oxford University Press.10.1093/acprof:oso/9780199573721.003.0013Search in Google Scholar
Chomsky, Noam. 1970. Remarks on nominalization. In Roderick Jacobs & Peter Rosenbaum (eds.), Readings in English transformational grammar, 184–221. Waltham, MA: Ginn & Co.Search in Google Scholar
Chomsky, Noam. 1995. The minimalist program. Cambridge, MA: The MIT Press.Search in Google Scholar
Chomsky, Noam. 2001. Derivation by phase. In Michael Kenstowicz (ed.), Ken Hale: A life in language, 1–52. Cambridge, MA: MIT Press.Search in Google Scholar
Chomsky, Noam. 2007. Approaching UG from below. In Uri Sauerland & Hans-Martin Gärtner (eds.), Interfaces + recursion=language? Chomsky’s minimalism and the view from semantics, 1–30. Berlin& New York: Mouton de Gruyter.10.1515/9783110207552-001Search in Google Scholar
Chomsky, Noam. 2008. On phases. In Robert Freidin, Carlos P. Otero & María Luisa Zubizarreta (eds.), Foundational issues in linguistic theory, 133–166. Cambridge, MA: MIT Press.Search in Google Scholar
Déchaine, Rose-Marie & Calisto Mudzingwa. 2014. Phono-semantics meets phono-syntax: A formal typology of ideophones. Paper presented at the University of Rochester, 2 May 2014.Search in Google Scholar
Dingemanse, Mark & Kimi Akita. 2017. An inverse relation between expressiveness and grammatical integration: On the morphosyntactic typology of ideophones, with special reference to Japanese. Journal of Linguistics 53. 501–532.10.1017/S002222671600030XSearch in Google Scholar
Embick, David & Rolf Noyer. 2007. Distributed morphology and the syntax-morphology interface. In Gillian Ramchand & Charles Reiss (eds.), The Oxford handbook of linguistic interfaces, 289–324. Oxford: Oxford University Press.10.1093/oxfordhb/9780199247455.013.0010Search in Google Scholar
Everaert, Martin B. H., Marinus A. C. Huybregts, Noam Chomsky, Robert C. Berwick & Johan J. Bolhuis. 2015. Structures, not strings: Linguistics as part of the cognitive sciences. Trends in Cognitive Sciences 19(12). 729–743.10.1016/j.tics.2015.09.008Search in Google Scholar
Fábregas, Antonio. 2005. The definition of the grammatical category in a syntactically oriented morphology: The case of nouns and adjectives. Madrid: Universidad Autónoma de Madrid dissertation.Search in Google Scholar
Fábregas, Antonio. 2012. Evidence for multidominance in Spanish agentive nominalizations. In Miriam Uribe-Etxebarría & Vidal Valmala (eds.), Ways of structure building, 66–92. Oxford: Oxford University Press.10.1093/acprof:oso/9780199644933.003.0004Search in Google Scholar
Fábregas, Antonio. 2014. On a grammatically relevant definition of word and why it belongs to syntax. In Iraide Ibarretxe-Antuñano & José-Luis Mendívil-Giró (eds.), To be or not to be a word: New reflections on the definition of word, 94–130. Newcastle: Cambridge Scholars Publishing.Search in Google Scholar
Givón, T. 1971. Historical syntax and synchronic morphology: An archeologist’s field trip. CLS Proceedings 7. Chicago, IL: University of Chicago.Search in Google Scholar
Hale, Morris & Samuel J. Keyser. 1993. On argument structure and the lexical expression of syntactic relations. In Kenneth Hale & Samuel J. Keyser (eds.), The view from building 20, 53–110. Cambridge, MA: The MIT Press.Search in Google Scholar
Halle, Morris & Alec Marantz. 1993. Distributed morphology and the pieces of inflection. In Kenneth Hale & Samuel J. Keyser (eds.), The view from building 20, 111–176. Cambridge, MA: The MIT Press.Search in Google Scholar
Hauser, Mark D., Noam Chomsky & W. Tecumseh Fitch. 2002. The faculty of language: What is it, who has it, and how it evolved? Science 298. 1569–1579.10.1017/CBO9780511817755.002Search in Google Scholar
Hildebrant Kristine, A. 2015. The prosodic word. In John R. Taylor (ed.), The Oxford handbook of the word, 221–245. Oxford: Oxford University Press.10.1093/oxfordhb/9780199641604.013.035Search in Google Scholar
Hurford, John R. 2007. The origins of meaning. Oxford: Oxford University Press.Search in Google Scholar
Jackendoff, Ray. 2012. A user’s guide to thought and meaning. Oxford: Oxford University Press.Search in Google Scholar
Jackendoff, Ray & Jenny Audring. 2018. Morphology in the parallel architecture. In Jenny Audring & Francesca Masini (eds.), The Oxford handbook of morphological theory. Oxford: Oxford University Press.10.1093/oxfordhb/9780199668984.013.33Search in Google Scholar
Julien, Marit. 2002. Syntactic heads and word formation. Oxford: Oxford University Press.Search in Google Scholar
Julien, Marit. 2007. On the relation between morphology and syntax. In Gillian Ramchand & Charles Reiss (eds.), The Oxford handbook of linguistic interfaces, 209–238. Oxford: Oxford University Press.Search in Google Scholar
Kratzer, Angelika. 1996. Severing the external argument from its verb. In Johan Rooryck & Laurie Zaring (eds.), Phrase structure and the lexicon, 109–137. Dordrecht: Kluwer.10.1007/978-94-015-8617-7_5Search in Google Scholar
Marantz, Alec. 1997. No scape from syntax. University of Pennsylvania Working Papers in Linguistics 4(2). 201–225.Search in Google Scholar
Moreno, Juan Carlos. 2014. From agglutination to polysynthesis: Towards a biological characterization of the spoken word. In Iraide Ibarretxe-Antuñano & José-Luis Mendívil-Giró (eds.), To be or not to be a word: New reflections on the definition of word, 131–163. Newcastle: Cambridge Scholars Publishing.Search in Google Scholar
Pietroski, Paul M. 2011. Minimal semantic instructions. In Anna-María Di Sciullo & Cedric Boeckx (eds.), The biolinguistic enterprise, 472–498. Oxford: Oxford University Press.10.1093/oxfordhb/9780199549368.013.0021Search in Google Scholar
Pietroski, Paul M. 2012. Language and conceptual reanalysis. In Anna-Maria Di Sciullo (ed.), Towards a biolinguistic understanding of grammar: Essays on interfaces, 57–86. Amsterdam & Philadelphia: John Benjamins.10.1075/la.194.04pieSearch in Google Scholar
Pulvermüller, Friedemann, Olaf Hauk, Vadim V. Nikulin & Risto J. Ilmoniemi. 2005. Functional links between motor and language systems. European Journal of Neuroscience 21. 793–797.10.1111/j.1460-9568.2005.03900.xSearch in Google Scholar
de Saussure, Ferdinand. 1916. Cours de linguistique générale. Paris: Payot.Search in Google Scholar
Selkirk, Elisabeth O. 1982. The syntax of words. Cambridge, MA: The MIT Press.Search in Google Scholar
Sigurðsson, Halldór Ármann. 2011. On UG and materialization. Linguistic Analysis 37. 367–388.Search in Google Scholar
Stewart, Thomas & Gregory Stump. 2007. Paradigm function morphology and the morphology-syntax interface. In Gillian Ramchand & Charles Reiss (eds.), The Oxford handbook of linguistic interfaces, 383–421. Oxford: Oxford University Press.10.1093/oxfordhb/9780199247455.013.0013Search in Google Scholar
Williams, Edwin. 2007. Dumping lexicalism. In Gillian Ramchand & Charles Reiss (eds.), The Oxford handbook of linguistic interfaces, 353–381. Oxford: Oxford University Press.10.1093/oxfordhb/9780199247455.013.0012Search in Google Scholar
© 2019 Walter de Gruyter GmbH, Berlin/Boston