Less is more: why all paradigms are defective, and why that is a good thing

A. Laura Janda 1  and M. Francis Tyers 2
  • 1 HSL, UiT Norges arktiske universitet, Tromso, Norway
  • 2 School of Linguistics, Nacional’nyj issledovatel’skij universitet Vyssaa skola ekonomiki, Moskva, Russia
A. Laura Janda
  • Corresponding author
  • HSL, UiT Norges arktiske universitet, Tromso, Norway
  • Email
  • Further information
  • Laura A. Janda (born 1957, Ph.D., UCLA, 1984) is Professor of Russian Linguistics at UiT the Arctic University of Norway. Her special areas of interest are the complex factors associated with the grammatical categories of case and aspect and how these can be investigated using corpus data and experiments.
  
  
and M. Francis Tyers
  • School of Linguistics, Nacional’nyj issledovatel’skij universitet Vyssaa skola ekonomiki, Moskva, Russia
  • Email
  • Further information
  • Francis M. Tyers (born 1983, Ph.D., Universitat d’Alacant, 2013) is Assistant Professor of Linguistics at Higher School of Economics in Moscow. He is passionate about language technology for lesser-resourced languages and has co-organised workshops on machine translation in a number of countries including Russia and Finland.
  
  


Only a fraction of lexemes are encountered in all their paradigm forms in any corpus or even in the lifetime of any speaker. This raises a question as to how it is that native speakers confidently produce and comprehend word forms that they have never witnessed. We present the results of an experiment using a recurrent neural network computational learning model. In particular, we compare the model’s production of unencountered forms using two types of training data: full paradigms vs. single word forms for Russian nouns, verbs, and adjectives. In the long run, the model displays better performance when exposed to the more naturalistic training on single word forms, even though the other training data is much larger as it includes full paradigms for each and every word. We discuss why “defective” paradigms may be better for human learners as well.

