Russian velar palatalization changes velars into alveopalatals before certain suffixes, including the stem extension -i and the diminutive suffixes -ok and -ek/ik. While velar palatalization always applies before the relevant suffixes in the established lexicon, it often fails with nonce loanwords before -i and -ik but not before -ok or -ek. This is shown to be predicted by the Minimal Generalization Learner (MGL), a model of rule induction and weighting developed by Albright and Hayes (Cognition 90: 119–161, 2003), by a novel version of Network Theory (Bybee, Morphology: A study of the relation between meaning and form, John Benjamins, 1985, Phonology and language use, Cambridge University Press, 2001), which uses competing unconditional product-oriented schemas weighted by type frequency and paradigm uniformity constraints, and by stochastic Optimality Theory with language-specific constraints learned using the Gradual Learning Algorithm (GLA, Boersma, Proceedings of the Institute of Phonetic Sciences of the University of Amsterdam 21: 43–58, 1997). The successful models are shown to predict that a morphophonological rule will fail if the triggering suffix comes to attach to inputs that are not eligible to undergo the rule. This prediction is confirmed in an artificial grammar learning experiment. Under either model, the choice between generalizations or output forms is shown to be stochastic, which requires retrieving known word-forms from the lexicon as wholes, rather than generating them through the grammar. Furthermore, MGL and GLA are shown to succeed only if the suffix and the stem shape are chosen simultaneously, as opposed to the suffix being chosen first and then triggering (or failing to trigger) a stem change. In addition, the GLA is shown to require output-output faithfulness to be ranked above markedness at the beginning of learning (Hayes, Phonological acquisition in Optimality Theory: the early stages, Cambridge University Press, 2004) to account for the present data.