Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Journal of Artificial Intelligence and Soft Computing Research

The Journal of Polish Neural Network Society, the University of Social Sciences in Lodz & Czestochowa University of Technology

4 Issues per year

Open Access
See all formats and pricing
More options …

Can Learning Vector Quantization be an Alternative to SVM and Deep Learning? - Recent Trends and Advanced Variants of Learning Vector Quantization for Classification Learning

Thomas Villmann / Andrea Bohnsack
  • Computational Intelligence Group, University of Applied Sciences Mittweida, Germany Germany
  • Staatliche Berufliche Oberschule Kaufbeuren, Germany
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Marika Kaden
Published Online: 2016-12-17 | DOI: https://doi.org/10.1515/jaiscr-2017-0005


Learning vector quantization (LVQ) is one of the most powerful approaches for prototype based classification of vector data, intuitively introduced by Kohonen. The prototype adaptation scheme relies on its attraction and repulsion during the learning providing an easy geometric interpretability of the learning as well as of the classification decision scheme. Although deep learning architectures and support vector classifiers frequently achieve comparable or even better results, LVQ models are smart alternatives with low complexity and computational costs making them attractive for many industrial applications like intelligent sensor systems or advanced driver assistance systems.

Nowadays, the mathematical theory developed for LVQ delivers sufficient justification of the algorithm making it an appealing alternative to other approaches like support vector machines and deep learning techniques.

This review article reports current developments and extensions of LVQ starting from the generalized LVQ (GLVQ), which is known as the most powerful cost function based realization of the original LVQ. The cost function minimized in GLVQ is an soft-approximation of the standard classification error allowing gradient descent learning techniques. The GLVQ variants considered in this contribution, cover many aspects like bordersensitive learning, application of non-Euclidean metrics like kernel distances or divergences, relevance learning as well as optimization of advanced statistical classification quality measures beyond the accuracy including sensitivity and specificity or area under the ROC-curve.

According to these topics, the paper highlights the basic motivation for these variants and extensions together with the mathematical prerequisites and treatments for integration into the standard GLVQ scheme and compares them to other machine learning approaches. For detailed description and mathematical theory behind all, the reader is referred to the respective original articles.

Thus, the intention of the paper is to provide a comprehensive overview of the stateof- the-art serving as a starting point to search for an appropriate LVQ variant in case of a given specific classification problem as well as a reference to recently developed variants and improvements of the basic GLVQ scheme.

Keywords: classification learning; vector quantization; prototype based learning


  • [1] Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521:436-444, May 2015.Google Scholar

  • [2] P.J. Werbos. Beyond Regression: New Tools for Prediction and Analysis in the Behavorial Sciences. PhD thesis, Havard University, Cambridge, MA., 1974.Google Scholar

  • [3] G. Cybenko. Approximations by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems, 2(4): 303-314, 1989.Google Scholar

  • [4] T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer Verlag, Heidelberg-Berlin, 2001.Google Scholar

  • [5] Y. Bengio. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1):1-127, 2009.Google Scholar

  • [6] Simon Haykin. Neural Networks - A Comprehensive Foundation. IEEE Press, New York, 1994.Google Scholar

  • [7] C.M. Bishop. Pattern Recognition and Machine Learning. Springer, 2006.Google Scholar

  • [8] R.O. Duda and P.E. Hart. Pattern Classification and Scene Analysis. Wiley, New York, 1973.Google Scholar

  • [9] K.L. Oehler and R.M. Gray. Combining image compressing and classification using vector quantization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(5):461-473, 1995.Google Scholar

  • [10] M. Biehl, B. Hammer, and T. Villmann. Prototypebased models in machine learning. Wiley Interdisciplinary Reviews: Cognitive Science, 7(2):92-111, 2016.Google Scholar

  • [11] P. L. Zador. Asymptotic quantization error of continuous signals and the quantization dimension. IEEE Transaction on Information Theory, IT-28:149-159, 1982.Google Scholar

  • [12] Y. Linde, A. Buzo, and R.M. Gray. An algorithm for vector quantizer design. IEEE Transactions on Communications, 28:84-95, 1980.CrossrefGoogle Scholar

  • [13] T. Lehn-Schiler, A. Hegde, D. Erdogmus, and J.C. Principe. Vector quantization using information theoretic concepts. Natural Computing, 4(1):39-51, 2005.Google Scholar

  • [14] J.C. Principe. Information Theoretic Learning. Springer, Heidelberg, 2010.Google Scholar

  • [15] Teuvo Kohonen. Self-Organizing Maps, volume 30 of Springer Series in Information Sciences. Springer, Berlin, Heidelberg, 1995. (Second Extended Edition 1997).Google Scholar

  • [16] Thomas M. Martinetz, Stanislav G. Berkovich, and Klaus J. Schulten. ’Neural-gas’ network for vector quantization and its application to time-series prediction. IEEE Trans. on Neural Networks, 4(4):558-569, 1993.Google Scholar

  • [17] B. Schlkopf and A. Smola. Learning with Kernels. MIT Press, Cambridge, 2002.Google Scholar

  • [18] Teuvo Kohonen. Learning vector quantization for pattern recognition. Report TKK-F-A601, Helsinki University of Technology, Espoo, Finland, 1986.Google Scholar

  • [19] Teuvo Kohonen. Learning Vector Quantization. Neural Networks, 1(Supplement 1):303, 1988.Google Scholar

  • [20] Teuvo Kohonen. Improved versions of Learning Vector Quantization. In Proc. IJCNN-90, International Joint Conference on Neural Networks, San Diego, volume I, pages 545-550, Piscataway, NJ, 1990. IEEE Service Center.Google Scholar

  • [21] D. Nova and P.A. Est´evez. A review of learning vector quantization classifiers. Neural Computation and Applications, 25(511-524), 2013.Google Scholar

  • [22] M. Kaden, M. Lange, D. Nebel, M. Riedel, T. Geweniger, and T. Villmann. Aspects in classification learning - Review of recent developments in Learning Vector Quantization. Foundations of Computing and Decision Sciences, 39(2):79-105, 2014.Google Scholar

  • [23] B. Fritzke. The LBG-U method for vector quantization - an improvement over LBG inspired from neural networks. Neural Processing Letters, 5(1):35-45, 1997.CrossrefGoogle Scholar

  • [24] H.-U. Bauer and Th. Villmann. Growing a Hypercubical Output Space in a Self-Organizing Feature Map. IEEE Transactions on Neural Networks, 8(2):218-226, 1997.CrossrefGoogle Scholar

  • [25] F. Hamker. Life-long learning cell structures - continuously learning without catastrophic interference. Neural Networks, 14:551-573, 2001.CrossrefGoogle Scholar

  • [26] H. Robbins and S. Monro. A stochastic approximation method. Ann. Math. Stat., 22:400-407, 1951.Google Scholar

  • [27] H.J. Kushner and D.S. Clark. Stochastic Approximation Methods for Constrained and Unconstrained Systems. Springer-Verlag, New York, 1978.Google Scholar

  • [28] S. Graf and H. Luschgy. Foundations of Quantization for Probability Distributions, volume 1730 of Lect. Notes in Mathematics. Springer, Berlin, 2000.Google Scholar

  • [29] G. Voronoi. Nouvelles aoolications des parametres la theorie des formes quadratiques. deuxime mmorie: Recherches sur les paralllodres primitifs. J. reine angew. Math., 134:198-287, 1908.Google Scholar

  • [30] A. Sato and K. Yamada. Generalized learning vector quantization. In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors, Advances in Neural Information Processing Systems 8. Proceedings of the 1995 Conference, pages 423-9. MIT Press, Cambridge, MA, USA, 1996.Google Scholar

  • [31] K. Crammer, R. Gilad-Bachrach, A. Navot, and A.Tishby. Margin analysis of the LVQ algorithm. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing (Proc. NIPS 2002), volume 15, pages 462-469, Cambridge, MA, 2003. MIT Press.Google Scholar

  • [32] S. Seo and K. Obermayer. Soft learning vector quantization. Neural Computation, 15:1589-1604, 2003.CrossrefGoogle Scholar

  • [33] S. Seo, M. Bode, and K. Obermayer. Soft nearest prototype classification. IEEE Transaction on Neural Networks, 14:390-398, 2003.Google Scholar

  • [34] A. Boubezoul, S. Paris, and M. Ouladsine. Application of the cross entropy method to the GLVQ algorithm. Pattern Recognition, 41:3173-3178, 2008.CrossrefGoogle Scholar

  • [35] B. Hammer, M. Strickert, and T. Villmann. Supervised neural gas with general similarity measure. Neural Processing Letters, 21(1):21-44, 2005.CrossrefGoogle Scholar

  • [36] A.K. Qin and P.N. Suganthan. Initialization insensitive LVQ algorithm based on cost-function adaptation. Pattern Recognition, 38:773-776, 2004.Google Scholar

  • [37] Keren O. Perlmutter, Sharon M. Perlmutter, Robert M. Gray, Richard A. Olshen, and Karen L. Oehler. Bayes risk weighted vector quantization with posterior estimation for image compression and classification. IEEE Trans. on Image Processing, 5(2):347-360, February 1996.Google Scholar

  • [38] B. Hammer, D. Nebel, M. Riedel, and T. Villmann. Generative versus discriminative prototype based classification. In T. Villmann, F.-M. Schleif, M. Kaden, and M. Lange, editors, Advances in Self- Organizing Maps and Learning Vector Quantization: Proceedings of 10th InternationalWorkshopWSOM 2014, Mittweida, volume 295 of Advances in Intelligent Systems and Computing, pages 123-132, Berlin, 2014. Springer.Google Scholar

  • [39] M. Kaden, M. Riedel, W. Hermann, and T. Villmann. Border-sensitive learning in generalized learning vector quantization: an alternative to support vector machines. Soft Computing, 19(9):2423-2434, 2015.CrossrefGoogle Scholar

  • [40] E. Pekalska and R.P.W. Duin. The Dissimilarity Representation for Pattern Recognition: Foundations and Applications. World Scientific, 2006.Google Scholar

  • [41] T. Villmann, M. Kaden, D. Nebel, and A. Bohnsack. Data similarities, dissimilarities and types of inner products - a mathematical characterization in the context of machine learning. Machine Learning Reports, 9(MLR-04-015):19-29, 2015. ISSN:1865-3960, http://www.techfak.unibielefeld.de/˜fschleif/mlr/mlr042015.pdf.Google Scholar

  • [42] M. Lange, D. Zühlke, O. Holz, and T. Villmann. Applications of lp-norms and their smooth approximations for gradient based learning vector quantization. In M. Verleysen, editor, Proc. of European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN’2014), pages 271-276, Louvain-La- Neuve, Belgium, 2014. i6doc.com.Google Scholar

  • [43] K. Bunte, F.-M. Schleif, and M. Biehl. Adaptive learning for complex-valued data. In M. Verleysen, editor, Proc. of European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN’2012), pages 381-386, Louvain-La-Neuve, Belgium, 2012. i6doc.com.Google Scholar

  • [44] M. Gay, M. Kaden, M. Biehl, A. Lampe, and T. Villmann. Complex variants of GLVQ based on Wirtingers calculus. In E. Mer´enyi, M.J. Mendenhall, and P. O’Driscoll, editors, Advances in Self- Organizing Maps and Learning Vector Quantization: Proceedings of 11th International Workshop WSOM 2016, volume 428 of Advances in Intelligent Systems and Computing, pages 293-303, Berlin-Heidelberg, 2016. Springer.Google Scholar

  • [45] T. Villmann and S. Haase. Divergence based vector quantization. Neural Computation, 23(5):1343-1392, 2011.CrossrefGoogle Scholar

  • [46] E. Mwebaze, P. Schneider, F.-M. Schleif, J.R. Aduwo, J.A. Quinn, S. Haase, T. Villmann, and M. Biehl. Divergence based classification in learning vector quantization. Neurocomputing, 74(9):1429-1435, 2011.CrossrefGoogle Scholar

  • [47] M. Kästner, B. Hammer, M. Biehl, and T. Villmann. Functional relevance learning in generalized learning vector quantization. Neurocomputing, 90(9):85-95, 2012.CrossrefGoogle Scholar

  • [48] F. Rossi, N. Delannay, B. Conan-Gueza, and M. Verleysen. Representation of functional data in neural networks. Neurocomputing, 64:183-210, 2005.CrossrefGoogle Scholar

  • [49] F. Melchert, U. Seiffert, and M. Biehl. Functional representation of prototypes in lvq and relevance learning. In E. Mer´enyi, M.J. Mendenhall, and P. O’Driscoll, editors, Advances in Self-Organizing Maps and Learning Vector Quantization: Proceedings of 11th International Workshop WSOM 2016, volume 428 of Advances in Intelligent Systems and Computing, pages 317-327, Berlin-Heidelberg, 2016. Springer.Google Scholar

  • [50] M. Strickert, U. Seiffert, N. Sreenivasulu, W. Weschke, T. Villmann, and B. Hammer. Generalized relevance LVQ (GRLVQ) with correlation measures for gene expression analysis. Neurocomputing, 69(6-7):651-659, March 2006.CrossrefGoogle Scholar

  • [51] S. Saralajew and T. Villmann. Adaptive tangent metrics in generalized learning vector quantization for transformation and distortion invariant classification learning. In Proceedings of the International Joint Conference on Neural networks (IJCNN) , Vancover, pages 2672-2679. IEEE Computer Society Press, 2016.Google Scholar

  • [52] S. Saralajew, D. Nebel, and T. Villmann. Adaptive Hausdorff distances and tangent distance adaptation for transformation invariant classification learning. In A. Hirose, editor, Proceedings of the International Conference on Neural Information Processing (ICONIP) , Kyoto, volume 9949 of LNCS, pages 362-371. Springer, 2016.Google Scholar

  • [53] I. Steinwart. On the influence of the kernel on the consistency of support vector machines. Journal of Machine Learning Research, 2:67-93, 2001.Google Scholar

  • [54] I. Steinwart and A. Christmann. Support Vector Machines. Information Science and Statistics. Springer Verlag, Berlin-Heidelberg, 2008.Google Scholar

  • [55] A.K. Qin and P.N. Suganthan. A novel kernel prototype-based learning algorithm. In Proceedings of the 17th International Conference on Pattern Recognition (ICPR’04), volume 4, pages 621-624, 2004.Google Scholar

  • [56] F.-M. Schleif, T. Villmann, B. Hammer, and P. Schneider. Efficient kernelized prototype based classification. International Journal of Neural Systems, 21(6):443-457, 2011.CrossrefGoogle Scholar

  • [57] T. Villmann, S. Haase, and M. Kaden. Kernelized vector quantization in gradient-descent learning. Neurocomputing, 147:83-95, 2015.Google Scholar

  • [58] D. Hofmann, A. Gisbrecht, and B. Hammer. Efficient approximations of robust soft learning vector quantization for non-vectorial data. Neurocomputing, 147:96-106, 2015.Google Scholar

  • [59] D. Nebel, M. Kaden, A. Bohnsack, and T. Villmann. Types of (dis-)similarities and adaptive mixtures thereof for improved classification learning. Neurocomputing, page in press, 2017.Google Scholar

  • [60] B. Hammer, D. Hofmann, F.-M. Schleif, and X. Zhu. Learning vector quantization for (dis- )similarities. Neurocomputing, 131:43-51, 2014.Google Scholar

  • [61] D. Nebel, B. Hammer, K. Frohberg, and T. Villmann. Median variants of learning vector quantization for learning of dissimilarity data. Neurocomputing, 169:295-305, 2015.Google Scholar

  • [62] B. Hammer and T. Villmann. Generalized relevance learning vector quantization. Neural Networks, 15(8-9):1059-1068, 2002.CrossrefGoogle Scholar

  • [63] B. Hammer, M. Strickert, and T. Villmann. On the generalization ability of GRLVQ networks. Neural Processing Letters, 21(2):109-120, 2005.CrossrefGoogle Scholar

  • [64] T. Villmann, M. Kästner, D. Nebel, and M. Riedel. Lateral enhancement in adaptative metric learning for functional data. Neurocomputing, 131:23-31, 2014.Google Scholar

  • [65] P. Schneider, B. Hammer, and M. Biehl. Adaptive relevance matrices in learning vector quantization. Neural Computation, 21:3532-3561, 2009.CrossrefGoogle Scholar

  • [66] P. Schneider, K. Bunte, H. Stiekema, B. Hammer, T. Villmann, and Michael Biehl. Regularization in matrix relevance learning. IEEE Transactions on Neural Networks, 21(5):831-840, 2010.CrossrefGoogle Scholar

  • [67] M. Biehl, B. Hammer, F.-M. Schleif, P. Schneider, and T. Villmann. Stationarity of matrix relevance LVQ. In Proc. of the International Joint Conference on Neural Networks 2015 (IJCNN), pages 1-8, Los Alamitos, 2015. IEEE Computer Society Press.Google Scholar

  • [68] K. Bunte, P. Schneider, B. Hammer, F.-M. Schleif, T. Villmann, and M. Biehl. Limited rank matrix learning, discriminative dimension reduction and visualization. Neural Networks, 26(1):159-173, 2012.CrossrefGoogle Scholar

  • [69] E. Mwebaze, G. Bearda, M. Biehl, and D. Zühlke. Combining dissimilarity measures for prototypebased classification. In M. Verleysen, editor, Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN’2015), pages 31-36, Louvain-La-Neuve, Belgium, 2015. i6doc.com.Google Scholar

  • [70] D. Zühlke, F.-M. Schleif, T. Geweniger, S. Haase, and T. Villmann. Learning vector quantization for heterogeneous structured data. In M. Verleysen, editor, Proc. of European Symposium on Artificial Neural Networks (ESANN’2010), pages 271-276, Evere, Belgium, 2010. d-side publications.Google Scholar

  • [71] J. Schmidhuber. Deep learning in neural networks: An overview. Neural Networks, 61:85-117, 2015.CrossrefGoogle Scholar

  • [72] U. Knauer, A. Backhaus, and U. Seiffert. Beyond standard metrics - on the selection and combination of distance metrics for an improved classification of hyperspectral data. In T. Villmann, F.-M. Schleif, M. Kaden, and M. Lange, editors, Advances in Self- Organizing Maps and Learning Vector Quantization: Proceedings of 10th InternationalWorkshopWSOM 2014, Mittweida, volume 295 of Advances in Intelligent Systems and Computing, pages 167-177, Berlin, 2014. Springer.Google Scholar

  • [73] M. Kaden, D. Nebel, and T. Villmann. Adaptive dissimilarity weighting for prototype-based classification optimizing mixtures of dissimilarities. In M. Verleysen, editor, Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN’2016), pages 135-140, Louvain-La- Neuve, Belgium, 2016. i6doc.com.Google Scholar

  • [74] D.G. Lowe. Object recognition from local scaleinvariant features. In The Proceedings of the Seventh IEEE International Conference on Computer Vision, volume 2, pages 1150-1157, 1999.Google Scholar

  • [75] D.G. Lowe. Distinctive image features from scaleinvariant keypoints. International Journal of Computer Vision, 60(2):91-110, 2004.CrossrefGoogle Scholar

  • [76] P. Simard, Y. LeCun, and J.S. Denker. Efficient pattern recognition using a new transformation distance. In S.J. Hanson, J.D. Cowan, and C.L. Giles, editors, Advances in Neural Information Processing Systems 5, pages 50-58. Morgan-Kaufmann, 1993.Google Scholar

  • [77] T. Hastie, P. Simard, and E. S¨ackinger. Learning prototype models for tangent distance. In G. Tesauro, D.S. Touretzky, and T.K. Leen, editors, Advances in Neural Information Processing Systems 7, pages 999-1006. MIT Press, 1995.Google Scholar

  • [78] S.J. Pan and Q. Yang. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10):13-451359, 2010.Google Scholar

  • [79] Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1798-1828, 2013.Google Scholar

  • [80] C. Prahm, B. Paassen, A. Schulz, B. Hammer, and O. Aszmann. Transfer learning for rapid recalibration of a myoelectric prosthesis after electrode shift. In J. Ibanez, J. Gonzales-Vargas, J.M. Azorin, M.Akay, and J.L. Pons, editors, Proceedings of the 3rd International Conference on NeuroRehabilitation (ICNR2016), volume 15 of Biosystems and Biorobotics, pages 153-157. Springer, 2016.Google Scholar

  • [81] Y.Tang, Y.Q. Zangh, N.V. Chawla, and S. Krasser. SVMs modeling for highly imbalanced classification. IEEE Transactions on Systems Man and Cybernetics, Part B, 39(1):281-288, 2009.Google Scholar

  • [82] T. Fawcett. An introduction to ROC analysis. Pattern Recognition Letters, 27:861-874, 2006.CrossrefGoogle Scholar

  • [83] P. Baldi, S. Brunak, Y. Chauvin, and C. Andersen H. Nielsen. Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics, 16(5):412-424, 2000.CrossrefGoogle Scholar

  • [84] L. Sachs. Angewandte Statistik. Springer Verlag, 7-th edition, 1992.Google Scholar

  • [85] C.J. Rijsbergen. Information Retrieval. Butterworths, London, 2nd edition edition, 1979.Google Scholar

  • [86] M. Kaden, W. Hermann, and T. Villmann. Optimization of general statistical accuracy measures for classification based on learning vector quantization. In M. Verleysen, editor, Proc. of European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN’2014), pages 47-52, Louvain-La-Neuve, Belgium, 2014. i6doc.com.Google Scholar

  • [87] A.P. Bradley. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7):1149-1155, 1997.Google Scholar

  • [88] J. Keilwagen, I. Grosse, and J. Grau. Area under precision-recall curves for weighted and unweighted data. PLOS|ONE, 9(3 / e92209):1-13, 2014.Google Scholar

  • [89] S. Vanderlooy and E. H¨ullermeier. A critical analysis of variants of the AUC. Machine Learning, 72:247-262, 2008.CrossrefGoogle Scholar

  • [90] T. Villmann, M. Kaden, W. Hermann, and M. Biehl. Learning vector quantization classifiers for ROC-optimization. Computational Statistics, 2016.Google Scholar

  • [91] J.A. Hanley and B.J. McNeil. The meaning and use of the area under a receiver operating characteristic. Radiology, 143:29-36, 1982.Google Scholar

  • [92] U. Brefeld and T. Scheffer. AUC maximizing support vector learning. In Proceedings of ICML 2005 workshop on ROC Analysis in Machine Learning, pages 377-384, 2005.Google Scholar

  • [93] T. Calders and S. Jaroszewicz. Efficient AUC optimization for classification. In J.N. Kok, J. Koronacki, R. Lopez de Mantaras, S. Matwin, D. Mladenic, and A. Skowron, editors, Knowledge Discovery in Databases: PKDD 2007, volume 4702 of LNCS, pages 42-53. Springer-Verlag, 2007.Google Scholar

  • [94] M. Biehl, M. Kaden, P. St¨urmer, and T. Villmann. ROC-optimization and statistical quality measures in learning vector quantization classifiers. Machine Learning Reports, 8(MLR-01-2014):23-34, 2014. ISSN:1865-3960, http://www.techfak.unibielefeld.de/˜fschleif/mlr/mlr012014.pdf. Google Scholar

  • [95] R. Senge, S. Bösner, Dembczyński K, J. Haasenritter, O. Hirsch, N. Donner-Banzhoff, and E. H¨ullermeier. Reliable classification: Learning classifiers that distinguish aleatoric and epistemic uncertainty. Information Sciences, 255:16-29, 2014.Google Scholar

  • [96] A. Vailaya and A.K. Jain. Reject option for VQbased Bayesian classification. In International Conference on Pattern Recognition (ICPR), pages 2048-2051, 2000.Google Scholar

  • [97] L. Fischer, B. Hammer, and H. Wersing. Efficient rejection strategies for prototype-based classification. Neurocomputing, 169:334-342, 2015.Google Scholar

  • [98] G. Fumera, F. Roli, and G. Giacinto. Reject option with multiple thresholds. Pattern Recognition, 33(12):2099-2101, 2000.CrossrefGoogle Scholar

  • [99] I. Pillai, G. Fumera, and F. Roli. Multi-label classification with a reject option. Pattern Recognition, 46:2256-2266, 2013.CrossrefGoogle Scholar

  • [100] R. Herbei and M.H. Wegkamp. Classification with reject option. The Canadian Journal of Statistics, 34(4):709-721, 2006.Google Scholar

  • [101] P. L. Bartlett and M.H. Wegkamp. Classification with a reject option using a hinge loss. Journal of Machine Learning Research, 9:1823-1840, 2008.Google Scholar

  • [102] M. Yuan and M.H. Wegkamp. Classification methods with reject option based on convex risk minimization. Journal of Machine Learning Research, 11:111-130, 2010.Google Scholar

  • [103] L.P. Cordella, C. deStefano, C. Sansone, and M. Vento. An adaptive reject option for LVQ classifiers. In C. Braccini, L. deFloriani, and G. Vernazza, editors, Proceedings of the International Conference on Image Analysis and Processing (ICIAP), San Remo, volume 974 of LNCS, pages 68-73, Berlin, 1995. Springer.Google Scholar

  • [104] J. Suutala, S. Pirttikangas, J. Riekki, and J. R¨oning. Reject-optional LVQ-based two-level classifier to improve reliability in footstep identification. In A. Ferscher and F. Mattern, editors, Pervasive Computing, Proccedings on the Second International Conference PERVASIVE, Vienna, volume 3001 of LNCS, pages 182-187. Springer, 2004.Google Scholar

  • [105] G. Fumera and F. Roli. Support vector machines with embedded reject option. In S.-W. Lee and A. Verri, editors, Proceeedings of the First Interantional Workshop on Pattern Recognition with Support Vector Machines, Niagara Falls, volume 2388 of LNCS, pages 68-82. Springer, 2002.Google Scholar

  • [106] C.K. Chow. On optimum recognition error and reject tradeoff. IEEE Transactions in Information Theory, 16(1):41-46, 1970.Google Scholar

  • [107] C.K. Chow. An optimum character recognition system using decision functions. IRE Transactions on Electronic Computers, EC-6:247-254, 1957.Google Scholar

  • [108] T. Villmann, M. Kaden, D. Nebel, and M. Biehl. Learning vector quantization with adaptive costbased outlier-rejection. In G. Azzopardi and N. Petkov, editors, Proceedings of 16th International Conference on Computer Analysis of Images and Pattern, CAIP 2015, Valetta - Malta, volume Part II of LNCS 9257, pages 772 - 782, Berlin-Heidelberg, 2015. Springer.Google Scholar

  • [109] T. Villmann, M. Kaden, A. Bohnsack, S. Saralajew, J.-M. Villmann, T. Drogies, and B. Hammer. Self-adjusting reject options in prototype based classification. In E. Merényi, M.J. Mendenhall, and P. O’Driscoll, editors, Advances in Self-Organizing Maps and Learning Vector Quantization: Proceedings of 11th International Workshop WSOM 2016, volume 428 of Advances in Intelligent Systems and Computing, pages 269-279, Berlin-Heidelberg, 2016. Springer.Google Scholar

  • [110] L. Fischer and T. Villmann. A probabilistic classifier model with adaptive rejection option. Machine Learning Reports, 10(MLR-01-2016):1-16, 2016. ISSN:1865-3960, http://www.techfak.unibielefeld.de/˜fschleif/mlr/mlr012016.pdf.Google Scholar

  • [111] V. Vovk, A. Gammerman, and G. Shafer. Algorithmic learning in a random world. Springer, Berlin, 2005.Google Scholar

  • [112] X. Zhu, F.-M. Schleif, and B. Hammer. Adaptive conformal semi-supervised vector quantization for dissimilarity data. Pattern Recognition Letters, 49:138-145, 2014.CrossrefGoogle Scholar

  • [113] D. Erhan, Y. Bengio, A. Courville, P.-A. Manzagol, and P. Vincent. Why does unsupervised pretraining help deep learning. Journal of Machine Learning Research, 11:625-660, 2010.Google Scholar

  • [114] D. Ciresan, U. Meier, J. Masci, and J. Schmidhuber. Multi-column deep neural network for traffic sign classification. Neural Networks, 32:333-338, 2012.CrossrefGoogle Scholar

  • [115] Helge Ritter, Thomas Martinetz, and Klaus Schulten. Neural Computation and Self-Organizing Maps: An Introduction. Addison-Wesley, Reading, MA, 1992. Google Scholar

About the article

Published Online: 2016-12-17

Published in Print: 2017-01-01

Citation Information: Journal of Artificial Intelligence and Soft Computing Research, ISSN (Online) 2083-2567, DOI: https://doi.org/10.1515/jaiscr-2017-0005.

Export Citation

© 2016. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. BY-NC-ND 4.0

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

Comments (0)

Please log in or register to comment.
Log in