Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Journal of Artificial Intelligence and Soft Computing Research

The Journal of Polish Neural Network Society, the University of Social Sciences in Lodz & Czestochowa University of Technology

4 Issues per year

Open Access
Online
ISSN
2083-2567
See all formats and pricing
More options …

Characterization of Symbolic Rules Embedded in Deep DIMLP Networks: A Challenge to Transparency of Deep Learning

Guido Bologna
  • Corresponding author
  • Department of Computer Science, University of Applied Science of Western Switzerland, Rue de la Prairie 4, Geneva 1202, Switzerland
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Yoichi Hayashi
Published Online: 2017-05-03 | DOI: https://doi.org/10.1515/jaiscr-2017-0019

Abstract

Rule extraction from neural networks is a fervent research topic. In the last 20 years many authors presented a number of techniques showing how to extract symbolic rules from Multi Layer Perceptrons (MLPs). Nevertheless, very few were related to ensembles of neural networks and even less for networks trained by deep learning. On several datasets we performed rule extraction from ensembles of Discretized Interpretable Multi Layer Perceptrons (DIMLP), and DIMLPs trained by deep learning. The results obtained on the Thyroid dataset and the Wisconsin Breast Cancer dataset show that the predictive accuracy of the extracted rules compare very favorably with respect to state of the art results. Finally, in the last classification problem on digit recognition, generated rules from the MNIST dataset can be viewed as discriminatory features in particular digit areas. Qualitatively, with respect to rule complexity in terms of number of generated rules and number of antecedents per rule, deep DIMLPs and DIMLPs trained by arcing give similar results on a binary classification problem involving digits 5 and 8. On the whole MNIST problem we showed that it is possible to determine the feature detectors created by neural networks and also that the complexity of the extracted rulesets can be well balanced between accuracy and interpretability.

Keywords: ensembles; Deep Learning; rule extraction; feature detectors

References

  • [1] M. Golea, On the complexity of rule extraction from neural networks and network querying, in: Rule Extraction From Trained Artificial Neural Networks Workshop, Society For the Study of Artificial Intelligence and Simulation of Behavior Workshop Series (AISB), 1996, pp. 51-59Google Scholar

  • [2] T. Hailesilassie, Rule extraction algorithm for deep neural networks: A review, International Journal of Computer Science and Information Security 14, 7, 2016, 376Google Scholar

  • [3] G. Bologna, Symbolic rule extraction from the dimlp neural network, in: Hybrid neural systems, Springer, 2000, pp. 240-254Google Scholar

  • [4] G. Bologna, A study on rule extraction from several combined neural networks, International journal of neural systems 11, 03, 2001, 247-255Google Scholar

  • [5] G. Bologn, Is it worth generating rules from neural network ensembles?, Journal of Applied Logic 2, 3, 2004, 325-348Google Scholar

  • [6] A. A. Freitas, Comprehensible classification models: a position paper, ACM SIGKDD explorations newsletter 15, 1, 2014, 1-10Google Scholar

  • [7] J. Chorowski, J. M. Zurada, Learning understandable neural networks with nonnegative weight constraints, Neural Networks and Learning Systems, IEEE Transactions on 26, 1, 2015, 62-69Google Scholar

  • [8] S. I. Gallant, Connectionist expert systems, Communications of the ACM 31 (2) (1988) 152-169.Google Scholar

  • [9] R. Andrews, J. Diederich, A. B. Tickle, Survey and critique of techniques for extracting rules from trained artificial neural networks, Knowledgebased systems 8, 6, 1995, 373-389Google Scholar

  • [10] J. Diederich, Rule extraction from support vector machines, Vol. 80, Springer Science & Business Media, 2008Google Scholar

  • [11] L. K. Hansen, P. Salamon, Neural network ensembles, IEEE transactions on pattern analysis and machine intelligence 12, 1990, 993-1001Google Scholar

  • [12] Z.-H. Zhou, Y. Jiang, S.-F. Chen, Extracting symbolic rules from trained neural network ensembles, Artificial Intelligence Communications 16 , 1, 2003 3-16.Google Scholar

  • [13] R. Setiono, B. Baesens, C. Mues, Recursive neural network rule extraction for data with mixed attributes, Neural Networks, IEEE Transactions on 19 , 2, 2008, 299-307Google Scholar

  • [14] A. Hara, Y. Hayashi, Ensemble neural network rule extraction using re-rx algorithm, in: Neural Networks (IJCNN), The 2012 International Joint Conference on, IEEE, 2012, pp. 1-6Google Scholar

  • [15] Y. Hayashi, R. Sato, S. Mitra, A new approach to three ensemble neural network rule extraction using recursive-rule extraction algorithm, in: Neural Networks (IJCNN), The 2013 International Joint Conference on, IEEE, 2013, pp. 1-7Google Scholar

  • [16] S. N. Tran, A. dAvila Garcez, Knowledge extraction from deep belief networks for images, in: IJCAI-2013Workshop on Neural-Symbolic Learning and Reasoning, 2013Google Scholar

  • [17] J. Zilke, Extracting rules from deep neural networks, Master’s thesis, Computer Science Department, Technische Universitt Darmstadt, 2015Google Scholar

  • [18] R. Setiono, W. K. Leow, Fernn: An algorithm for fast extraction of rules from neural networks, Applied Intelligence 12 , 1-2, 2000, 15-25Google Scholar

  • [19] J. R. Quinlan, C4.5: Programs for machine learning. morgan kaufmann publishers, inc., 1993, Machine Learning 16, 3, 1994, 235-240Google Scholar

  • [20] G. Bologna, C. Pellegrini, Constraining the mlp power of expression to facilitate symbolic rule extraction, in: Neural Networks Proceedings, 1998, IEEE World Congress on Computational Intelligence. The 1998 IEEE International Joint Conference on, Vol. 1, IEEE, 1998, pp. 146-151Google Scholar

  • [21] G.-B. Huang, Q.-Y. Zhu, C.-K. Siew, Extreme learning machine: theory and applications, Neurocomputing 70 , 1, 2006, 489-501CrossrefGoogle Scholar

  • [22] L. Breiman, Bagging predictors, Machine learning 24, 2, 1996, 123-140Google Scholar

  • [23] L. Breman, Bias, variance, and arcing classifiers (technical report 460), Statistics Department, University of CaliforniaGoogle Scholar

  • [24] P. Vincent, H. Larochelle, Y. Bengio, P.A. Manzagol, Extracting and composing robust features with denoising autoencoders, in: Proceedings of the 25th international conference on Machine learning, ACM, 2008, pp. 1096-1103Google Scholar

  • [25] M. Lichman, http://archive.ics.uci.edu/ml (UCI machine learning repository 2013)Google Scholar

  • [26] Y. Hayashi, S. Nakano, S. Fujisawa, Use of the recursiverule extraction algorithm with continuous attributes to improve diagnostic accuracy in thyroid disease, Informatics in Medicine Unlocked 1, 2015, 1-8Google Scholar

  • [27] W. Duch, R. Adamczak, K. Grøbczewski, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules, Neural Networks, IEEE Transactions on 12 , 2, 2001, 277-306Google Scholar

  • [28] S. Abe, R. Thawonmas, M. Kayama, A fuzzy classifier with ellipsoidal regions for diagnosis problems, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 29 , 1, 1999, 140-148Google Scholar

  • [29] J. Huysmans, R. Setiono, B. Baesens, J. Vanthienen, Minerva: Sequential covering for rule extraction, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 38 , 2, 2008, 299-309CrossrefGoogle Scholar

  • [30] K. Odajima, Y. Hayashi, G. Tianxia, R. Setiono, Greedy rule generation from discrete data and its use in neural network rule extraction, Neural Networks 21 , 7, 2008, 1020-1028Google Scholar

  • [31] Y. Hayashi, S. Nakano, Use of a recursiverule extraction algorithm with j48graft to achieve highly accurate and concise rule extraction from a large breast cancer dataset, Informatics in Medicine Unlocked 1, 2015, 9-16Google Scholar

  • [32] Y. LeCun, C. Cortes, C. Burges, The mnist database of handwritten digits, 1998, 2012, Available electronically at http://yann.lecun.com/exdb/mnistGoogle Scholar

  • [33] V. Cherkassky, S. Dhar, Interpretation of blackbox predictive models, in: Measures of Complexity, Springer, 2015, pp. 267-286Google Scholar

  • [34] W. Verbeke, D. Martens, C. Mues, B. Baesens, Building comprehensible customer churn prediction models with advanced rule induction techniques, Expert Systems with Applications 38 , 3, 2011, 2354-2364Google Scholar

  • [35] G. Bologna, Y. Hayashi, Qsvm: A support vector machine for rule extraction, in: International WorkConference on Artificial Neural Networks, Springer, 2015, pp. 276-289Google Scholar

About the article

Received: 2017-02-14

Accepted: 2017-03-20

Published Online: 2017-05-03

Published in Print: 2017-10-01


Citation Information: Journal of Artificial Intelligence and Soft Computing Research, ISSN (Online) 2083-2567, DOI: https://doi.org/10.1515/jaiscr-2017-0019.

Export Citation

© 2017. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. BY-NC-ND 4.0

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

Comments (0)

Please log in or register to comment.
Log in