Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Foundations of Computing and Decision Sciences

The Journal of Poznan University of Technology

4 Issues per year

CiteScore 2016: 0.75

SCImago Journal Rank (SJR) 2016: 0.330
Source Normalized Impact per Paper (SNIP) 2016: 0.709

Open Access
See all formats and pricing
More options …

Software Measurement and Defect Prediction with Depress Extensible Framework

Lech Madeyski
  • Corresponding author
  • Lech Madeyski is with the Faculty of Computer Science and Management, Wroclaw University of Technology, Poland.
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Marek Majchrzak
  • Marek Majchrzak is with the Faculty of Computer Science and Management, Wroclaw University of Technology and Capgemini Poland.
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2014-12-20 | DOI: https://doi.org/10.2478/fcds-2014-0014


Context. Software data collection precedes analysis which, in turn, requires data science related skills. Software defect prediction is hardly used in industrial projects as a quality assurance and cost reduction mean. Objectives. There are many studies and several tools which help in various data analysis tasks but there is still neither an open source tool nor standardized approach. Results. We developed Defect Prediction for software systems (DePress), which is an extensible software measurement, and data integration framework which can be used for prediction purposes (e.g. defect prediction, effort prediction) and software changes analysis (e.g. release notes, bug statistics, commits quality). DePress is based on the KNIME project and allows building workflows in a graphic, end-user friendly manner. Conclusions. We present main concepts, as well as the development state of the DePress framework. The results show that DePress can be used in Open Source, as well as in industrial project analysis.

Keywords: mining in software repositories; software metrics; KNIME; defect prediction


  • [1] M. D'Ambros and M. Lanza, “Distributed and Collaborative Software Evolution Analysis with Churrasco,” Sci. Comput. Program., vol. 75, pp. 276-287, Apr. 2010.Google Scholar

  • [2] G. Ghezzi and H. C. Gall, “Distributed and collaborative software analysis,” in Collaborative software engineering (I. Mistrik, J. Grundy, A. van der Hoek, and J. Whitehead, eds.), pp. 241-263, Heidelberg, Germany: Springer, January 2010.Google Scholar

  • [3] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, “The WEKA data mining software: an update,” ACM SIGKDD Explorations Newsletter, vol. 11, no. 1, pp. 10-18, 2009.Google Scholar

  • [4] R. Ihaka and R. Gentleman, “R: A language for data analysis and graphics,” Journal of computational and graphical statistics, vol. 5, no. 3, pp. 299-314, 1996.Google Scholar

  • [5] M. U. Guide, “The mathworks,” Inc., Natick, MA, vol. 5, 1998.Google Scholar

  • [6] M. R. Berthold, N. Cebron, F. Dill, T. R. Gabriel, T. Kötter, T. Meinl, P. Ohl, C. Sieb, K. Thiel, and B. Wiswedel, “KNIME: The Konstanz Information Miner,” in Studies in Classification, Data Analysis, and Knowledge Organization (GfKL 2007), Springer, 2007.Google Scholar

  • [7] D. Morent, K. Stathatos, W.-C. Lin, and M. R. Berthold, “Comprehensive PMML preprocessing in KNIME,” in Proceedings of the 2011 workshop on Predictive markup language modeling, PMML '11, (New York, NY, USA), pp. 28-31, ACM, 2011.Google Scholar

  • [8] Data Mining Group, “PMML Powered.” http://www.dmg.org/products.html, 2012.Google Scholar

  • [9] T. Meinl and G. Landrum, “Get your chemistry right with knime,” Journal of Cheminformatics, vol. 5, no. Suppl 1, p. F1, 2013.Google Scholar

  • [10] W. A. Warr, “Scientific workow systems: Pipeline Pilot and KNIME,” Journal of computer-aided molecular design, pp. 1-4, 2012.Google Scholar

  • [11] M. P. Mazanetz, R. J. Marmon, C. B. Reisser, and I. Morao, “Drug Discovery Applications for KNIME: An Open Source Data Mining Platform,” Current topics in medicinal chemistry, vol. 12, no. 18, pp. 1965-1979, 2012.Google Scholar

  • [12] M. Jureczko and J. Magott, “QualitySpy: a framework for monitoring software development processes,” Journal of Theoretical and Applied Computer Science, vol. 6, no. 1, pp. 35-45, 2012.Google Scholar

  • [13] Marian Jureczko and contributors, “Quality Spy.” http://java.net/projects/qualityspy.Google Scholar

  • [14] The Apache Software Foundation, “Apache License, Version 2.0.” http://www.apache.org/licenses/LICENSE-2.0.html.Google Scholar

  • [15] N. Fenton, P. Krause, M. Neil, and C. Lane, “A Probabilistic Model for Software Defect Prediction,” 2001.Google Scholar

  • [16] N. E. Fenton and M. Neil, “Software metrics: success, failures and new directions,” J. Syst. Softw., vol. 47, pp. 149-157, July 1999.Google Scholar

  • [17] Agena, “Agenarisk Desktop.” <http://www.agenarisk.com>.Google Scholar

  • [18] S. Demeyer, S. Tichelaar, and S. Ducasse, “FAMIX 2.1 - The FAMOOS Information Exchange Model,” tech. rep., University of Berne, 2001.Google Scholar

  • [19] TIOBE, “Programming Community Index.” http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html, 10 2013.Google Scholar

  • [20] Black Duck Software, “Ohloh Index.” https://www.ohloh.net/languages.Google Scholar

  • [21] H. C. Gall, B. Fluri, and M. Pinzger, “Change Analysis with Evolizer and ChangeDistiller,” IEEE Software, vol. 26, no. 1, pp. 26-33, 2009.Google Scholar

  • [22] B. Fluri, M. Würsch, M. Pinzger, and H. Gall, “Change distilling: Tree differencing for fine-grained source code change extraction,” IEEE Transactions on Software Engineering, vol. 33, pp. 725-743, NOV 2007.Google Scholar

  • [23] The Eclipse Foundation, “Eclipse.” http://www.eclipse.org/.Google Scholar

  • [24] G. Ghezzi and H. C. Gall, “SOFAS: A Lightweight Architecture for Software Analysis as a Service,” in 2011 Ninth Working IEEE/IFIP Conference on Software Architecture, pp. 93-102, IEEE, June 2011.Google Scholar

  • [25] W3C, “Sparql query language for rdf.” http://www.w3.org/TR/rdf-sparql-query/.Google Scholar

  • [26] M. Fischer, M. Pinzger, and H. Gall, “Populating a release history database from version control and bug tracking systems,” in Software Maintenance, 2003. ICSM 2003. Proceedings. International Conference on, pp. 23-32, IEEE, 2003.Google Scholar

  • [27] L. Madeyski and N. Radyk, “Judy-a mutation testing tool for Java,” Software, IET, vol. 4, no. 1, pp. 32-42, 2010. http://madeyski.e-informatyka.pl/download/Madeyski10b.pdf.Google Scholar

  • [28] L. Madeyski, W. Orzeszyna, R. Torkar, and M. Józala, “Overcoming the equivalent mutant problem: A systematic literature review and a comparative experiment of second order mutation,” IEEE Transactions on Software Engineering, vol. 40, pp. 23-42, January 2014. http://dx.doi.org/10.1109/TSE.2013.44.CrossrefGoogle Scholar

  • [29] L. Madeyski, Test-Driven Development: An Empirical Evaluation of Agile Practice. (Heidelberg, London, New York): Springer, 2010. http://www.springer.com/978-3-642-04287-4.Google Scholar

  • [30] L. Madeyski, “The impact of test-first programming on branch coverage and mutation score indicator of unit tests: An experiment,” Information and Software Technology, vol. 52, no. 2, pp. 169-184, 2010. Draft: http://madeyski.e-informatyka.pl/download/Madeyski10c.pdf.Google Scholar

  • [31] L. Madeyski, “The impact of pair programming on thoroughness and fault detection effectiveness of unit tests suites,” Software Process: Improvement and Practice, vol. 13, no. 3, pp. 281-295, 2008. Draft: <http://madeyski.e-informatyka.pl/download/Madeyski08.pdf>.Google Scholar

  • [32] JaCoCo. http://www.eclemma.org/jacoco/.Google Scholar

  • [33] F. Sauer, “Eclipse metrics plugin.” http://metrics.sourceforge.net/.Google Scholar

  • [34] Checkstyle. http://checkstyle.sourceforge.net/, 2007.Google Scholar

  • [35] PMD. http:/pmd.sourceforge.net/.Google Scholar

  • [36] PIT. http:/pitest.org/.Google Scholar

  • [37] FindBugs. http:/findbugs.sourceforge.net/.Google Scholar

  • [38] S. R. Chidamber and C. F. Kemerer, “A metrics suite for object oriented design,” IEEE Transactions on Software Engineering, vol. 20, no. 6, pp. 476-493, 1994.Google Scholar

  • [39] N. Nagappan, B. Murphy, and V. Basili, “The inuence of organizational structure on software quality: an empirical case study,” in Proceedings of the 30th international conference on Software engineering, pp. 521-530, ACM, 2008.Google Scholar

  • [40] Atlassian, “REST Plugin Module.”Google Scholar

  • [41] TMate Software, “SVNKit.” http://svnkit.com/.Google Scholar

  • [42] R Core Team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2014.Google Scholar

  • [43] I. H. Witten, E. Frank, and M. A. Hall, Data Mining: Practical Machine Learning Tools and Techniques. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 3rd ed., 2011.Google Scholar

  • [44] G. Williams, M. Hahsler, H. Ishwaran, U. B. Kogalur, and R. Guha, pmml: Package 'pmml', 2012. R package version 1.2.32.Google Scholar

  • [45] BIRT. http://www.eclipse.org/birt/phoenix/.Google Scholar

  • [46] N. Nagappan, T. Ball, and A. Zeller, “Mining metrics to predict component failures,” in Proceedings of the 28th international conference on Software engineering, pp. 452-461, ACM, 2006.Google Scholar

  • [47] L. Madeyski and M. Majchrzak, “ImpressiveCode DePress (Defect Prediction for software systems) Extensible Framework,” 2012. Available as an open source project from GitHub: https://github.com/ImpressiveCode/ic-depress.Google Scholar

  • [48] T. Menzies, B. Caglayan, Z. He, E. Kocaguneli, J. Krall, F. Peters, and B. Turhan, “The PROMISE Repository of empirical software engineering data,” June 2012.Google Scholar

  • [49] M. Jureczko and L. Madeyski, “Towards identifying software project clusters with regard to defect prediction,” in Proceedings of the 6th International Conference on Predictive Models in Software Engineering, PROMISE '10, (New York, NY, USA), pp. 9:1-9:10, ACM, 2010.Google Scholar

  • [50] L. Madeyski and M. Jureczko, “Which Process Metrics Can Significantly Improve Defect Prediction Models? An Empirical Study,” Software Quality Journal, 2014. DOI: 10.1007/s11219-014-9241-7 (accepted), preprint: http://madeyski.e-informatyka.pl/download/Madeyski14SQJ.pdf.CrossrefGoogle Scholar

  • [51] D. De Roure, C. Goble, and R. Stevens, “The design and realisation of the myexperiment virtual research environment for social sharing of workows,” Future Generation Computer Systems, vol. 25, pp. 561-567, 2009.Google Scholar

  • [52] Free Software Foundation, Inc., “GNU General Public License.” http://www.gnu.org/licenses/gpl-3.0.en.html.Google Scholar

  • [53] GitHub Inc. http://www.github.com.Google Scholar

  • [54] L. Dabbish, C. Stuart, J. Tsay, and J. Herbsleb, “Social coding in GitHub: transparency and collaboration in an open software repository,” in Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work, CSCW '12, (New York, NY, USA), pp. 1277-1286, ACM, 2012.Google Scholar

  • [55] M. Majchrzak and L. Madeyski, “DePress JIRA.” https://depress.atlassian.net/browse/DEP, 2013.Google Scholar

  • [56] L. Madeyski and M. Majchrzak, “DePress GitHub Issues.” https://github.com/ImpressiveCode/ic-depress/issues, 2012.Google Scholar

  • [57] N. Nagappan, A. Zeller, T. Zimmermann, K. Herzig, and B. Murphy, “Change Bursts as Defect Predictors,” in Software Reliability Engineering (ISSRE), 2010 IEEE 21st International Symposium on, pp. 309 -318, nov. 2010.Google Scholar

About the article

Published Online: 2014-12-20

Published in Print: 2014-12-01

Citation Information: Foundations of Computing and Decision Sciences, Volume 39, Issue 4, Pages 249–270, ISSN (Online) 2300-3405, DOI: https://doi.org/10.2478/fcds-2014-0014.

Export Citation

© by Lech Madeyski. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License. BY-NC-ND 3.0

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

Lech Madeyski, Barbara Kitchenham, Ngoc-Thanh Nguyen, Manuel Núñez, and Bogdan Trawiński
Journal of Intelligent & Fuzzy Systems, 2017, Volume 32, Number 2, Page 1509

Comments (0)

Please log in or register to comment.
Log in