Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Journal of Intelligent Systems

Editor-in-Chief: Fleyeh, Hasan

4 Issues per year


CiteScore 2017: 0.96

SCImago Journal Rank (SJR) 2017: 0.193
Source Normalized Impact per Paper (SNIP) 2017: 0.481

Online
ISSN
2191-026X
See all formats and pricing
More options …
Volume 26, Issue 1

Issues

A State-of-the-Art Review of Knowledge Discovery in Multiple Databases

Animesh Adhikari / Lakhmi C. Jain
  • School of Electrical and Information Engineering, University of South Australia, Mawson Lakes Campus, Australia
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Bhanu Prasad
  • Department of Computer and Information Sciences, Florida A&M University, Tallahassee, FL 32307, USA
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2016-05-20 | DOI: https://doi.org/10.1515/jisys-2015-0154

Abstract

Knowledge discovery in multiple databases offers many opportunities and challenges. We have given a number of motivating points on knowledge discovery in multiple databases. In view of further studies on this aspect, we highlight some domains that generated numerous problems on multiple related databases. Activities related to data preprocessing in a multi-database mining environment are also discussed. Important techniques of mining multiple databases are outlined. Many interesting patterns that originated out of multi-database environments are highlighted. We shall witness more research outcomes and investigations as the number of multi-database domains is on the rise.

Keywords: Data preparation in multi-database environment; multi-database mining; patterns in multiple databases

MSC 2010: 68T10

Bibliography

  • [1]

    A. Adhikari, Knowledge discovery in databases with an emphasis on multiple large databases, Goa University, 2009. Link to thesis: http://shodhganga.inflibnet.ac.in/handle/10603/12532.

  • [2]

    A. Adhikari, Synthesizing global exceptional patterns in different data sources, J. Intell. Syst. 21 (2012), 293–323.Google Scholar

  • [3]

    A. Adhikari and J. Adhikari, Mining patterns of select items in different data sources, in: Advances in Knowledge Discovery in Databases, pp. 233–253, Springer, Switzerland, 2015.Google Scholar

  • [4]

    A. Adhikari and J. Adhikari, Advances in knowledge discovery in databases, Springer, Berlin, 2015.Google Scholar

  • [5]

    A. Adhikari and J. Adhikari, Mining patterns of different related databases, in: Advances in Knowledge Discovery in Databases, pp. 83–95, Springer, Switzerland, 2015.Google Scholar

  • [6]

    A. Adhikari, J. Adhikari and W. Pedrycz, Measuring influence of an item in time-stamped databases, in: Data Analysis and Pattern Recognition in Multiple Databases, Springer, Berlin, pp. 209–228, 2014.Google Scholar

  • [7]

    A. Adhikari and P. R. Rao, Synthesizing heavy association rules from different real data sources, Pattern Recognit. Lett. 29 (2008), 59–71.Google Scholar

  • [8]

    A. Adhikari, P. Ramachandrarao and W. Pedrycz, Developing multi-database mining applications, pp. 1–13, Springer, Berlin, 2010.Google Scholar

  • [9]

    A. Adhikari and P. R. Rao, Enhancing quality of knowledge synthesized from multi-database mining, Pattern Recognit. Lett. 28 (2007), 2312–2324.Google Scholar

  • [10]

    A. Adhikari and P. R. Rao, Efficient clustering of databases induced by local patterns, Decis. Support Syst. 44 (2008), 925–943.Google Scholar

  • [11]

    A. Adhikari, P. Ramachandrarao, B. Prasad and J. Adhikari, Mining multiple large data sources, Int. Arab. J. Inf. Technol. 7 (2010), 241–249.Google Scholar

  • [12]

    J. Adhikari, Mining and Analysis of Time-stamped Databases, PhD thesis, Goa University, 2014, Link to thesis: http://www.zantyecollege.ac.in/libraries/view/Dr.-Mrs.-Jhimli-Adhikari/35.

  • [13]

    J. Adhikari, P. R. Rao and W. Pedrycz, Mining icebergs in time-stamped databases, in: Proceedings of Indian International Conferences on Artificial Intelligence, pp. 639–658, IICAI, USA, 2011.Google Scholar

  • [14]

    J. Adhikari and P. R. Rao, Identifying calendar-based periodic patterns, in: Emerging Paradigms in Machine Learning, S. Ramanna, L. Jain and R. J. Howlett (Eds.), pp. 329–357, Springer, Berlin, 2011.Google Scholar

  • [15]

    J. Adhikari, P. R. Rao and A. Adhikari, Clustering items in different data sources induced by stability, Int. Arab. J. Inf. Technol. 6 (2009), 394–402.Google Scholar

  • [16]

    R. Agrawal and R. Srikant, Fast algorithms for mining association rules. in: Proceedings of International Conference on Very Large Data Bases, pp. 487–499, VLDB, Santiago, 1994.Google Scholar

  • [17]

    J. Aronis, V. Kolluri, F. Provost and B. Buchanan, The WoRLD: knowledge discovery from multiple distributed databases, in: Proceedings of the Tenth International Florida AI Research Symposium, pp. 337–341, FLAIRS, Florida, 1997.Google Scholar

  • [18]

    Big Data. https://en.wikipedia.org/wiki/Big_data.

  • [19]

    H. Blockeel and S. Dzeroski, Multi-relational data mining 2005: workshop report, SIGKDD Explor. 7 (2005), 126–128.Google Scholar

  • [20]

    M. Böttcher, F. Hoppner and M. Spiliopoulou, On exploiting the power of time in data mining, SIGKDD Explor. 10 (2008), 3–11.Google Scholar

  • [21]

    D. L. Donoho, High-dimensional data analysis: the curses and blessings of dimensionality, in: Proceedings of AMS Conference on Math Challenges of the 21st Century, American Mathematical Society, Los Angeles, 2000.Google Scholar

  • [22]

    S. Dzeroski, Multi-relational data mining: an introduction, SIGKDD Explor. 5 (2003), 1–16.Google Scholar

  • [23]

    S. Dzeroski and H. Blockeel, Multi-relational data mining 2004: workshop report, SIGKDD Explor. 6 (2004), 140–141.Google Scholar

  • [24]

    S. Dzeroski and L. D. Raedt, Multi-relational data mining: a workshop report, SIGKDD Explor. 4 (2002), 122–124.Google Scholar

  • [25]

    S. Dzeroski, L. D. Raedt and S. Wrobel, Multi-relational data mining 2003: workshop report, SIGKDD Explor. 5 (2003), 200–202.Google Scholar

  • [26]

    P. M. Domingos, Prospects and challenges for multi-relational data mining, SIGKDD Explor. 5 (2003), 80–83.Google Scholar

  • [27]

    M. Dumas, M. C. Fauvet and P. C. Scholl, Handling temporal grouping and pattern-matching queries in a temporal object model, in Proceedings of CIKM, pp. 424–431, ACM, New York, 1998.Google Scholar

  • [28]

    C. I. Ezeife and D. Zhang, TidFP: mining frequent patterns in different databases with transaction ID, in: Proceedings of DaWaK, pp. 125–137, 2009.Google Scholar

  • [29]

    P. A. Flach, Multi-relational data mining: a perspective, in Proceedings of EPIA, pp. 3–4, EPIA, Porto, 2001.Google Scholar

  • [30]

    A. Greenfield, Everyware: the dawning age of ubiquitous computing, 1st ed., New Riders Publishing, San Francisco, 2006.Google Scholar

  • [31]

    J. Han, M. Kamber and J. Pei, Data mining: concepts and techniques, 3rd ed., Morgan Kaufmann, Burlington, MA, 2011.Google Scholar

  • [32]

    J. Han, J. Pei and Y. Yiwen, Mining frequent patterns without candidate generation, in: Proceedings of ACM SIGMOD Conference on Management of Data, pp. 1–12, ACM, New York, 2000.Google Scholar

  • [33]

    S. Khiat, H. Belbachir and R. S. Ahmed, Probabilistic models for local patterns analysis, JIPS 10 (2014), 145–161.Google Scholar

  • [34]

    H. Kargupta, W. Huang, S. Krishnamurthy, B. Park and S. Wang, Collective PCA from distributed and heterogeneous data, in: Proceedings of the Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 452–457, PKDD, Lyon, 2000.Google Scholar

  • [35]

    H. Kargupta, K. Liu and J. Ryan, Privacy sensitive distributed data mining from multi-party data, in: Proceedings of Intelligence and Security Informatics, pp. 336–342, Springer, Berlin, 2003.Google Scholar

  • [36]

    G. Krempl, I. Zliobaite, D. Brzezinski, E. Hüllermeier, M. Last, V. Lemaire, T. Noack, A. Shaker, S. Sievi, M. Spiliopoulou and J. Stefanowski, Open challenges for data stream mining research, SIGKDD Explor. 16 (2014), 1–10.Google Scholar

  • [37]

    H.-C. Kum, H. C. Chang and W. Wang, Sequential pattern mining in multi-databases via multiple alignment, Data Mining Knowl. Discov. 12 (2006), 151–180.Google Scholar

  • [38]

    G.-C. Lan, T.-P. Hong and V. S. Tseng, A novel algorithm for mining rare-utility itemsets in a multi-database environment, in: Proceedings of the 26th Workshop on Combinatorial Mathematics and Computation Theory, pp. 293–302, CMCT, Taiwan, 2007.Google Scholar

  • [39]

    A. Lazarevic and Z. Obradovic, Knowledge discovery in multiple spatial databases, Neural Comput. Appl. 10 (2002), 339–350.Google Scholar

  • [40]

    H. Liu, H. Lu and J. Yao, Toward multi-database mining: identifying relevant databases, IEEE Trans. Knowl. Data Eng. 13 (2001), 541–553.Google Scholar

  • [41]

    H. Lu, Seamless integration of data mining with DBMS and applications, in: Proceedings of PAKDD, pp. 3, PAKDD, Hong Kong, 2001.Google Scholar

  • [42]

    H. J. Miller and J. Han, Eds., Geographic data mining and knowledge discovery, 2nd ed., CRC Press, Boca Raton, FL, 2009.Google Scholar

  • [43]

    Mining Multiple Information Sources, 2007. Available at: citeseerx.ist.psu.edu

  • [44]

    Mining Multiple Information Sources, 2008. Available at: citeseerx.ist.psu.edu.

  • [45]

    Mining Multiple Information Sources, 2009. Available at: http://www.cse.fau.edu/∼xqzhu/.

  • [46]

    Mining Multiple Information Sources, 2010. Available at: http://www.cse.fau.edu/∼xqzhu.

  • [47]

    Mining Multiple Information Sources, 2011. Available at: http://www.cse.fau.edu/∼xqzhu.

  • [48]

    B. Moon, I. F. V. Lopez and V. Immanuel, Efficient algorithms for large-scale temporal aggregation, IEEE Trans. Knowl. Data Eng. 15 (2003), 744–759.Google Scholar

  • [49]

    D. J. Nigrin and I.S. Kohane, Temporal expressiveness in querying a timestamp-based clinical database, J. Am. Med. Inform. Assoc. 7 (2000), 152–163.Google Scholar

  • [50]

    B. H. Park and H. Kargupta, Distributed data mining: algorithms, systems, and applications, in: Data Mining Handbook, pp. 341–358, Lawrence Erlbaum Associates, Denmark, 2002.Google Scholar

  • [51]

    W.-C. Peng and Z.-X. Liao, Mining sequential patterns across multiple sequence databases, Data Knowl. Eng. 68 (2009), 1014–1033.Google Scholar

  • [52]

    D. Pyle, Data preparation for data mining, Morgan Kaufmann, San Francisco, 1999.Google Scholar

  • [53]

    J. S. Ribeiro, K. A. Kaufman and L. Kerschberg, Knowledge discovery from multiple databases, in: Proceedings of KDD, pp. 240–245, AAAI, California, 1995.Google Scholar

  • [54]

    A. Savasere, E. Omiecinski and S. Navathe, An efficient algorithm for mining association rules in large databases, in: Proceedings of the 21st International Conference on Very Large Data Bases, pp. 432–443, VLDB, USA, 1995.Google Scholar

  • [55]

    E. Spyropoulou, T. D. Bie and M. Boley, Interesting pattern mining in multi-relational data, Data Mining Knowl. Discov. 28 (2014), 808–849.Google Scholar

  • [56]

    A. S. Szalaya , J. Grayb and J. Vandenberga, Petabyte scale data mining: dream or reality?, Technical Report MSR-TR-2002-84, Johns Hopkins University, 2002.

  • [57]

    P.-N. Tan, V. Kumar and M. Steinbach, Introduction to Data Mining, Pearson Education, London, 2006.Google Scholar

  • [58]

    G. Tsoumakas, Distributed data mining, Encyclopedia of Data Warehousing and Mining, pp. 709–715, IGI Global, Pennsylvania, 2009.Google Scholar

  • [59]

    B. Wilkinson, Grid Computing: Techniques and Applications, CRC Press, Boca Raton, FL, 2009.Google Scholar

  • [60]

    W. Wu and L. Gruenwald, Research issues in mining multiple data streams, in: Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining, pp. 56–60, ACM, New York, 2010.Google Scholar

  • [61]

    X. Wu, C. Zhang and S. Zhang, Database classification for multi-database mining, Inf. Syst. 30 (2005), 71–88.Google Scholar

  • [62]

    X. Wu and S. Zhang, Synthesizing high-frequency rules from different data sources, IEEE Trans. Knowl. Data Eng. 14 (2003), 353–367.Google Scholar

  • [63]

    J. Yan, N. Liu, Q. Yang, B. Zhang, Q. Cheng and Z. Chen, Mining adaptive ratio rules from distributed data sources, Data Mining Knowl. Discov. 12 (2006), 249–273.Google Scholar

  • [64]

    S. Zhang, X. Wu and C. Zhang, Multi-database mining, IEEE Comput. Intell. Bull. 2 (2003), 5–13.Google Scholar

  • [65]

    F. Zhao and L. Guibas, Wireless Sensor Networks: An Information Processing Approach, Morgan Kaufmann, San Francisco, 2004.Google Scholar

  • [66]

    X. Zhu, X. Wu and Q. Chen, Bridging local and global data cleansing: identifying class noise in large, distributed data datasets, Data Min. Knowl. Discov. 12 (2006), 275–308.Google Scholar

  • [67]

    N. Zhong and S. Ohsuga, Discovering concept clusters by decomposing databases, Data Knowl. Eng. 12 (1994), 223–244.Google Scholar

  • [68]

    S. Zhang, Knowledge discovery in multi-databases by analyzing local instances, Ph.D. thesis, Deakin University, 2002.Google Scholar

  • [69]

    S. Zhang, X. You, Z. Jin and X. Wu, Mining globally interesting patterns from multiple databases using kernel estimation, Exp. Syst. Appl. 36 (2009), 10863–10869.Google Scholar

  • [70]

    S. Zhang, C. Zhang and X. Wu, Knowledge discovery in multiple databases, Springer, Berlin, 2004.Google Scholar

  • [71]

    S. Zhang and M. J. Zaki, Mining multiple data sources: local pattern analysis, Data Min. Knowl. Discov. 12 (2006), 121–125.Google Scholar

  • [72]

    N. Zhong, Y. Yao and M. Ohshima, Peculiarity oriented multidatabase mining, IEEE Trans. Knowl. Data Eng. 15 (2003), 952–960.Google Scholar

  • [73]

    X. Zhu and X. Wu, Discovering relational patterns across multiple databases, in: Proceedings of ICDE, pp. 726–735, IEEE, USA, 2007.Google Scholar

About the article

Received: 2015-12-02

Published Online: 2016-05-20

Published in Print: 2017-01-01


Citation Information: Journal of Intelligent Systems, Volume 26, Issue 1, Pages 23–34, ISSN (Online) 2191-026X, ISSN (Print) 0334-1860, DOI: https://doi.org/10.1515/jisys-2015-0154.

Export Citation

©2017 Walter de Gruyter GmbH, Berlin/Boston.Get Permission

Comments (0)

Please log in or register to comment.
Log in