Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Journal of Artificial Intelligence and Soft Computing Research

The Journal of Polish Neural Network Society, the University of Social Sciences in Lodz & Czestochowa University of Technology

4 Issues per year

Open Access
Online
ISSN
2083-2567
See all formats and pricing
More options …

Clustering Large-Scale Data Based On Modified Affinity Propagation Algorithm

Ahmed M. Serdah / Wesam M. Ashour
Published Online: 2016-01-13 | DOI: https://doi.org/10.1515/jaiscr-2016-0003

Abstract

Traditional clustering algorithms are no longer suitable for use in data mining applications that make use of large-scale data. There have been many large-scale data clustering algorithms proposed in recent years, but most of them do not achieve clustering with high quality. Despite that Affinity Propagation (AP) is effective and accurate in normal data clustering, but it is not effective for large-scale data. This paper proposes two methods for large-scale data clustering that depend on a modified version of AP algorithm. The proposed methods are set to ensure both low time complexity and good accuracy of the clustering method. Firstly, a data set is divided into several subsets using one of two methods random fragmentation or K-means. Secondly, subsets are clustered into K clusters using K-Affinity Propagation (KAP) algorithm to select local cluster exemplars in each subset. Thirdly, the inverse weighted clustering algorithm is performed on all local cluster exemplars to select well-suited global exemplars of the whole data set. Finally, all the data points are clustered by the similarity between all global exemplars and each data point. Results show that the proposed clustering method can significantly reduce the clustering time and produce better clustering result in a way that is more effective and accurate than AP, KAP, and HAP algorithms.

References

  • [1] L. Kaufan, P. J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis, John Wiley & Sons, New York, 1990.Google Scholar

  • [2] R. Lior, and O. Maimon, Clustering Methods, Data mining and knowledge discovery handbook. Springer US, 2005, pp. 321-352Google Scholar

  • [3] S. Patel, S. Sihmar and A. Jatain, A Study of Hierarchical Clustering Algorithms, Computing for Sustainable Global Development (INDIACom), 2015 2nd International Conference on, 2005 pp. 537-541Google Scholar

  • [4] J.W. Han and M. Kambr, Data Mining Concepts and Techniques, Higher Education Press, Beijing, 2001.Google Scholar

  • [5] Y. Kang and Y. B. PARK, The Performance Evaluation of K-means by Two MapReduce Frameworks, Hadoop vs. Twister, Information Networking (ICOIN), 2015 International Conference on, 2015, pp. 405-406Google Scholar

  • [6] A. Y. Ng, M. I. Jordan, and Y. Weiss, On spectral clustering: Analysis and an algorithm, in Advances in Neural Information Processing Systems, 2001, pp. 849-856Google Scholar

  • [7] H. D. Menendez, D. F. Barrero and D. Camacho, A Co-Evolutionary Multi-Objective Approach for a K-Adaptive Graph-based Clustering Algorithm, IEEE Congress on Evolutionary Computation (CEC), 2014, pp. 2724-2731Google Scholar

  • [8] Han, J., and Kamber, M. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, 2001, pp. 450-479Google Scholar

  • [9] C. Tsai; Y. Hu, Enhancement of efficiency by thrifty search of interlocking neighbor grids approach for grid-based data clustering, Machine Learning and Cybernetics (ICMLC), 2013 International Conference on, 2013, pp. 1279-1284Google Scholar

  • [10] M. Ester, H. P. Kriegel, J. S, X. W. Xu, A density based algorithm for discovering clusters in large spatial databases with noise, in Proc. 2nd International Conference on, 1993, pp. 2-11Google Scholar

  • [11] S.T.Mai, He. Xiao, N. Hubig, C. Plant and C. Bohm, Active Density-Based Clustering, Data Mining (ICDM), 2013 IEEE 13th International Conference on, 2013, pp. 508–517Google Scholar

  • [12] Zahn, C. T., Graph-theoretical methods for detecting and describing gestalt clusters. IEEE trans. Comput. C-20 (Apr.), 1971, pp. 68-86Google Scholar

  • [13] F. Chamroukhi, Robust EM algorithm for model-based curve clustering, Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), 2013, pp. 1-8Google Scholar

  • [14] B. J. Frey, D. Dueck, Clustering by Passing Messages Between Data Points, in Science, vol. 315, 2007, pp. 972-976Google Scholar

  • [15] Wang Kai-jun, Zhang Jun-ying, Li Dan, et al, Adaptive Affinity Propagation Clustering, J. Acta Automatica Sinica, vol. 33(12), 2007, pp. 1242-1246Google Scholar

  • [16] Wang Kai-jun, Li Jian, Zhang Jun-ying, et al, Semi-supervised Affinity Propagation Clustering, J. Computer Engineering, vol. 33(23), 2007, pp. 197-201Google Scholar

  • [17] Yancheng He, Qingcai Chen, Xiaolong, et al, An Adaptive Affinity Propagation Docu-ment Clustering, Proceedings of the 7th International Conference on Informatics and Sys-tems, 2010, pp. 1-7Google Scholar

  • [18] Yangqing Jiay, Jingdong Wangz, Changshui Zhangy, Xian-Sheng Hua, Finding Image Exemplars Using Fast Sparse Affinity Propagation, Proceedings of the 16th ACM International conference on Multimedia, 2006, pp. 113-118Google Scholar

  • [19] Yasuhiro Fujiwara, Go Irie and Tomoe Kitahara, Fast Algorithm for Affinity Propagation, International Joint Conference on Artificial Intelligence (IJCAI), 2011, pp. 2238-2243Google Scholar

  • [20] Xiangliang Zhang, Wei Wang, Kjetil Nrvag and Michele Sebag, K-AP: Generating Specified K Clusters by Efficient Affinity Propagation, Data Mining (ICDM), 2010 IEEE 10th International Conference on, 2010, pp. 1187-1192Google Scholar

  • [21] Xiaonan Liu, Meijuan Yin, Junyong Luo and Wuping Chen, An Improved Affinity Propagation Clustering Algorithm for Large-scale Data Sets, 2013 Ninth International Conference on Natural Computation (ICNC), IEEE, 2013, pp. 894 - 899Google Scholar

  • [22] W. Barbakh and C. Fyfe. Inverse weighted clustering algorithm, Computing and InformationSystems, 11(2)10-18, May 2007. ISSN 1352-9404.Google Scholar

  • [23] C.-D. Wang, J.-H. Lai, C. Suen, and J.-Y. Zhu, Multi-exemplar affinity propagation, Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 35, 2013 pp. 2223–2237Google Scholar

  • [24] K.J. Wang, J.Y. Zhang, D. Li, X.N. Zhang, and T. Guo, Adaptive Affinity Propagation Clustering, Acta Automatica Sinica, vol. 33, no. 12, 2007, pp. 1242-1246Google Scholar

  • [25] C. L. Blake, C. J. Merz, “UCI repository of machine learning databases,” 2012, http://archive.ics.uci.edu/ml/.

  • [26] L. N. Ana, Fred, K. J. Anil, Robust Data Clustering, Computer Vision and Pattern Recognition, 2003, Proceedings, 2003 IEEE Computer Society Conference on, 2003, pp. 128 – 133Google Scholar

About the article

Published Online: 2016-01-13

Published in Print: 2016-01-01


Citation Information: Journal of Artificial Intelligence and Soft Computing Research, ISSN (Online) 2083-2567, DOI: https://doi.org/10.1515/jaiscr-2016-0003.

Export Citation

© 2016 Academy of Management (SWSPiZ), Lodz. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License. BY-NC-ND 3.0

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

[1]
Yuan Jiang, Yuliang Liao, and Guoxian Yu
Algorithms, 2016, Volume 9, Number 3, Page 46

Comments (0)

Please log in or register to comment.
Log in