Skip to content
BY-NC-ND 4.0 license Open Access Published by De Gruyter Open Access August 18, 2017

A simple spectral algorithm for recovering planted partitions

Sam Cole, Shmuel Friedland and Lev Reyzin
From the journal Special Matrices

Abstract

In this paper, we consider the planted partition model, in which n = ks vertices of a random graph are partitioned into k “clusters,” each of size s. Edges between vertices in the same cluster and different clusters are included with constant probability p and q, respectively (where 0 ≤ q < p ≤ 1). We give an efficient algorithm that, with high probability, recovers the clusters as long as the cluster sizes are are least (√n). Informally, our algorithm constructs the projection operator onto the dominant k-dimensional eigenspace of the graph’s adjacency matrix and uses it to recover one cluster at a time. To our knowledge, our algorithm is the first purely spectral algorithm which runs in polynomial time and works even when s = Θ (√n), though there have been several non-spectral algorithms which accomplish this. Our algorithm is also among the simplest of these spectral algorithms, and its proof of correctness illustrates the usefulness of the Cauchy integral formula in this domain.

References

[1] Nir Ailon, Yudong Chen, and Huan Xu. Breaking the small cluster barrier of graph clustering. In Proceedings of the 30th International Conference on Machine Learning (ICML-13), pages 995-1003, 2013.Search in Google Scholar

[2] Noga Alon, Michael Krivelevich, and Benny Sudakov. Finding a large hidden clique in a random graph. Random Struct. Algorithms, 13(3-4):457-466, 1998.10.1002/(SICI)1098-2418(199810/12)13:3/4<457::AID-RSA14>3.0.CO;2-WSearch in Google Scholar

[3] Noga Alon, Michael Krivelevich, and Van H. Vu. On the concentration of eigenvalues of random symmetric matrices. Israel Journal of Mathematics, 131(1):259-267, 2002.10.1007/BF02785860Search in Google Scholar

[4] Brendan P. W. Ames. Guaranteed clustering and biclustering via semidefinite programming. Mathematical Programming, 147(1-2):429-465, 2014.10.1007/s10107-013-0729-xSearch in Google Scholar

[5] Brendan P.W. Ames and Stephen A. Vavasis. Nuclear norm minimization for the planted clique and biclique problems. Math. Program., 129(1):69-89, 2011.10.1007/s10107-011-0459-xSearch in Google Scholar

[6] Afonso S. Bandeira and Ramon van Handel. Sharp nonasymptotic bounds on the norm of randommatrices with independent entries. Ann. Probab., 44(4):2479-2506, 07 2016.10.1214/15-AOP1025Search in Google Scholar

[7] Béla Bollobás and Paul Erdos. Cliques in random graphs. In Mathematical Proceedings of the Cambridge Philosophical Society, volume 80, pages 419-427. Cambridge Univ Press, 1976.10.1017/S0305004100053056Search in Google Scholar

[8] Yudong Chen, S. Sanghavi, and Huan Xu. Improved graph clustering. Information Theory, IEEE Transactions on, 60(10):6440-6455, October 2014.10.1109/TIT.2014.2346205Search in Google Scholar

[9] Yudong Chen and Jiaming Xu. Statistical-computational phase transitions in planted models: The high-dimensional setting. In Proceedings of the 31st International Conference on Machine Learning (ICML-14), pages 244-252, 2014.Search in Google Scholar

[10] Amin Coja-Oghlan. Graph partitioning via adaptive spectral techniques. Combinatorics, Probability and Computing, 19(02):227-284, 2010.10.1017/S0963548309990514Search in Google Scholar

[11] Don Coppersmith and Shmuel Winograd. Matrix multiplication via arithmetic progressions. Journal of Symbolic Computation, 9(3):251 - 280, 1990.10.1016/S0747-7171(08)80013-2Search in Google Scholar

[12] Yael Dekel, Ori Gurel-Gurevich, and Yuval Peres. Finding hidden cliques in linear time with high probability. In Proceedings of ANALCO, pages 67-75, 2011.10.1137/1.9781611973013.8Search in Google Scholar

[13] Paul Erdos and Alfréd Rényi. On random graphs I. Publicationes Mathematicae (Debrecen), 6:290-297, 1959 1959.Search in Google Scholar

[14] Uriel Feige and R. Krauthgamer. Finding and certifying a large hidden clique in a semirandom graph. Random Struct. Algorithms, 16(2):195-208, 2000.10.1002/(SICI)1098-2418(200003)16:2<195::AID-RSA5>3.0.CO;2-ASearch in Google Scholar

[15] Uriel Feige and Dorit Ron. Finding hidden cliques in linear time. In Proceedings of AofA, pages 189-204, 2010.10.46298/dmtcs.2802Search in Google Scholar

[16] Vitaly Feldman, Elena Grigorescu, Lev Reyzin, Santosh Vempala, and Ying Xiao. Statistical algorithms and a lower bound for detecting planted cliques. In Symposium on Theory of Computing Conference, STOC’13, Palo Alto, CA, USA, June 1-4, 2013, pages 655-664, 2013.10.1145/2488608.2488692Search in Google Scholar

[17] Shmuel Friedland. Matrices. World Scientific, 2015.10.1142/9567Search in Google Scholar

[18] Zoltán Füredi and János Komlós. The eigenvalues of random symmetric matrices. Combinatorica, 1(3):233-241, 1981.10.1007/BF02579329Search in Google Scholar

[19] Joachim Giesen and Dieter Mitsche. Reconstructing many partitions using spectral techniques. In Proceedings of the 15th International Symposium on Fundamentals of Computation Theory, 2005.10.1007/11537311_38Search in Google Scholar

[20] Gene H. Golub and Charles F. Van Loan. Matrix Computations (3rd Ed.). Johns Hopkins University Press, Baltimore, MD, USA, 1996.Search in Google Scholar

[21] Ming Gu. Subspace iteration randomization and singular value problems. SIAM Journal on Scientific Computing, 37(3):A1139-A1173, 2015.10.1137/130938700Search in Google Scholar

[22] N. Halko, P. G. Martinsson, and J. A. Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev., 53(2):217-288, May 2011.10.1137/090771806Search in Google Scholar

[23] Mark Jerrum. Large cliques elude the metropolis process. Random Struct. Algorithms, 3(4):347-360, 1992.10.1002/rsa.3240030402Search in Google Scholar

[24] Richard M. Karp. Probabilistic analysis of graph-theoretic algorithms. In Proceedings of Computer Science and Statistics 12th Annual Symposium on the Interface, page 173, 1979.Search in Google Scholar

[25] N. Kishore Kumar and J. Schneider. Literature survey on low rank approximation of matrices. Linear and Multilinear Algebra, pages 1-33, 2016.10.1080/03081087.2016.1267104Search in Google Scholar

[26] Ludek Kucera. Expected complexity of graph partitioning problems. Discrete Applied Mathematics, 57(2-3):193-212, 1995.10.1016/0166-218X(94)00103-KSearch in Google Scholar

[27] François Le Gall. Powers of tensors and fast matrix multiplication. In Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation, ISSAC ’14, pages 296-303, New York, NY, USA, 2014. ACM.10.1145/2608628.2608664Search in Google Scholar

[28] Frank McSherry. Spectral partitioning of random graphs. In FOCS, pages 529-537, 2001.10.1109/SFCS.2001.959929Search in Google Scholar

[29] Nam H. Nguyen, Thong T. Do, and Trac D. Tran. A fast and efficient algorithm for low-rank approximation of a matrix. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 215-224. ACM, 2009.10.1145/1536414.1536446Search in Google Scholar

[30] Samet Oymak and Babak Hassibi. Finding dense clusters via “low rank + sparse” decomposition. arXiv preprint arXiv:1104.5186, 2011.Search in Google Scholar

[31] G.W. Stewart. Introduction to matrix computations. Computer science and applied mathematics. Academic Press, 1973.Search in Google Scholar

[32] Van Vu. Spectral norm of random matrices. Combinatorica, 27(6):721-736, 2007.10.1007/s00493-007-2190-zSearch in Google Scholar

[33] Van Vu. A simple SVD algorithm for finding hidden partitions. arXiv preprint arXiv:1404.3918, 2014.Search in Google Scholar

Received: 2016-11-24
Accepted: 2017-07-14
Published Online: 2017-08-18
Published in Print: 2017-01-26

© 2017

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Scroll Up Arrow