Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Sanguinetti, Guido

IMPACT FACTOR 2018: 0.536
5-year IMPACT FACTOR: 0.764

CiteScore 2018: 0.49

SCImago Journal Rank (SJR) 2018: 0.316
Source Normalized Impact per Paper (SNIP) 2018: 0.342

Mathematical Citation Quotient (MCQ) 2018: 0.02

See all formats and pricing
More options …
Volume 15, Issue 1


Volume 10 (2011)

Volume 9 (2010)

Volume 6 (2007)

Volume 5 (2006)

Volume 4 (2005)

Volume 2 (2003)

Volume 1 (2002)

Using persistent homology and dynamical distances to analyze protein binding

Violeta Kovacev-Nikolic / Peter Bubenik / Dragan Nikolić
  • Department of Mechanical Engineering, University of Alberta and National Institute for Nanotechnology, Canada
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Giseon Heo
  • Corresponding author
  • School of Dentistry; Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, Canada, T6G 2N8
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2016-01-19 | DOI: https://doi.org/10.1515/sagmb-2015-0057


Persistent homology captures the evolution of topological features of a model as a parameter changes. The most commonly used summary statistics of persistent homology are the barcode and the persistence diagram. Another summary statistic, the persistence landscape, was recently introduced by Bubenik. It is a functional summary, so it is easy to calculate sample means and variances, and it is straightforward to construct various test statistics. Implementing a permutation test we detect conformational changes between closed and open forms of the maltose-binding protein, a large biomolecule consisting of 370 amino acid residues. Furthermore, persistence landscapes can be applied to machine learning methods. A hyperplane from a support vector machine shows the clear separation between the closed and open proteins conformations. Moreover, because our approach captures dynamical properties of the protein our results may help in identifying residues susceptible to ligand binding; we show that the majority of active site residues and allosteric pathway residues are located in the vicinity of the most persistent loop in the corresponding filtered Vietoris-Rips complex. This finding was not observed in the classical anisotropic network model.

Keywords: dynamical distance; persistence landscape; persistent homology; simplicial complex; support vector machine


  • Ahmad, S., M. Gromiha, H. Fawareh and A. Sarai (2004): “Asa-view: solvent acesitiblity graphics for proteins,” Available at http://www.abren.net/asaview/. Accessed on December 3, 2011.

  • Amitai, G., A. Shemesh, E. Sitbon, M. Shklar, D. Netanely, I. Venger and S. Pietrokovski (2004): “network analysis of protein structures identifies functional residues,” J. Mol. Biol., 344, 1135–1146.Google Scholar

  • Atilgan, A. R., S. R. Durell, R. L. Jernigan, M. C. Demirel, O. Keskin and I. Bahar (2001): “Anisotropy of fluctuation dynamics of proteins with an elastic network model,” Biophys. J., 80, 505–515.Google Scholar

  • Bandulasiri, A., R. N. Bhattacharya and V. Patrangenaru (2009): “Nonparametric inference for extrinsic means on size-and-(reflection)-shape manifolds with applications in medical imaging,” J. Multivariate Anal., 100, 1867–1882.CrossrefGoogle Scholar

  • Bendich, P., T. Galkovskyi and J. Harer (2011): “Improving homology estimates with random walks,” Inverse Probl., 27, 124002.CrossrefGoogle Scholar

  • Bernstein, F. C., T. F. Koetzle, G. J. Williams, E. E. Meyer Jr., M. D. Brice, J. R. Rodgers, O. Kennard, T. Shimanouchi and M. Tasumi (1977): “The protein data bank: a computer-based archival file for macromolecular structures,” J. Mol. Biol., 112, 535.CrossrefGoogle Scholar

  • Bhattacharya, A. (2008): “Statistical analysis on manifolds: a nonparametric approach for inference on shape spaces,” Sankhya Ser. A., 70-A, 223–266.Google Scholar

  • Bhattacharya, R. and V. Patrangenaru (2003): “Large sample theory of intrinsic and extrinsic sample means on manifolds I,” Ann. Stat., 31, 1–29.CrossrefGoogle Scholar

  • Boos, W. and H. Shuman (1998): “Maltose/maltodextrin system of Escherichia coli: transport, metabolism, and regulation,” Microbiol. Mol. Biol. Rev., 62, 204–229.Google Scholar

  • Bradley, M. J., P. T. Chivers and N. A. Baker (2008): “Molecular dynamics simulation of the Escherichia coli NikR protein: equilibrium conformational fluctuations reveal interdomain allosteric communication pathways,” J. Mol. Biol., 378, 1155–1173.Google Scholar

  • Bubenik, P. (2015): “Statistical topological data analysis using persistence landscapes,” J. Mach. Learn. Res., 16, 77–102.Google Scholar

  • Bubenik, P., G. Carlsson, P. T. Kim and Z.-M. Luo (2010): “Statistical topology via Morse theory persistence and nonparametric estimation.” In: Algebraic methods in statistics and probability II, Contemp. Math., volume 516, Providence, RI: Amer. Math. Soc., 75–92.Google Scholar

  • Bubenik, P. and J. Scott (2014): “Categorification of persistent homology,” Discrete Comput. Geom., 51, 600–627.CrossrefGoogle Scholar

  • Cavasotto, C. N., J. A. Kovacs and R. A. Abagyan (2005): “Representing receptor flexibility in ligand docking through relevant normal modes,” J. Am. Chem. Soc., 127, 9632–9640.Google Scholar

  • Chazal, F., B. Fasy, F. Lecci, A. Rinaldo, A. Singh and L. Wasserman (2014a): “On the bootstrap for persistence diagrams and landscapes,” Model. Anal. Inform. Syst., 20, 96–105.Google Scholar

  • Chazal, F., B. T. Fasy, F. Lecci, A. Rinaldo and L. Wasserman (2014b): “Stochastic convergence of persistence landscapes and silhouettes,” In: Proceedings of the Thirtieth Annual Symposium on Computational Geometry, SOCG’14, New York, NY, USA: ACM, 474–483.Google Scholar

  • Collins, A., A. Zomorodian, G. Carlsson and L. J. Guibas (2004): “A barcode shape descriptor for curve point cloud data,” Comput. Graph., 28, 881–894.CrossrefGoogle Scholar

  • de Silva, V. and P. Perry (2005): “Plex: A MATLAB library for studying simplicial homology,” Available at http://comptop.stanford.edu/programs/plex. Accessed on January 19, 2012.

  • Dijkstra, E. (1959): “A note on two problems in connexion with graphs,” Numer. Math., 1, 269–271.Google Scholar

  • Dryden, I. L. and K. V. Mardia (1998): Statistical shape analysis, New York: John Wiley and Sons.Google Scholar

  • Duan, X., J. A. Hall, H. Nikaido and F. A. Quiocho (2001): “Crystal structures of the maltodextrin/maltose-binding protein complexed with reduced oligosaccharides: flexibility of tertiary structure and ligand binding,” J. Mol. Biol., 306, 1115–1126.Google Scholar

  • Duan, X. and F. A. Quiocho (2002): “Structural evidence for a dominant role of nonpolar interactions in the binding of a transport/chemosensory receptor to its highly polar ligands,” Biochemistry, 41, 706–712.CrossrefGoogle Scholar

  • Edelsbrunner, H. and J. Harer (2010): Computational Topology An Introduction, Providence Rhode Island: American Mathematical Society.Google Scholar

  • Edelsbrunner, H., D. Letscher and A. Zomorodian (2002): “Topological persistence and simplifi-cation,” Discrete Comput. Geom., 28, 511–533.CrossrefGoogle Scholar

  • Eyal, E., L.-W. Yang, I. Bahar and A. Tramontano (2006): “Anisotropic network model: systematic evaluation and a new web interface,” Bioinformatics, 22, 2619–2627.CrossrefGoogle Scholar

  • Fasy, B. T., F. Lecci, A. Rinaldo, L. Wasserman, S. Balakrishnan and A. Singh (2014): “Confi-dence sets for persistence diagrams,” Ann. Statist., 42, 2301–2339.CrossrefGoogle Scholar

  • Gameiro, M., Y. Hiraoka, S. Izumi, M. Kramar, K. Mischaikow and V. Nanda (2012): “Topological measurement of protein compressibility via persistence diagrams,” In The Global COE Program, MI Preprint Series, volume 6, Math for Industry Education & Research Hub, Fukuoka, Japan: Kyushu University, MI Preprint Series, volume 6, 1–10.Google Scholar

  • Gekko, K. and Y. Hasegawa (1986): “Compressibility–structure relationship of globular proteins,” Biochemistry, 25, 6563–6571.CrossrefGoogle Scholar

  • Gould, A. D. and B. H. Shilton (2010): “Studies of the maltose transport system reveal a mechanism for coupling ATP hydrolysis to substrate translocation without direct recognition of substrate,” J. Biol. Chem., 285, 11290–11296.Google Scholar

  • Hatcher, A. (2002): Algebraic topology, Cambridge: Cambridge University Press.Google Scholar

  • Heo, G., J. Gamble and P. T. Kim (2012): “Topological analysis of variance and the maxillary complex,” J. Am. Stat. Assoc., 107, 477–492.Google Scholar

  • HKF (2013): “How to plot a hyper plane in 3D for the SVM results?” http://stackoverflow.com/a/19969412. Accessed on November 14, 2013.

  • Hudault, S., J. Guignot and A. L. Servin (2001): “Escherichia coli strains colonising the gastrointestinal tract protect germfree mice against Salmonella typhimurium infection,” Gut, 49, 47–55.CrossrefGoogle Scholar

  • Inkscape (2010): “Inkscape: open source vector graphics editor,” Free Software Foundation, Inc., Available at http://inkscape.org/.

  • Kasahara, K., I. Fukuda and H. Nakamura (2014): “A novel approach of dynamic cross correlation analysis on molecular dynamics simulations and its application to Ets1 dimer–DNA complex,” PLoS ONE, 9, e112419.Google Scholar

  • Kobryn, A. E., D. Nikolić, O. Lyubimova, S. Gusarov and A. Kovalenko (2014): “Dissipative particle dynamics with an effective pair potential from integral equation theory of molecular liquids,” J. Phys. Chem. B, 118, 12034–12049.CrossrefGoogle Scholar

  • Kovacev-Nikolic, V. (2012): Persistent homology in analysis of point-cloud data, Master’s thesis, Department of Mathematical and Statistical Sciences, University of Alberta.Google Scholar

  • Ledoux, M. and M. Talagrand (2002): Probability in Banach spaces: isoperimetry and processes, A Series of Modern Surveys in Mathematics Series, Springer, first reprint 2002 edition.Google Scholar

  • Lockless, S. W. and R. Ranganathan (1999): “Evolutionarily conserved pathways of energetic connectivity in protein families,” Science, 286, 295–299.Google Scholar

  • Marvin, J. S., E. E. Corcoran, N. A. Hattangadi, J. V. Zhang, S. A. Gere and H. W. Hellinga (1997): “The rational design of allosteric interactions in a monomeric protein and its applications to the construction of biosensors,” P. Natl. Acad. Sci., 94, 4366–4371.CrossrefGoogle Scholar

  • MATLAB (2005): “Matlab release 14,” The MathWorks Inc., Natick, Massachusetts, USA.Google Scholar

  • MATLAB (2011): “Matlab and statistics toolbox release 2011a,” The MathWorks Inc., Natick, Massachusetts, USA.Google Scholar

  • McNaught, A. D. and A. Wilkinson (1997): IUPAC compendium of chemical terminology, 2nd ed., Oxford: Blackwell Scientific Publications.Google Scholar

  • Mileyko, Y., S. Mukherjee and J. Harer (2011): “Probability measures on the space of persistence diagrams,” Inverse Probl., 27, 124007.CrossrefGoogle Scholar

  • Morris, G. M., R. Huey, W. Lindstrom, M. F. Sanner, R. K. Belew, D. S. Goodsell and A. J. Olson (2009): “Autodock4 and autodocktools4: automated docking with selective receptor flexiblity,” J. Comp. Chem., 15, 2785–2791.Google Scholar

  • Nikolić, D., N. Blinov, D. Wishart and A. Kovalenko (2012): “3d-rism-dock: a new fragment-based drug design protocol,” J. Chem. Theory Comput., 8, 3356–3372.CrossrefGoogle Scholar

  • Nikolić, D. and V. Kovacev-Nikolic (2013): “Dynamical model of the maltose-binding protein,” unpublished, 11 pages, Research Gate. DOI: 10.13140/2.1.3269.8883.Google Scholar

  • Quiocho, F. A., J. C. Spurlino and L. E. Rodseth (1997): “Extensive features of tight oligosaccha-ride binding revealed in high-resolution structures of the maltodextrin transport/chemosensory receptor,” Structure, 5, 997–1015.Google Scholar

  • R Development Core Team (2008): R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, URL http://www.R-project.org, ISBN 3-900051-07-0.

  • Reininghause, J., S. Huber, U. Bauer and R. Kwitt (2015): “A stable multi-scale kernel for topological machine learning,” In: Proc. 2015 IEEE Conf. Comp. Vision & Pat. Rec. (CVPR ’15), Boston, MA, USA, 4741–4748.Google Scholar

  • Rizk, S. S., M. Paduch, J. H. Heithaus, E. M. Duguid, A. Sandstrom and A. A. Kossiakoff (2011): “Allosteric control of ligand-binding affinity using engineered conformation-specific effector proteins,” Nat. Struct. Mol. Biol., 18, 437–442.CrossrefGoogle Scholar

  • Rubin, S. M., S.-Y. Lee, E. J. Ruiz, A. Pines and D. E. Wemmer (2002): “Detection and characterization of xenon-binding sites in proteins by 129Xe NMR spectroscopy,” J. Mol. Biol., 322, 425–440.Google Scholar

  • Seeliger, D. and B. L. de Groot (2010): “Conformational transitions upon ligand binding: holo-structure prediction from apo conformations,” PLoS Comput. Biol., 6, e1000634.Google Scholar

  • Sharff, A. J., L. E. Rodseth, J. C. Spurlino and F. A. Quiocho (1992): “Crystallographic evidence of a large ligand-induced hinge-twist motion between the two domains of the maltodextrin binding protein involved in active transport and chemotaxis,” Biochemistry, 31, 10657–10663.CrossrefGoogle Scholar

  • Shilton, B. H., H. A. Shuman and S. L. Mowbray (1996): “Crystal structures and solution conformations of a dominant-negative mutant of Escherichia coli maltose-binding protein,” J. Mol. Biol., 264, 364–376.Google Scholar

  • Szmelcman, S., M. Schwartz, T. J. Silhavy and W. Boos (1976): “Maltose transport in Escherichia coli K12,” Eur. J. Biochem., 65, 13–19.CrossrefGoogle Scholar

  • Tamal, K. D., S. Jian and W. Yusu (2011): “Approximating cycles in a shortest basis of the first homology group from point data,” Inverse Probl., 27, 124004.CrossrefGoogle Scholar

  • Tang, S., J.-C. Liao, A. R. Dunn, R. B. Altman, J. A. Spudich and J. P. Schmidt (2007): “Predicting allosteric communication in myosin via a pathway of conserved residues,” J. Mol. Biol., 373, 1361–1373.Google Scholar

  • Tausz, A., M. Vejdemo-Johansson and H. Adams (2011): “JavaPlex: a research software package for persistent (co)homology,” Available at http://code.google.com/javaplex.

  • Tenenbaum, J. B., V. de Silva and J. C. Langford (2000): “Isomap: a global geometric framework for nonlinear dimensionality reduction,” Science, 290, 2319–2323.Google Scholar

  • Tobi, D. and I. Bahar (2005): “Structural changes involved in protein binding correlate with intrinsic motions of proteins in the unbound state,” P. Natl. Acad. Sci. USA 102, 18908–18913.CrossrefGoogle Scholar

  • Turner, K., Y. Mileyko, S. Mukherjee and J. Harer (2014): “Fréchet means for distributions of persistence diagrams,” Discrete Comput. Geom., 52, 44–70.CrossrefGoogle Scholar

  • Van Houdt, R. and C. W. Michiels (2005): “Role of bacterial cell surface structures in Escherichia coli biofilm formation,” Res. Microbiol., 156, 626–633.Google Scholar

  • Xia, K. and G.-W. Wei (2014): “Persistent homology analysis of protein structure, flexibility, and folding,” Int. J. Numer. Meth. Biomed. Eng., 30, 814–844.CrossrefGoogle Scholar

  • Xia, K. and G.-W. Wei (2015a): “Multidimensional persistence in biomolecular data,” J. Comput. Chem., 36, 1502–1520.CrossrefGoogle Scholar

  • Xia, K. and G.-W. Wei (2015b): “Persistent topology for cryo-EM data analysis,” Int. J. Numer. Meth. Biomed. Engng., 31. Doi: 10.1002/cnm.2719.CrossrefGoogle Scholar

  • Zomorodian, A. and G. Carlsson (2005): “Computing persistent homology,” Discrete Comput. Geom., 33, 249–274.CrossrefGoogle Scholar

About the article

Corresponding author: Giseon Heo, School of Dentistry; Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, Canada, T6G 2N8, e-mail:

Published Online: 2016-01-19

Published in Print: 2016-03-01

Citation Information: Statistical Applications in Genetics and Molecular Biology, Volume 15, Issue 1, Pages 19–38, ISSN (Online) 1544-6115, ISSN (Print) 2194-6302, DOI: https://doi.org/10.1515/sagmb-2015-0057.

Export Citation

©2016 by De Gruyter.Get Permission

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

Menglun Wang, Zixuan Cang, and Guo-Wei Wei
Nature Machine Intelligence, 2020, Volume 2, Number 2, Page 116
Peter Bubenik, Michael Hull, Dhruv Patel, and Benjamin Whittle
Inverse Problems, 2020, Volume 36, Number 2, Page 025008
Anubha Goel, Puneet Pasricha, and Aparna Mehra
Expert Systems with Applications, 2020, Page 113222
Paul Bendich, Peter Bubenik, and Alexander Wagner
Journal of Applied and Computational Topology, 2019
Lee Steinberg, John Russo, and Jeremy Frey
Journal of Cheminformatics, 2019, Volume 11, Number 1
Bernadette Stolz, Heather Harrington, and Mason Alexander Porter
SSRN Electronic Journal , 2016
Vic Patrangenaru, Peter Bubenik, Robert L. Paige, and Daniel Osborne
Sankhya A, 2018
Francisco Belchi, Mariam Pirashvili, Joy Conway, Michael Bennett, Ratko Djukanovic, and Jacek Brodzki
Scientific Reports, 2018, Volume 8, Number 1
Larry Wasserman
Annual Review of Statistics and Its Application, 2018, Volume 5, Number 1, Page 501
Peter Bubenik and Paweł Dłotko
Journal of Symbolic Computation, 2017, Volume 78, Page 91
Bernadette J. Stolz, Heather A. Harrington, and Mason A. Porter
Chaos: An Interdisciplinary Journal of Nonlinear Science, 2017, Volume 27, Number 4, Page 047410

Comments (0)

Please log in or register to comment.
Log in