Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Bio-Algorithms and Med-Systems

Editor-in-Chief: Roterman-Konieczna , Irena

CiteScore 2018: 0.29

SCImago Journal Rank (SJR) 2018: 0.129
Source Normalized Impact per Paper (SNIP) 2018: 0.324

See all formats and pricing
More options …

Protein intrachain contact prediction with most interacting residues (MIR)

Ruben Acuña
  • Scientific Data Management Laboratory, School of Electrical, Computer and Energy Engineering (ECEE), Arizona State University, Tempe, AZ 85282-5706, USA
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Zoé Lacroix
  • Corresponding author
  • Scientific Data Management Laboratory, School of Electrical, Computer and Energy Engineering (ECEE), Arizona State University, Tempe, AZ 85282-5706, USA
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Nikolaos Papandreou / Jacques Chomilier
  • Protein Structure Prediction group, IMPMC, Sorbonne University, UPMC, CNRS, MNHN, IRD, Paris, France
  • RPBS, Paris, France
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2014-11-27 | DOI: https://doi.org/10.1515/bams-2014-0015


The transition state ensemble during the folding process of globular proteins occurs when a sufficient number of intrachain contacts are formed, mainly, but not exclusively, due to hydrophobic interactions. These contacts are related to the folding nucleus, and they contribute to the stability of the native structure, although they may disappear after the energetic barrier of transition states has been passed. A number of structure and sequence analyses, as well as protein engineering studies, have shown that the signature of the folding nucleus is surprisingly present in the native three-dimensional structure, in the form of closed loops, and also in the early folding events. These findings support the idea that the residues of the folding nucleus become buried in the very first folding events, therefore helping the formation of closed loops that act as anchor structures, speed up the process, and overcome the Levinthal paradox. We present here a review of an algorithm intended to simulate in a discrete space the early steps of the folding process. It is based on a Monte Carlo simulation where perturbations, or moves, are randomly applied to residues within a sequence. In contrast with many technically similar approaches, this model does not intend to fold the protein but to calculate the number of non-covalent neighbors of each residue, during the early steps of the folding process. Amino acids along the sequence are categorized as most interacting residues (MIRs) or least interacting residues. The MIR method can be applied under a variety of circumstances. In the cases tested thus far, MIR has successfully identified the exact residue whose mutation causes a switch in conformation. This follows with the idea that MIR identifies residues that are important in the folding process. Most MIR positions correspond to hydrophobic residues; correspondingly, MIRs have zero or very low accessible surface area. Alongside the review of the MIR method, we present a new postprocessing method called smoothed MIR (SMIR), which refines the original MIR method by exploiting the knowledge of residue hydrophobicity. We review known results and present new ones, focusing on the ability of MIR to predict structural changes, secondary structure, and the improved precision with the SMIR method.

Keywords: globular proteins; hydrophobic core; prediction method; protein folding nucleus; protein structure


  • 1.

    Anfinsen CB. Principles that govern the folding of protein chains. Science 1973;181:223–30.Google Scholar

  • 2.

    Anfinsen CB, Haber E. Studies on the reduction and re-formation of protein disulfide bonds. J Biol Chem 1961;236:1361–3.Google Scholar

  • 3.

    Go N, Taketomi H. Respective roles of short-and long-range interactions in protein folding. Proc Natl Acad Sci USA 1978;75:559–63.Google Scholar

  • 4.

    Miyazawa S, Jernigan RL. Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. J Mol Biol 1996;256:623–44.Google Scholar

  • 5.

    Papandreou N, Kanehisa M, Chomilier J. Folding of the human protein FKBP. Lattice Monte-Carlo simulations. C R Acad Sci III 1998;321:835–43.Google Scholar

  • 6.

    Skolnick J, Kolinski A. Dynamic Monte Carlo simulations of a new lattice model of globular protein folding, structure and dynamics. J Mol Biol 1991;221:499–531.Google Scholar

  • 7.

    Skolnick J, Kolinski A, Ortiz AR. Reduced protein models and their application to the protein folding problem. J Biomol Struct Dyn 1998;16:381–96.CrossrefPubMedGoogle Scholar

  • 8.

    Kolinski A, Rotkiewicz P, Skolnick J. Application of a high coordination lattice model in protein structure prediction. In: Proceedings of the Workshop on Monte Carlo approach to biopolymers and protein folding, Singapore: World Scientific, 1998:377–88.Google Scholar

  • 9.

    Chomilier J, Lamarine M, Mornon JP, Torres JH, Eliopoulos, E, Papandreou N. Analysis of fragments induced by simulated lattice protein folding. C R Biol 2004;327:431–43.Google Scholar

  • 10.

    Prudhomme N, Chomilier J. Prediction of the protein folding core: application to the immunoglobulin fold. Biochimie 2009;91:1465–74.Web of ScienceGoogle Scholar

  • 11.

    Lonquety M, Lacroix Z, Chomilier J. Evaluation of the stability of folding nucleus upon mutation. Pattern Recogn Bioinform LNCS 2008;5265:54–65.Google Scholar

  • 12.

    Lonquety M, Lacroix Z, Papandreou N, Chomilier J. SPROUTS: a database for the evaluation of protein stability upon point mutation. Nucleic Acids Res 2009;37:D374–9.Web of ScienceCrossrefGoogle Scholar

  • 13.

    Alland C, Moreews F, Boens D, Carpentier M, Chiusa S, Lonquety M, et al. RPBS: a web resource for structural bioinformatics. Nucleic Acids Res 2005;33:W44–9.PubMedCrossrefGoogle Scholar

  • 14.

    Callebaut I, Labesse G, Durand P, Poupon A, Canard L, Chomilier J, et al. Deciphering protein sequence information through hydrophobic cluster analysis (HCA): current status and perspectives. Cell Mol Life Sci 1997;53:621–45.CrossrefPubMedGoogle Scholar

  • 15.

    Acuña R, Lacroix Z, Chomilier J, Papandreou N. SMIR: a method to predict the residues involved in the core of a protein. In: Proceedings of the International Conference on Bioinformatics & Computational Biology (BIOCOMP). Las Vegas, USA, 2014.Google Scholar

  • 16.

    Lipman DJ, Pearson WR. Rapid and sensitive protein similarity searches. Science 1985;227:1435–41. doi:10.1126/science.2983426. PMID 2983426CrossrefGoogle Scholar

  • 17.

    Bostock M, Ogievetsky V, Heer J. D3: data-driven documents. IEEE Trans Vis Comput Graph 2011;17:2301–9.CrossrefPubMedWeb of ScienceGoogle Scholar

  • 18.

    Murzin AG, Brenner SE, Hubbard T, Chothia C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 1995;247:536–40.Google Scholar

  • 19.

    Hamill S, Steward A, Clarke J. The folding of an immunoglobulin like Greek key protein is defined by a common core nucleus and regions constrained by topology. J Mol Biol 2000;297: 165–78.Google Scholar

  • 20.

    Berezovsky IN, Grosberg AY, Trifonov EN. Closed loops of nearly standard size: common basic element of protein structure. FEBS Lett 2000;466:283–6.Google Scholar

  • 21.

    Lamarine M, Mornon JP, Berezovsky IN, Chomilier J. Distribution of tightened end fragments of globular proteins statistically match that of topohydrophobic positions: towards an efficient punctuation of protein folding? Cell Mol Life Sci 2001;58:492–8.CrossrefPubMedGoogle Scholar

  • 22.

    Chintapalli SV, Illingworth CJ, Upton GJ, Sacquin-Mora S, Reeves PJ, Mohammedali HS, et al. Assessing the effect of dynamics on the closed-loop protein-folding hypothesis. J R Soc Interface 2013;11:20130935.PubMedGoogle Scholar

  • 23.

    Potapov V, Cohen M, Schreiber G. Assessing computational methods for predicting protein stability upon mutation: good on average but not in the details. Protein Eng Des Sel 2009;22:553–60.CrossrefWeb of SciencePubMedGoogle Scholar

  • 24.

    Schymkowitz J, Borg J, Stricher F, Nys R, Rousseau F, Serrano L. The FoldX web server: an online force field. Nucleic Acids Res 2005;33:W382–8.CrossrefGoogle Scholar

  • 25.

    Cheng J, Randall A, Baldi P. Prediction of protein stability changes for single-site mutations using support vector machines. Proteins 2006;62:1125–32.PubMedGoogle Scholar

  • 26.

    Zhou H, Zhou Y. Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction. Protein Sci 2002;11:2714–26.PubMedGoogle Scholar

  • 27.

    Capriotti E, Fariselli P, Casadio R. I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure. Nucleic Acids Res 2005;33:W306–10.Google Scholar

  • 28.

    He Y, Yeh DC, Alexander P, Bryan PN, Orban J. Solution NMR structures of IgG binding domains with artificially evolved high levels of sequence identity but different folds. Biochemistry 2005;44:14055–61.PubMedCrossrefGoogle Scholar

  • 29.

    Alexander PA, He Y, Chen Y, Orban J, Bryan PN. The design and characterization of two proteins with 88% sequence identity but different structure and function. Proc Natl Acad Sci USA 2007;104:11963–8.Google Scholar

  • 30.

    Alexander PA, He Y, Chen Y, Orban J, Bryan PN. A minimal sequence code for switching protein structure and function. Proc Natl Acad Sci USA 2009;106:21149–54.Google Scholar

  • 31.

    Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983;22:2577–637.CrossrefPubMedGoogle Scholar

  • 32.

    Tsodikov OV, Record, MT, Sergeev YV. A novel computer program for fast exact calculation of accessible and molecular surface areas and average surface curvature. J Comput Chem 2002;23:600–9.PubMedCrossrefGoogle Scholar

  • 33.

    Faísca PF, Travasso RD, Parisi A, Rey A. Why do protein folding rates correlate with metrics of native topology? PLoS ONE 2012;7:e35599.CrossrefGoogle Scholar

  • 34.

    Travasso RD, Faísca PF, Rey A. The protein folding transition state: insight from kinetics and thermodynamics. J Chem Phys 2010;133:125102.Google Scholar

  • 35.

    Lappalainen I, Hurley M, Clarke J. Plasticity within the obligatory folding nucleus of an immunoglobulin like domain. J Mol Biol 2008;375:547–59.Web of ScienceGoogle Scholar

  • 36.

    Galzitskaya OV, Ivankov DN, Finkelstein AV. Folding nuclei in proteins. FEBS Lett 2001;489:113–8.Google Scholar

  • 37.

    Galzitskaya OV, Skoogarev AV, Ivankov DN, Finkelstein AV. Folding nuclei in 3D protein structures. Pac Symp Biocomput 2000;5:131–42.Google Scholar

  • 38.

    Lonquety M, Chomilier J, Papandreou N, Lacroix Z. Prediction of stability upon mutation in the context of the folding nucleus. OMICS 2010;14:151–6.Web of SciencePubMedCrossrefGoogle Scholar

  • 39.

    Acuña R, Lacroix Z, Chomilier J. A workflow for the prediction of the effects of residue substitution on protein stability. Pattern Recogn Bioinform LNCS 2013; 7986:253–64.Google Scholar

  • 40.

    Néron B, Ménager H, Maufrais C, Joly N, Maupetit J, Letart S, et al. Mobyle: a new full web bioinformatics framework. Bioinformatics 2009;25:3005–11.PubMedCrossrefWeb of ScienceGoogle Scholar

  • 41.

    Strauser E, Naveau M, Ménager H, Maupetit J, Lacroix Z, Tufféry P. Semantic map for structural bioinformatics: enhanced service discovery based on high level concept ontology. Resour Discov LNCS 2010;6799:57–70.Google Scholar

  • 42.

    Lacroix Z, Critchlow T, editors. Bioinformatics: managing scientific data. San Francisco: Morgan Kaufmann, 2003.Google Scholar

  • 43.

    Kolinski A, Rotkiewicz P, Ilkowski B, Skolnick J. Protein folding: flexible lattice models. Prog Theor Phys Suppl 2000;138:292–300.Google Scholar

  • 44.

    Ravichandran L, Papandreou-Suppappola A, Spanias A, Lacroix Z, Legendre C. Waveform mapping and time-frequency processing of DNA and protein sequences. IEEE Trans Signal Process 2011;59:4210–24.CrossrefWeb of ScienceGoogle Scholar

  • 45.

    Papandreou N, Berezovsky IN, Lopes A, Eliopoulos E, Chomilier J. Universal positions in globular proteins – from observation to simulation. Eur J Biochem 2004;271:4762–8.Google Scholar

  • 46.

    Miyazawa S, Jernigan RL. Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation. Macromolecules 1985;18:534–52.CrossrefGoogle Scholar

  • 47.

    Marsaglia G, Zaman A. The KISS generator. Technical report, Department of Statistics, Florida State University, 1993.Google Scholar

About the article

Corresponding author: Zoé Lacroix, Scientific Data Management Laboratory, School of Electrical, Computer and Energy Engineering (ECEE), Arizona State University, Tempe, AZ 85282-5706, USA, E-mail:

Received: 2014-09-11

Accepted: 2014-10-16

Published Online: 2014-11-27

Published in Print: 2014-12-19

Citation Information: Bio-Algorithms and Med-Systems, Volume 10, Issue 4, Pages 227–242, ISSN (Online) 1896-530X, ISSN (Print) 1895-9091, DOI: https://doi.org/10.1515/bams-2014-0015.

Export Citation

©2014 by De Gruyter.Get Permission

Comments (0)

Please log in or register to comment.
Log in