Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Cellular and Molecular Biology Letters

Editor-in-Chief: /


IMPACT FACTOR 2016: 1.260
5-year IMPACT FACTOR: 1.506

CiteScore 2016: 1.56

SCImago Journal Rank (SJR) 2016: 0.615
Source Normalized Impact per Paper (SNIP) 2016: 0.470

Online
ISSN
1689-1392
See all formats and pricing
More options …
Volume 16, Issue 2 (Jun 2011)

PPI_SVM: Prediction of protein-protein interactions using machine learning, domain-domain affinities and frequency tables

Piyali Chatterjee / Subhadip Basu / Mahantapas Kundu / Mita Nasipuri / Dariusz Plewczynski
Published Online: 2011-03-26 | DOI: https://doi.org/10.2478/s11658-011-0008-x

Abstract

Protein-protein interactions (PPI) control most of the biological processes in a living cell. In order to fully understand protein functions, a knowledge of protein-protein interactions is necessary. Prediction of PPI is challenging, especially when the three-dimensional structure of interacting partners is not known. Recently, a novel prediction method was proposed by exploiting physical interactions of constituent domains. We propose here a novel knowledge-based prediction method, namely PPI_SVM, which predicts interactions between two protein sequences by exploiting their domain information. We trained a two-class support vector machine on the benchmarking set of pairs of interacting proteins extracted from the Database of Interacting Proteins (DIP). The method considers all possible combinations of constituent domains between two protein sequences, unlike most of the existing approaches. Moreover, it deals with both single-domain proteins and multi domain proteins; therefore it can be applied to the whole proteome in high-throughput studies. Our machine learning classifier, following a brainstorming approach, achieves accuracy of 86%, with specificity of 95%, and sensitivity of 75%, which are better results than most previous methods that sacrifice recall values in order to boost the overall precision. Our method has on average better sensitivity combined with good selectivity on the benchmarking dataset. The PPI_SVM source code, train/test datasets and supplementary files are available freely in the public domain at: http://code.google.com/p/cmater-bioinfo/.

Keywords: Protein-protein interaction; Domain-frequency values; Domaindomain interaction affinity value; Proteome; Interactome; Brainstorming; Machine learning; Consensus; DIP; Protein domains; Sequences; Structures; Protein-protein complexes

  • [1] Ito, T., Tashiro, K., Muta, S., Ozawa, R., Chiba, T., Nishizawa, M., Yamamoto, K., Kuhara, S. and Sakaki, Y. Toward a protein-protein interaction map of the budding yeast: a comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. Proc. Natl. Acad. Sci. USA 97 (2000) 1143–1147. http://dx.doi.org/10.1073/pnas.97.3.1143CrossrefGoogle Scholar

  • [2] Plewczynski, D. and Basu, S. AMS 3.0: prediction of post-translational modifications. BMC Bioinformatics 11 (2010) 210 DOI: 10.1186/1471- 2105-11-210. http://dx.doi.org/10.1186/1471-2105-11-210Web of ScienceGoogle Scholar

  • [3] Gharakhanian, E., Takahashi, J., Clever, J. and Kasamatsu, H. In vitro assay for protein-protein interaction: carboxyl-terminal 40 residues of simian virus 40 structural protein VP3 contain a determinant for interaction with VP1. Proc. Natl. Acad. Sci. USA 85 (1998) 6607–6611. http://dx.doi.org/10.1073/pnas.85.18.6607CrossrefGoogle Scholar

  • [4] Hu, C.D., Chinenov, Y. and Kerppola, T.K. Visualization of interactions among bZIP and Rel family proteins in living cells using bimolecular fluorescence complementation. Mol. Cell. 9 (2002) 789–798. http://dx.doi.org/10.1016/S1097-2765(02)00496-3CrossrefGoogle Scholar

  • [5] Rigaut, G., Shevchenko, A., Rutz, B., Wilm, M., Mann, M. and Seraphin, B. A generic protein purification method for protein complex characterization and proteome exploration. Nat. Biotechnol. 17 (1999) 1030–1032. http://dx.doi.org/10.1038/13732CrossrefGoogle Scholar

  • [6] Klingström, T. and Plewczynski D. Protein-protein interaction and pathway databases, a graphical review. Brief. Bioinform. (2010) DOI: 10.1093/bib/bbq064. CrossrefWeb of ScienceGoogle Scholar

  • [7] Salwinski, L., Miller, C.S., Smith, A.J., Pettit, F.K., Bowie, J.U. and Eisenberg, E. The Database of Interacting Proteins: 2004 update. Nucleic Acids Res. 32 (2004) 449–451. http://dx.doi.org/10.1093/nar/gkh086CrossrefGoogle Scholar

  • [8] Pagel, P., Kovac, S., Oesterheld, M., Brauner, B., Dunger-Kaltenbach, I., Frishman, G., Montrone, C., Mark, P., Stümpflen, V., Mewes, H.W., Ruepp, A. and Frishman, D. The MIPS mammalian protein-protein interaction database. Bioinformatics 21 (2005) 832–834. http://dx.doi.org/10.1093/bioinformatics/bti115CrossrefGoogle Scholar

  • [9] Bader, G.D., Betel, D. and Hogue, C.W. BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res. 31 (2003) 248–250. http://dx.doi.org/10.1093/nar/gkg056CrossrefGoogle Scholar

  • [10] Aranda, B., Achuthan, P., Alam-Faruque, Y., Armean, I., Bridge, A., Derow, C., Feuermann, M., Ghanbarian, A.T., Kerrien, S., Khadake, J., Kerssemakers, J., Leroy, C., Menden, M., Michaut, M., Montecchi-Palazzi, L., Neuhauser, L.N., Orchard, S., Perreau, V., Roechert, B., van Eijk, K. and Hermjakob, H. The IntAct molecular interaction database in 2010. Nucleic Acids Res. 38 (2009) 525–531. http://dx.doi.org/10.1093/nar/gkp878CrossrefGoogle Scholar

  • [11] Ceol, A., Chatr, Aryamontri, A., Licata, L., Peluso, D., Briganti, L., Perfetto, L., Castagnoli, L. and Cesareni, G. MINT, the molecular interaction database: 2009 update. Nucleic Acids Res. 38 (2010) 532–539. http://dx.doi.org/10.1093/nar/gkp983CrossrefGoogle Scholar

  • [12] Plewczynski, D., Łaźniewski, M., Augustyniak, R. and Ginalski, K. Can we trust docking results? Evaluation of seven commonly used programs on PDBbind database. J. Comput. Chem. 32 (2011) 742–755. http://dx.doi.org/10.1002/jcc.21643Web of ScienceCrossrefGoogle Scholar

  • [13] Plewczynski, D., Łaźniewski, M., von Grotthuss, M., Rychlewski, L. and Ginalski, K. VoteDock: Consensus docking method for prediction of protein-ligand interactions. J. Comput. Chem. 32 (2011) 568–581. http://dx.doi.org/10.1002/jcc.21642Web of ScienceCrossrefGoogle Scholar

  • [14] Bock, J.R. and Gough, A.D., A. Predicting protein-protein interactions from primary structure. Bioinformatics 17 (2001) 455–460. http://dx.doi.org/10.1093/bioinformatics/17.5.455CrossrefGoogle Scholar

  • [15] Gomez, S.M., Noble, W.S. and Rzhetsky, A. Learning to predict protein-protein interactions from protein sequences. Bioinformatics 19 (2003) 1875–1881. http://dx.doi.org/10.1093/bioinformatics/btg352CrossrefGoogle Scholar

  • [16] Zaki, N. Prediction of protein-protein interactions using pairwise alignment and inter-domain linker region. Engin. Letter 16 (2008) 505–511. Google Scholar

  • [17] Wojcik, J. and Schachter, V. Protein-protein interaction map inference using interacting domain profile pairs. Bioinformatics 17 (2001) 296–305. CrossrefGoogle Scholar

  • [18] Kim, W.K., Park, J. and Suh, J.K. Large scale statistical prediction of protein-protein interaction by potentially interacting domain (PID) pair. Genome Inform. 13 (2002) 42–50. Google Scholar

  • [19] Alashwal, H., Deris, S. and Othman, R.M. One-class support vector machines for protein-protein interactions prediction. J. Biomed. Sci. 1 (2006) 120–127. Google Scholar

  • [20] Chen, X.W. and Liu, M. Domain-based predictive models for proteinprotein interaction prediction. Eurasip Jasp. 1 (2006) 1–8. DOI: 10.1155/ASP/2006/32767. CrossrefGoogle Scholar

  • [21] Han, D.S., Kim, H.S., Jang, W.H., Lee, S.D. and Suh, J.K. PreSPI: a domain combination based prediction system for protein-protein interaction. Nucleic Acids Res. 132 (2004) 6312–6320. http://dx.doi.org/10.1093/nar/gkh972CrossrefGoogle Scholar

  • [22] Alashwal, H., Deris, S. and Othman, R.M. A Bayesian kernel for the Prediction of Protein-Protein Interactions. World Academy of Science, Engineering and Technology 51 (2009) 928–933. Google Scholar

  • [23] Vapnik, V. The nature of statistical learning theory, Springer-Verlag, New York, 1995. Google Scholar

  • [24] Xenarios, I., Salwinski, L., Duan, X.J., Higney, P., Kim, S.M. and Eisenberg, D. DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30 (2002) 303–305. http://dx.doi.org/10.1093/nar/30.1.303CrossrefGoogle Scholar

  • [25] Joachims, T. Making Large-Scale SVM Learning Practical. in: Advances in Kernel Methods — Support Vector Learning (Schölkopf, B., Burges. C. and Smola. A., Eds.), MIT Press Cambridge, 1999, 169–284. Google Scholar

  • [26] Plewczynski, D. and Ginalski, K. The interactome: Predicting the proteinprotein interactions in cells. Cell. Mol. Biol. Lett. 14 (2009) 1–22. http://dx.doi.org/10.2478/s11658-008-0024-7Web of ScienceCrossrefGoogle Scholar

  • [27] Plewczynski D. Brainstorming: weighted voting prediction of inhibitors for protein targets. J. Mol. Model. (2010) DOI 10.1007/s00894-010-0854-x. Web of ScienceCrossrefGoogle Scholar

About the article

Published Online: 2011-03-26

Published in Print: 2011-06-01


Citation Information: Cellular and Molecular Biology Letters, ISSN (Online) 1689-1392, DOI: https://doi.org/10.2478/s11658-011-0008-x.

Export Citation

© 2011 Versita Warsaw. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License. BY-NC-ND 3.0

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

[1]
Xiuquan Du, Shiwei Sun, Changlin Hu, Yu Yao, Yuanting Yan, and Yanping Zhang
Journal of Chemical Information and Modeling, 2017, Volume 57, Number 6, Page 1499
[2]
A. Srivastava, G. Mazzocco, A. Kel, L. S. Wyrwicz, and D. Plewczynski
Mol. BioSyst., 2016, Volume 12, Number 3, Page 778
[3]
Hong Guo, Bingjing Liu, Danli Cai, and Tun Lu
International Journal of Machine Learning and Cybernetics, 2016
[4]
Brijesh Kumar Sriwastava, Subhadip Basu, and Ujjwal Maulik
Journal of Biosciences, 2015, Volume 40, Number 4, Page 809
[5]
Sovan Saha, Piyali Chatterjee, Subhadip Basu, Mahantapas Kundu, and Mita Nasipuri
Cellular and Molecular Biology Letters, 2014, Volume 19, Number 4
[6]
Manoj Kumar Sekhwal, Vinay Sharma, and Renu Sarin
journal of Proteome Science and Computational Biology, 2013, Volume 2, Number 1, Page 2
[7]
Indrajit Saha, Julian Zubek, Tomas Klingström, Simon Forsberg, Johan Wikander, Marcin Kierczak, Ujjwal Maulik, and Dariusz Plewczynski
Molecular BioSystems, 2014, Volume 10, Number 4, Page 820
[8]
Zhiwang Zhang, Guangxia Gao, Jun Yue, Yanqing Duan, and Yong Shi
Applied Soft Computing, 2014, Volume 18, Page 115
[9]
Brijesh K. Sriwastava, Subhadip Basu, Ujjwal Maulik, and Dariusz Plewczynski
Journal of Molecular Modeling, 2013, Volume 19, Number 9, Page 4059

Comments (0)

Please log in or register to comment.
Log in