Skip to content
BY-NC-ND 3.0 license Open Access Published by De Gruyter Open Access November 26, 2015

Prediction of protein structural class based on Linear Predictive Coding of PSI-BLAST profiles

Yufang Qin, Xiaoqi Zheng, Jun Wang, Ming Chen and Changjie Zhou
From the journal Open Life Sciences

Abstract

Knowledge of protein structure plays a key role in the analysis of protein functions, protein binding, rational drug design, and many other related fields and applications. In this study, a novel feature extraction model based on linear predictive coding (LPC) and position-specific score matrices (PSSM) was proposed to predict structural class from protein sequences. First, the PSI-BLAST tool was employed to transform the original protein sequences into PSSMs. Then, the LPC, a signal processing tool, was applied to extract the features from PSSMs. The selected features were finally fed to a support vector machine to perform the prediction. Cross-validation tests on the four benchmark datasets Z277, Z498, 1189 and 25PDB, showed a significant leap in overall accuracy using the proposed method. Compared to existing methods, our method achieved better performance in prediction of protein structural class.

References

[1] Murzin A.G., Brenner S.E., Hubbard T., Chothia C., A structural classification of proteins database for the investigation of sequences and structures, J.Mol. Biol., 1995, 247, 536-540 10.1016/S0022-2836(05)80134-2Search in Google Scholar

[2] Chou K.C., Structural bioinformatics and its impact to biomedical science, Curr. Med. Chem., 2004, 11, 2105-2134 10.2174/0929867043364667Search in Google Scholar

[3] Chou K.C. Liu W.M., Maggiora G.M., Zhang C.T., Prediction and classification of domain structural classes, Proteins, 1998, 31, 97-103 10.1002/(SICI)1097-0134(19980401)31:1<97::AID-PROT8>3.0.CO;2-ESearch in Google Scholar

[4] Efimov A.V., Structural similarity between two-layer alpha/beta and beta-proteins, J. Mol. Biol., 1995, 245, 402-415 10.1006/jmbi.1994.0033Search in Google Scholar

[5] Nakashima H., Nishikawa K., and Ooi T. The folding type of a protein is relevant to the amino acid composition, J. Biochem., 1986, 99, 153-162 10.1093/oxfordjournals.jbchem.a135454Search in Google Scholar

[6] Zhou G.P., An intriguing controversy over protein structural class prediction, J. Protein Chem., 1998, 17, 729-738 10.1023/A:1020713915365Search in Google Scholar

[7] Chou K.C., A key driving force in determination of protein structural classes, Biochem. Biophys. Res. Commun., 1999, 264, 216-224 10.1006/bbrc.1999.1325Search in Google Scholar

[8] Chou K.C., Prediction of protein cellular attributes using pseudo-amino acid composition, Proteins, 2001, 43, 246-255 10.1002/prot.1035Search in Google Scholar

[9] Lin H., Li Q.Z., Using pseudo amino acid composition to predict protein structural class: approached by incorporating 400 dipeptide components, J. Comput. Chem., 2007, 28, 1463-1466 10.1002/jcc.20554Search in Google Scholar

[10] Xiao X., Shao S. H., Huang Z. D., Chou K.C., Using pseudo amino acid composition to predict protein structural classes: approached with complexity measure factor, J. Comput. Chem., 2006, 27, 478-482 10.1002/jcc.20354Search in Google Scholar

[11] Zhang T.L., Ding Y.S., Using pseudo amino acid composition and binary-tree support vector machines to predict protein structural classes, Amino Acids, 2007, 33, 623-629 10.1007/s00726-007-0496-1Search in Google Scholar PubMed

[12] Luo R.Y., Feng Z.P., Liu J.K., Prediction of protein structural class by amino acid and polypeptide composition, Eur. J. Biochem., 2002, 269, 4219-4225 10.1046/j.1432-1033.2002.03115.xSearch in Google Scholar PubMed

[13] Sun X.D., Huang R.B., Prediction of protein structural classes using support vector machines, Amino Acids, 2006, 30, 469-475 10.1007/s00726-005-0239-0Search in Google Scholar PubMed

[14] Costantini S., Facchiano A.M., Prediction of the protein structural class by specific peptide frequencies, Biochimie, 91, 226-229 10.1016/j.biochi.2008.09.005Search in Google Scholar PubMed

[15] Chou K.C., Cai Y.D., Predicting protein structural class by functional domain composition, Biochem. Biophys. Res. Commun., 2004, 321, 1007-1009 10.1016/j.bbrc.2004.07.059Search in Google Scholar PubMed

[16] Yang J.Y., Peng Z.L., Yu Z.G., Zhang R.J., Anh V., Wang D.S., Prediction of protein structural classes by recurrence quantifica-tion analysis based on chaos game representation, J. Therm. Biol., 2009, 257, 618-626 10.1016/j.jtbi.2008.12.027Search in Google Scholar PubMed

[17] Zhou X.B., Chen C., Li Z.C., Zou X.Y., Improved prediction of subcellular location for apoptosis proteins by the dual-layer support vector machine, Amino Acids, 2008, 35, 383-388 10.1007/s00726-007-0608-ySearch in Google Scholar PubMed

[18] Mizianty M.J., Kurgan L., Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences, BMC Bioinformatics, 2009, 10, 414 10.1186/1471-2105-10-414Search in Google Scholar PubMed PubMed Central

[19] Kurgan L., Cios K., Chen K., SCPRED: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences, BMC Bioinformatics, 2008, 9, 226 10.1186/1471-2105-9-226Search in Google Scholar PubMed PubMed Central

[20] Liu T., Geng X., Zheng X., Li R., Wang J., Accurate prediction of protein structural class using auto covariance transformation of PSI-BLAST profiles, Amino Acids, 2012, 42, 2243-2249 10.1007/s00726-011-0964-5Search in Google Scholar PubMed

[21] Cai Y.D., Zhou G.P., Prediction of protein structural classes by neural network, Biochimie, 2000, 82, 783-785 10.1016/S0300-9084(00)01161-5Search in Google Scholar

[22] Cai Y.D., Liu X.J., Xu X., Zhou G.P., Support vector machines for predicting protein structural class, BMC Bioinformatics, 2001, 2, 3 10.1186/1471-2105-2-3Search in Google Scholar

[23] Chen C., Tian Y.X., Zou X.Y., Cai P.X., Mo J.Y., Using pseudo amino acid composition and support vector machine to predict protein structural class, J. Therm. Biol., 2006a, 243, 444-448 10.1016/j.jtbi.2006.06.025Search in Google Scholar

[24] Li Z.C., Zhou X.B., Lin Y.R., Zou X.Y., Prediction of protein structure class by coupling improved genetic algorithm and support vector machine, Amino Acids, 2008, 35, 581-590 10.1007/s00726-008-0084-zSearch in Google Scholar

[25] Qiu J.D., Luo S.H., Huang J.H., and Liang R.P., Using support vector machines for prediction of protein structural classes based on discrete wavelet transform, J. Comput. Chem., 2009, 30, 1344-1350 10.1002/jcc.21115Search in Google Scholar

[26] Shen H.B., Yang J., Liu X.J., Chou K.C., Using supervised fuzzy clustering to predict protein structural classes, Biochem. Biophys. Res. Commun., 2005, 334, 577-581 10.1016/j.bbrc.2005.06.128Search in Google Scholar

[27] Zhang T.L., Ding Y.S., Chou K.C., Prediction protein structural classes with pseudo-amino acid composition: approximate entropy and hydrophobicity pattern, J. Therm. Biol., 2008, 250, 186-193 10.1016/j.jtbi.2007.09.014Search in Google Scholar

[28] Zheng X., Li C., Wang J., An information-theoretic approach to the prediction of protein structural class, J. Comput. Chem., 2010, 31, 1201-1206 10.1002/jcc.21406Search in Google Scholar

[29] Wang Z.X., Yuan Z., How good is prediction of protein structural class by the component-coupled method? Proteins, 2000, 38, 165-175 10.1002/(SICI)1097-0134(20000201)38:2<165::AID-PROT5>3.0.CO;2-VSearch in Google Scholar

[30] Kurgan L., and Chen K., Prediction of protein structural class for the twilight zone sequences, Biochem. Biophys. Res. Commun., 2007, 357, 453-460 10.1016/j.bbrc.2007.03.164Search in Google Scholar

[31] Cao Y.F., Liu S., Zhang L.D., Qin J., Wang J., Tang K.X., Prediction of protein structural class with Rough Sets, BMC Bioinformatics, 2006, 7, 20 10.1186/1471-2105-7-20Search in Google Scholar PubMed PubMed Central

[32] Cai Y.D., Feng K.Y., Lu W.C., Chou K.C., Using LogitBoost classifier to predict protein structural classes, J. Therm. Biol., 2006, 238, 172-176 10.1016/j.jtbi.2005.05.034Search in Google Scholar PubMed

[33] Chen C., Zhou X., Tian Y., Zou X., Cai P., Predicting protein structural class with pseudo-amino acid composition and support vector machine fusion network, Anal. Biochem., 2006b, 357, 116-121 10.1016/j.ab.2006.07.022Search in Google Scholar PubMed

[34] Feng K.Y., Cai Y.D., Chou K.C., Boosting classifier for predicting protein domain structural class, Biochem. Biophys. Res. Commun., 2005, 334, 213-217 10.1016/j.bbrc.2005.06.075Search in Google Scholar PubMed

[35] Chen L., Lu L., Feng K., Li W., Song J., Zheng L., et al., Multiple classifier integration for the prediction of protein structural classes, J. Comput. Chem., 2009, 30, 2248-2254 10.1002/jcc.21230Search in Google Scholar PubMed

[36] Kedarisetti K.D., Kurgan L., Dick S., Classifier ensembles for protein structural class prediction with varying homology, Biochem. Biophys. Res. Commun., 2006, 348, 981-988 10.1016/j.bbrc.2006.07.141Search in Google Scholar PubMed

[37] Rabiner L., Juang, B.H., Fundamentals of Speech Recognition, Prentice Hall, Englewood Cliffs, NJ, 1993. Search in Google Scholar

[38] Dominik B., Miriam B.B., Lies B., Ashwin U., John E. P., et al., Signal analysis for genome wide maps of histone modifications measured by ChIP-seq, Bioinformatics, 2012, 28, 1062-1069 10.1093/bioinformatics/bts085Search in Google Scholar PubMed

[39] Altschul S.F., Madden T.L., Schaffer A.A., Zhang J., Zhang Z., Miller W, et al., Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res., 1997, 25, 3389-3402 10.1093/nar/25.17.3389Search in Google Scholar PubMed PubMed Central

[40] Kurgan L.A., Homaeian L., Prediction of structural classes for protein sequences and domains - Impact of prediction algorithms, sequence representation and homology, and test procedures on accuracy, Pattern Recogn., 2006, 39, 2323-2343 10.1016/j.patcog.2006.02.014Search in Google Scholar

[41] Vapnik V., The nature of statistical learning theory, Springer, New York,1995. 10.1007/978-1-4757-2440-0Search in Google Scholar

[42] Chang C., Lin C., LIBSVM: a library for support vector machines, ACM T. Intel. Syst. Techn., 2001, 2, 27. 10.1145/1961189.1961199Search in Google Scholar

[43] Liu T., Jia C., A high-accuracy protein structural class prediction algorithm using predicted secondary structural information, J. Therm. Biol., 2001, 267, 272-275 10.1016/j.jtbi.2010.09.007Search in Google Scholar PubMed

[44] Li Z.C., Zhou X.B., Dai Z., Zou X.Y., Prediction of protein structural classes by Chou’s pseudo amino acid composition: approached using continuous wavelet transform and principal component analysis, Amino Acids, 2009, 37, 415-425 10.1007/s00726-008-0170-2Search in Google Scholar PubMed

[45] Liu T., Zheng X., Wang J., Prediction of protein structural class for low-similarity sequences using support vector machine and PSI-BLAST profile, Biochimie, 2010, 92, 1330-1334 10.1016/j.biochi.2010.06.013Search in Google Scholar PubMed

[46] Yang J.Y., Peng Z.L., Chen X., Prediction of protein structural classes for low-homology sequences based on predicted second-ary structure, BMC Bioinformatics, 2010, 11, S9 10.1186/1471-2105-11-S1-S9Search in Google Scholar PubMed PubMed Central

Received: 2014-4-5
Accepted: 2015-2-17
Published Online: 2015-11-26

©2015 Yufang Qin et al.

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Scroll Up Arrow