期刊文献+

基于多特征融合的蛋白质折叠子预测 被引量:2

Protein Fold Prediction Based on Multi-Feature Fusion
在线阅读 下载PDF
导出
摘要 蛋白质折叠子预测为启发式搜索蛋白质三级结构提供了有用的信息。目前已知的折叠子预测方法大多数基于单种特征或多种特征的简单组合,本文采用一种多特征融合方法,从蛋白质的一级序列出发,对27类折叠子进行预测。使用支持向量机作为分类器,采用多对多的多类分类策略,以氨基酸组成成分、极性、极化性、范德瓦尔斯量、疏水性和预测的二级结构作为样本的六种特征,进行多特征融合,独立样本预测总精度为59.22%,与Ding等人的结果比较提高了3.2%,结果表明多特征融合方法是一种有效的蛋白质折叠子预测方法。 Protein fold prediction provides useful information for the heuristic search of protein tertiary structure. Many former fold prediction methods are based on a single feature or a simple combination of several features, and this paper presents a novel approach using multi-feature fusion (MFF) to make a 27-class fold prediction from primary structure of proteins. In this paper, we take support vector machine (SVM) as classifier, All-Versus-All as multi-class classification method. We use amino acid composition, polarity, polarizability, van der Waals volume, hydrophobieity and predicted secondary structure as features. Finally the prediction of the testing set was implemented by sixteen fusion schemes, and the better accuracy 59.22% is achieved and increases 3.2% than Ding' s. The result and comparison with Ding' s work show the effectiveness of MFF.
出处 《北京生物医学工程》 2006年第5期482-485,519,共5页 Beijing Biomedical Engineering
基金 国家自然基金(60372085 60404011)资助
关键词 折叠子预测 多特征融合 支持向量机 多类分类 fold prediction multi-feature fusion support vector machine multi-class classification
  • 相关文献

参考文献9

  • 1Ding CHQ,Dubchak I.Multi-class protein fold recognition using support vector machines and neural networks.Bioinformatics,2001,17:349-358
  • 2张绍武,潘泉,陈润生,张洪才.基于支持向量机的蛋白质同源寡聚体分类研究[J].生物化学与生物物理进展,2003,30(6):879-883. 被引量:15
  • 3Nakashima H,Nishikawa K and Ooi T.The folding type of a protein is relevant to the amino acid composition.Biochem,1986,99:153-162
  • 4Vapnik V.Statistical Learning Theory.Wiley-Interscience,1998
  • 5Kittler J,Hatef M,Duin RPW,et al.On combining classifiers.IEEE Transactions on Pattern Analysis and Machine Intelligence,1998,20 (3):226-239
  • 6Statnikov A,Aliferis CF,Tsamardinos I,et al.A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis.Bioinformatics,2005,21 (5):631-43
  • 7Joachims T.Making large-scale SVM learning practical.Advances in Kernel Methods-Support Vector Learning.Sch(o)lkopf B,Burges C,Smola A,ed.MIT Press,1999
  • 8Platt J,Cristianini N and Shawe-Taylor J.Large margin DAGs for multiclass classification.Advances in Neural Information Processing Systems,2000,12:547-553
  • 9Hsu CW and Lin CJ.A Comparison of Methods for Multi-class Support Vector Machines.IEEE Transactions in Neural Networks,2002,13 (2):415-425

二级参考文献21

  • 1Chou K C. A key driving force in determination of protein structural classes. Biochem Biophys Res Commun, 1999, 264(1): 216~224
  • 2Rost B, Sander C. Prediction of secondary structure at better than 70% accuracy. J Mol Biol, 1993, 232(2): 584~599
  • 3Reinhardt A, Hubbard T. Using neural networks for prediction of the subcellular location of proteins. Nucleic Acids Res, 1998, 26(9): 2230~2236
  • 4Emanuelsson O, Nielsen H, Brunak S, et al. Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol, 2000, 300(4): 1005~1016
  • 5Garian R. Prediction of quaternary structure from primary structure. Bioinformatics, 2001, 17(6): 551~556
  • 6Vapnik V. The Nature of Statistical Learning Theory. New York: Springer, 1995. 1~188
  • 7Vapnik V. Statistical Learning Theory. New York: Wiely, 1998. 1~736
  • 8Chou K C, Elrod D W. Using discriminant function for prediction of subcellular location of prokaryotic proteins. Biochem Biophys Res Commun, 1998, 252(1): 63~68
  • 9Chou K C, Elrod D W. Protein subcellular location prediction. Protein Eng, 1999, 12(2): 107~108
  • 10Brown M, Grundy W, Lin D, et al. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci USA, 2000, 97(1): 262~267

共引文献14

同被引文献9

  • 1娄震,金忠,杨静宇.基于类条件置信变换的后验概率估计方法[J].计算机学报,2005,28(1):18-24. 被引量:6
  • 2业宁,王迪,窦立君.信息熵与支持向量的关系[J].广西师范大学学报(自然科学版),2006,24(4):127-130. 被引量:10
  • 3Vapnik V N.The nature of statistical learning theory[M].New York: Springer Verlag, 2000 : 138-167.
  • 4Kom F,Muthukrishnan S.Influence sets based on reverse nearest neighbor queries[C]//Chen W D, Jeffrey F N, Philip A B. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA: 2000.New York, NY, USA: ACM Press, 2000 : 201-212.
  • 5Stanoi I,Agrawal D,Abbadi A E.Reverse nearest neighbor queries for dynamic data bases[C]//Chen W D, Jeffrey F N,Philip A B.Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, 2000.New York,NY,USA:ACM Press,2000:44-53.
  • 6Yang C, Lin K I.An index structure for efficient reverse nearest neighbor queries[C]//George K.Proceedings of the IEEE International Conference on Data Engineering, Heidelberg, Germany,2001.Washington:IEEE Computer Society,2001:485-492.
  • 7Richard O D,Peter E H,David G S.Pattem classification[M].李宏东,姚天翔,译.北京:机械工业出版社,2003:151-158.
  • 8Platt J C.Probabilistic outputs for support vector machines and comparison to regularized likelihood methods[C]//Advances in Large Margin Classifiers.Cambridge, MA: MIT Press, 2000: 61-74.
  • 9李蓉,叶世伟,史忠植.SVM-KNN分类器——一种提高SVM分类精度的新方法[J].电子学报,2002,30(5):745-748. 被引量:133

引证文献2

二级引证文献34

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部