期刊文献+

一种基于贪婪算法的KNN参数选择策略 被引量:2

A Strategy for Parameter Selecting in KNN Based on Greedy Search
在线阅读 下载PDF
导出
摘要 K近邻算法是基于向量空间模型的最好的文本分类算法之一。使用KNN算法时通常要用贪婪算法进行参数选择,最终的参数不仅取决于每个参数的初始值及候选值,而且和参数选择的顺序密切相关。不同的参数选择策略间存在较大差异,通过实验,指出了KNN算法进行文本分类时一个较好的参数选择策略。 KNN(K nearest neighbors) is one of the best text categorization algorithms based on Vector Space Model. Greedy Algorithm is the most common parameter selecting method for KNN,the final result depends on not only the initial parameter and the candidate parameters but also the order in the tuning process. This survey introduces a better strategy for parameter selecting in KNN by experiment.
出处 《广西师范大学学报(自然科学版)》 CAS 北大核心 2008年第1期182-185,共4页 Journal of Guangxi Normal University:Natural Science Edition
基金 国家863计划基金资助项目(2006AA01Z143,2006AA01Z139) 国家社会科学基金资助项目(07BYY051) 江苏省自然科学基金资助项目(BK2006117)
关键词 文本分类 K近邻 参数调节 贪婪算法 text categorization KNN parameter tuning greedy search
  • 相关文献

参考文献11

  • 1AAS K,EIKVIL L. Text categorization :a survey[R]. Oslo :Norwegian Computing Center, 1999.
  • 2YANG Yi-ming. An evaluation of statistical approaches to text categorization[J]. Information Retrieval, 1999,1 (1/ 2) :69-90.
  • 3徐晓颖,王晓晔,杜太行.基于Fuzzy ART的K-最近邻分类改进算法[J].河北工业大学学报,2004,33(6):1-5. 被引量:4
  • 4钱晓东,王正欧.基于改进KNN的文本分类方法[J].情报科学,2005,23(4):550-554. 被引量:19
  • 5陈振洲,李磊,姚正安.基于SVM的特征加权KNN算法[J].中山大学学报(自然科学版),2005,44(1):17-20. 被引量:52
  • 6MITCHELL T, Machine Learning[M]. New York:McCraw Hill ,1996.
  • 7LEWIS D D,SCHAPIRE R E,CALLAN J P ,et al. Training algorithms for linear text classifiers [C]//Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York :ACM Press, 1996: 298-306.
  • 8YANG Yi-ming,PEDERSEN J O. A comparative study on feature selection in text categorization[C]//Proceeding of the Fourteenth International Conference on Machine Learning. San Francisco ,CA :Morgan Kaufmann Publishers Inc, 1997:412-420.
  • 9LEWIS D D,YANG Yi-ming,Rose T G,et al. RCV.1 : a new benchmark collection for text categorization research [J]. Journal of Machine Learning Research, 2004,5 : 361-397.
  • 10YANG Yi-ming. A study of thresholding strategies for text categorization[C]//Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York :ACM Press,2001 : 137-145.

二级参考文献23

  • 1COVER T M,HART P E. Nearest neighbor pattern classification [J]. In Trans IEEE Inform Theory, 1967,IT- 13:21 - 27.?A
  • 2CHO T H,CONNERS R W,ARAMAN P A. A comparison of rule-based, K-nearest neighbor, and neural net classifiers for automation [ C ]. Proceedings, Developing and Managing Expert System Programs, 1991, 202 - 209.?A
  • 3DUDANI S A. The distance-weighted k-nearest-neighbor rule [J]. IEEE Trans Syst Man Cyber, 1976, 6:325-327.?A
  • 4VAPNIK V N. The nature of statistical learningtheory[M].NewYork:Springer-Verlag,1995.张学工,译.统计学习理论的本质[M].北京:清华大学出版社,1999.?A
  • 5BURGES J C. A tutorial on support vector machines for pattern recognition [ M ]. Bell Laboratories, Lucent Technologies, Boston, 1997.?A
  • 6KEERTHI S S, SHEVADE S K, BHATTACHARYYA C, et al. Improvements to Platt's SMO algorithm for SVM classifier design[J]. Neural Computation,2001,13(3):637 - 649.?A
  • 7LIN C J. A formal analysis of stopping criteria of decomposition methods for support vector machines[J]. IEEE Transaction on Neural Networks 2002, 13 (5): 1045 - 1052.?A
  • 8LEE J H, LIN C J. Automatic model selection for support vector machines[ EB/OL]. Available from http:∥www. csie.ntu. edu. tw/~ cjlin/papers. html, 2000.?A
  • 9Shin C, Yun U, Kim H, etal. A Hybrid Approach of Neural Network and Memory-Based Learning to Data Minging [J]. IEEE Transaction on NeuralNetwork, 2000, 11 (3): 637-646.
  • 10Duda R O, Hart P E. Pattern classification and Scene Analysis [ M]. John Wiley & Sons, 1991.

共引文献70

同被引文献21

  • 1ZHANG Shi-chao.Parimputation:from imputation and null-imputation to partially imputation[J].IEEE Intelligent Informatics Bulletin,2008,9 (1):32-38.
  • 2ZHANG Shi-chao.Shell-neighbor method and its application in missing data imputation[J].Applied Intelligence,2010(待发).
  • 3QIN Yong-song,ZHANG Shi-chao,ZHU Xiao-feng,et al.Semi-parametric optimization for missing data imputation[J].Applied Intelligence,2007,27 (1):79-88.
  • 4BATISTA G,MONARD M C.An analysis of four missing data treatment methods for supervised learning[J].Applied Artificial Intelligence,2003,17 (5):519-533.
  • 5GEDIGA G,DUNTSCH I.Maximum consistency of incomplete data via non-invasive imputation[J].Artificial Intelligence Review,2003,19 (1):93-107.
  • 6WANG Qi-hua,RAO J N K.Empirical likelihood-based inference under imputation for missing response data[J].The Annals of Statistics,2002,30(3):896-924.
  • 7BATISTA G E,MONARD M C.A study of k-nearest neighbor as a model-based method to treat missing data[C]// Proceedings of the Argentine Symposium on Artificial Intelligence.Bering Germany:Springer,2001,30:1-9.
  • 8朱晓锋.缺失值填充若干问题研究[D].桂林:广西师范大学计算机科学与信息工程学院,2007.
  • 9周念成,廖建权,王强钢,李春艳,李剑.深度学习在智能电网中的应用现状分析与展望[J].电力系统自动化,2019,43(4):180-191. 被引量:190
  • 10黄景光,丁婧,郑淑文,林湘宁.基于电流突变量的自适应过电流保护新原理[J].电力系统保护与控制,2018,46(7):49-55. 被引量:15

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部