摘要
尽管精度(或总体错分率)普遍用作分类算法的性能评价指标,但存在诸如敏感于类先验分布和错分代价,忽略分类算法所得的后验概率或排序信息等不足.而接收者操作特性(ROC)曲线下面积则能度量算法在整个类先验分布及错分代价范围内的总体分类性能、后验概率和排序性能,因此在分类学习中受到越来越多的关注,由此涌现出众多研究成果.文章旨在对此作相对全面的回顾和总结,包括AUC作为性能评价指标的优势所在,基于AUC优化的算法设计,基于精度优化和AUC优化的算法间的关系以及AUC存在的不足及改进.
Though as a common performance evaluating index for classification algorithms, accuracy (or total miselassification error) has several deficiencies, such as the sensitivity to class prior distribution and misclassification costs, and the ignorance of the posterior probability and ranking information obtained by classification algorithms. While the area under the receiver operation characteristic (ROC) curve measures the classification performance across the entire range of class prior distribution and misclassification costs, as well as the probability and ranking performance. Thus, it attracts much attention in classification learning and evokes a lot of researches. In this paper, a relative comprehensive survey for these researches is presented, including the advantages of AUC as a performance evaluating index, the design of algorithms based on AUC, the relationship between the accuracy-maximizing and AUC-maximizing algorithms and the deficiencies of AUC along with its variants.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2011年第1期64-71,共8页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金(No.60773061)
江苏省自然科学基金(No.BK2008381)
高校博士点基金(No.200802870003)资助项目