摘要
提出了一种基于纠错编码的支持向量机多类分类算法 ( ECC- SVM) ,并在理论上研究了该算法的推广性与编码长度、码间汉明距离、编码顺序以及每个 SVM推广性之间的关系 ,给出了这种关系的数学表达 ,为提高该算法的推广能力指明了方向。把目前广泛使用的 1 - v- R SVM多类分类算法作为该算法的一个特例 ,分析了它的推广性。计算机仿真数据和多光谱遥感图像分类实验结果表明 ,ECC- SVM具有更快的分类速度和更高的分类精度 。
It is very difficult for SVM to perform multiclass classification. Going further than Refs.3~6, we propose using what we call ECC SVM——SVMs combined with ECCs——to perform multiclass classification. The first thing we do is that we provide each of K classes with an independent binary ECC. Thus we obtain a code matrix with K (class number) rows and L (ECC length) columns. For example, in Table 1, K=5 and L=10 bit. The value of each class in a column is either 1 or 0. We put all the “1” value classes in a column in one category, and put all the “0” value classes in a column in the second category. In this way, the multiclass classification problem is decomposed into L binary classification problems. We construct L SVMs to solve the L binary classification problems. The second thing we do is that, in section 2, we discuss theoretically in some depth the generalization performance of ECC SVM. Classification capability of ECC SVM depends on its generalization performance. We draw on VC (Vapnik Chervonenkis) theory to develop the relationship existing between ECC properties——code length, Hamming distance between codes, and margin of each SVM——and the generalization performance of the newly proposed ECC SVM algorithm. In particular, section 2 gives Theorems 1 and 2 for this relationship. For example, according to Theorem 2, the generalization performance of ECC SVM is not determined by SVM with the worst generalization performance. Instead, it depends on M SVMs with better generalization performances. M is equal to the integer of the number expressed by [L-(d-1)/2] , where L is code length, and d is the minimum Hamming distance between codes. Section 2 is important because it indicates how to improve the generalization performance of ECC SVM. Incidentally, the 1 v R SVM (one versus rest SVM), which is widely used to solve multiclass problem, can be treated as a special case of ECC SVM. Compared with 1 v R SVM, ECC SVM has two desirable advantages: (1) shorter codes can be used to raise the speed of classification; (2) the minimum Hamming distances between ECCs can be increased to improve its generalization performance. Section 3 gives experimental results. Tables 3, 4 and 5 confirm the validity of Theorems 1 and 2. We used remote multispectral images to compare the performances of ECC SVM, 1 v R SVM and 1 v 1 SVM. Table 6 shows that our ECC SVM can provide better classification accuracy and faster classification speed than traditional algorithm.
出处
《西北工业大学学报》
EI
CAS
CSCD
北大核心
2003年第4期443-448,共6页
Journal of Northwestern Polytechnical University
基金
国家 973计划
教育部博士点基金
关键词
支持向量机(SVM)
纠错编码(ECC)
多类分类
推广性
1-v-R
SVM
support vector machine (SVM), error correcting code (ECC), multiclass classification, generalization performance, VC (Vapnik Chervonenkis) theory, 1 v R SVM (one versus rest SVM)