摘要
垃圾邮件是长期以来困扰电子邮件使用者的一个问题,反垃圾邮件技术除了可以抑制垃圾邮件,对反垃圾短信和垃圾VoIP电话等问题也有借鉴意义.为此,对使用贝叶斯方法过滤垃圾邮件进行了介绍,阐述了中文垃圾邮件过滤系统的实现,并给出了评估结果.结果表明,在过滤中计算最终概率的特征数目以及用于训练的样本个数都存在某个最优值,当用于训练的样本个数逐渐超过这个最优值时,过滤效果会略微下降并趋于一致.
Spam has been a serious problem to email users for a long time.Anti-spam technique can be used to block not only spam but also unsolicited commercial mobile messages and VoIP phones.Here the authors give a survey of bayes filtering,introduce a Chinese spam filtering system and show the evaluation.It is shown that there are certain optimized values for the size of the training aggregate and the token numbers that are calculated to the final probability.If the size of the training aggregate exceeds the optimum value,the filtering effect will decrease a little and go to a constant as the aggregate size increases.
出处
《大连理工大学学报》
EI
CAS
CSCD
北大核心
2005年第z1期189-195,共7页
Journal of Dalian University of Technology