摘要
网络敏感信息挖掘过程中,敏感信息和正常信息的特征不同,具有较高的遮蔽性。利用传统敏感信息挖掘方法时,固有的敏感信息被遮蔽,无法进行敏感信息的准确挖掘。提出基于TF-IDF改进聚类算法的网络敏感信息挖掘方法,通过TF-IDF方法获取网络敏感信息文本,在网络敏感信息文本中获取有价值的敏感信息特征,采用该信息完成聚类算法,对全部敏感信息特征进行聚类分析,完成网络敏感信息的挖掘。实验结果说明,所提方法进行网络敏感信息挖掘,具有较高的挖掘效率和精度。
In the mining process of objectionable Internet information,the sensitive information is different from normal information and has high shadowing property. When the traditional method is taken to excavate the sensitive information,the sensitive information can not be mined accurately because the inherent sensitive information is obscured. The objectionable Internet information excavation algorithm is proposed,in which clustering algorithm is improved on the basis of TF-IDF. It uses TF-IDF algorithm to obtain objectionable Internet informative text,in which valuable features of the sensitive information are got. This information is used to complete the clustering algorithm,and all the sensitive information features are clustered and analyzed,so that the network sensitive information is mined completely. The experimental results show that the proposed method has high efficiency and precision for objectionable network information excavation.
出处
《现代电子技术》
北大核心
2015年第24期44-46,49,共4页
Modern Electronics Technique
基金
2015年河南省高等学校重点科研项目:基于数据挖掘的反恐情报分析技术研究(15B520027)
2015年河南省高等学校重点科研项目:基于大数据的公安信息化应用技术研究(15A120014)
关键词
TF-IDF
聚类分析
网络敏感信息
信息挖掘
TF-IDF
clustering analysis
sensitive network information
information mining