摘要
针对现有分类方法在对数字化档案分类时存在分类结果轮廓系数和CH指数过低的问题,本文引入改进K-均值聚类,开展数字化档案智能分类方法的设计研究。通过提取数字化档案特征,并对特征主分量分析,利用改进K-均值聚类,完成对档案特征相似度的计算,结合相似度计算结果,对数字化档案关键词自动聚类,并实现智能分类。通过实验证明:新的分类方法应用后,数字化档案分类结果的轮廓系数和CH指数均显著提高,该分类方法具备较高的分类精度,同时也可广泛应用于类似资源分类当中。
There is a problem of low contour coefficient and CH index in the classification results of digital archives using existing classification methods.Therefore,this article introduces improved K-means clustering to carry out research on the design of intelligent classification methods for digital archives.By extracting features from digital archives and analyzing their principal components,using improved K-means clustering to calculate the similarity of archive features,combining the similarity calculation results,automatically cluster digital archive keywords and achieve intelligent classification.Through experiments,it has been proven that the application of the new classification method significantly improves the contour coefficient and CH index of digital archive classification results.This classification method has high classification accuracy and can also be widely applied in similar resource classification.
作者
李嘉
LI Jia(Kunming Archives,Kunming Yunnan 650216)
出处
《软件》
2023年第11期103-105,共3页
Software
关键词
改进K-均值聚类
档案
分类
智能
数字化
improving K-means clustering
archives
classification
intelligence
digitization