摘要
[目的/意义]通过分析政府数据隐私相关文本,设计敏感数据识别方案,构建隐私计量模型,计量敏感数据的隐私值,为政府数据隐私保护提供理论依据.[方法/过程]首先筛选政府数据隐私的相关文本构建样本库;然后依据文本的句法结构,抽取敏感数据项、核心动词、程度词、否定词等词汇,构建政府数据隐私语义词表;最后以上述词汇组成的敏感数据单元为基础,构建隐私计量模型.[结果/结论]该方法基于隐私相关文本,准确析出政府数据的敏感数据,客观计量政府数据对象的隐私值,可为政府数据的隐私风险防范及隐私保护规范化提供支持.
[Purpose/Significance]Through the analysis of government data privacy related texts,designing sensitive data identification scheme,building a privacy measurement model,and measuring the privacy value of sensitive data,this paper provides a theoretical basis for government data privacy protection.[Method/Process]First,filtered the relevant text of government data privacy to build a sample library;Then,according to the syntactic structure of the text,words such as sensitive data items,core verbs,degree words,negative words were extracted,it constructed the semantic vocabulary of government data privacy;Finally,based on the sensitive data unit composed of the above words,it constructed privacy measurement model.[Result/Conclusion]This method is based on privacy related texts,accurately extracts the sensitive data of government data,objectively measures the privacy value of government data objects,and provides support for the privacy risk prevention and privacy protection standardization of government data.
作者
臧国全
王家振
毕崇武
耿瑞利
Zang Guoquan;Wang Jiazhen;Bi Chongwu;Geng Ruili(School of Information Management,Zhengzhou University,Zhengzhou 45001;Research Institute of Data Science,Zhengzhou City,Zhengzhou 450001)
出处
《图书情报工作》
CSSCI
北大核心
2022年第15期66-75,共10页
Library and Information Service
基金
国家社会科学基金重大项目"政府数据的隐私风险计量与保护机制创新研究"(项目编号:21&ZD338)研究成果之一。
关键词
政府数据
数据隐私
个人隐私
语义词表
隐私计量
government data
data privacy
personal privacy
semantic vocabulary
privacy measurement