摘要
互联网上存在海量数据,如何在大量的信息中查找到有用信息就变成了一个至关重要的问题。语义网为解决这一问题带来了曙光。然而当今网络现状与语义网之间存在巨大差距,即海量非结构化的页面内容难直接转化为语义的知识。提出了一种基于文档内容的语义标注方法,利用本体所表达的语义环境,即本体知识相关词汇及其所处的语义上下文环境在文档中出现频率,实现对文档的语义标注。实验显示方法取得良好的效果,但受本体知识质量和标注文档质量两个因素影响较大。
There are huge amount of data in internet today,which result in great difficult to retrieval useful and personal information for different users.Semantic Web brings us the opportunity to deal with this problem.However,there great gulf fixed between semantic web and the current internet,for it is impossible to transfer all of these data into semantic knowledge.Propose a semantic annotation method based the content of document and ontology.According to several heuristic rules,it annotates document by ontology entity.These rules calculate the relevant degree between ontology entity and a document by taking account of semantic context instead of the entity itself.There are two factors impact the effectiveness of this method:the 1st one is the quality of the domain ontology we made,and the 2nd is the quality of the document.
出处
《微计算机信息》
2011年第1期298-300,共3页
Control & Automation
关键词
本体
语义网
语义标注
文档
Ontology
Semantic Web
Semantic Annotation
Document