The future usage of heterogeneous databases will consist of the WWW and CORBA environments. The integration of the WWW databases and CORBA standards are discussed. These two techniques need to merge together to make d...The future usage of heterogeneous databases will consist of the WWW and CORBA environments. The integration of the WWW databases and CORBA standards are discussed. These two techniques need to merge together to make distributed usage of heterogeneous databases user friendly. In an environment integrating WWW databases and CORBA technologies, CORBA can be used to access heterogeneous data sources in the internet. This kind of applications can achieve distributed transactions to assure data consistency and integrity. The application of this technology is with a good prospect.展开更多
Aim To develop a heterogeneous database united system(HDBUS)that combines the local database of Oracle, Sybase and SQL server distributed on different server into a global database,and supports the global transaction...Aim To develop a heterogeneous database united system(HDBUS)that combines the local database of Oracle, Sybase and SQL server distributed on different server into a global database,and supports the global transaction management and parallel query over the Intranet Methods In the designing and implementation of HDBUS two important concepts heterogeneous tables join. Results and Conclu- tion The first concept can be used to process the parallel query of multiple database server, the second one is the key technology of heterogeneous is the key technology of heterogeneous distribute database.展开更多
The problem of sharing heterogeneous database for accessing different educational resources has to be considered. The study is carried out to realize the heterogeneous database sharing for educational resources using ...The problem of sharing heterogeneous database for accessing different educational resources has to be considered. The study is carried out to realize the heterogeneous database sharing for educational resources using multi-media educa-tional resources as the researching object. XML is applied as middleware for the practical requirements of education. The study has important practical significance for the intellectualization of educational and teaching resource platform.展开更多
The data nodes with heterogeneous database in early warning system for grain security seriously hampered the effective data collection in this system. In this article,the existing middleware technologies was analyzed,...The data nodes with heterogeneous database in early warning system for grain security seriously hampered the effective data collection in this system. In this article,the existing middleware technologies was analyzed,the problem-solution approach of heterogeneous data sharing was discussed through middleware technologies. Based on this method,and according to the characteristics of early warning system for grain security,the technology of data sharing in this system were researched and explored to solve the issues of collection of heterogeneous data sharing.展开更多
为解决现有的事件抽取方法在实体抽取子任务中难以充分利用上下文信息,导致事件抽取精度较低的问题,提出了基于跨度和图卷积网络的篇章级事件抽取(document-level event extraction based on span and graph convolutional network, DEE...为解决现有的事件抽取方法在实体抽取子任务中难以充分利用上下文信息,导致事件抽取精度较低的问题,提出了基于跨度和图卷积网络的篇章级事件抽取(document-level event extraction based on span and graph convolutional network, DEESG)模型。首先,设计中间线性层对编码的向量进行线性处理,并结合标注信息计算最佳跨度,通过提升对跨度开始位置和结束位置判断的准确度来提高实体抽取的精度;接着,提出异构图的构建方法,使用池化策略将实体与句子表示为图的节点,根据提出的建边规则构建异构图,以此建立全局信息的交互,并利用多层图卷积网络(graph convolutional network, GCN)对异构图进行卷积,获得具有上下文信息的实体表示和句子表示,以此解决上下文信息利用不充分的问题;然后,利用多头注意力机制进行事件类型的检测;最后,为组合中的实体分配论元角色,完成事件抽取任务。在中文金融公告(Chinese financial announcements, ChFinAnn)数据集上进行实验。结果表明,与拥有追踪器的异构图交互模型(graph-based interaction model with a tracker, GIT)相比,DEESG模型的F1分数提升了1.3个百分点。该研究证实DEESG模型能有效应用于篇章级事件抽取领域。展开更多
Entity Linking(EL)aims to automatically link the mentions in unstructured documents to corresponding entities in a knowledge base(KB),which has recently been dominated by global models.Although many global EL methods ...Entity Linking(EL)aims to automatically link the mentions in unstructured documents to corresponding entities in a knowledge base(KB),which has recently been dominated by global models.Although many global EL methods attempt to model the topical coherence among all linked entities,most of them failed in exploiting the correlations among manifold knowledge helpful for linking,such as the semantics of mentions and their candidates,the neighborhood information of candidate entities in KB and the fine-grained type information of entities.As we will show in the paper,interactions among these types of information are very useful for better characterizing the topic features of entities and more accurately estimating the topical coherence among all the referred entities within the same document.In this paper,we present a novel HEterogeneous Graph-based Entity Linker(HEGEL)for global entity linking,which builds an informative heterogeneous graph for every document to collect various linking clues.Then HEGEL utilizes a novel heterogeneous graph neural network(HGNN)to integrate the different types of manifold information and model the interactions among them.Experiments on the standard benchmark datasets demonstrate that HEGEL can well capture the global coherence and outperforms the prior state-of-the-art EL methods.展开更多
Entity set expansion(ESE)aims to expand an entity seed set to obtain more entities which have common properties.ESE is important for many applications such as dictionary con-struction and query suggestion.Traditional ...Entity set expansion(ESE)aims to expand an entity seed set to obtain more entities which have common properties.ESE is important for many applications such as dictionary con-struction and query suggestion.Traditional ESE methods relied heavily on the text and Web information of entities.Recently,some ESE methods employed knowledge graphs(KGs)to extend entities.However,they failed to effectively and fficiently utilize the rich semantics contained in a KG and ignored the text information of entities in Wikipedia.In this paper,we model a KG as a heterogeneous information network(HIN)containing multiple types of objects and relations.Fine-grained multi-type meta paths are proposed to capture the hidden relation among seed entities in a KG and thus to retrieve candidate entities.Then we rank the entities according to the meta path based structural similarity.Furthermore,to utilize the text description of entities in Wikipedia,we propose an extended model CoMeSE++which combines both structural information revealed by a KG and text information in Wikipedia for ESE.Extensive experiments on real-world datasets demonstrate that our model achieves better performance by combining structural and textual information of entities.展开更多
With recent advancement on hardware technologies, new general-purpose high-performance devices have been widely adopted, such as the graphics processing unit (GPU) and solid state drive (SSD). GPU may offer an ord...With recent advancement on hardware technologies, new general-purpose high-performance devices have been widely adopted, such as the graphics processing unit (GPU) and solid state drive (SSD). GPU may offer an order of higher throughput for applications with massive data parallelism, compared with the multicore CPU. Moreover, new storage device SSD is also capable of offering a much higher I/O throughput and lower latency than a traditional hard disk device (HDD). These new hardware devices can significantly boost the performance of many applications;thus the database community has been actively engaging in adopting them into database systems. However, the performance benefit cannot be easily reaped if the new hardwares are improperly used. In this paper, we propose Hetero-DB, a high-performance database system by exploiting both the characteristics of the database system and the special properties of the new hardware devices in system’s design and implementation. Hetero-DB develops a GPU-aware query execution engine with GPU device memory management and query scheduling mechanism to support concurrent query execution. Furthermore, with the SSD-HDD hybrid storage system, we redesign the storage engine by organizing HDD and SSD into a two-level caching hierarchy in Hetero-DB. To best utilize the hybrid hardware devices, the semantic information that is critical for storage I/O is identified and passed to the storage manager, which has a great potential to improve the e?ciency and performance. Hetero-DB aims to maximize the performance benefits of GPU and SSD, and demonstrates the effectiveness for designing next generation database systems.展开更多
Heterogeneous information networks,which consist of multi-typed vertices representing objects and multi-typed edges representing relations between objects,are ubiquitous in the real world.In this paper,we study the pr...Heterogeneous information networks,which consist of multi-typed vertices representing objects and multi-typed edges representing relations between objects,are ubiquitous in the real world.In this paper,we study the problem of entity matching for heterogeneous information networks based on distributed network embedding and multi-layer perceptron with a highway network,and we propose a new method named DEM short for Deep Entity Matching.In contrast to the traditional entity matching methods,DEM utilizes the multi-layer perceptron with a highway network to explore the hidden relations to improve the performance of matching.Importantly,we incorporate DEM with the network embedding methodology,enabling highly efficient computing in a vectorized manner.DEM's generic modeling of both the network structure and the entity attributes enables it to model various heterogeneous information networks flexibly.To illustrate its functionality,we apply the DEM algorithm to two real-world entity matching applications:user linkage under the social network analysis scenario that predicts the same or matched users in different social platforms and record linkage that predicts the same or matched records in different citation networks.Extensive experiments on real-world datasets demonstrate DEM's effectiveness and rationality.展开更多
文摘The future usage of heterogeneous databases will consist of the WWW and CORBA environments. The integration of the WWW databases and CORBA standards are discussed. These two techniques need to merge together to make distributed usage of heterogeneous databases user friendly. In an environment integrating WWW databases and CORBA technologies, CORBA can be used to access heterogeneous data sources in the internet. This kind of applications can achieve distributed transactions to assure data consistency and integrity. The application of this technology is with a good prospect.
文摘Aim To develop a heterogeneous database united system(HDBUS)that combines the local database of Oracle, Sybase and SQL server distributed on different server into a global database,and supports the global transaction management and parallel query over the Intranet Methods In the designing and implementation of HDBUS two important concepts heterogeneous tables join. Results and Conclu- tion The first concept can be used to process the parallel query of multiple database server, the second one is the key technology of heterogeneous is the key technology of heterogeneous distribute database.
文摘The problem of sharing heterogeneous database for accessing different educational resources has to be considered. The study is carried out to realize the heterogeneous database sharing for educational resources using multi-media educa-tional resources as the researching object. XML is applied as middleware for the practical requirements of education. The study has important practical significance for the intellectualization of educational and teaching resource platform.
基金Supported by Monitoring and Early warning System for Grain Security in Henan (0613024000)
文摘The data nodes with heterogeneous database in early warning system for grain security seriously hampered the effective data collection in this system. In this article,the existing middleware technologies was analyzed,the problem-solution approach of heterogeneous data sharing was discussed through middleware technologies. Based on this method,and according to the characteristics of early warning system for grain security,the technology of data sharing in this system were researched and explored to solve the issues of collection of heterogeneous data sharing.
文摘为解决现有的事件抽取方法在实体抽取子任务中难以充分利用上下文信息,导致事件抽取精度较低的问题,提出了基于跨度和图卷积网络的篇章级事件抽取(document-level event extraction based on span and graph convolutional network, DEESG)模型。首先,设计中间线性层对编码的向量进行线性处理,并结合标注信息计算最佳跨度,通过提升对跨度开始位置和结束位置判断的准确度来提高实体抽取的精度;接着,提出异构图的构建方法,使用池化策略将实体与句子表示为图的节点,根据提出的建边规则构建异构图,以此建立全局信息的交互,并利用多层图卷积网络(graph convolutional network, GCN)对异构图进行卷积,获得具有上下文信息的实体表示和句子表示,以此解决上下文信息利用不充分的问题;然后,利用多头注意力机制进行事件类型的检测;最后,为组合中的实体分配论元角色,完成事件抽取任务。在中文金融公告(Chinese financial announcements, ChFinAnn)数据集上进行实验。结果表明,与拥有追踪器的异构图交互模型(graph-based interaction model with a tracker, GIT)相比,DEESG模型的F1分数提升了1.3个百分点。该研究证实DEESG模型能有效应用于篇章级事件抽取领域。
基金supported in part by the National Key R&D Program of China(No.2020AAA0106600)the Key Laboratory of Science,Technology and Standard in Press Industry(Key Laboratory of Intelligent Press Media Technology)
文摘Entity Linking(EL)aims to automatically link the mentions in unstructured documents to corresponding entities in a knowledge base(KB),which has recently been dominated by global models.Although many global EL methods attempt to model the topical coherence among all linked entities,most of them failed in exploiting the correlations among manifold knowledge helpful for linking,such as the semantics of mentions and their candidates,the neighborhood information of candidate entities in KB and the fine-grained type information of entities.As we will show in the paper,interactions among these types of information are very useful for better characterizing the topic features of entities and more accurately estimating the topical coherence among all the referred entities within the same document.In this paper,we present a novel HEterogeneous Graph-based Entity Linker(HEGEL)for global entity linking,which builds an informative heterogeneous graph for every document to collect various linking clues.Then HEGEL utilizes a novel heterogeneous graph neural network(HGNN)to integrate the different types of manifold information and model the interactions among them.Experiments on the standard benchmark datasets demonstrate that HEGEL can well capture the global coherence and outperforms the prior state-of-the-art EL methods.
基金This work was supported by the National Natural Science Foundation of China(Grant Nos.61806020,61772082,61972047,61702296)the National Key Research and Development Program of China(2017YFB0803304)+1 种基金the Beijing Municipal Natural Science Foundation(4182043)the CCF-Tencent Open Fund,and the Fundamental Research Funds for the Central Universities.
文摘Entity set expansion(ESE)aims to expand an entity seed set to obtain more entities which have common properties.ESE is important for many applications such as dictionary con-struction and query suggestion.Traditional ESE methods relied heavily on the text and Web information of entities.Recently,some ESE methods employed knowledge graphs(KGs)to extend entities.However,they failed to effectively and fficiently utilize the rich semantics contained in a KG and ignored the text information of entities in Wikipedia.In this paper,we model a KG as a heterogeneous information network(HIN)containing multiple types of objects and relations.Fine-grained multi-type meta paths are proposed to capture the hidden relation among seed entities in a KG and thus to retrieve candidate entities.Then we rank the entities according to the meta path based structural similarity.Furthermore,to utilize the text description of entities in Wikipedia,we propose an extended model CoMeSE++which combines both structural information revealed by a KG and text information in Wikipedia for ESE.Extensive experiments on real-world datasets demonstrate that our model achieves better performance by combining structural and textual information of entities.
基金This work was supported in part by the National Science Foundation of USA under Grant Nos. CCF-0913050, OCI-1147522, and CNS-1162165.
文摘With recent advancement on hardware technologies, new general-purpose high-performance devices have been widely adopted, such as the graphics processing unit (GPU) and solid state drive (SSD). GPU may offer an order of higher throughput for applications with massive data parallelism, compared with the multicore CPU. Moreover, new storage device SSD is also capable of offering a much higher I/O throughput and lower latency than a traditional hard disk device (HDD). These new hardware devices can significantly boost the performance of many applications;thus the database community has been actively engaging in adopting them into database systems. However, the performance benefit cannot be easily reaped if the new hardwares are improperly used. In this paper, we propose Hetero-DB, a high-performance database system by exploiting both the characteristics of the database system and the special properties of the new hardware devices in system’s design and implementation. Hetero-DB develops a GPU-aware query execution engine with GPU device memory management and query scheduling mechanism to support concurrent query execution. Furthermore, with the SSD-HDD hybrid storage system, we redesign the storage engine by organizing HDD and SSD into a two-level caching hierarchy in Hetero-DB. To best utilize the hybrid hardware devices, the semantic information that is critical for storage I/O is identified and passed to the storage manager, which has a great potential to improve the e?ciency and performance. Hetero-DB aims to maximize the performance benefits of GPU and SSD, and demonstrates the effectiveness for designing next generation database systems.
基金supported by the National Natural Science Foundation of China Youth Fund under Grant No.61902001.
文摘Heterogeneous information networks,which consist of multi-typed vertices representing objects and multi-typed edges representing relations between objects,are ubiquitous in the real world.In this paper,we study the problem of entity matching for heterogeneous information networks based on distributed network embedding and multi-layer perceptron with a highway network,and we propose a new method named DEM short for Deep Entity Matching.In contrast to the traditional entity matching methods,DEM utilizes the multi-layer perceptron with a highway network to explore the hidden relations to improve the performance of matching.Importantly,we incorporate DEM with the network embedding methodology,enabling highly efficient computing in a vectorized manner.DEM's generic modeling of both the network structure and the entity attributes enables it to model various heterogeneous information networks flexibly.To illustrate its functionality,we apply the DEM algorithm to two real-world entity matching applications:user linkage under the social network analysis scenario that predicts the same or matched users in different social platforms and record linkage that predicts the same or matched records in different citation networks.Extensive experiments on real-world datasets demonstrate DEM's effectiveness and rationality.