Urban sustainability assessment is an effective method for objectively presenting the current state of sustainable urban development and diagnosing sustainability-related issues.As the global community intensifies its...Urban sustainability assessment is an effective method for objectively presenting the current state of sustainable urban development and diagnosing sustainability-related issues.As the global community intensifies its efforts to implement the sustainable development goals(SDGs),the demand for assessing progress in urban sustainable development has increased.This has led to the emergence of numerous indicator systems with varying scales and themes published by different entities.Cities participating in these evaluations often encounter difficulties in matching indicators or the absence of certain indicators.In this context,urban decision makers and planners urgently need to identify substitute indicators that can express the semantic meaning of the original indicators and consider the availability of indicators for participating cities.Hence,this study explores the relationships of substitution between indicators and constructs a collection of substitute indicators to serve as a reference for sustainable urban development assessment.Specifically,building on a review of international and Chinese indicators related to urban sustainability assessment,this study employs natural semantic analysis methods based on the Word2Vec model and cosine similarity algorithm to calculate the similarity between indicators related to sustainable urban development.The results show that the Skip-gram algorithm with a word vector dimensionality of 600 has the best performance in terms of calculating the similarity between sustainable urban development assessment indicators.The findings provide valuable insights into selecting substitute indicators for future sustainable urban development assessment,particularly in China.展开更多
In order to achieve adaptive and efficient service composition, a task-oriented algorithm for discovering services is proposed. The traditional process of service composition is divided into semantic discovery and fun...In order to achieve adaptive and efficient service composition, a task-oriented algorithm for discovering services is proposed. The traditional process of service composition is divided into semantic discovery and functional matching and makes tasks be operation objects. Semantic similarity is used to discover services matching a specific task and then generate a corresponding task-oriented web service composition (TWC) graph. Moreover, an algorithm for the new service is designed to update the TWC. The approach is applied to the composition model, in which the TWC is searched to obtain an optimal path and the final service composition is output. Also, the model can implement realtime updating with changing environments. Experimental results demonstrate the feasibility and effectiveness of the algorithm and indicate that the maximum searching radius can be set to 2 to achieve an equilibrium point of quality and quantity.展开更多
Services discovery based on syntactic matching cannot adapt to the open and dynamic environment of the web. To select the proper one from the web services candidate set provided by syntactic matching, a service select...Services discovery based on syntactic matching cannot adapt to the open and dynamic environment of the web. To select the proper one from the web services candidate set provided by syntactic matching, a service selection method based on semantic similarity is proposed. First, this method defines a web services ontology including QoS and context as semantic supporting, which also provides a set of terms to describe the interfaces of web services. Secondly, the similarity degree of two web services is evaluated by computing the semantic distances of those terms used to describe interfaces. Compared with existing methods, interfaces of web services can be interpreted under ontology, because it provides a formal and semantic specification of conceptualization. Meanwhile, efficiency and accuracy of services selection are improved.展开更多
To solve the problem of the inadequacy of semantic processing in the intelligent question answering system, an integrated semantic similarity model which calculates the semantic similarity using the geometric distance...To solve the problem of the inadequacy of semantic processing in the intelligent question answering system, an integrated semantic similarity model which calculates the semantic similarity using the geometric distance and information content is presented in this paper. With the help of interrelationship between concepts, the information content of concepts and the strength of the edges in the ontology network, we can calculate the semantic similarity between two concepts and provide information for the further calculation of the semantic similarity between user’s question and answers in knowledge base. The results of the experiments on the prototype have shown that the semantic problem in natural language processing can also be solved with the help of the knowledge and the abundant semantic information in ontology. More than 90% accuracy with less than 50 ms average searching time in the intelligent question answering prototype system based on ontology has been reached. The result is very satisfied. Key words intelligent question answering system - ontology - semantic similarity - geometric distance - information content CLC number TP39 Foundation item: Supported by the important science and technology item of China of “The 10th Five-year Plan” (2001BA101A05-04)Biography: LIU Ya-jun (1953-), female, Associate professor, research direction: software engineering, information processing, data-base application.展开更多
In recent years, there are many types of semantic similarity measures, which are used to measure the similarity between two concepts. It is necessary to define the differences between the measures, performance, and ev...In recent years, there are many types of semantic similarity measures, which are used to measure the similarity between two concepts. It is necessary to define the differences between the measures, performance, and evaluations. The major contribution of this paper is to choose the best measure among different similarity measures that give us good result with less error rate. The experiment was done on a taxonomy built to measure the semantic distance between two concepts in the health domain, which are represented as nodes in the taxonomy. Similarity measures methods were evaluated relative to human experts’ ratings. Our experiment was applied on the ICD10 taxonomy to determine the similarity value between two concepts. The similarity between 30 pairs of the health domains has been evaluated using different types of semantic similarity measures equations. The experimental results discussed in this paper have shown that the Hoa A. Nguyen and Hisham Al-Mubaid measure has achieved high matching score by the expert’s judgment.展开更多
Internet of Things (IoT) as an important and ubiquitous service paradigm is one of the most important issues in IoT applications to provide terminal users with effective and efficient services based on service communi...Internet of Things (IoT) as an important and ubiquitous service paradigm is one of the most important issues in IoT applications to provide terminal users with effective and efficient services based on service community. This paper presents a semantic-based similarity algorithm to build the IoT service community. Firstly, the algorithm reflects that the nodes of IoT contain a wealth of semantic information and makes them to build into the concept tree. Then tap the similarity of the semantic information based on the concept tree. Finally, we achieve the optimization of the service community through greedy algorithm and control the size of the service community by adjusting the threshold. Simulation results show the effectiveness and feasibility of this algorithm.展开更多
In this paper, we proposed an improved hybrid semantic matching algorithm combining Input/Output (I/O) semantic matching with text lexical similarity to overcome the disadvantage that the existing semantic matching al...In this paper, we proposed an improved hybrid semantic matching algorithm combining Input/Output (I/O) semantic matching with text lexical similarity to overcome the disadvantage that the existing semantic matching algorithms were unable to distinguish those services with the same I/O by only performing I/O based service signature matching in semantic web service discovery techniques. The improved algorithm consists of two steps, the first is logic based I/O concept ontology matching, through which the candidate service set is obtained and the second is the service name matching with lexical similarity against the candidate service set, through which the final precise matching result is concluded. Using Ontology Web Language for Services (OWL-S) test collection, we tested our hybrid algorithm and compared it with OWL-S Matchmaker-X (OWLS-MX), the experimental results have shown that the proposed algorithm could pick out the most suitable advertised service corresponding to user's request from very similar ones and provide better matching precision and efficiency than OWLS-MX.展开更多
Most of the questions from users lack the context needed to thoroughly understand the problemat hand,thus making the questions impossible to answer.Semantic Similarity Estimation is based on relating user’s questions...Most of the questions from users lack the context needed to thoroughly understand the problemat hand,thus making the questions impossible to answer.Semantic Similarity Estimation is based on relating user’s questions to the context from previous Conversational Search Systems(CSS)to provide answers without requesting the user’s context.It imposes constraints on the time needed to produce an answer for the user.The proposed model enables the use of contextual data associated with previous Conversational Searches(CS).While receiving a question in a new conversational search,the model determines the question that refers tomore pastCS.Themodel then infers past contextual data related to the given question and predicts an answer based on the context inferred without engaging in multi-turn interactions or requesting additional data from the user for context.This model shows the ability to use the limited information in user queries for best context inferences based on Closed-Domain-based CS and Bidirectional Encoder Representations from Transformers for textual representations.展开更多
In this paper, a finite state machine approach is followed in order to find the semantic similarity of two sentences. The approach exploits the concept of bi-directional logic along with a semantic ordering approach. ...In this paper, a finite state machine approach is followed in order to find the semantic similarity of two sentences. The approach exploits the concept of bi-directional logic along with a semantic ordering approach. The core part of this approach is bi-directional logic of artificial intelligence. The bi-directional logic is implemented using Finite State Machine algorithm with slight modification. For finding the semantic similarity, keyword has played climactic importance. With the help of the keyword approach, it can be found easily at the sentence level according to this algorithm. The algorithm is proposed especially for Nepali texts. With the polarity of the individual keywords, the finite state machine is made and its final state determines its polarity. If two sentences are negatively polarized, they are said to be coherent, otherwise not. Similarly, if two sentences are of a positive nature, they are said to be coherence. For measuring the coherence (similarity), contextual concept is taken into consideration. The semantic approach, in this research, is a totally contextual based method. Two sentences are said to be semantically similar if they bear the same context. The total accuracy obtained in this algorithm is 90.16%.展开更多
As one of the essential topics in proteomics and molecular biology, protein subcellular localization has been extensively studied in previous decades. However, most of the methods are limited to the prediction of sing...As one of the essential topics in proteomics and molecular biology, protein subcellular localization has been extensively studied in previous decades. However, most of the methods are limited to the prediction of single-location proteins. In many studies, multi-location proteins are either not considered or assumed not existing. This paper proposes a novel multi-label subcellular-localization predictor based on the semantic similarity between Gene Ontology (GO) terms. Given a protein, the accession numbers of its homologs are obtained via BLAST search. Then, the homologous accession numbers of the protein are used as keys to search against the gene ontology annotation database to obtain a set of GO terms. The semantic similarity between GO terms is used to formulate semantic similarity vectors for classification. A support vector machine (SVM) classifier with a new decision scheme is proposed to classify the multi-label GO semantic similarity vectors. Experimental results show that the proposed multi-label predictor significantly outperforms the state-of-the-art predictors such as iLoc-Plant and Plant-mPLoc.展开更多
The similarity between biomedical terms/concepts is a very important task for biomedical information extraction and knowledge discovery. The measures and tests are tools used to define how to measure the goodness of o...The similarity between biomedical terms/concepts is a very important task for biomedical information extraction and knowledge discovery. The measures and tests are tools used to define how to measure the goodness of ontology or its resources. The semantic similarity measuring techniques can be classified into three classes: first, measuring semantic similarity using ontology/ taxonomy;second, using training corpora and information content and third, combination between them. Some of the semantic similarity measures are based on the path length between the concept nodes as well as the depth of the LCS node in the ontology tree or hierarchy, and these measures assign high similarity when the two concepts are in the lower level of the hierarchy. However, most of the semantic similarity measures can be adopted to be used in health domain (Biomedical Domain). Many experiments have been conducted to check the applicability of these measures. In this paper, we investigate to measure semantic similarity between two concepts within single ontology or multiple ontologies in UMLS Metathesaurus (MeSH, SNOMED-CT, ICD), and compare my results to human experts score by correlation coefficient.展开更多
In order to improve the effectiveness of semantic web service discovery, the semantic bias between an interface parameter and an annotation is reduced by extracting semantic restrictions for the annotation from the de...In order to improve the effectiveness of semantic web service discovery, the semantic bias between an interface parameter and an annotation is reduced by extracting semantic restrictions for the annotation from the description context and generating refined semantic annotations, and then the semantics of the web service is refined. These restrictions are dynamically extracted from the parsing tree of the description text, with the guide of the restriction template extracted from the ontology definition. New semantic annotations are then generated by combining the original concept with the restrictions and represented via refined concept expressions. In addition, a novel semantic similarity measure for refined concept expressions is proposed for semantic web service discovery. Experimental results show that the matchmaker based on this method can improve the average precision of discovery and exhibit low computational complexity. Reducing the semantic bias by utilizing restriction information of annotations can refine the semantics of the web service and improve the discovery effectiveness.展开更多
In order to mine production and security information from security supervising data and to ensure security and safety involved in production and decision-making,a clustering analysis algorithm for security supervising...In order to mine production and security information from security supervising data and to ensure security and safety involved in production and decision-making,a clustering analysis algorithm for security supervising data based on a semantic description in coal mines is studied.First,the semantic and numerical-based hybrid description method of security supervising data in coal mines is described.Secondly,the similarity measurement method of semantic and numerical data are separately given and a weight-based hybrid similarity measurement method for the security supervising data based on a semantic description in coal mines is presented.Thirdly,taking the hybrid similarity measurement method as the distance criteria and using a grid methodology for reference,an improved CURE clustering algorithm based on the grid is presented.Finally,the simulation results of a security supervising data set in coal mines validate the efficiency of the algorithm.展开更多
In order to improve the efficiency and quality of service composition,a service composition algorithm based on semantic constraint is proposed.First, a user’s requirements and services from a service repository are c...In order to improve the efficiency and quality of service composition,a service composition algorithm based on semantic constraint is proposed.First, a user’s requirements and services from a service repository are compared with the help of a matching algorithm.The algorithm has two levels and filters out the services which do not match the user’s constraint personality requirements.The mechanism can reduce the searching scope at the beginning of the service composition algorithm.Secondly,satisfactions of those selected services for the user’s personality requirements are computed and those services,which have the greatest satisfaction value to make up the service composition,are used.The algorithm is evaluated analytically and experimentally based on the efficiency of service composition and satisfaction for the user’s personality requirements.展开更多
In Chinese question answering system, because there is more semantic relation in questions than that in query words, the precision can be improved by expanding query while using natural language questions to retrieve ...In Chinese question answering system, because there is more semantic relation in questions than that in query words, the precision can be improved by expanding query while using natural language questions to retrieve documents. This paper proposes a new approach to query expansion based on semantics and statistics Firstly automatic relevance feedback method is used to generate a candidate expansion word set. Then the expanded query words are selected from the set based on the semantic similarity and seman- tic relevancy between the candidate words and the original words. Experiments show the new approach is effective for Web retrieval and out-performs the conventional expansion approaches.展开更多
A reputation mechanism is introduced in P2P- based Semantic Web to solve the problem of lacking trust. It enables Semantic Web to utilize reputation information based on semantic similarity of peers in the network. Th...A reputation mechanism is introduced in P2P- based Semantic Web to solve the problem of lacking trust. It enables Semantic Web to utilize reputation information based on semantic similarity of peers in the network. This approach is evaluated in a simulation of a content sharing system and the experiments show that the system with reputation mechanism outperforms the system without it.展开更多
Network security policy and the automated refinement of its hierarchies aims to simplify the administration of security services in complex network environments. The semantic gap between the policy hierarchies reflect...Network security policy and the automated refinement of its hierarchies aims to simplify the administration of security services in complex network environments. The semantic gap between the policy hierarchies reflects the validity of the policy hierarchies yielded by the automated policy refinement process. However, little attention has been paid to the evaluation of the compliance between the derived lower level policy and the higher level policy. We present an ontology based on Ontology Web Language (OWL) to describe the semantics of security policy and their implementation. We also propose a method of estimating the semantic similarity between a given展开更多
The meaning of a word includes a conceptual meaning and a distributive meaning.Word embedding based on distribution suffers from insufficient conceptual semantic representation caused by data sparsity,especially for l...The meaning of a word includes a conceptual meaning and a distributive meaning.Word embedding based on distribution suffers from insufficient conceptual semantic representation caused by data sparsity,especially for low-frequency words.In knowledge bases,manually annotated semantic knowledge is stable and the essential attributes of words are accurately denoted.In this paper,we propose a Conceptual Semantics Enhanced Word Representation(CEWR)model,computing the synset embedding and hypernym embedding of Chinese words based on the Tongyici Cilin thesaurus,and aggregating it with distributed word representation to have both distributed information and the conceptual meaning encoded in the representation of words.We evaluate the CEWR model on two tasks:word similarity computation and short text classification.The Spearman correlation between model results and human judgement are improved to 64.71%,81.84%,and 85.16%on Wordsim297,MC30,and RG65,respectively.Moreover,CEWR improves the F1 score by 3%in the short text classification task.The experimental results show that CEWR can represent words in a more informative approach than distributed word embedding.This proves that conceptual semantics,especially hypernymous information,is a good complement to distributed word representation.展开更多
During the new product development process, reusing the existing CAD models could avoid designing from scratch and decrease human cost. With the advent of big data,how to rapidly and efficiently find out suitable 3D C...During the new product development process, reusing the existing CAD models could avoid designing from scratch and decrease human cost. With the advent of big data,how to rapidly and efficiently find out suitable 3D CAD models for design reuse is taken more attention. Currently the sketch-based retrieval approach makes search more convenient, but its accuracy is not high enough; on the other hand, the semantic-based retrieval approach fully utilizes high level semantic information, and makes search much closer to engineers' intent.However, effectively extracting and representing semantic information from data sets is difficult.Aiming at these problems, we proposed a sketch-based semantic retrieval approach for reusing3 D CAD models. Firstly a fine granularity semantic descriptor is designed for representing 3D CAD models; Secondly, several heuristic rules are adopted to recognize 3D features from 2D sketch, and the correspondences between 3D feature and 2D loops are built; Finally, semantic and shape similarity measurements are combined together to match the input sketch to 3D CAD models. Hence the retrieval accuracy is improved. A sketch-based prototype system is developed.Experimental results validate the feasibility and effectiveness of our proposed approach.展开更多
基金supported by the National Key Research and Development Program of China under the theme“Key technologies for urban sustainable development evaluation and decision-making support” [Grant No.2022YFC3802900]the Guangxi Key Research and Development Program [Grant No.Guike AB21220057].
文摘Urban sustainability assessment is an effective method for objectively presenting the current state of sustainable urban development and diagnosing sustainability-related issues.As the global community intensifies its efforts to implement the sustainable development goals(SDGs),the demand for assessing progress in urban sustainable development has increased.This has led to the emergence of numerous indicator systems with varying scales and themes published by different entities.Cities participating in these evaluations often encounter difficulties in matching indicators or the absence of certain indicators.In this context,urban decision makers and planners urgently need to identify substitute indicators that can express the semantic meaning of the original indicators and consider the availability of indicators for participating cities.Hence,this study explores the relationships of substitution between indicators and constructs a collection of substitute indicators to serve as a reference for sustainable urban development assessment.Specifically,building on a review of international and Chinese indicators related to urban sustainability assessment,this study employs natural semantic analysis methods based on the Word2Vec model and cosine similarity algorithm to calculate the similarity between indicators related to sustainable urban development.The results show that the Skip-gram algorithm with a word vector dimensionality of 600 has the best performance in terms of calculating the similarity between sustainable urban development assessment indicators.The findings provide valuable insights into selecting substitute indicators for future sustainable urban development assessment,particularly in China.
基金The National Key Technology R&D Program of Chinaduring the 11th Five-Year Plan Period(No2007BAF23B0302)the Major Research Plan of the National Natural Science Foundation of China(No90818028)
文摘In order to achieve adaptive and efficient service composition, a task-oriented algorithm for discovering services is proposed. The traditional process of service composition is divided into semantic discovery and functional matching and makes tasks be operation objects. Semantic similarity is used to discover services matching a specific task and then generate a corresponding task-oriented web service composition (TWC) graph. Moreover, an algorithm for the new service is designed to update the TWC. The approach is applied to the composition model, in which the TWC is searched to obtain an optimal path and the final service composition is output. Also, the model can implement realtime updating with changing environments. Experimental results demonstrate the feasibility and effectiveness of the algorithm and indicate that the maximum searching radius can be set to 2 to achieve an equilibrium point of quality and quantity.
基金The National Natural Science Foundation of China(No.70471090,70472005),the Natural Science Foundation of Jiangsu Province(No.BK2004052,BK2005046).
文摘Services discovery based on syntactic matching cannot adapt to the open and dynamic environment of the web. To select the proper one from the web services candidate set provided by syntactic matching, a service selection method based on semantic similarity is proposed. First, this method defines a web services ontology including QoS and context as semantic supporting, which also provides a set of terms to describe the interfaces of web services. Secondly, the similarity degree of two web services is evaluated by computing the semantic distances of those terms used to describe interfaces. Compared with existing methods, interfaces of web services can be interpreted under ontology, because it provides a formal and semantic specification of conceptualization. Meanwhile, efficiency and accuracy of services selection are improved.
文摘To solve the problem of the inadequacy of semantic processing in the intelligent question answering system, an integrated semantic similarity model which calculates the semantic similarity using the geometric distance and information content is presented in this paper. With the help of interrelationship between concepts, the information content of concepts and the strength of the edges in the ontology network, we can calculate the semantic similarity between two concepts and provide information for the further calculation of the semantic similarity between user’s question and answers in knowledge base. The results of the experiments on the prototype have shown that the semantic problem in natural language processing can also be solved with the help of the knowledge and the abundant semantic information in ontology. More than 90% accuracy with less than 50 ms average searching time in the intelligent question answering prototype system based on ontology has been reached. The result is very satisfied. Key words intelligent question answering system - ontology - semantic similarity - geometric distance - information content CLC number TP39 Foundation item: Supported by the important science and technology item of China of “The 10th Five-year Plan” (2001BA101A05-04)Biography: LIU Ya-jun (1953-), female, Associate professor, research direction: software engineering, information processing, data-base application.
文摘In recent years, there are many types of semantic similarity measures, which are used to measure the similarity between two concepts. It is necessary to define the differences between the measures, performance, and evaluations. The major contribution of this paper is to choose the best measure among different similarity measures that give us good result with less error rate. The experiment was done on a taxonomy built to measure the semantic distance between two concepts in the health domain, which are represented as nodes in the taxonomy. Similarity measures methods were evaluated relative to human experts’ ratings. Our experiment was applied on the ICD10 taxonomy to determine the similarity value between two concepts. The similarity between 30 pairs of the health domains has been evaluated using different types of semantic similarity measures equations. The experimental results discussed in this paper have shown that the Hoa A. Nguyen and Hisham Al-Mubaid measure has achieved high matching score by the expert’s judgment.
基金Supported by the China Postdoctoral Science Foundation(No. 20100480701)the Ministry of Education of Humanities and Social Sciences Youth Fund Project(11YJC880119)
文摘Internet of Things (IoT) as an important and ubiquitous service paradigm is one of the most important issues in IoT applications to provide terminal users with effective and efficient services based on service community. This paper presents a semantic-based similarity algorithm to build the IoT service community. Firstly, the algorithm reflects that the nodes of IoT contain a wealth of semantic information and makes them to build into the concept tree. Then tap the similarity of the semantic information based on the concept tree. Finally, we achieve the optimization of the service community through greedy algorithm and control the size of the service community by adjusting the threshold. Simulation results show the effectiveness and feasibility of this algorithm.
基金Supported by the National Natural Science Foundation of China (No. 60872018)the Specialized Research Fund for the Doctoral Program of Higher Education (No. 20070293001)973 Project (No. 2007CB310607)
文摘In this paper, we proposed an improved hybrid semantic matching algorithm combining Input/Output (I/O) semantic matching with text lexical similarity to overcome the disadvantage that the existing semantic matching algorithms were unable to distinguish those services with the same I/O by only performing I/O based service signature matching in semantic web service discovery techniques. The improved algorithm consists of two steps, the first is logic based I/O concept ontology matching, through which the candidate service set is obtained and the second is the service name matching with lexical similarity against the candidate service set, through which the final precise matching result is concluded. Using Ontology Web Language for Services (OWL-S) test collection, we tested our hybrid algorithm and compared it with OWL-S Matchmaker-X (OWLS-MX), the experimental results have shown that the proposed algorithm could pick out the most suitable advertised service corresponding to user's request from very similar ones and provide better matching precision and efficiency than OWLS-MX.
文摘Most of the questions from users lack the context needed to thoroughly understand the problemat hand,thus making the questions impossible to answer.Semantic Similarity Estimation is based on relating user’s questions to the context from previous Conversational Search Systems(CSS)to provide answers without requesting the user’s context.It imposes constraints on the time needed to produce an answer for the user.The proposed model enables the use of contextual data associated with previous Conversational Searches(CS).While receiving a question in a new conversational search,the model determines the question that refers tomore pastCS.Themodel then infers past contextual data related to the given question and predicts an answer based on the context inferred without engaging in multi-turn interactions or requesting additional data from the user for context.This model shows the ability to use the limited information in user queries for best context inferences based on Closed-Domain-based CS and Bidirectional Encoder Representations from Transformers for textual representations.
文摘In this paper, a finite state machine approach is followed in order to find the semantic similarity of two sentences. The approach exploits the concept of bi-directional logic along with a semantic ordering approach. The core part of this approach is bi-directional logic of artificial intelligence. The bi-directional logic is implemented using Finite State Machine algorithm with slight modification. For finding the semantic similarity, keyword has played climactic importance. With the help of the keyword approach, it can be found easily at the sentence level according to this algorithm. The algorithm is proposed especially for Nepali texts. With the polarity of the individual keywords, the finite state machine is made and its final state determines its polarity. If two sentences are negatively polarized, they are said to be coherent, otherwise not. Similarly, if two sentences are of a positive nature, they are said to be coherence. For measuring the coherence (similarity), contextual concept is taken into consideration. The semantic approach, in this research, is a totally contextual based method. Two sentences are said to be semantically similar if they bear the same context. The total accuracy obtained in this algorithm is 90.16%.
文摘As one of the essential topics in proteomics and molecular biology, protein subcellular localization has been extensively studied in previous decades. However, most of the methods are limited to the prediction of single-location proteins. In many studies, multi-location proteins are either not considered or assumed not existing. This paper proposes a novel multi-label subcellular-localization predictor based on the semantic similarity between Gene Ontology (GO) terms. Given a protein, the accession numbers of its homologs are obtained via BLAST search. Then, the homologous accession numbers of the protein are used as keys to search against the gene ontology annotation database to obtain a set of GO terms. The semantic similarity between GO terms is used to formulate semantic similarity vectors for classification. A support vector machine (SVM) classifier with a new decision scheme is proposed to classify the multi-label GO semantic similarity vectors. Experimental results show that the proposed multi-label predictor significantly outperforms the state-of-the-art predictors such as iLoc-Plant and Plant-mPLoc.
文摘The similarity between biomedical terms/concepts is a very important task for biomedical information extraction and knowledge discovery. The measures and tests are tools used to define how to measure the goodness of ontology or its resources. The semantic similarity measuring techniques can be classified into three classes: first, measuring semantic similarity using ontology/ taxonomy;second, using training corpora and information content and third, combination between them. Some of the semantic similarity measures are based on the path length between the concept nodes as well as the depth of the LCS node in the ontology tree or hierarchy, and these measures assign high similarity when the two concepts are in the lower level of the hierarchy. However, most of the semantic similarity measures can be adopted to be used in health domain (Biomedical Domain). Many experiments have been conducted to check the applicability of these measures. In this paper, we investigate to measure semantic similarity between two concepts within single ontology or multiple ontologies in UMLS Metathesaurus (MeSH, SNOMED-CT, ICD), and compare my results to human experts score by correlation coefficient.
基金The National Basic Research Program of China (973Program)(No.2005CB321802)Program for New Century Excellent Talents in University (No. NCET-06-0926)the National Natural Science Foundation of China (No.60403050,90612009)
文摘In order to improve the effectiveness of semantic web service discovery, the semantic bias between an interface parameter and an annotation is reduced by extracting semantic restrictions for the annotation from the description context and generating refined semantic annotations, and then the semantics of the web service is refined. These restrictions are dynamically extracted from the parsing tree of the description text, with the guide of the restriction template extracted from the ontology definition. New semantic annotations are then generated by combining the original concept with the restrictions and represented via refined concept expressions. In addition, a novel semantic similarity measure for refined concept expressions is proposed for semantic web service discovery. Experimental results show that the matchmaker based on this method can improve the average precision of discovery and exhibit low computational complexity. Reducing the semantic bias by utilizing restriction information of annotations can refine the semantics of the web service and improve the discovery effectiveness.
基金The National Natural Science Foundation of China(No.50674086)Specialized Research Fund for the Doctoral Program of Higher Education(No.20060290508)the Postdoctoral Scientific Program of Jiangsu Province(No.0701045B)
文摘In order to mine production and security information from security supervising data and to ensure security and safety involved in production and decision-making,a clustering analysis algorithm for security supervising data based on a semantic description in coal mines is studied.First,the semantic and numerical-based hybrid description method of security supervising data in coal mines is described.Secondly,the similarity measurement method of semantic and numerical data are separately given and a weight-based hybrid similarity measurement method for the security supervising data based on a semantic description in coal mines is presented.Thirdly,taking the hybrid similarity measurement method as the distance criteria and using a grid methodology for reference,an improved CURE clustering algorithm based on the grid is presented.Finally,the simulation results of a security supervising data set in coal mines validate the efficiency of the algorithm.
基金The National Natural Science Foundation of China(No.60673130)the Natural Science Foundation of Shandong Province(No.Y2006G29,Y2007G24,Y2007G38)
文摘In order to improve the efficiency and quality of service composition,a service composition algorithm based on semantic constraint is proposed.First, a user’s requirements and services from a service repository are compared with the help of a matching algorithm.The algorithm has two levels and filters out the services which do not match the user’s constraint personality requirements.The mechanism can reduce the searching scope at the beginning of the service composition algorithm.Secondly,satisfactions of those selected services for the user’s personality requirements are computed and those services,which have the greatest satisfaction value to make up the service composition,are used.The algorithm is evaluated analytically and experimentally based on the efficiency of service composition and satisfaction for the user’s personality requirements.
基金the Specialized Research Program Fundthe Doctoral Program of Higher Education of China (20050007023)the Natural Science Foundation of Shandong Province(Y2004G04)
文摘In Chinese question answering system, because there is more semantic relation in questions than that in query words, the precision can be improved by expanding query while using natural language questions to retrieve documents. This paper proposes a new approach to query expansion based on semantics and statistics Firstly automatic relevance feedback method is used to generate a candidate expansion word set. Then the expanded query words are selected from the set based on the semantic similarity and seman- tic relevancy between the candidate words and the original words. Experiments show the new approach is effective for Web retrieval and out-performs the conventional expansion approaches.
基金Supported by the National Natural Science Foun-dation of China (60173026) the Ministry of Education Key Project(105071) Foundation of E-Institute of Shanghai HighInstitutions(200301)
文摘A reputation mechanism is introduced in P2P- based Semantic Web to solve the problem of lacking trust. It enables Semantic Web to utilize reputation information based on semantic similarity of peers in the network. This approach is evaluated in a simulation of a content sharing system and the experiments show that the system with reputation mechanism outperforms the system without it.
基金the National Natural Science Foundation of China
文摘Network security policy and the automated refinement of its hierarchies aims to simplify the administration of security services in complex network environments. The semantic gap between the policy hierarchies reflects the validity of the policy hierarchies yielded by the automated policy refinement process. However, little attention has been paid to the evaluation of the compliance between the derived lower level policy and the higher level policy. We present an ontology based on Ontology Web Language (OWL) to describe the semantics of security policy and their implementation. We also propose a method of estimating the semantic similarity between a given
基金This research is supported by the National Science Foundation of China(grant 61772278,author:Qu,W.grant number:61472191,author:Zhou,J.http://www.nsfc.gov.cn/)+2 种基金the National Social Science Foundation of China(grant number:18BYY127,author:Li B.http://www.cssn.cn)the Philosophy and Social Science Foundation of Jiangsu Higher Institution(grant number:2019SJA0220,author:Wei,T.https://jyt.jiangsu.gov.cn)Jiangsu Higher Institutions’Excellent Innovative Team for Philosophy and Social Science(grant number:2017STD006,author:Gu,W.https://jyt.jiangsu.gov.cn)。
文摘The meaning of a word includes a conceptual meaning and a distributive meaning.Word embedding based on distribution suffers from insufficient conceptual semantic representation caused by data sparsity,especially for low-frequency words.In knowledge bases,manually annotated semantic knowledge is stable and the essential attributes of words are accurately denoted.In this paper,we propose a Conceptual Semantics Enhanced Word Representation(CEWR)model,computing the synset embedding and hypernym embedding of Chinese words based on the Tongyici Cilin thesaurus,and aggregating it with distributed word representation to have both distributed information and the conceptual meaning encoded in the representation of words.We evaluate the CEWR model on two tasks:word similarity computation and short text classification.The Spearman correlation between model results and human judgement are improved to 64.71%,81.84%,and 85.16%on Wordsim297,MC30,and RG65,respectively.Moreover,CEWR improves the F1 score by 3%in the short text classification task.The experimental results show that CEWR can represent words in a more informative approach than distributed word embedding.This proves that conceptual semantics,especially hypernymous information,is a good complement to distributed word representation.
基金Supported by the National Natural Science Foundation of China(61502129,61572432,61163016)the Zhejiang Natural Science Foundation of China(LQ16F020004,LQ15F020011)the University Scientific Research Projects of Ningxia Province of China(NGY2015161)
文摘During the new product development process, reusing the existing CAD models could avoid designing from scratch and decrease human cost. With the advent of big data,how to rapidly and efficiently find out suitable 3D CAD models for design reuse is taken more attention. Currently the sketch-based retrieval approach makes search more convenient, but its accuracy is not high enough; on the other hand, the semantic-based retrieval approach fully utilizes high level semantic information, and makes search much closer to engineers' intent.However, effectively extracting and representing semantic information from data sets is difficult.Aiming at these problems, we proposed a sketch-based semantic retrieval approach for reusing3 D CAD models. Firstly a fine granularity semantic descriptor is designed for representing 3D CAD models; Secondly, several heuristic rules are adopted to recognize 3D features from 2D sketch, and the correspondences between 3D feature and 2D loops are built; Finally, semantic and shape similarity measurements are combined together to match the input sketch to 3D CAD models. Hence the retrieval accuracy is improved. A sketch-based prototype system is developed.Experimental results validate the feasibility and effectiveness of our proposed approach.