This paper proposed an incremental textclustering algorithm based on semantic sequence. Using similarity relation of semantic sequences and calculating the cover of similarity semantic sequences set, the candidate clu...This paper proposed an incremental textclustering algorithm based on semantic sequence. Using similarity relation of semantic sequences and calculating the cover of similarity semantic sequences set, the candidate cluster with minimum entropy overlap value was selected as a result cluster every time in this algorithm. The comparison of experimental results shows that the precision of the algorithm is higher than other algorithms under same conditions and this is obvious especially on long documents set.展开更多
Internetware is envisioned as a general software paradigm for the application style of resources integration and sharing in the open, dynamic and uncertain platforms such as the Internet. Continuing the agent-based In...Internetware is envisioned as a general software paradigm for the application style of resources integration and sharing in the open, dynamic and uncertain platforms such as the Internet. Continuing the agent-based Internetware model presented in a previous paper, in this paper, after an analysis of the behavioral patterns and the technical challenges of environment-driven applications, a software-structuring model is proposed for environment-driven Internetware applications. A series of explorations on the enabling techniques for the model, especially the modeling, management and utilization of context information are presented. Several proto-typical systems have also been built to prove the concepts and evaluate the techniques. These research efforts make a further step toward the Internetware paradigm by providing an initial framework for the construction of context-aware and self-adaptive software application systems in the open network environment.展开更多
In recent years,multimedia annotation problem has been attracting significant research attention in multimedia and computer vision areas,especially for automatic image annotation,whose purpose is to provide an efficie...In recent years,multimedia annotation problem has been attracting significant research attention in multimedia and computer vision areas,especially for automatic image annotation,whose purpose is to provide an efficient and effective searching environment for users to query their images more easily. In this paper,a semi-supervised learning based probabilistic latent semantic analysis( PLSA) model for automatic image annotation is presenred. Since it's often hard to obtain or create labeled images in large quantities while unlabeled ones are easier to collect,a transductive support vector machine( TSVM) is exploited to enhance the quality of the training image data. Then,different image features with different magnitudes will result in different performance for automatic image annotation. To this end,a Gaussian normalization method is utilized to normalize different features extracted from effective image regions segmented by the normalized cuts algorithm so as to reserve the intrinsic content of images as complete as possible. Finally,a PLSA model with asymmetric modalities is constructed based on the expectation maximization( EM) algorithm to predict a candidate set of annotations with confidence scores. Extensive experiments on the general-purpose Corel5k dataset demonstrate that the proposed model can significantly improve performance of traditional PLSA for the task of automatic image annotation.展开更多
This paper presents a new method for refining image annotation by integrating probabilistic la- tent semantic analysis (PLSA) with conditional random field (CRF). First a PLSA model with asymmetric modalities is c...This paper presents a new method for refining image annotation by integrating probabilistic la- tent semantic analysis (PLSA) with conditional random field (CRF). First a PLSA model with asymmetric modalities is constructed to predict a candidate set of annotations with confidence scores, and then model semantic relationship among the candidate annotations by leveraging conditional ran- dom field. In CRF, the confidence scores generated lay the PLSA model and the Fliekr distance be- tween pairwise candidate annotations are considered as local evidences and contextual potentials re- spectively. The novelty of our method mainly lies in two aspects : exploiting PLSA to predict a candi- date set of annotations with confidence scores as well as CRF to further explore the semantic context among candidate annotations for precise image annotation. To demonstrate the effectiveness of the method proposed in this paper, an experiment is conducted on the standard Corel dataset and its re- sults are 'compared favorably with several state-of-the-art approaches.展开更多
Automatic image annotation has been an active topic of research in computer vision and pattern recognition for decades.A two stage automatic image annotation method based on Gaussian mixture model(GMM) and random walk...Automatic image annotation has been an active topic of research in computer vision and pattern recognition for decades.A two stage automatic image annotation method based on Gaussian mixture model(GMM) and random walk model(abbreviated as GMM-RW) is presented.To start with,GMM fitted by the rival penalized expectation maximization(RPEM) algorithm is employed to estimate the posterior probabilities of each annotation keyword.Subsequently,a random walk process over the constructed label similarity graph is implemented to further mine the potential correlations of the candidate annotations so as to capture the refining results,which plays a crucial role in semantic based image retrieval.The contributions exhibited in this work are multifold.First,GMM is exploited to capture the initial semantic annotations,especially the RPEM algorithm is utilized to train the model that can determine the number of components in GMM automatically.Second,a label similarity graph is constructed by a weighted linear combination of label similarity and visual similarity of images associated with the corresponding labels,which is able to avoid the phenomena of polysemy and synonym efficiently during the image annotation process.Third,the random walk is implemented over the constructed label graph to further refine the candidate set of annotations generated by GMM.Conducted experiments on the standard Corel5 k demonstrate that GMM-RW is significantly more effective than several state-of-the-arts regarding their effectiveness and efficiency in the task of automatic image annotation.展开更多
A novel image auto-annotation method is presented based on probabilistic latent semantic analysis(PLSA) model and multiple Markov random fields(MRF).A PLSA model with asymmetric modalities is first constructed to esti...A novel image auto-annotation method is presented based on probabilistic latent semantic analysis(PLSA) model and multiple Markov random fields(MRF).A PLSA model with asymmetric modalities is first constructed to estimate the joint probability between images and semantic concepts,then a subgraph is extracted served as the corresponding structure of Markov random fields and inference over it is performed by the iterative conditional modes so as to capture the final annotation for the image.The novelty of our method mainly lies in two aspects:exploiting PLSA to estimate the joint probability between images and semantic concepts as well as multiple MRF to further explore the semantic context among keywords for accurate image annotation.To demonstrate the effectiveness of this approach,an experiment on the Corel5 k dataset is conducted and its results are compared favorably with the current state-of-the-art approaches.展开更多
By analyzing some existing test data generation methods, a new automated test data generation approach was presented. The linear predicate functions on a given path was directly used to construct a linear constrain sy...By analyzing some existing test data generation methods, a new automated test data generation approach was presented. The linear predicate functions on a given path was directly used to construct a linear constrain system for input variables. Only when the predicate function is nonlinear, does the linear arithmetic representation need to be computed. If the entire predicate functions on the given path are linear, either the desired test data or the guarantee that the path is infeasible can be gotten from the solution of the constrain system. Otherwise, the iterative refining for the input is required to obtain the desired test data. Theoretical analysis and test results show that the approach is simple and effective, and takes less computation. The scheme can also be used to generate path-based test data for the programs with arrays and loops.展开更多
This paper summarizes the principles and ideas in the design of a graphical specification language, called GSPEC. Based on the requirement analysis of specification languages, a new software decomposition model is pro...This paper summarizes the principles and ideas in the design of a graphical specification language, called GSPEC. Based on the requirement analysis of specification languages, a new software decomposition model is proposed, and an abstract data type definition method combining both the algebraic and the flrst-order logic descriptions is adopted in GSPEC. With its graphical representation, the correctness of its specifications can be guaranteed or verified to some extent. The language is powerful and easy to understand. It has been implemented on IBM PC/AT computers and SUN-3/160 C work stations.展开更多
Due to the independency, variability, and tailorability of software service in the open environment, the research of middleware which supports software services multi-mode interaction is thus of great importance. In t...Due to the independency, variability, and tailorability of software service in the open environment, the research of middleware which supports software services multi-mode interaction is thus of great importance. In this paper, an agent-based multi-mode interaction middleware model and its supporting system for software services were proposed. This model includes an interaction feature decomposition and configuration model to enable interaction programming, an agent-based middleware model, and a programmable coordination media based on reflection technology. The decomposition and configuration model for interaction features can assist programmers in interaction programming by analyzing and synthesizing interaction features. The agent-based middleware model provides a runtime framework for service multi-mode interaction. The programmable coordination media is able to effectively support software service coordination based on multimode interaction. To verify feasibility and efficiency of the above method, the design, implementation and performance analysis of Artemis-M3C, a multi-mode interaction middleware for software services, were introduced. The result shows that the above method is feasible and that the Artemis-M3C system is practical and effective in multi-mode interaction.展开更多
A functional specification decomposition tree model of an algorithm design process ispresented, and the properties of functional specification and the correctness criteria of thealgorithm design are discussed. The cor...A functional specification decomposition tree model of an algorithm design process ispresented, and the properties of functional specification and the correctness criteria of thealgorithm design are discussed. The correctness of some major rules used in NDADAS isverified.展开更多
Inheritance is regarded as the hallmark of object-oriented programming languages.A mathematical model of inheritance is presented.In this model,the graph-sorted signature is introduced to represent the algebraic struc...Inheritance is regarded as the hallmark of object-oriented programming languages.A mathematical model of inheritance is presented.In this model,the graph-sorted signature is introduced to represent the algebraic structure of the program,and an extension function on the graph-sorted signatures is used to formally describe the semantics of inheritance.The program’s algebraic structure reflects the syntactic constraints of the language and the corresponding extension function exposes the character of the language’s inheritance.展开更多
FGSPEC is a wide spectrum specification language intended to facilitate the software specification and the expression of transformation process from the functional specification which describes“what to do”to the cor...FGSPEC is a wide spectrum specification language intended to facilitate the software specification and the expression of transformation process from the functional specification which describes“what to do”to the corresponding design(operational)specification which describes“how to do”.The design emphasizes the coherence of multi-level specification mechanisms and a tree structure model is provided which unifies the wide spectrum specification styles from“what”to“how”.展开更多
This paper proposes two definitions of analogy from the aspects of epistemology andmethodology respectively, i. e. ?-analogy based on a common model and A_c-analogy basedon analogy correspondence. Their relations are ...This paper proposes two definitions of analogy from the aspects of epistemology andmethodology respectively, i. e. ?-analogy based on a common model and A_c-analogy basedon analogy correspondence. Their relations are discussed. An analysis method for findinganalogy correspondence has been derived from the definition of A_c-analogy.展开更多
Users are vulnerable to privacy risks when providing their location information to location-based services (LBS). Existing work sacrifices the quality of LBS by degrading spatial and temporal accuracy for ensuring u...Users are vulnerable to privacy risks when providing their location information to location-based services (LBS). Existing work sacrifices the quality of LBS by degrading spatial and temporal accuracy for ensuring user privacy. In this paper, we propose a novel approach, Complete Bipartite Anonymity (CBA), aiming to achieve both user privacy and quality of service. The theoretical basis of CBA is that: if the bipartite graph of k nearby users' paths can be transformed into a complete bipartite graph, then these users achieve k-anonymity since the set of "end points connecting to a specific start point in a graph" is an equivalence class. To achieve CBA, we design a Collaborative Path Confusion (CPC) protocol which enables nearby nsers to discover and authenticate each other without knowing their real identities or accurate locations, predict tile encounter location using users' moving pattern information, and generate fake traces obfuscating the real ones. We evaluate CBA using a real-world dataset, and compare its privacy performance with existing path confusion approach. The results show that CBA enhances location privacy by increasing the chance for a user confusing his/her path with others by 4 to 16 times in low user density areas. We also demonstrate that CBA is secure under the trace identification attack.展开更多
The sound and complete rules for data reification in the algebraic framework are discussed. Based on these rules, the retrieve function approach in VDM is extended and the biased model and non-determinacy can be treat...The sound and complete rules for data reification in the algebraic framework are discussed. Based on these rules, the retrieve function approach in VDM is extended and the biased model and non-determinacy can be treated in some sense展开更多
As a continuation of last three years' special section on software systems, this special section encourages and promotes research to address challenges from the perspective of software systems. The goal of this speci...As a continuation of last three years' special section on software systems, this special section encourages and promotes research to address challenges from the perspective of software systems. The goal of this special section is to present state-of-the-art and high-quality original research in the area of software systems. Similar to last three years' special section, this special section includes two major themes: data-driven software engineering, and software testing and analvsis.展开更多
基金Supported by the National Natural Science Funda-tion of China (60173058)
文摘This paper proposed an incremental textclustering algorithm based on semantic sequence. Using similarity relation of semantic sequences and calculating the cover of similarity semantic sequences set, the candidate cluster with minimum entropy overlap value was selected as a result cluster every time in this algorithm. The comparison of experimental results shows that the precision of the algorithm is higher than other algorithms under same conditions and this is obvious especially on long documents set.
基金the National 973 Program (Grant No. 2002CB312002)the National 863 Program (Grant Nos. 2007AA01Z178, 2007AA01Z140 and 2006AA01Z159)+2 种基金the Program for New Century Excellent Talents in University (Grant No. NCET-07-0419)the National Natural Science Foundation of China (Grant Nos. 60403014, 60721002 and 60736015)the Jiangsu Nature Science Foundation (Grant No. BK2006712)
文摘Internetware is envisioned as a general software paradigm for the application style of resources integration and sharing in the open, dynamic and uncertain platforms such as the Internet. Continuing the agent-based Internetware model presented in a previous paper, in this paper, after an analysis of the behavioral patterns and the technical challenges of environment-driven applications, a software-structuring model is proposed for environment-driven Internetware applications. A series of explorations on the enabling techniques for the model, especially the modeling, management and utilization of context information are presented. Several proto-typical systems have also been built to prove the concepts and evaluate the techniques. These research efforts make a further step toward the Internetware paradigm by providing an initial framework for the construction of context-aware and self-adaptive software application systems in the open network environment.
基金Supported by the National Program on Key Basic Research Project(No.2013CB329502)the National Natural Science Foundation of China(No.61202212)+1 种基金the Special Research Project of the Educational Department of Shaanxi Province of China(No.15JK1038)the Key Research Project of Baoji University of Arts and Sciences(No.ZK16047)
文摘In recent years,multimedia annotation problem has been attracting significant research attention in multimedia and computer vision areas,especially for automatic image annotation,whose purpose is to provide an efficient and effective searching environment for users to query their images more easily. In this paper,a semi-supervised learning based probabilistic latent semantic analysis( PLSA) model for automatic image annotation is presenred. Since it's often hard to obtain or create labeled images in large quantities while unlabeled ones are easier to collect,a transductive support vector machine( TSVM) is exploited to enhance the quality of the training image data. Then,different image features with different magnitudes will result in different performance for automatic image annotation. To this end,a Gaussian normalization method is utilized to normalize different features extracted from effective image regions segmented by the normalized cuts algorithm so as to reserve the intrinsic content of images as complete as possible. Finally,a PLSA model with asymmetric modalities is constructed based on the expectation maximization( EM) algorithm to predict a candidate set of annotations with confidence scores. Extensive experiments on the general-purpose Corel5k dataset demonstrate that the proposed model can significantly improve performance of traditional PLSA for the task of automatic image annotation.
基金Supported by the National Basic Research Priorities Programme(No.2013CB329502)the National High Technology Research and Development Programme of China(No.2012AA011003)+1 种基金the Natural Science Basic Research Plan in Shanxi Province of China(No.2014JQ2-6036)the Science and Technology R&D Program of Baoji City(No.203020013,2013R2-2)
文摘This paper presents a new method for refining image annotation by integrating probabilistic la- tent semantic analysis (PLSA) with conditional random field (CRF). First a PLSA model with asymmetric modalities is constructed to predict a candidate set of annotations with confidence scores, and then model semantic relationship among the candidate annotations by leveraging conditional ran- dom field. In CRF, the confidence scores generated lay the PLSA model and the Fliekr distance be- tween pairwise candidate annotations are considered as local evidences and contextual potentials re- spectively. The novelty of our method mainly lies in two aspects : exploiting PLSA to predict a candi- date set of annotations with confidence scores as well as CRF to further explore the semantic context among candidate annotations for precise image annotation. To demonstrate the effectiveness of the method proposed in this paper, an experiment is conducted on the standard Corel dataset and its re- sults are 'compared favorably with several state-of-the-art approaches.
基金Supported by the National Basic Research Program of China(No.2013CB329502)the National Natural Science Foundation of China(No.61202212)+1 种基金the Special Research Project of the Educational Department of Shaanxi Province of China(No.15JK1038)the Key Research Project of Baoji University of Arts and Sciences(No.ZK16047)
文摘Automatic image annotation has been an active topic of research in computer vision and pattern recognition for decades.A two stage automatic image annotation method based on Gaussian mixture model(GMM) and random walk model(abbreviated as GMM-RW) is presented.To start with,GMM fitted by the rival penalized expectation maximization(RPEM) algorithm is employed to estimate the posterior probabilities of each annotation keyword.Subsequently,a random walk process over the constructed label similarity graph is implemented to further mine the potential correlations of the candidate annotations so as to capture the refining results,which plays a crucial role in semantic based image retrieval.The contributions exhibited in this work are multifold.First,GMM is exploited to capture the initial semantic annotations,especially the RPEM algorithm is utilized to train the model that can determine the number of components in GMM automatically.Second,a label similarity graph is constructed by a weighted linear combination of label similarity and visual similarity of images associated with the corresponding labels,which is able to avoid the phenomena of polysemy and synonym efficiently during the image annotation process.Third,the random walk is implemented over the constructed label graph to further refine the candidate set of annotations generated by GMM.Conducted experiments on the standard Corel5 k demonstrate that GMM-RW is significantly more effective than several state-of-the-arts regarding their effectiveness and efficiency in the task of automatic image annotation.
基金Supported by the National Basic Research Priorities Program(No.2013CB329502)the National High-tech R&D Program of China(No.2012AA011003)+1 种基金National Natural Science Foundation of China(No.61035003,61072085,60933004,60903141)the National Scienceand Technology Support Program of China(No.2012BA107B02)
文摘A novel image auto-annotation method is presented based on probabilistic latent semantic analysis(PLSA) model and multiple Markov random fields(MRF).A PLSA model with asymmetric modalities is first constructed to estimate the joint probability between images and semantic concepts,then a subgraph is extracted served as the corresponding structure of Markov random fields and inference over it is performed by the iterative conditional modes so as to capture the final annotation for the image.The novelty of our method mainly lies in two aspects:exploiting PLSA to estimate the joint probability between images and semantic concepts as well as multiple MRF to further explore the semantic context among keywords for accurate image annotation.To demonstrate the effectiveness of this approach,an experiment on the Corel5 k dataset is conducted and its results are compared favorably with the current state-of-the-art approaches.
文摘By analyzing some existing test data generation methods, a new automated test data generation approach was presented. The linear predicate functions on a given path was directly used to construct a linear constrain system for input variables. Only when the predicate function is nonlinear, does the linear arithmetic representation need to be computed. If the entire predicate functions on the given path are linear, either the desired test data or the guarantee that the path is infeasible can be gotten from the solution of the constrain system. Otherwise, the iterative refining for the input is required to obtain the desired test data. Theoretical analysis and test results show that the approach is simple and effective, and takes less computation. The scheme can also be used to generate path-based test data for the programs with arrays and loops.
文摘This paper summarizes the principles and ideas in the design of a graphical specification language, called GSPEC. Based on the requirement analysis of specification languages, a new software decomposition model is proposed, and an abstract data type definition method combining both the algebraic and the flrst-order logic descriptions is adopted in GSPEC. With its graphical representation, the correctness of its specifications can be guaranteed or verified to some extent. The language is powerful and easy to understand. It has been implemented on IBM PC/AT computers and SUN-3/160 C work stations.
基金the Major State Basic Research Development Program of China(973 Program)(Grant No.2002CB312002)the National Natural Science Foundation of China(Grant No.60403014)Nature Science Foundation of Jiangsu Province Project(Grant No.BK2006712)
文摘Due to the independency, variability, and tailorability of software service in the open environment, the research of middleware which supports software services multi-mode interaction is thus of great importance. In this paper, an agent-based multi-mode interaction middleware model and its supporting system for software services were proposed. This model includes an interaction feature decomposition and configuration model to enable interaction programming, an agent-based middleware model, and a programmable coordination media based on reflection technology. The decomposition and configuration model for interaction features can assist programmers in interaction programming by analyzing and synthesizing interaction features. The agent-based middleware model provides a runtime framework for service multi-mode interaction. The programmable coordination media is able to effectively support software service coordination based on multimode interaction. To verify feasibility and efficiency of the above method, the design, implementation and performance analysis of Artemis-M3C, a multi-mode interaction middleware for software services, were introduced. The result shows that the above method is feasible and that the Artemis-M3C system is practical and effective in multi-mode interaction.
文摘A functional specification decomposition tree model of an algorithm design process ispresented, and the properties of functional specification and the correctness criteria of thealgorithm design are discussed. The correctness of some major rules used in NDADAS isverified.
文摘Inheritance is regarded as the hallmark of object-oriented programming languages.A mathematical model of inheritance is presented.In this model,the graph-sorted signature is introduced to represent the algebraic structure of the program,and an extension function on the graph-sorted signatures is used to formally describe the semantics of inheritance.The program’s algebraic structure reflects the syntactic constraints of the language and the corresponding extension function exposes the character of the language’s inheritance.
文摘FGSPEC is a wide spectrum specification language intended to facilitate the software specification and the expression of transformation process from the functional specification which describes“what to do”to the corresponding design(operational)specification which describes“how to do”.The design emphasizes the coherence of multi-level specification mechanisms and a tree structure model is provided which unifies the wide spectrum specification styles from“what”to“how”.
基金Project supported in Part by the National Natural Science Foundation of China
文摘This paper proposes two definitions of analogy from the aspects of epistemology andmethodology respectively, i. e. ?-analogy based on a common model and A_c-analogy basedon analogy correspondence. Their relations are discussed. An analysis method for findinganalogy correspondence has been derived from the definition of A_c-analogy.
基金supported by the National Natural Science Foundation of China under Grant Nos.61373011,91318301,and 61321491
文摘Users are vulnerable to privacy risks when providing their location information to location-based services (LBS). Existing work sacrifices the quality of LBS by degrading spatial and temporal accuracy for ensuring user privacy. In this paper, we propose a novel approach, Complete Bipartite Anonymity (CBA), aiming to achieve both user privacy and quality of service. The theoretical basis of CBA is that: if the bipartite graph of k nearby users' paths can be transformed into a complete bipartite graph, then these users achieve k-anonymity since the set of "end points connecting to a specific start point in a graph" is an equivalence class. To achieve CBA, we design a Collaborative Path Confusion (CPC) protocol which enables nearby nsers to discover and authenticate each other without knowing their real identities or accurate locations, predict tile encounter location using users' moving pattern information, and generate fake traces obfuscating the real ones. We evaluate CBA using a real-world dataset, and compare its privacy performance with existing path confusion approach. The results show that CBA enhances location privacy by increasing the chance for a user confusing his/her path with others by 4 to 16 times in low user density areas. We also demonstrate that CBA is secure under the trace identification attack.
基金Project supported by the National Science Foundation for Excellent Young Scientists, Trans-Century Training Programme Foundation for the Talents and the National Climbing Program.
文摘The sound and complete rules for data reification in the algebraic framework are discussed. Based on these rules, the retrieve function approach in VDM is extended and the biased model and non-determinacy can be treated in some sense
文摘As a continuation of last three years' special section on software systems, this special section encourages and promotes research to address challenges from the perspective of software systems. The goal of this special section is to present state-of-the-art and high-quality original research in the area of software systems. Similar to last three years' special section, this special section includes two major themes: data-driven software engineering, and software testing and analvsis.