An improved randomized algorithm of the equivalent 2-catalog segmentation problem is presented. The result obtained in this paper makes some progress to answer the open problem by analyze this algorithm with performan...An improved randomized algorithm of the equivalent 2-catalog segmentation problem is presented. The result obtained in this paper makes some progress to answer the open problem by analyze this algorithm with performance guarantee. A 0.6378-approximation for the equivalent 2-catalog segmentation problem is obtained.展开更多
Using GIS,GPS and GPRS,an intelligent monitoring and dispatch system of trucks and shovels in an open pit has been designed and developed.The system can monitor and dispatch open-pit trucks and shovels and play back t...Using GIS,GPS and GPRS,an intelligent monitoring and dispatch system of trucks and shovels in an open pit has been designed and developed.The system can monitor and dispatch open-pit trucks and shovels and play back their historical paths.An intelligent data algorithm is proposed in a practical application.The algorithm can count the times of deliveries of trucks and load- ings of shovels.Experiments on real scenes show that the performance of this system is stable and can satisfy production standards in open pits.展开更多
The checking survey in Open mine is one of the most frequent and important work.It plays the role of forming a connecting link between open mine planning and pro- duction.Traditional checking method has such disadvant...The checking survey in Open mine is one of the most frequent and important work.It plays the role of forming a connecting link between open mine planning and pro- duction.Traditional checking method has such disadvantages as long time consumption, heavy workload,complicated calculating process,and lower automation.Used GPS and GIS technologies to systematically study the core issues of checking survey in open mine. A detail GPS data acquisition coding scheme was presented.Based on the scheme an algorithm used for computer semiautomatic cartography was made.Three methods used for eliminating gross errors from raw data which were needed for creating DEM was dis- cussed.Two algorithms were researched and realized which can be used to create open mine fine DEM model with constrained conditions and to dynamically update the model. The precision analysis and evaluation of the created model were carried out.展开更多
The technique of data mining was provided to predict gas disaster in view of the characteristics of coal mine gas disaster and feature knowledge based on gas disaster. The rough set theory was used to establish data m...The technique of data mining was provided to predict gas disaster in view of the characteristics of coal mine gas disaster and feature knowledge based on gas disaster. The rough set theory was used to establish data mining model of gas disaster prediction, and rough set attributes relations was discussed in prediction model of gas disaster to supplement the shortages of rough intensive reduction method by using information en- tropy criteria.The effectiveness and practicality of data mining technology in the prediction of gas disaster is confirmed through practical application.展开更多
Rockburst is an important phenomenon that has affected many deep underground mines around the world. An understanding of this phenomenon is relevant to the management of such events, which can lead to saving both cost...Rockburst is an important phenomenon that has affected many deep underground mines around the world. An understanding of this phenomenon is relevant to the management of such events, which can lead to saving both costs and lives. Laboratory experiments are one way to obtain a deeper and better understanding of the mechanisms of rockburst. In a previous study by these authors, a database of rockburst laboratory tests was created; in addition, with the use of data mining (DM) techniques, models to predict rockburst maximum stress and rockburst risk indexes were developed. In this paper, we focus on the analysis of a database of in situ cases of rockburst in order to build influence diagrams, list the factors that interact in the occurrence of rockburst, and understand the relationships between these factors. The in situ rockburst database was further analyzed using different DM techniques ranging from artificial neural networks (ANNs) to naive Bayesian classifiers. The aim was to predict the type of rockburst-that is, the rockburst level-based on geologic and construction characteristics of the mine or tunnel. Conclusions are drawn at the end of the paper.展开更多
In order to increase the exploration depth of Rayleigh wave, new idea that dif-ferent from the former principles in data acquisition was applied. Suitable data acquisition parameter was given out on the basis of large...In order to increase the exploration depth of Rayleigh wave, new idea that dif-ferent from the former principles in data acquisition was applied. Suitable data acquisition parameter was given out on the basis of large amount of experiments. By reducing the group interval, the low frequency signal are enhanced instead of been attenuated. Fur-thermore, to solve the problem that the precision of Rayleigh wave exploration method count much to the signal-to-noise ratio, some preprocessing methods were put forward. By using zero shift rectifying, digital F-K filtering and cutting, noises can be effectively eliminated.展开更多
The authors designed the spatial data mining system for ore-forming prediction based on the theory and methods of data mining as well as the technique of spatial database,in combination with the characteristics of geo...The authors designed the spatial data mining system for ore-forming prediction based on the theory and methods of data mining as well as the technique of spatial database,in combination with the characteristics of geological information data.The system consists of data management,data mining and knowledge discovery,knowledge representation.It can syncretize multi-source geosciences data effectively,such as geology,geochemistry,geophysics,RS.The system digitized geological information data as data layer files which consist of the two numerical values,to store these files in the system database.According to the combination of the characters of geological information,metallogenic prognosis was realized,as an example from some area in Heilongjiang Province.The prospect area of hydrothermal copper deposit was determined.展开更多
A data mining method for quality prediction using association rule (DMAR) is presented in this paper. Association rule is used to mine the valuable relations of items among amounts of textile process data for ANN pred...A data mining method for quality prediction using association rule (DMAR) is presented in this paper. Association rule is used to mine the valuable relations of items among amounts of textile process data for ANN prediction model. DMAR consists of three main steps: setup knowledge data set; data cleaning and converting; find the item set with large supports and generate the expected rules. DMAR effectively improves the precision of prediction in yarn breaking. It rapidly gets rid of the negative influence of training parameters on prediction model. Then more satisfactory quality prediction result can be reached.展开更多
To detect the DoS in networks by applying association rules mining techniques, we propose that association rules and frequent itemsets can be employed to find DoS pattern in packet streams which describe traffic and u...To detect the DoS in networks by applying association rules mining techniques, we propose that association rules and frequent itemsets can be employed to find DoS pattern in packet streams which describe traffic and user behaviors. The method extracts information from the log analysis of submitted packets using the algorithm which depends on the definition of the intrusion. Large itemsets were extracted to represent the super facts to build the association analysis for the intrusion. Network data files were analysed for experiments. The analysis and experimental results are encouraging with better performance as packet frequency number increases.展开更多
The problem of scalable classification by clustering in large databases was discussed. Clustering based classification method first generates clusters using clustering algorithms. To classify new coming da-ta points, ...The problem of scalable classification by clustering in large databases was discussed. Clustering based classification method first generates clusters using clustering algorithms. To classify new coming da-ta points, it finds the κ nearest clusters of the data point as neighbors, and assign each data point to the dominant class of these neighbors. Existing algorithms incorporated class information in making clustering decisions and produced pure clusters (each cluster associated with only one class). We presented hybrid cluster based algorithms, which produce clusters by unsupervised clustering and allow each cluster associ- ated with multiple classes. Experimental results show that hybrid cluster based algorithms outperform pure ones in both classification accuracy and training soeed.展开更多
The integration of remote sensing (RS) with geographical information system (GIS) is a hotspot in geographical information science.A good database structure is important to the integration of RS with GIS,which should ...The integration of remote sensing (RS) with geographical information system (GIS) is a hotspot in geographical information science.A good database structure is important to the integration of RS with GIS,which should be beneficial to the complete integration of RS with GIS,able to deal with the disagreement between the resolution of remote sensing images and the precision of GIS data,and also helpful to the knowledge discovery and exploitation.In this paper,the database structure storing the spatial data based on semantic network is presented.This database structure has several advantages.Firstly,the spatial data is stored as raster data with space index,so the image processing can be done directly on the GIS data that is stored hierarchically according to the distinguishing precision.Secondly,the simple objects are aggregated into complex ones.Thirdly,because we use the indexing tree to depict the relationship of aggregation and the indexing pictures expressed by 2_D strings to describe the topology structure of the objects,the concepts of surrounding and region are expressed clearly and the semantic content of the landscape can be illustrated well.All the factors that affect the recognition of the objects are depicted in the factor space,which provides a uniform mathematical frame for the fusion of the semantic and non_semantic information.Lastly,the object node,knowledge node and the indexing node are integrated into one node.This feature enhances the ability of system in knowledge expressing,intelligent inference and association.The application shows that this database structure can benefit the interpretation of remote sensing image with the information of GIS.展开更多
The advanced data mining technologies and the large quantities of remotely sensed Imagery provide a data mining opportunity with high potential for useful results. Extracting interesting patterns and rules from data s...The advanced data mining technologies and the large quantities of remotely sensed Imagery provide a data mining opportunity with high potential for useful results. Extracting interesting patterns and rules from data sets composed of images and associated ground data can be of importance in object identification, community planning, resource discovery and other areas. In this paper, a data field is presented to express the observed spatial objects and conduct behavior mining on them. First, most of the important aspects are discussed on behavior mining and its implications for the future of data mining. Furthermore, an ideal framework of the behavior mining system is proposed in the network environment. Second, the model of behavior mining is given on the observed spatial objects, including the objects described by the first feature data field and the main feature data field by means of the potential function. Finally, a case study about object identification in public is given and analyzed. The experimental results show that the new model is feasible in behavior mining.展开更多
We present our recent work on both linear and nonlinear data reduction methods and algorithms: for the linear case we discuss results on structure analysis of SVD of columnpartitioned matrices and sparse low-rank appr...We present our recent work on both linear and nonlinear data reduction methods and algorithms: for the linear case we discuss results on structure analysis of SVD of columnpartitioned matrices and sparse low-rank approximation; for the nonlinear case we investigate methods for nonlinear dimensionality reduction and manifold learning. The problems we address have attracted great deal of interest in data mining and machine learning.展开更多
Automatic protocol mining is a promising approach for inferring accurate and complete API protocols. However, just as with any data-mining technique, this approach requires sufficient training data(object usage scena...Automatic protocol mining is a promising approach for inferring accurate and complete API protocols. However, just as with any data-mining technique, this approach requires sufficient training data(object usage scenarios). Existing approaches resolve the problem by analyzing more programs, which may cause significant runtime overhead. In this paper, we propose an inheritance-based oversampling approach for object usage scenarios(OUSs). Our technique is based on the inheritance relationship in object-oriented programs. Given an object-oriented program p, generally, the OUSs that can be collected from a run of p are not more than the objects used during the run. With our technique, a maximum of n times more OUSs can be achieved, where n is the average number of super-classes of all general OUSs. To investigate the effect of our technique, we implement it in our previous prototype tool, ISpec Miner, and use the tool to mine protocols from several real-world programs. Experimental results show that our technique can collect 1.95 times more OUSs than general approaches. Additionally, accurate and complete API protocols are more likely to be achieved. Furthermore, our technique can mine API protocols for classes never even used in programs, which are valuable for validating software architectures, program documentation, and understanding. Although our technique will introduce some runtime overhead, it is trivial and acceptable.展开更多
基金This work is supported by National Natural Key product Foundations of China 10231060This work is supported by the Younth Key Fundation of UESTC: JX04042.
文摘An improved randomized algorithm of the equivalent 2-catalog segmentation problem is presented. The result obtained in this paper makes some progress to answer the open problem by analyze this algorithm with performance guarantee. A 0.6378-approximation for the equivalent 2-catalog segmentation problem is obtained.
文摘Using GIS,GPS and GPRS,an intelligent monitoring and dispatch system of trucks and shovels in an open pit has been designed and developed.The system can monitor and dispatch open-pit trucks and shovels and play back their historical paths.An intelligent data algorithm is proposed in a practical application.The algorithm can count the times of deliveries of trucks and load- ings of shovels.Experiments on real scenes show that the performance of this system is stable and can satisfy production standards in open pits.
基金the Ph.D.Program Research Foundation from MOE of China(20060147004)Research Foundation from Liaoning Technical University(04A02001)
文摘The checking survey in Open mine is one of the most frequent and important work.It plays the role of forming a connecting link between open mine planning and pro- duction.Traditional checking method has such disadvantages as long time consumption, heavy workload,complicated calculating process,and lower automation.Used GPS and GIS technologies to systematically study the core issues of checking survey in open mine. A detail GPS data acquisition coding scheme was presented.Based on the scheme an algorithm used for computer semiautomatic cartography was made.Three methods used for eliminating gross errors from raw data which were needed for creating DEM was dis- cussed.Two algorithms were researched and realized which can be used to create open mine fine DEM model with constrained conditions and to dynamically update the model. The precision analysis and evaluation of the created model were carried out.
基金the National Natural Science Foundation of China(70572070)the Liaoning Province Talents Fund Projects(2005219005)the Technology Key Project of Liaoning Province(2006220019)
文摘The technique of data mining was provided to predict gas disaster in view of the characteristics of coal mine gas disaster and feature knowledge based on gas disaster. The rough set theory was used to establish data mining model of gas disaster prediction, and rough set attributes relations was discussed in prediction model of gas disaster to supplement the shortages of rough intensive reduction method by using information en- tropy criteria.The effectiveness and practicality of data mining technology in the prediction of gas disaster is confirmed through practical application.
文摘Rockburst is an important phenomenon that has affected many deep underground mines around the world. An understanding of this phenomenon is relevant to the management of such events, which can lead to saving both costs and lives. Laboratory experiments are one way to obtain a deeper and better understanding of the mechanisms of rockburst. In a previous study by these authors, a database of rockburst laboratory tests was created; in addition, with the use of data mining (DM) techniques, models to predict rockburst maximum stress and rockburst risk indexes were developed. In this paper, we focus on the analysis of a database of in situ cases of rockburst in order to build influence diagrams, list the factors that interact in the occurrence of rockburst, and understand the relationships between these factors. The in situ rockburst database was further analyzed using different DM techniques ranging from artificial neural networks (ANNs) to naive Bayesian classifiers. The aim was to predict the type of rockburst-that is, the rockburst level-based on geologic and construction characteristics of the mine or tunnel. Conclusions are drawn at the end of the paper.
文摘In order to increase the exploration depth of Rayleigh wave, new idea that dif-ferent from the former principles in data acquisition was applied. Suitable data acquisition parameter was given out on the basis of large amount of experiments. By reducing the group interval, the low frequency signal are enhanced instead of been attenuated. Fur-thermore, to solve the problem that the precision of Rayleigh wave exploration method count much to the signal-to-noise ratio, some preprocessing methods were put forward. By using zero shift rectifying, digital F-K filtering and cutting, noises can be effectively eliminated.
文摘The authors designed the spatial data mining system for ore-forming prediction based on the theory and methods of data mining as well as the technique of spatial database,in combination with the characteristics of geological information data.The system consists of data management,data mining and knowledge discovery,knowledge representation.It can syncretize multi-source geosciences data effectively,such as geology,geochemistry,geophysics,RS.The system digitized geological information data as data layer files which consist of the two numerical values,to store these files in the system database.According to the combination of the characters of geological information,metallogenic prognosis was realized,as an example from some area in Heilongjiang Province.The prospect area of hydrothermal copper deposit was determined.
文摘A data mining method for quality prediction using association rule (DMAR) is presented in this paper. Association rule is used to mine the valuable relations of items among amounts of textile process data for ANN prediction model. DMAR consists of three main steps: setup knowledge data set; data cleaning and converting; find the item set with large supports and generate the expected rules. DMAR effectively improves the precision of prediction in yarn breaking. It rapidly gets rid of the negative influence of training parameters on prediction model. Then more satisfactory quality prediction result can be reached.
文摘To detect the DoS in networks by applying association rules mining techniques, we propose that association rules and frequent itemsets can be employed to find DoS pattern in packet streams which describe traffic and user behaviors. The method extracts information from the log analysis of submitted packets using the algorithm which depends on the definition of the intrusion. Large itemsets were extracted to represent the super facts to build the association analysis for the intrusion. Network data files were analysed for experiments. The analysis and experimental results are encouraging with better performance as packet frequency number increases.
文摘The problem of scalable classification by clustering in large databases was discussed. Clustering based classification method first generates clusters using clustering algorithms. To classify new coming da-ta points, it finds the κ nearest clusters of the data point as neighbors, and assign each data point to the dominant class of these neighbors. Existing algorithms incorporated class information in making clustering decisions and produced pure clusters (each cluster associated with only one class). We presented hybrid cluster based algorithms, which produce clusters by unsupervised clustering and allow each cluster associ- ated with multiple classes. Experimental results show that hybrid cluster based algorithms outperform pure ones in both classification accuracy and training soeed.
文摘The integration of remote sensing (RS) with geographical information system (GIS) is a hotspot in geographical information science.A good database structure is important to the integration of RS with GIS,which should be beneficial to the complete integration of RS with GIS,able to deal with the disagreement between the resolution of remote sensing images and the precision of GIS data,and also helpful to the knowledge discovery and exploitation.In this paper,the database structure storing the spatial data based on semantic network is presented.This database structure has several advantages.Firstly,the spatial data is stored as raster data with space index,so the image processing can be done directly on the GIS data that is stored hierarchically according to the distinguishing precision.Secondly,the simple objects are aggregated into complex ones.Thirdly,because we use the indexing tree to depict the relationship of aggregation and the indexing pictures expressed by 2_D strings to describe the topology structure of the objects,the concepts of surrounding and region are expressed clearly and the semantic content of the landscape can be illustrated well.All the factors that affect the recognition of the objects are depicted in the factor space,which provides a uniform mathematical frame for the fusion of the semantic and non_semantic information.Lastly,the object node,knowledge node and the indexing node are integrated into one node.This feature enhances the ability of system in knowledge expressing,intelligent inference and association.The application shows that this database structure can benefit the interpretation of remote sensing image with the information of GIS.
基金Supported by the National 973 Program of China(No.2006CB701305,No.2007CB310804)the National Natural Science Fundation of China(No.60743001)+1 种基金the Best National Thesis Fundation (No.2005047)the National New Century Excellent Talent Fundation (No.NCET-06-0618)
文摘The advanced data mining technologies and the large quantities of remotely sensed Imagery provide a data mining opportunity with high potential for useful results. Extracting interesting patterns and rules from data sets composed of images and associated ground data can be of importance in object identification, community planning, resource discovery and other areas. In this paper, a data field is presented to express the observed spatial objects and conduct behavior mining on them. First, most of the important aspects are discussed on behavior mining and its implications for the future of data mining. Furthermore, an ideal framework of the behavior mining system is proposed in the network environment. Second, the model of behavior mining is given on the observed spatial objects, including the objects described by the first feature data field and the main feature data field by means of the potential function. Finally, a case study about object identification in public is given and analyzed. The experimental results show that the new model is feasible in behavior mining.
基金This work was supported in part by the Special Funds for Major State Basic Research Projectsthe National Natural Science Foundation of China(Grants No.60372033 and 9901936)NSF CCR9901986,DMS 0311800.
文摘We present our recent work on both linear and nonlinear data reduction methods and algorithms: for the linear case we discuss results on structure analysis of SVD of columnpartitioned matrices and sparse low-rank approximation; for the nonlinear case we investigate methods for nonlinear dimensionality reduction and manifold learning. The problems we address have attracted great deal of interest in data mining and machine learning.
基金supported by the Scientific Research Project of the Education Department of Hubei Province,China(No.Q20181508)the Youths Science Foundation of Wuhan Institute of Technology(No.k201622)+5 种基金the Surveying and Mapping Geographic Information Public Welfare Scientific Research Special Industry(No.201412014)the Educational Commission of Hubei Province,China(No.Q20151504)the National Natural Science Foundation of China(Nos.41501505,61502355,61502355,and 61502354)the China Postdoctoral Science Foundation(No.2015M581887)the Key Program of Higher Education Institutions of Henan Province,China(No.17A520040)and the Natural Science Foundation of Henan Province,China(No.162300410177)
文摘Automatic protocol mining is a promising approach for inferring accurate and complete API protocols. However, just as with any data-mining technique, this approach requires sufficient training data(object usage scenarios). Existing approaches resolve the problem by analyzing more programs, which may cause significant runtime overhead. In this paper, we propose an inheritance-based oversampling approach for object usage scenarios(OUSs). Our technique is based on the inheritance relationship in object-oriented programs. Given an object-oriented program p, generally, the OUSs that can be collected from a run of p are not more than the objects used during the run. With our technique, a maximum of n times more OUSs can be achieved, where n is the average number of super-classes of all general OUSs. To investigate the effect of our technique, we implement it in our previous prototype tool, ISpec Miner, and use the tool to mine protocols from several real-world programs. Experimental results show that our technique can collect 1.95 times more OUSs than general approaches. Additionally, accurate and complete API protocols are more likely to be achieved. Furthermore, our technique can mine API protocols for classes never even used in programs, which are valuable for validating software architectures, program documentation, and understanding. Although our technique will introduce some runtime overhead, it is trivial and acceptable.