The increase of competition, economic recession and financial crises has increased business failure and depending on this the researchers have attempted to develop new approaches which can yield more correct and more ...The increase of competition, economic recession and financial crises has increased business failure and depending on this the researchers have attempted to develop new approaches which can yield more correct and more reliable results. The classification and regression tree (CART) is one of the new modeling techniques which is developed for this purpose. In this study, the classification and regression trees method is explained and tested the power of the financial failure prediction. CART is applied for the data of industry companies which is trade in Istanbul Stock Exchange (ISE) between 1997-2007. As a result of this study, it has been observed that, CART has a high predicting power of financial failure one, two and three years prior to failure, and profitability ratios being the most important ratios in the prediction of failure.展开更多
This study investigates the use of a decision tree classification model, combined with Principal Component Analysis (PCA), to distinguish between Assam and Bhutan ethnic groups based on specific anthropometric feature...This study investigates the use of a decision tree classification model, combined with Principal Component Analysis (PCA), to distinguish between Assam and Bhutan ethnic groups based on specific anthropometric features, including age, height, tail length, hair length, bang length, reach, and earlobe type. The dataset was reduced using PCA, which identified height, reach, and age as key features contributing to variance. However, while PCA effectively reduced dimensionality, it faced challenges in clearly distinguishing between the two ethnic groups, a limitation noted in previous research. In contrast, the decision tree model performed significantly better, establishing clear decision boundaries and achieving high classification accuracy. The decision tree consistently selected Height and Reach as the most important classifiers, a finding supported by existing studies on ethnic differences in Northeast India. The results highlight the strengths of combining PCA for dimensionality reduction with decision tree models for classification tasks. While PCA alone was insufficient for optimal class separation, its integration with decision trees improved both the model’s accuracy and interpretability. Future research could explore other machine learning models to enhance classification and examine a broader set of anthropometric features for more comprehensive ethnic group classification.展开更多
Flood disasters can have a serious impact on people's production and lives, and can cause hugelosses in lives and property security. Based on multi-source remote sensing data, this study establisheddecision tree c...Flood disasters can have a serious impact on people's production and lives, and can cause hugelosses in lives and property security. Based on multi-source remote sensing data, this study establisheddecision tree classification rules through multi-source and multi-temporal feature fusion, classified groundobjects before the disaster and extracted flood information in the disaster area based on optical imagesduring the disaster, so as to achieve rapid acquisition of the disaster situation of each disaster bearing object.In the case of Qianliang Lake, which suffered from flooding in 2020, the results show that decision treeclassification algorithms based on multi-temporal features can effectively integrate multi-temporal and multispectralinformation to overcome the shortcomings of single-temporal image classification and achieveground-truth object classification.展开更多
As one of the main geographical elements in urban areas,buildings are closely related to the development of the city.Therefore,how to quickly and accurately extract building information from remote sensing images is o...As one of the main geographical elements in urban areas,buildings are closely related to the development of the city.Therefore,how to quickly and accurately extract building information from remote sensing images is of great significance for urban map updating,urban planning and construction,etc.Extracting building information around power facilities,especially obtaining this information from high-resolution images,has become one of the current hot topics in remote sensing technology research.This study made full use of the characteristics of GF-2 satellite remote sensing images,adopted an object-oriented classification method,combined with multi-scale segmentation technology and CART classification algorithm,and successfully extracted the buildings in the study area.The research results showed that the overall classification accuracy reached 89.5%and the Kappa coefficient was 0.86.Using the object-oriented CART classification algorithm for building extraction could be closer to actual ground objects and had higher accuracy.The extraction of buildings in the city contributed to urban development planning and provided decision support for management.展开更多
According to groundwater level monitoring data of Shuping landslide in the Three Gorges Reservoir area, based on the response relationship between influential factors such as rainfall and reservoir level and the chang...According to groundwater level monitoring data of Shuping landslide in the Three Gorges Reservoir area, based on the response relationship between influential factors such as rainfall and reservoir level and the change of groundwater level, the influential factors of groundwater level were selected. Then the classification and regression tree(CART) model was constructed by the subset and used to predict the groundwater level. Through the verification, the predictive results of the test sample were consistent with the actually measured values, and the mean absolute error and relative error is 0.28 m and 1.15%respectively. To compare the support vector machine(SVM) model constructed using the same set of factors, the mean absolute error and relative error of predicted results is 1.53 m and 6.11% respectively. It is indicated that CART model has not only better fitting and generalization ability, but also strong advantages in the analysis of landslide groundwater dynamic characteristics and the screening of important variables. It is an effective method for prediction of ground water level in landslides.展开更多
Understanding an underlying structure for phylogenetic trees is very important as it informs on the methods that should be employed during phylogenetic inference. The methods used under a structured population differ ...Understanding an underlying structure for phylogenetic trees is very important as it informs on the methods that should be employed during phylogenetic inference. The methods used under a structured population differ from those needed when a population is not structured. In this paper, we compared two supervised machine learning techniques, that is artificial neural network (ANN) and logistic regression models for prediction of an underlying structure for phylogenetic trees. We carried out parameter tuning for the models to identify optimal models. We then performed 10-fold cross-validation on the optimal models for both logistic regression?and ANN. We also performed a non-supervised technique called clustering to identify the number of clusters that could be identified from simulated phylogenetic trees. The trees were from?both structured?and non-structured populations. Clustering and prediction using classification techniques were?done using tree statistics such as Colless, Sackin and cophenetic indices, among others. Results from 10-fold cross-validation revealed that both logistic regression and ANN models had comparable results, with both models having average accuracy rates of over 0.75. Most of the clustering indices used resulted in 2 or 3 as the optimal number of clusters.展开更多
This paper presents a supervised learning algorithm for retinal vascular segmentation based on classification and regression tree (CART) algorithm and improved adptive bosting (AdaBoost). Local binary patterns (LBP) t...This paper presents a supervised learning algorithm for retinal vascular segmentation based on classification and regression tree (CART) algorithm and improved adptive bosting (AdaBoost). Local binary patterns (LBP) texture features and local features are extracted by extracting,reversing,dilating and enhancing the green components of retinal images to construct a 17-dimensional feature vector. A dataset is constructed by using the feature vector and the data manually marked by the experts. The feature is used to generate CART binary tree for nodes,where CART binary tree is as the AdaBoost weak classifier,and AdaBoost is improved by adding some re-judgment functions to form a strong classifier. The proposed algorithm is simulated on the digital retinal images for vessel extraction (DRIVE). The experimental results show that the proposed algorithm has higher segmentation accuracy for blood vessels,and the result basically contains complete blood vessel details. Moreover,the segmented blood vessel tree has good connectivity,which basically reflects the distribution trend of blood vessels. Compared with the traditional AdaBoost classification algorithm and the support vector machine (SVM) based classification algorithm,the proposed algorithm has higher average accuracy and reliability index,which is similar to the segmentation results of the state-of-the-art segmentation algorithm.展开更多
To solve the multi-class fault diagnosis tasks, decision tree support vector machine (DTSVM), which combines SVM and decision tree using the concept of dichotomy, is proposed. Since the classification performance of...To solve the multi-class fault diagnosis tasks, decision tree support vector machine (DTSVM), which combines SVM and decision tree using the concept of dichotomy, is proposed. Since the classification performance of DTSVM highly depends on its structure, to cluster the multi-classes with maximum distance between the clustering centers of the two sub-classes, genetic algorithm is introduced into the formation of decision tree, so that the most separable classes would be separated at each node of decisions tree. Numerical simulations conducted on three datasets compared with "one-against-all" and "one-against-one" demonstrate the proposed method has better performance and higher generalization ability than the two conventional methods.展开更多
Machine learning algorithms are an important measure with which to perform landslide susceptibility assessments, but most studies use GIS-based classification methods to conduct susceptibility zonation.This study pres...Machine learning algorithms are an important measure with which to perform landslide susceptibility assessments, but most studies use GIS-based classification methods to conduct susceptibility zonation.This study presents a machine learning approach based on the C5.0 decision tree(DT) model and the K-means cluster algorithm to produce a regional landslide susceptibility map. Yanchang County, a typical landslide-prone area located in northwestern China, was taken as the area of interest to introduce the proposed application procedure. A landslide inventory containing 82 landslides was prepared and subsequently randomly partitioned into two subsets: training data(70% landslide pixels) and validation data(30% landslide pixels). Fourteen landslide influencing factors were considered in the input dataset and were used to calculate the landslide occurrence probability based on the C5.0 decision tree model.Susceptibility zonation was implemented according to the cut-off values calculated by the K-means cluster algorithm. The validation results of the model performance analysis showed that the AUC(area under the receiver operating characteristic(ROC) curve) of the proposed model was the highest, reaching 0.88,compared with traditional models(support vector machine(SVM) = 0.85, Bayesian network(BN) = 0.81,frequency ratio(FR) = 0.75, weight of evidence(WOE) = 0.76). The landslide frequency ratio and frequency density of the high susceptibility zones were 6.76/km^(2) and 0.88/km^(2), respectively, which were much higher than those of the low susceptibility zones. The top 20% interval of landslide occurrence probability contained 89% of the historical landslides but only accounted for 10.3% of the total area.Our results indicate that the distribution of high susceptibility zones was more focused without containing more " stable" pixels. Therefore, the obtained susceptibility map is suitable for application to landslide risk management practices.展开更多
This paper focuses on improving decision tree induction algorithms when a kind of tie appears during the rule generation procedure for specific training datasets. The tie occurs when there are equal proportions of the...This paper focuses on improving decision tree induction algorithms when a kind of tie appears during the rule generation procedure for specific training datasets. The tie occurs when there are equal proportions of the target class outcome in the leaf node's records that leads to a situation where majority voting cannot be applied. To solve the above mentioned exception, we propose to base the prediction of the result on the naive Bayes (NB) estimate, k-nearest neighbour (k-NN) and association rule mining (ARM). The other features used for splitting the parent nodes are also taken into consideration.展开更多
This article presents two approaches for automated building of knowledge bases of soil resources mapping. These methods used decision tree and Bayesian predictive modeling, respectively to generate knowledge from tra...This article presents two approaches for automated building of knowledge bases of soil resources mapping. These methods used decision tree and Bayesian predictive modeling, respectively to generate knowledge from training data. With these methods, building a knowledge base for automated soil mapping is easier than using the conventional knowledge acquisition approach. The knowledge bases built by these two methods were used by the knowledge classifier for soil type classification of the Longyou area, Zhejiang Province, China using TM bi-temporal imageries and GIS data. To evaluate the performance of the resultant knowledge bases, the classification results were compared to existing soil map based on field survey. The accuracy assessment and analysis of the resultant soil maps suggested that the knowledge bases built by these two methods were of good quality for mapping distribution model of soil classes over the study area.展开更多
Karst rocky desertification is a phenomenon of land degradation as a result of affection by the interaction of natural and human factors.In the past,in the rocky desertification areas,supervised classification and uns...Karst rocky desertification is a phenomenon of land degradation as a result of affection by the interaction of natural and human factors.In the past,in the rocky desertification areas,supervised classification and unsupervised classification are often used to classify the remote sensing image.But they only use pixel brightness characteristics to classify it.So the classification accuracy is low and can not meet the needs of practical application.Decision tree classification is a new technology for remote sensing image classification.In this study,we select the rocky desertification areas Kaizuo Township as a case study,use the ASTER image data,DEM and lithology data,by extracting the normalized difference vegetation index,ratio vegetation index,terrain slope and other data to establish classification rules to build decision trees.In the ENVI software support,we access the classification images.By calculating the classification accuracy and kappa coefficient,we find that better classification results can be obtained,desertification information can be extracted automatically and if more remote sensing image bands used,higher resolution DEM employed and less errors data reduced during processing,classification accuracy can be improve further.展开更多
The trend toward designing an intelligent distribution system based on students’individual differences and individual needs has taken precedence in view of the traditional dormitory distribution system,which neglects...The trend toward designing an intelligent distribution system based on students’individual differences and individual needs has taken precedence in view of the traditional dormitory distribution system,which neglects the students’personality traits,causes dormitory disputes,and affects the students’quality of life and academic quality.This paper collects freshmen's data according to college students’personal preferences,conducts a classification comparison,uses the decision tree classification algorithm based on the information gain principle as the core algorithm of dormitory allocation,determines the description rules of students’personal preferences and decision tree classification preferences,completes the conceptual design of the database of entity relations and data dictionaries,meets students’personality classification requirements for the dormitory,and lays the foundation for the intelligent dormitory allocation system.展开更多
The classification for handwritten Chinese character recognition can be viewed as a transformation in discrete vector space. In this paper, from the point of discrete vector space transformation, a new 4-corner codes ...The classification for handwritten Chinese character recognition can be viewed as a transformation in discrete vector space. In this paper, from the point of discrete vector space transformation, a new 4-corner codes classifier based on decision tree inductive learning algorithm ID3 for handwritten Chinese characters is presented. With a feature extraction controller, the classifier can reduce the number of extracted features and accelerate classification speed. Experimental results show that the 4-corner codes classifier performs well on both recognition accuracy and speed.展开更多
In many decision making tasks,the features and decision are ordinal.Several ordinal classification learning algorithms have been developed in recent years,it is shown that these algorithms are sensitive to noisy sampl...In many decision making tasks,the features and decision are ordinal.Several ordinal classification learning algorithms have been developed in recent years,it is shown that these algorithms are sensitive to noisy samples and do not work in real-world applications.In this work,we propose a new measure of feature quality, called rank mutual information.Then,we design an ordinal decision tree(REOT) construction technique based on rank mutual information.The theoretic and experimental analysis shows that the proposed algorithm is effective.展开更多
Based on the discuss of the basic concept of data mining technology and the decision tree method,combining with the data samples of wind and hailstorm disasters in some counties of Mudanjiang region,the forecasting mo...Based on the discuss of the basic concept of data mining technology and the decision tree method,combining with the data samples of wind and hailstorm disasters in some counties of Mudanjiang region,the forecasting model of agro-meteorological disaster grade was established by adopting the C4.5 classification algorithm of decision tree,which can forecast the direct economic loss degree to provide rational data mining model and obtain effective analysis results.展开更多
文摘The increase of competition, economic recession and financial crises has increased business failure and depending on this the researchers have attempted to develop new approaches which can yield more correct and more reliable results. The classification and regression tree (CART) is one of the new modeling techniques which is developed for this purpose. In this study, the classification and regression trees method is explained and tested the power of the financial failure prediction. CART is applied for the data of industry companies which is trade in Istanbul Stock Exchange (ISE) between 1997-2007. As a result of this study, it has been observed that, CART has a high predicting power of financial failure one, two and three years prior to failure, and profitability ratios being the most important ratios in the prediction of failure.
文摘This study investigates the use of a decision tree classification model, combined with Principal Component Analysis (PCA), to distinguish between Assam and Bhutan ethnic groups based on specific anthropometric features, including age, height, tail length, hair length, bang length, reach, and earlobe type. The dataset was reduced using PCA, which identified height, reach, and age as key features contributing to variance. However, while PCA effectively reduced dimensionality, it faced challenges in clearly distinguishing between the two ethnic groups, a limitation noted in previous research. In contrast, the decision tree model performed significantly better, establishing clear decision boundaries and achieving high classification accuracy. The decision tree consistently selected Height and Reach as the most important classifiers, a finding supported by existing studies on ethnic differences in Northeast India. The results highlight the strengths of combining PCA for dimensionality reduction with decision tree models for classification tasks. While PCA alone was insufficient for optimal class separation, its integration with decision trees improved both the model’s accuracy and interpretability. Future research could explore other machine learning models to enhance classification and examine a broader set of anthropometric features for more comprehensive ethnic group classification.
文摘Flood disasters can have a serious impact on people's production and lives, and can cause hugelosses in lives and property security. Based on multi-source remote sensing data, this study establisheddecision tree classification rules through multi-source and multi-temporal feature fusion, classified groundobjects before the disaster and extracted flood information in the disaster area based on optical imagesduring the disaster, so as to achieve rapid acquisition of the disaster situation of each disaster bearing object.In the case of Qianliang Lake, which suffered from flooding in 2020, the results show that decision treeclassification algorithms based on multi-temporal features can effectively integrate multi-temporal and multispectralinformation to overcome the shortcomings of single-temporal image classification and achieveground-truth object classification.
基金Research on Algorithm Model for Monitoring and Evaluating Typical Disaster Situations of Electric Power Equipment Based on Remote Sensing Imaging Technology of Heaven and Earth,South Grid Guangxi Power Grid Company Science and Technology Project(GXKJXM20222160).
文摘As one of the main geographical elements in urban areas,buildings are closely related to the development of the city.Therefore,how to quickly and accurately extract building information from remote sensing images is of great significance for urban map updating,urban planning and construction,etc.Extracting building information around power facilities,especially obtaining this information from high-resolution images,has become one of the current hot topics in remote sensing technology research.This study made full use of the characteristics of GF-2 satellite remote sensing images,adopted an object-oriented classification method,combined with multi-scale segmentation technology and CART classification algorithm,and successfully extracted the buildings in the study area.The research results showed that the overall classification accuracy reached 89.5%and the Kappa coefficient was 0.86.Using the object-oriented CART classification algorithm for building extraction could be closer to actual ground objects and had higher accuracy.The extraction of buildings in the city contributed to urban development planning and provided decision support for management.
基金supported by the China Earthquake Administration, Institute of Seismology Foundation (IS201526246)
文摘According to groundwater level monitoring data of Shuping landslide in the Three Gorges Reservoir area, based on the response relationship between influential factors such as rainfall and reservoir level and the change of groundwater level, the influential factors of groundwater level were selected. Then the classification and regression tree(CART) model was constructed by the subset and used to predict the groundwater level. Through the verification, the predictive results of the test sample were consistent with the actually measured values, and the mean absolute error and relative error is 0.28 m and 1.15%respectively. To compare the support vector machine(SVM) model constructed using the same set of factors, the mean absolute error and relative error of predicted results is 1.53 m and 6.11% respectively. It is indicated that CART model has not only better fitting and generalization ability, but also strong advantages in the analysis of landslide groundwater dynamic characteristics and the screening of important variables. It is an effective method for prediction of ground water level in landslides.
文摘Understanding an underlying structure for phylogenetic trees is very important as it informs on the methods that should be employed during phylogenetic inference. The methods used under a structured population differ from those needed when a population is not structured. In this paper, we compared two supervised machine learning techniques, that is artificial neural network (ANN) and logistic regression models for prediction of an underlying structure for phylogenetic trees. We carried out parameter tuning for the models to identify optimal models. We then performed 10-fold cross-validation on the optimal models for both logistic regression?and ANN. We also performed a non-supervised technique called clustering to identify the number of clusters that could be identified from simulated phylogenetic trees. The trees were from?both structured?and non-structured populations. Clustering and prediction using classification techniques were?done using tree statistics such as Colless, Sackin and cophenetic indices, among others. Results from 10-fold cross-validation revealed that both logistic regression and ANN models had comparable results, with both models having average accuracy rates of over 0.75. Most of the clustering indices used resulted in 2 or 3 as the optimal number of clusters.
基金National Natural Science Foundation of China(No.61163010)
文摘This paper presents a supervised learning algorithm for retinal vascular segmentation based on classification and regression tree (CART) algorithm and improved adptive bosting (AdaBoost). Local binary patterns (LBP) texture features and local features are extracted by extracting,reversing,dilating and enhancing the green components of retinal images to construct a 17-dimensional feature vector. A dataset is constructed by using the feature vector and the data manually marked by the experts. The feature is used to generate CART binary tree for nodes,where CART binary tree is as the AdaBoost weak classifier,and AdaBoost is improved by adding some re-judgment functions to form a strong classifier. The proposed algorithm is simulated on the digital retinal images for vessel extraction (DRIVE). The experimental results show that the proposed algorithm has higher segmentation accuracy for blood vessels,and the result basically contains complete blood vessel details. Moreover,the segmented blood vessel tree has good connectivity,which basically reflects the distribution trend of blood vessels. Compared with the traditional AdaBoost classification algorithm and the support vector machine (SVM) based classification algorithm,the proposed algorithm has higher average accuracy and reliability index,which is similar to the segmentation results of the state-of-the-art segmentation algorithm.
基金supported by the National Natural Science Foundation of China (60604021 60874054)
文摘To solve the multi-class fault diagnosis tasks, decision tree support vector machine (DTSVM), which combines SVM and decision tree using the concept of dichotomy, is proposed. Since the classification performance of DTSVM highly depends on its structure, to cluster the multi-classes with maximum distance between the clustering centers of the two sub-classes, genetic algorithm is introduced into the formation of decision tree, so that the most separable classes would be separated at each node of decisions tree. Numerical simulations conducted on three datasets compared with "one-against-all" and "one-against-one" demonstrate the proposed method has better performance and higher generalization ability than the two conventional methods.
基金This research is funded by the National Natural Science Foundation of China(Grant Nos.41807285 and 51679117)Key Project of the State Key Laboratory of Geohazard Prevention and Geoenvironment Protection(SKLGP2019Z002)+3 种基金the National Science Foundation of Jiangxi Province,China(20192BAB216034)the China Postdoctoral Science Foundation(2019M652287 and 2020T130274)the Jiangxi Provincial Postdoctoral Science Foundation(2019KY08)Fundamental Research Funds for National Universities,China University of Geosciences(Wuhan)。
文摘Machine learning algorithms are an important measure with which to perform landslide susceptibility assessments, but most studies use GIS-based classification methods to conduct susceptibility zonation.This study presents a machine learning approach based on the C5.0 decision tree(DT) model and the K-means cluster algorithm to produce a regional landslide susceptibility map. Yanchang County, a typical landslide-prone area located in northwestern China, was taken as the area of interest to introduce the proposed application procedure. A landslide inventory containing 82 landslides was prepared and subsequently randomly partitioned into two subsets: training data(70% landslide pixels) and validation data(30% landslide pixels). Fourteen landslide influencing factors were considered in the input dataset and were used to calculate the landslide occurrence probability based on the C5.0 decision tree model.Susceptibility zonation was implemented according to the cut-off values calculated by the K-means cluster algorithm. The validation results of the model performance analysis showed that the AUC(area under the receiver operating characteristic(ROC) curve) of the proposed model was the highest, reaching 0.88,compared with traditional models(support vector machine(SVM) = 0.85, Bayesian network(BN) = 0.81,frequency ratio(FR) = 0.75, weight of evidence(WOE) = 0.76). The landslide frequency ratio and frequency density of the high susceptibility zones were 6.76/km^(2) and 0.88/km^(2), respectively, which were much higher than those of the low susceptibility zones. The top 20% interval of landslide occurrence probability contained 89% of the historical landslides but only accounted for 10.3% of the total area.Our results indicate that the distribution of high susceptibility zones was more focused without containing more " stable" pixels. Therefore, the obtained susceptibility map is suitable for application to landslide risk management practices.
文摘This paper focuses on improving decision tree induction algorithms when a kind of tie appears during the rule generation procedure for specific training datasets. The tie occurs when there are equal proportions of the target class outcome in the leaf node's records that leads to a situation where majority voting cannot be applied. To solve the above mentioned exception, we propose to base the prediction of the result on the naive Bayes (NB) estimate, k-nearest neighbour (k-NN) and association rule mining (ARM). The other features used for splitting the parent nodes are also taken into consideration.
基金Project supported by the National Natural Science Foundation ofChina (No. 40101014) and by the Science and technology Committee of Zhejiang Province (No. 001110445) China
文摘This article presents two approaches for automated building of knowledge bases of soil resources mapping. These methods used decision tree and Bayesian predictive modeling, respectively to generate knowledge from training data. With these methods, building a knowledge base for automated soil mapping is easier than using the conventional knowledge acquisition approach. The knowledge bases built by these two methods were used by the knowledge classifier for soil type classification of the Longyou area, Zhejiang Province, China using TM bi-temporal imageries and GIS data. To evaluate the performance of the resultant knowledge bases, the classification results were compared to existing soil map based on field survey. The accuracy assessment and analysis of the resultant soil maps suggested that the knowledge bases built by these two methods were of good quality for mapping distribution model of soil classes over the study area.
文摘Karst rocky desertification is a phenomenon of land degradation as a result of affection by the interaction of natural and human factors.In the past,in the rocky desertification areas,supervised classification and unsupervised classification are often used to classify the remote sensing image.But they only use pixel brightness characteristics to classify it.So the classification accuracy is low and can not meet the needs of practical application.Decision tree classification is a new technology for remote sensing image classification.In this study,we select the rocky desertification areas Kaizuo Township as a case study,use the ASTER image data,DEM and lithology data,by extracting the normalized difference vegetation index,ratio vegetation index,terrain slope and other data to establish classification rules to build decision trees.In the ENVI software support,we access the classification images.By calculating the classification accuracy and kappa coefficient,we find that better classification results can be obtained,desertification information can be extracted automatically and if more remote sensing image bands used,higher resolution DEM employed and less errors data reduced during processing,classification accuracy can be improve further.
文摘The trend toward designing an intelligent distribution system based on students’individual differences and individual needs has taken precedence in view of the traditional dormitory distribution system,which neglects the students’personality traits,causes dormitory disputes,and affects the students’quality of life and academic quality.This paper collects freshmen's data according to college students’personal preferences,conducts a classification comparison,uses the decision tree classification algorithm based on the information gain principle as the core algorithm of dormitory allocation,determines the description rules of students’personal preferences and decision tree classification preferences,completes the conceptual design of the database of entity relations and data dictionaries,meets students’personality classification requirements for the dormitory,and lays the foundation for the intelligent dormitory allocation system.
文摘The classification for handwritten Chinese character recognition can be viewed as a transformation in discrete vector space. In this paper, from the point of discrete vector space transformation, a new 4-corner codes classifier based on decision tree inductive learning algorithm ID3 for handwritten Chinese characters is presented. With a feature extraction controller, the classifier can reduce the number of extracted features and accelerate classification speed. Experimental results show that the 4-corner codes classifier performs well on both recognition accuracy and speed.
基金supported by National Natural Science Foundation of China under Grant 60703013 and 10978011Key Program of National Natural Science Foundation of China under Grant 60932008+1 种基金National Science Fund for Distinguished Young Scholars under Grant 50925625China Postdoctoral Science Foundation.
文摘In many decision making tasks,the features and decision are ordinal.Several ordinal classification learning algorithms have been developed in recent years,it is shown that these algorithms are sensitive to noisy samples and do not work in real-world applications.In this work,we propose a new measure of feature quality, called rank mutual information.Then,we design an ordinal decision tree(REOT) construction technique based on rank mutual information.The theoretic and experimental analysis shows that the proposed algorithm is effective.
基金Supported by Science and Technology Plan of Mudanjiang City (G200920064)Teaching Reform Construction of Mudanjiang Normal University (10-xj11080)
文摘Based on the discuss of the basic concept of data mining technology and the decision tree method,combining with the data samples of wind and hailstorm disasters in some counties of Mudanjiang region,the forecasting model of agro-meteorological disaster grade was established by adopting the C4.5 classification algorithm of decision tree,which can forecast the direct economic loss degree to provide rational data mining model and obtain effective analysis results.