Logistic regression is often used to solve linear binary classification problems such as machine vision,speech recognition,and handwriting recognition.However,it usually fails to solve certain nonlinear multi-classifi...Logistic regression is often used to solve linear binary classification problems such as machine vision,speech recognition,and handwriting recognition.However,it usually fails to solve certain nonlinear multi-classification problem,such as problem with non-equilibrium samples.Many scholars have proposed some methods,such as neural network,least square support vector machine,AdaBoost meta-algorithm,etc.These methods essentially belong to machine learning categories.In this work,based on the probability theory and statistical principle,we propose an improved logistic regression algorithm based on kernel density estimation for solving nonlinear multi-classification.We have compared our approach with other methods using non-equilibrium samples,the results show that our approach guarantees sample integrity and achieves superior classification.展开更多
在开展新能源出力预测阶段,由于新能源自身具有波动性和间歇性,导致预测结果的可靠性难以得到保障。为此,提出基于XGBoost和QRLSTM的新能源出力高精度预测方法。采用极限梯度提升算法(EXtreme Gradient Boosting,XGBoost)建立新能源出...在开展新能源出力预测阶段,由于新能源自身具有波动性和间歇性,导致预测结果的可靠性难以得到保障。为此,提出基于XGBoost和QRLSTM的新能源出力高精度预测方法。采用极限梯度提升算法(EXtreme Gradient Boosting,XGBoost)建立新能源出力数据的目标函数,利用二阶泰勒展开式对目标函数进行近似处理。结合分位数回归构(Quantile Regression,QR)改进长短期记忆(Long Short Term Memory,LSTM)递归神经网络,构建QRLSTM模型将近似处理后的数据输入至该模型中,通过逻辑门完成新能源出力预测。在测试结果中,实际方法在不同环境条件下对于新能源机组出力情况的预测结果均与实际情况保持较高的拟合度,具有较高的精准度。展开更多
随着计算机技术的发展,可以采用仿真的方法来研究新拌混凝土的流变性能.离散元方法适合于新拌混凝土的大变形流动.颗粒的物性参数和接触参数的设定是模拟结果真实可靠的关键.在本研究中,将混凝土分为机制石和砂浆两相.首先测量了物性参...随着计算机技术的发展,可以采用仿真的方法来研究新拌混凝土的流变性能.离散元方法适合于新拌混凝土的大变形流动.颗粒的物性参数和接触参数的设定是模拟结果真实可靠的关键.在本研究中,将混凝土分为机制石和砂浆两相.首先测量了物性参数,包含密度、恢复系数、静摩擦系数和滚动摩擦系数.使用Hertz-Mindlin(no slip)接触模型表示粗骨料-边界和粗骨料-粗骨料之间的相互作用,并通过休止角的实验进行了验证;采用Hertz-Mindlin with JKR接触模型来描述粗骨料-砂浆、砂浆-砂浆、砂浆-边界之间的相互作用,用坍落度实验对JKR参数进行了标定,并采用响应曲面法确定了最佳的参数组合值.最后通过L型箱试验对新拌混凝土的离散元仿真方法进行了验证.展开更多
The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will resu...The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will result in rising outlier values and noise.Therefore,the speed and performance of classification could be greatly affected.Given the above problems,this paper starts with the motivation and mathematical representing of classification,puts forward a new classification method based on the relationship between different classification formulations.Combined with the vector characteristics of the actual problem and the choice of matrix characteristics,we firstly analyze the orderly regression to introduce slack variables to solve the constraint problem of the lone point.Then we introduce the fuzzy factors to solve the problem of the gap between the isolated points on the basis of the support vector machine.We introduce the cost control to solve the problem of sample skew.Finally,based on the bi-boundary support vector machine,a twostep weight setting twin classifier is constructed.This can help to identify multitasks with feature-selected patterns without the need for additional optimizers,which solves the problem of large-scale classification that can’t deal effectively with the very low category distribution gap.展开更多
基金The authors would like to thank all anonymous reviewers for their suggestions and feedback.This work was supported by National Natural Science Foundation of China(Grant No.61379103).
文摘Logistic regression is often used to solve linear binary classification problems such as machine vision,speech recognition,and handwriting recognition.However,it usually fails to solve certain nonlinear multi-classification problem,such as problem with non-equilibrium samples.Many scholars have proposed some methods,such as neural network,least square support vector machine,AdaBoost meta-algorithm,etc.These methods essentially belong to machine learning categories.In this work,based on the probability theory and statistical principle,we propose an improved logistic regression algorithm based on kernel density estimation for solving nonlinear multi-classification.We have compared our approach with other methods using non-equilibrium samples,the results show that our approach guarantees sample integrity and achieves superior classification.
文摘在开展新能源出力预测阶段,由于新能源自身具有波动性和间歇性,导致预测结果的可靠性难以得到保障。为此,提出基于XGBoost和QRLSTM的新能源出力高精度预测方法。采用极限梯度提升算法(EXtreme Gradient Boosting,XGBoost)建立新能源出力数据的目标函数,利用二阶泰勒展开式对目标函数进行近似处理。结合分位数回归构(Quantile Regression,QR)改进长短期记忆(Long Short Term Memory,LSTM)递归神经网络,构建QRLSTM模型将近似处理后的数据输入至该模型中,通过逻辑门完成新能源出力预测。在测试结果中,实际方法在不同环境条件下对于新能源机组出力情况的预测结果均与实际情况保持较高的拟合度,具有较高的精准度。
文摘随着计算机技术的发展,可以采用仿真的方法来研究新拌混凝土的流变性能.离散元方法适合于新拌混凝土的大变形流动.颗粒的物性参数和接触参数的设定是模拟结果真实可靠的关键.在本研究中,将混凝土分为机制石和砂浆两相.首先测量了物性参数,包含密度、恢复系数、静摩擦系数和滚动摩擦系数.使用Hertz-Mindlin(no slip)接触模型表示粗骨料-边界和粗骨料-粗骨料之间的相互作用,并通过休止角的实验进行了验证;采用Hertz-Mindlin with JKR接触模型来描述粗骨料-砂浆、砂浆-砂浆、砂浆-边界之间的相互作用,用坍落度实验对JKR参数进行了标定,并采用响应曲面法确定了最佳的参数组合值.最后通过L型箱试验对新拌混凝土的离散元仿真方法进行了验证.
基金Hebei Province Key Research and Development Project(No.20313701D)Hebei Province Key Research and Development Project(No.19210404D)+13 种基金Mobile computing and universal equipment for the Beijing Key Laboratory Open Project,The National Social Science Fund of China(17AJL014)Beijing University of Posts and Telecommunications Construction of World-Class Disciplines and Characteristic Development Guidance Special Fund “Cultural Inheritance and Innovation”Project(No.505019221)National Natural Science Foundation of China(No.U1536112)National Natural Science Foundation of China(No.81673697)National Natural Science Foundation of China(61872046)The National Social Science Fund Key Project of China(No.17AJL014)“Blue Fire Project”(Huizhou)University of Technology Joint Innovation Project(CXZJHZ201729)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201902218004)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201902024006)Industry-University Cooperation Cooperative Education Project of the Ministry of Education(No.201901197007)Industry-University Cooperation Collaborative Education Project of the Ministry of Education(No.201901199005)The Ministry of Education Industry-University Cooperation Collaborative Education Project(No.201901197001)Shijiazhuang science and technology plan project(236240267A)Hebei Province key research and development plan project(20312701D)。
文摘The distribution of data has a significant impact on the results of classification.When the distribution of one class is insignificant compared to the distribution of another class,data imbalance occurs.This will result in rising outlier values and noise.Therefore,the speed and performance of classification could be greatly affected.Given the above problems,this paper starts with the motivation and mathematical representing of classification,puts forward a new classification method based on the relationship between different classification formulations.Combined with the vector characteristics of the actual problem and the choice of matrix characteristics,we firstly analyze the orderly regression to introduce slack variables to solve the constraint problem of the lone point.Then we introduce the fuzzy factors to solve the problem of the gap between the isolated points on the basis of the support vector machine.We introduce the cost control to solve the problem of sample skew.Finally,based on the bi-boundary support vector machine,a twostep weight setting twin classifier is constructed.This can help to identify multitasks with feature-selected patterns without the need for additional optimizers,which solves the problem of large-scale classification that can’t deal effectively with the very low category distribution gap.