摘要
基于监测数据及机器学习算法的湖泊水质实时评价技术对当前湖泊水资源的管理、维护和保护具有重要意义。本文针对巢湖水质的类别评价,利用随机森林(Random Forest,RF)分类算法对该区域水质进行类别判定。与其他算法相比,随机森林算法有着精度高、可容忍噪声强等诸多优点。测试结果表明,当决策树的棵数ntree=300,分裂属性集中属性个数mtry=2时,在合肥湖滨监测断面水质分类准确率可达96.15%,在巢湖裕溪口监测断面水质分类准确率高达100%,该方法具有稳健性较高、实用性强、泛化性能好等特点,能够有效进行水质评价。
Real time evaluation of water quality based on monitoring data and machine learning algorithm has great significance for management,maintenances and protection of water resources in lake. Aiming at the class evaluation of water quality of Chaohu,a classification algorithm named random forest was used to determine the category of the water quality of this area. Comparing with other typical machine learning methods,this method has higher precision of classification and better tolerableness of noise. The testing result shows that when the quantities of the decision-making tree: ntree = 300 and the number of attributes of split attribute sets: mtry = 2,the accuracy rate of water quality classification in Hefei Hubin monitoring section could reach 96. 15%,and it reaches as high as 100% in Yu Xikou monitoring section. The suggested method has higher robustness,stronger practicability and higher generalization performance. It can effectively fulfill water quality assessment with high precision.
出处
《环境工程学报》
CAS
CSCD
北大核心
2016年第2期992-998,共7页
Chinese Journal of Environmental Engineering
基金
国家自然科学基金资助项目(61273068)
上海市自然科学基金资助项目(12ZR1412600)
上海市教委科研创新资助项目(13YZ084)
关键词
随机森林算法
决策树
分裂属性集
水质评价
random forest algorithm
decision-making tree
split attribute sets
water quality assessment