文章摘要
基于SMOTE辅助分区误差控制的随机森林土壤重金属含量预测模型
A random forest model for predicting soil heavy metals content based on SMOTE and assisted partition error control
投稿时间:2024-03-27  
DOI:10.13254/j.jare.2024.0192
中文关键词: 土壤重金属预测,随机森林,SMOTE过采样,普通克里金,协同克里金,反距离加权
英文关键词: soil heavy metal prediction, random forest, SMOTE oversampling, ordinary kriging, co-kriging, inverse distance weighting
基金项目:“一带一路”创新人才交流外国专家项目(DL2022051004L)
作者单位E-mail
陈敏 农业农村部环境保护科研监测所, 天津 300191
中国农业科学院湘潭综合实验站, 湖南 湘潭 411100 
 
董泽馨 农业农村部环境保护科研监测所, 天津 300191
中国农业科学院湘潭综合实验站, 湖南 湘潭 411100 
 
秦莉 农业农村部环境保护科研监测所, 天津 300191
中国农业科学院湘潭综合实验站, 湖南 湘潭 411100 
ql-tj@163.com 
张晨晨 农业农村部环境保护科研监测所, 天津 300191
中国农业科学院湘潭综合实验站, 湖南 湘潭 411100 
 
张彦儒 农业农村部环境保护科研监测所, 天津 300191
中国农业科学院湘潭综合实验站, 湖南 湘潭 411100 
 
孙思佳 农业农村部环境保护科研监测所, 天津 300191
中国农业科学院湘潭综合实验站, 湖南 湘潭 411100
东北农业大学资源与环境学院, 哈尔滨 150030 
 
摘要点击次数: 269
全文下载次数: 326
中文摘要:
      土壤中重金属空间分布的准确预测是制定科学合理的土地利用规划以及构建有效风险管理措施的关键环节。本研究旨在探索一种结合合成少数类过采样技术(SMOTE)和分区误差控制混合策略的随机森林(RF)模型,利用长株潭(长沙市、株洲市和湘潭市)区域8种重金属元素(As、Cd、Cr、Cu、Hg、Ni、Pb和Zn)及29项环境辅助变量数据,开展区域土壤重金属空间预测精度比较研究。将本研究构建的模型与全区及分区随机森林建模方法进行了比较分析,同时,也与三种经典地统计学方法——普通克里金(OK)、协同克里金(CK)和反距离加权法(IDW)进行了对比。结果表明:相较于全区建模方法,本研究构建的模型在预测 Cd、Cr、Hg、Ni、Pb和Zn 6种重金属含量的R2值提升了15.87%~35.39%;与分区建模方法相比,所有8种重金属的预测精度也有了显著提高,R2值的增幅为3.03%~66.86%。与地统计学方法比较,本模型在Cd、Cr、Hg、Pb和Zn 5种重金属预测中表现出优越性,与OK、CK和IDW法相比,R2值分别提升了2.45%~13.80%、15.09%~89.95%、1.57%~102.91%。本研究探索的混合策略模型显著提高了长株潭区域土壤中 8种重金属元素的预测准确度,表明 SMOTE 技术和分区误差控制策略的结合应用在环境科学领域内有巨大潜力。该模型不仅在预测精度上超越了传统模型和方法,还为环境监测和管理提供了一种有效的新工具。
英文摘要:
      Accurately predicting the spatial distribution of heavy metals in soil is a crucial step in formulating scientifically sound land use plans and constructing effective risk management measures. This study aims to explore a Random Forest(RF)model that combines Synthetic Minority Oversampling Technique(SMOTE)and partition error control mixed strategies, using data on eight heavy metal elements (As, Cd, Cr, Cu, Hg, Ni, Pb, and Zn)and 29 environmental auxiliary variables in the Chang-Zhu-Tan region(Changsha, Zhuzhou, and Xiangtan). A comparative study of spatial prediction accuracy of regional soil heavy metals was conducted. The model constructed in this study was compared and analyzed with both regional and partitioned RF modeling methods, as well as three classical geostatistical models: Ordinary Kriging(OK), Co-Kriging(CK), and Inverse Distance Weighting(IDW). The results indicated that compared with the regional modeling approach, the R2 indicators for predicting the content of Cd, Cr, Hg, Ni, Pb, and Zn increased by 15.87% to 35.39% in this study's model. Compared with the partition modeling approach, the prediction accuracy of all eight heavy metals also significantly improved, with R2 increased ranging from 3.03% to 66.86%. Compared with the geostatistical models, this model exhibited superiority in predicting Cd, Cr, Hg, Pb, and Zn, with R2 increases of 2.45% to 13.80%, 15.09% to 89.95%, and 1.57% to 102.91%, respectively, compared with OK, CK, and IDW. The hybrid strategy model explored in this study significantly improved the prediction accuracy of eight heavy metal elements in the Chang-Zhu-Tan region's soil, demonstrating the great potential of combining SMOTE technology and partition error control strategies in the field of environmental science. This model not only surpasses traditional models and methods in prediction accuracy but also provides an effective new tool for environmental monitoring and management.
HTML   查看全文   查看/发表评论  下载PDF阅读器
关闭