文章摘要
冯锋,王育红,左雨芳.基于RF-XGBoost的土壤镉污染影响因子及空间分布研究[J].农业环境科学学报,2023,42(4):811-819.
基于RF-XGBoost的土壤镉污染影响因子及空间分布研究
A study on factors that influence the spatial distribution of soil cadmium pollution based on RF-XGBoost
投稿时间:2022-09-02  
DOI:10.11654/jaes.2022-0884
中文关键词: 土壤重金属污染  随机森林  极端梯度提升  影响因子  空间分布
英文关键词: soil heavy metal pollution  Random Forest  eXtreme Gradient Boosting  influencing factors  spatial distribution
基金项目:国家自然科学基金项目(U1304401);江苏省研究生科研与创新计划项目(KYCX21_2572)
作者单位E-mail
冯锋 江苏师范大学地理测绘与城乡规划学院, 江苏 徐州 221000  
王育红 江苏师范大学地理测绘与城乡规划学院, 江苏 徐州 221000 wyhhyk@126.com 
左雨芳 江苏师范大学地理测绘与城乡规划学院, 江苏 徐州 221000  
摘要点击次数: 683
全文下载次数: 943
中文摘要:
      为降低土壤重金属研究中人工采样成本,宏观掌握大尺度研究区内土壤重金属污染的空间分布特征,本研究以贵阳市、遵义市和毕节市为研究区,以镉(Cadmium,Cd)元素为研究对象,提出利用随机森林(Random Forest,RF)分析评估影响因子的贡献率,并根据贡献率进行影响因子筛选后构建极端梯度提升(eXtreme Gradient Boosting,XGBoost)模型,即RF-XGBoost模型,用以预测研究区内土壤Cd污染的空间分布特征。结果表明:研究区内土壤Cd含量平均值仅比贵州省背景值高出0.02 mg·kg-1,污染程度较低,变异系数为125.37%,属于强变异;研究区内对土壤Cd污染贡献率最高的影响因子为土壤侵蚀程度、高程和年平均气温,贡献率分别为0.100、0.088和0.084,说明在大尺度研究区中自然环境对土壤Cd富集影响最大;RF-XGBoost模型的精度和稳定性高于RF和XGBoost模型,准确率提升了0.039 3,Kappa系数分别提升了0.059 2、0.091 4,F1_score分别提升了0.250 4、0.270 1;研究区内土壤Cd污染整体程度较低,但在毕节市西南部出现多个中度-中强污染带。研究表明,RF-XGBoost模型可准确预测大尺度范围的土壤Cd污染空间分布,有助于宏观掌握土壤Cd污染的空间分布特征。
英文摘要:
      To reduce the cost of manual sampling in the study of soil heavy metals and understand the spatial distribution characteristics of soil heavy metal pollution over a large-scale study area, our research examines cadmium(Cd)levels in the Guiyang, Zunyi, and Bijie regions. We proposed the use of Random Forest(RF)analysis to evaluate the contribution rate of influencing factors. The eXtreme Gradient Boosting(XGBoost) model, namely the RF-XGBoost model, was constructed after screening influencing factors according to the contribution rate to predict the spatial distribution characteristics of soil Cd pollution in the study area. The results showed that the average soil Cd content in the study area was only 0.02 mg·kg-1 higher than the background value in Guizhou Province. The results also showed that the pollution degree was low, and the coefficient of variation was a strong 125.37%. The highest contributing factors to soil Cd pollution in the study area were soil erosion degree, elevation, and annual average temperature, with contribution rates of 0.100, 0.088, and 0.084, respectively. This indicates that the natural environment has the greatest impact on soil Cd enrichment in the study area. The accuracy and performance of the RF-XGBoost model were higher than that of the RF and XGBoost models, with accuracy increased by 0.039 3, Kappa coefficient increased by 0.059 2 and 0.091 4, and F1_score increased by 0.250 4 and 0.270 1, respectively. The overall degree of soil Cd pollution in the study area is low, but there are several moderate to moderately strong pollution zones in the southwest of Bijie City. The results show that the RF-XGBoost model can accurately forecast the spatial distribution of soil Cd pollution at a large scale. This aids our understanding of the spatial distribution characteristics of soil Cd pollution at a macro level.
HTML    查看全文   查看/发表评论  下载PDF阅读器