| 邓辉婷,赵婉妤,李应裕庭,胡甜,李文彦,王进进.机器学习耦合多源变量预测农田重金属分布与生态风险[J].农业环境科学学报,2025,44(11):2783-2795. |
| 机器学习耦合多源变量预测农田重金属分布与生态风险 |
| Machine learning coupled with multi-source variables predicts the distribution of heavy metals in farmland and ecological risks |
| 投稿时间:2025-05-18 |
| DOI:10.11654/jaes.2025-0458 |
| 中文关键词: 机器学习 多源变量 风险评估 空间预测 土壤重金属 |
| 英文关键词: machine learning multi-source variable risk assessment spatial prediction soil heavy metals |
| 基金项目:国家重点研发计划项目(2023YFD1700103);国家特色油料产业技术体系项目(CARS-14) |
| 作者 | 单位 | E-mail | | 邓辉婷 | 广东省农田土壤污染防控工程技术研究中心, 华南农业大学资源环境学院, 广州 510642 农业农村部华南耕地保育重点实验室, 广州 510642 | | | 赵婉妤 | 广东省农田土壤污染防控工程技术研究中心, 华南农业大学资源环境学院, 广州 510642 农业农村部华南耕地保育重点实验室, 广州 510642 | | | 李应裕庭 | 广东省农田土壤污染防控工程技术研究中心, 华南农业大学资源环境学院, 广州 510642 农业农村部华南耕地保育重点实验室, 广州 510642 | | | 胡甜 | 广东省农田土壤污染防控工程技术研究中心, 华南农业大学资源环境学院, 广州 510642 农业农村部华南耕地保育重点实验室, 广州 510642 | | | 李文彦 | 广东省农田土壤污染防控工程技术研究中心, 华南农业大学资源环境学院, 广州 510642 农业农村部华南耕地保育重点实验室, 广州 510642 | | | 王进进 | 广东省农田土壤污染防控工程技术研究中心, 华南农业大学资源环境学院, 广州 510642 农业农村部华南耕地保育重点实验室, 广州 510642 | wangjinjin@scau.edu.cn |
|
| 摘要点击次数: 1043 |
| 全文下载次数: 801 |
| 中文摘要: |
| 为识别重金属的空间变异特征及其潜在生态风险,本研究在广东省台山市采集了1 166个农田土壤样本,测定其中铬(Cr)、铅(Pb)、砷(As)、汞(Hg)和镉(Cd)的含量,并获取了17个协同因子,通过特征递归消除(RFE)方法筛选出10个重要因子,分别与随机森林(RF)、支持向量机(SVM)和人工神经网络(ANN)相结合,选出最佳预测模型及因子子集,再用地累积指数法评估污染风险程度并绘制分布图。结果表明:RF模型的R2值在训练集上均高于0.940,在测试集上除Cr外,介于0.583~0.766之间。与其他模型相比,RF表现最佳。SHAP分析揭示了降水、夜晚灯光强度、已开采矿藏距离和企业距离是主要驱动因子。地累积指数结果表明,Cd和Hg中度污染及以上的面积分别占5.6%和35.5%。研究表明,台山市的土壤重金属污染受到了自然和人为因素的共同影响,其中Cd和Hg是主要的污染物,风险高值区主要分布在台山市的西南部和北部,这些区域需要重点关注。 |
| 英文摘要: |
| To identify the spatial variation characteristics of heavy metals and their potential ecological risks, in this study, we collected 1 166 farmland soil samples in Taishan City, Guangdong Province, to measure concentrations of five heavy metals(Cr, Pb, As, Hg, Cd)and obtain 17 co-factors. Using recursive feature elimination(RFE), the 10 most influential factors were identified and combined with three machine learning models(RF, SVM, ANN)to select the best prediction model. Geo-accumulation index was then used to assess pollution risk and generate a distribution map. The RF model performs the best, with R2 values above 0.940 on the training set and 0.583-0.766 on the test set(except for Cr). In contrast, the SVM model had R2 values ranging from 0.275-0.533 on the training set and from 0.226-0.461 on the test set. The ANN model had R2 values ranging from 0.156-0.587 on the training set and from 0.183-0.489 on the test set. SHAP analysis identified key factors influencing predictions: precipitation, night light intensity, and distance to exploited metal mines for Cd and Pb; precipitation, distance to exploited metal mines, and distance to industrial enterprises for As and Hg. The study area had no strong pollution, but Cd and Hg showed moderate pollution over 5.6% and 35.5% of the area, respectively, indicating a need for focused pollution control. The RF model showed excellent prediction performance with strong generalization and application potential. Soil heavy metals in Taishan City were influenced by both natural and anthropogenic factors, with Cd and Hg being the main pollutants, primarily distributed in the southwest and northern areas. These results highlight the need to prioritize these regions for pollution control and remediation. |
| HTML
查看全文
查看/发表评论 下载PDF阅读器 |
|
|
|