Advanced Search
A study on factors that influence the spatial distribution of soil cadmium pollution based on RF-XGBoost
Received:September 02, 2022  
View Full Text  View/Add Comment  Download reader
KeyWord:soil heavy metal pollution;Random Forest;eXtreme Gradient Boosting;influencing factors;spatial distribution
Author NameAffiliationE-mail
FENG Feng School of Geography, Geomatics and Planning, Jiangsu Normal University, Xuzhou 221000, China  
WANG Yuhong School of Geography, Geomatics and Planning, Jiangsu Normal University, Xuzhou 221000, China wyhhyk@126.com 
ZUO Yufang School of Geography, Geomatics and Planning, Jiangsu Normal University, Xuzhou 221000, China  
Hits: 792
Download times: 1047
Abstract:
      To reduce the cost of manual sampling in the study of soil heavy metals and understand the spatial distribution characteristics of soil heavy metal pollution over a large-scale study area, our research examines cadmium(Cd)levels in the Guiyang, Zunyi, and Bijie regions. We proposed the use of Random Forest(RF)analysis to evaluate the contribution rate of influencing factors. The eXtreme Gradient Boosting(XGBoost) model, namely the RF-XGBoost model, was constructed after screening influencing factors according to the contribution rate to predict the spatial distribution characteristics of soil Cd pollution in the study area. The results showed that the average soil Cd content in the study area was only 0.02 mg·kg-1 higher than the background value in Guizhou Province. The results also showed that the pollution degree was low, and the coefficient of variation was a strong 125.37%. The highest contributing factors to soil Cd pollution in the study area were soil erosion degree, elevation, and annual average temperature, with contribution rates of 0.100, 0.088, and 0.084, respectively. This indicates that the natural environment has the greatest impact on soil Cd enrichment in the study area. The accuracy and performance of the RF-XGBoost model were higher than that of the RF and XGBoost models, with accuracy increased by 0.039 3, Kappa coefficient increased by 0.059 2 and 0.091 4, and F1_score increased by 0.250 4 and 0.270 1, respectively. The overall degree of soil Cd pollution in the study area is low, but there are several moderate to moderately strong pollution zones in the southwest of Bijie City. The results show that the RF-XGBoost model can accurately forecast the spatial distribution of soil Cd pollution at a large scale. This aids our understanding of the spatial distribution characteristics of soil Cd pollution at a macro level.