A machine learning approach to geochemical mapping

Kirkwood, Charlie and Cave, Mark and Beamish, David and Grebby, Stephen and Ferreira, Antonio (2016) A machine learning approach to geochemical mapping. Journal of Geochemical Exploration, 167 . pp. 49-61. ISSN 1879-1689

[img] PDF - Repository staff only until 10 May 2017. - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Available under Licence Creative Commons Attribution Non-commercial No Derivatives.
Download (2MB)


Geochemical maps provide invaluable evidence to guide decisions on issues of mineral exploration, agriculture, and environmental health. However, the high cost of chemical analysis means that the ground sampling density will always be limited. Traditionally, geochemical maps have been produced through the interpolation of measured element concentrations between sample sites using models based on the spatial autocorrelation of data (e.g. semivariogram models for ordinary kriging). In their simplest form such models fail to consider potentially useful auxiliary information about the region and the accuracy of the maps may suffer as a result. In contrast, this study uses quantile regression forests (an elaboration of random forest) to investigate the potential of high resolution auxiliary information alone to support the generation of accurate and interpretable geochemical maps. This paper presents a summary of the performance of quantile regression forests in predicting element concentrations, loss on ignition and pH in the soils of south west England using high resolution remote sensing and geophysical survey data.

Through stratified 10-fold cross validation we find the accuracy of quantile regression forests in predicting soil geochemistry in south west England to be a general improvement over that offered by ordinary kriging. Concentrations of immobile elements whose distributions are most tightly controlled by bedrock lithology are predicted with the greatest accuracy (e.g. Al with a cross-validated R2 of 0.79), while concentrations of more mobile elements prove harder to predict. In addition to providing a high level of prediction accuracy, models built on high resolution auxiliary variables allow for informative, process based, interpretations to be made. In conclusion, this study has highlighted the ability to map and understand the surface environment with greater accuracy and detail than previously possible by combining information from multiple datasets. As the quality and coverage of remote sensing and geophysical surveys continue to improve, machine learning methods will provide a means to interpret the otherwise-uninterpretable.

Item Type: Article
Keywords: Uncertainty, Modelling, Soil geochemistry, Quantile regression, Random forest, South west England
Schools/Departments: University of Nottingham UK Campus > Faculty of Engineering
Identification Number: https://doi.org/10.1016/j.gexplo.2016.05.003
Depositing User: Grebby, Dr Stephen
Date Deposited: 10 Jun 2016 09:01
Last Modified: 19 Sep 2016 17:33
URI: http://eprints.nottingham.ac.uk/id/eprint/33879

Actions (Archive Staff Only)

Edit View Edit View