Date of Award


Degree Name

MS in Environmental Sciences and Management


Natural Resources Management


College of Agriculture, Food, and Environmental Sciences


Gordon Rees

Advisor Department

Natural Resources Management

Advisor College

College of Agriculture, Food, and Environmental Sciences


Soil characterization provides the basic information necessary for understanding the physical, chemical, and biological properties of soils. Knowledge about soils can in turn be used to inform management practices, optimize agricultural operations, and ensure the continuation of ecosystem services provided by soils. However, current analytical standards for identifying each distinct property are costly and time-consuming. The optimization of laboratory grade technology for wide scale use is demonstrated by advances in a proximal soil sensing technique known as portable X-ray fluorescence spectrometry (pXRF). pXRF analyzers use high energy Xrays that interact with a sample to cause characteristic reflorescence that can be distinguished by the analyzer for its energy and intensity to determine the chemical composition of the sample. While pXRF only measures total elemental abundance, the concentrations of certain elements have been used as a proxy to develop models capable of predicting soil characteristics. This study aimed to evaluate existing models and model building techniques for predicting soil pH, texture, cation exchange capacity (CEC), soil organic carbon (SOC), total nitrogen (TN), and C:N ratio from pXRF spectra and assess their fittingness for California soils by comparing predictions to results from laboratory methods. Multiple linear regression (MLR) and random forest (RF) models were created for each property using a training subset of data and evaluated by R2 , RMSE, RPD and RPIQ on an unseen test set. The California soils sample set was comprised of 480 soil samples from across the state that were subject to laboratory and pXRF analysis in GeoChem mode. Results showed that existing data models applied to the CA soils dataset lacked predictive ability. In comparison, data models generated using MLR with 10-fold cross validation for variable selection improved predictions, while algorithmic modeling produced the best estimates for all properties besides pH. The best models produced for each property gave RMSE values of 0.489 for pH, 10.8 for sand %, 6.06 for clay % (together predicting the correct texture class 74% of the time), 6.79 for CEC (cmolc/kg soil), 1.01 for SOC %, 0.062 for TN %, and 7.02 for C:N ratio. Where R2 and RMSE were observed to fluctuate inconsistently with a change in the random train/test splits, RPD and RPIQ were more stable, which may indicate a more useful representation of out of sample applicability. RF modeling for TN content provided the best predictive model overall (R2 = 0.782, RMSE = 0.062, RPD = 2.041, and RPIQ = 2.96). RF models for CEC and TN % achieved RPD values >2, indicating stable predictive models (Cheng et al., 2021). Lower RPD values between 1.75 and 2 and RPIQ >2 were also found for MLR models of CEC, and TN %, as well as RF models for SOC. Better estimates for chemical properties (CEC, N, SOC) when compared to physical properties (texture), may be attributable to a correlation between elemental signatures and organic matter. All models were improved with the addition of categorical variables (land-use and sample set) but came at a great statistical cost (9 extra predictors). Separating models by land type and lab characterization method revealed some improvements within land types, but these effects could not be fully untangled from sample set. Thus, the consortia of characterizing bodies for ‘true’ lab data may have been a drawback in model performance, by confounding inter-lab errors with predictive errors. Future studies using pXRF analysis for soil property estimation should investigate how predictive v models are affected by characterizing method and lab body. While statewide models for California soils provided what may be an acceptable level of error for some applications, models calibrated for a specific site using consistent lab characterization methods likely provide a higher degree of accuracy for indirect measurements of some key soil properties.