The use of machine learning algorithms to design a generalized simplified denitrification model
Abstract. We propose to use machine learning (ML) algorithms to design a simplified denitrification model. Boosted regression trees (BRT) and artificial neural networks (ANN) were used to analyse the relationships and the relative influences of different input variables towards total denitrification, and an ANN was designed as a simplified model to simulate total nitrogen emissions from the denitrification process. To calibrate the BRT and ANN models and test this method, we used a database obtained collating datasets from the literature. We used bootstrapping to compute confidence intervals for the calibration and validation process. Both ML algorithms clearly outperformed a commonly used simplified model of nitrogen emissions, NEMIS, which is based on denitrification potential, temperature, soil water content and nitrate concentration. The ML models used soil organic matter % in place of a denitrification potential and pH as a fifth input variable. The BRT analysis reaffirms the importance of temperature, soil water content and nitrate concentration. Generalization, although limited to the data space of the database used to build the ML models, could be improved if pH is used to differentiate between soil types. Further improvements in model performance and generalization could be achieved by adding more data.