31 Aug 2023
 | 31 Aug 2023
Status: this preprint is currently under review for the journal BG.

Using automated machine learning for the upscaling of gross primary productivity

Max Gaber, Yanghui Kang, Guy Schurgers, and Trevor Keenan

Abstract. Estimating gross primary productivity (GPP) over space and time is fundamental for understanding the response of the terrestrial biosphere to climate change. Eddy-covariance flux towers provide in situ estimates of GPP at the ecosystem scale, but their sparse geographical distribution limits larger scales inference. Machine learning (ML) techniques have been used to address this problem by extrapolating local GPP measurements over space using satellite remote sensing data. However, the accuracy of the regression model can be affected by uncertainties introduced by model selection, parametrization, and choice of predictor features. Recent advances in automated ML (AutoML) provide a novel automated way to select and synthesize different ML models. In this work, we explore the potential of AutoML by training three major AutoML frameworks on eddy-covariance measurements of GPP at 243 globally distributed sites. We compared their ability to predict GPP and its spatial and temporal variability based on different sets of remote sensing predictor variables. Predictor variables from only MODIS surface reflectance data and photosynthetically active radiation explained over 70 % of the monthly variability in GPP, while satellite-derived proxies for land surface temperature, evapotranspiration, soil moisture and plant functional types, and climate variables from reanalysis (ERA5-Land) further improved the frameworks' predictive ability. We found that the AutoML framework AutoSklearn consistently outperformed other AutoML frameworks as well as a classical Random Forest regressor in predicting GPP, reaching an overall r2 of 0.75. In addition, we deployed AutoSklearn to generate global wall-to-wall maps highlighting GPP patterns in good agreement with satellite-derived reference data. This research benchmarks the application of AutoML in GPP estimation and assesses its potential and limitations in quantifying global photosynthetic activity.

Max Gaber et al.

Status: open (until 12 Oct 2023)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse

Max Gaber et al.

Model code and software

AutoML for GPP upscaling v1.0 Max Gaber

Max Gaber et al.


Total article views: 169 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
115 50 4 169 4 4
  • HTML: 115
  • PDF: 50
  • XML: 4
  • Total: 169
  • BibTeX: 4
  • EndNote: 4
Views and downloads (calculated since 31 Aug 2023)
Cumulative views and downloads (calculated since 31 Aug 2023)

Viewed (geographical distribution)

Total article views: 246 (including HTML, PDF, and XML) Thereof 246 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 24 Sep 2023
Short summary
Gross primary productivity (GPP) describes the photosynthetic carbon assimilation, which plays an important role in the carbon cycle. We can measure GPP locally, but it is challenging to produce larger and continuous estimates. Here, we present an approach to extrapolate GPP to a global scale using satellite imagery and automated machine learning. We benchmark different models and predictor variables and achieve an estimate that can capture 75 % of the variation in GPP.