25 Jan 2024
 | 25 Jan 2024
Status: a revised version of this preprint is currently under review for the journal BG.

Predicting dominant terrestrial biomes at a global scale using machine learning algorithms, climate variable indices, and extreme climate indices

Hisashi Sato

Abstract. Several methods have been proposed for modelling global biome distribution. Climate data are typically summarised in terms of a few climate indices. However, with the recent advancement of machine learning algorithms, such summarisation is no longer required. Extreme climate events such as intense droughts and very low temperatures cannot be captured by monthly mean climate data, which may limit the applicability of biome boundaries. In this study, I assessed the influences of machine learning algorithms, climate variable indices, and extreme climate indices on the accuracy and robustness of global biome modelling. I found that the random forest and convolutional neural network algorithms produced highly accurate models for reconstructing the global biome distribution. However, the convolutional neural network algorithm was preferable, because the random forest algorithm substantially overfit the training data relative to the other machine learning algorithms examined. Including indexed climate data slightly reduced model accuracy, whereas including extreme climate data slightly improved it. However, there were significant deviations in the distribution of values between the observed and predicted climate when extreme climate data was included; this fatally reduced the robustness of the models, which were evaluated in terms of prediction consistency. Therefore, I recommend that extreme climate data not be considered in global-scale biome prediction applications.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Hisashi Sato

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on bg-2023-106', Anonymous Referee #1, 08 Feb 2024
  • RC2: 'Comment on bg-2023-106', Anonymous Referee #3, 16 Feb 2024
    • AC2: 'Reply on RC2', Hisashi Sato, 15 Mar 2024
  • EC1: 'Comment on bg-2023-106', Semeena Valiyaveetil Shamsudheen, 28 Mar 2024
    • AC3: 'Reply on EC1', Hisashi Sato, 28 Mar 2024
Hisashi Sato


Total article views: 390 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
298 62 30 390 20 17 19
  • HTML: 298
  • PDF: 62
  • XML: 30
  • Total: 390
  • Supplement: 20
  • BibTeX: 17
  • EndNote: 19
Views and downloads (calculated since 25 Jan 2024)
Cumulative views and downloads (calculated since 25 Jan 2024)

Viewed (geographical distribution)

Total article views: 391 (including HTML, PDF, and XML) Thereof 391 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 16 May 2024
Short summary
Modelling potential natural biome distribution is one of the most classical issues in biogeoscience. This study shows how accurate models can be constructed without simplifying climate data by employing machine-learning techniques. While extreme climate data enhance predictions, their inclusion can significantly reduce model reliability. With the convolutional neural network algorithm emerging as the preferred choice, this research paves the way for more robust global-scale biome predictions.