Preprints
https://doi.org/10.5194/bg-2021-323
https://doi.org/10.5194/bg-2021-323

  15 Dec 2021

15 Dec 2021

Review status: this preprint is currently under review for the journal BG.

Reviews and syntheses: The promise of big soil data, moving current practices towards future potential

Katherine E. O. Todd-Brown1, Rose Z. Abramoff2, Jeffrey Beem-Miller3, Hava K. Blair4, Stevan Earl5, Kristen J. Frederick1, Daniel R. Fuka6, Mario Guevara Santamaria7, Jennifer W. Harden8, Katherine Heckman9, Lillian J. Heran1, James R. Holmquist10, Allison M. Hoyt8, David H. Klinges11, David S. LeBauer12, Avni Malhotra8,13, Shelby C. McClelland14, Lucas E. Nave15, Katherine S. Rocci16, Sean M. Schaeffer17, Shane Stoner3,18, Natasja van Gestel19, Sophie F. von Fromm3,18, and Marisa L. Younger1 Katherine E. O. Todd-Brown et al.
  • 1Department of Environmental Engineering Science, University of Florida, Gainesville, Florida, USA
  • 2Laboratoire des Sciences du Climat et de l’Environnement, Gif-sur-Yvette, France
  • 3Max Planck Institute for Biogeochemistry, Jena, Germany
  • 4Department of Soil, Water, and Climate, University of Minnesota, St Paul, MN, USA
  • 5Global Institute of Sustainability and Innovation, Arizona State University, Tempe, AZ, USA
  • 6Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
  • 7Centro de Geociencias, Universidad Nacional Autónoma de México, Juriquilla, Querétaro, Mexico
  • 8Department of Earth System Science, Stanford University, Stanford, CA, USA
  • 9Northern Research Station, USDA Forest Service, Houghton, MI, USA
  • 10Smithsonian Environmental Research Center, Edgewater, Maryland, USA
  • 11School of Natural Resources and Environment, University of Florida, Gainesville, Florida, USA
  • 12Arizona Experiment Station, College of Agriculture and Life Sciences, University of Arizona, Tuscon, AZ, USA
  • 13Department of Geography, University of Zürich, Zürich, Switzerland
  • 14Department of Soil and Crop Sciences, Graduate Degree Program in Ecology, Colorado State University, Fort Collins, CO, USA
  • 15Biological Station and Dept. of Ecology and Evolutionary Biology, University of Michigan, Pellston, MI, USA
  • 16Natural Resource Ecology Laboratory, Department of Soil and Crop Sciences, Graduate Degree Program in Ecology, Colorado State University, Fort Collins, CO, USA
  • 17Biosystems Engineering and Soil Science Department, University of Tennessee, Knoxville, TN, USA
  • 18ETH Zürich, Zürich, Switzerland
  • 19Department of Biological Sciences & TTU Climate Center, Texas Tech University, Lubbock, Texas, USA

Abstract. In the age of big data, soil data are more available than ever, but -outside of a few large soil survey resources- remain largely unusable for informing soil management and understanding Earth system processes outside of the original study. Data science has promised a fully reusable research pipeline where data from past studies are used to contextualize new findings and reanalyzed for global relevance. Yet synthesis projects encounter challenges at all steps of the data reuse pipeline, including unavailable data, labor-intensive transcription of datasets, incomplete metadata, and a lack of communication between collaborators. Here, using insights from a diversity of soil, data and climate scientists, we summarize current practices in soil data synthesis across all stages of database creation: data discovery, input, harmonization, curation, and publication. We then suggest new soil-focused semantic tools to improve existing data pipelines, such as ontologies, vocabulary lists, and community practices. Our goal is to provide the soil data community with an overview of current practices in soil data and where we need to go to fully leverage big data to solve soil problems in the next century.

Katherine E. O. Todd-Brown et al.

Status: open (until 23 Feb 2022)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse

Katherine E. O. Todd-Brown et al.

Katherine E. O. Todd-Brown et al.

Viewed

Total article views: 731 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
592 135 4 731 1 6
  • HTML: 592
  • PDF: 135
  • XML: 4
  • Total: 731
  • BibTeX: 1
  • EndNote: 6
Views and downloads (calculated since 15 Dec 2021)
Cumulative views and downloads (calculated since 15 Dec 2021)

Viewed (geographical distribution)

Total article views: 698 (including HTML, PDF, and XML) Thereof 698 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 24 Jan 2022
Download
Short summary
Research data is becoming increasingly available online with tantalizing possibilities for reanalysis. However harmonizing data from different sources remains challenging. Using the soils community as an example, we walked through the various strategies that researchers currently use to integrate datasets for reanalysis. We find that manual data transcription is still extremely common and that there is a critical need for community supported informatics tools like vocabularies and ontologies.
Altmetrics