From simple labels to semantic image segmentation: leveraging citizen science plant photographs for tree species mapping in drone imagery

Soltani, Salim; Ferlian, Olga; Eisenhauer, Nico; Feilhauer, Hannes; Kattenborn, Teja

doi:https://doi.org/10.5194/bg-21-2909-2024

Articles | Volume 21, issue 11

https://doi.org/10.5194/bg-21-2909-2024

© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/bg-21-2909-2024

© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 21, issue 11

Research article

|

14 Jun 2024

Research article |

| 14 Jun 2024

From simple labels to semantic image segmentation: leveraging citizen science plant photographs for tree species mapping in drone imagery

Salim Soltani, Olga Ferlian, Nico Eisenhauer, Hannes Feilhauer, and Teja Kattenborn

Download

Final revised paper (published on 14 Jun 2024)
Preprint (discussion started on 05 Dec 2023)

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2023-2576', Anonymous Referee #1, 12 Jan 2024

Thank you for the opportunity to review this manuscript. In this article the authors present an innovative method to incorporate citizen science photographs of trees to segment and classify ten deciduous tree species from aerial images, using a Convolutional Neural Network. The two-step approach of using simple labels of citizen science data to create masks for a segmentation model is innovative and highly relevant. I think that the paper fits well within the scope of this journal and presents an application of an interesting new approach to remote sensing.
The manuscript as a whole is very well structured. Only the first part of the abstract could be shortened significantly.
Comments
1) The first part of the abstract, that presents an overview of the problem could be shortened to make it more concise (try to summarise each section of the manuscript in 1-3 sentences).
2) l. 250 why did you choose EfficientNetV2L over the other tested backbone architectures?
3) l. 261 how much % of the images were assigned NA? Did this influence the model training?
4) Could you explain the term “replacements” (e.g. l. 240)?
5) Do you think the amount of misclassified data could be a problem for the training of the segmentation model? (l. 297-298)
6) 0.22 cm already seems like very high resolution. Many remote sensing studies focus on making high resolution reference data more usable over large areas (i.e. by adapting it to satellite data). You argue for the use of even finer resolution data in the future. What research objectives could be studies using this very high resolution of UAV data? Is there a research gap for very high prediction accuracy over relatively small areas? Could multispectral/hyperspectral sensors be more useful than higher resolution?

Minor comments
l. 29 Please remove the “and” between “data” and “by”
l. 51 “unleash” might not be the right word; “harness” might be better suited
“provided” might be better instead of “given”
l. 56-60 This sentence is not completely clear to me. Maybe you can reformulate it to make it
easier to read.
l. 63 Please remove “similar”, as it is unnecessary
l. 66 Consider combining sentence “[…] costly, as training data […]”
l. 81 Is the training data limited or just costly/time consuming to generate?
l. 89 “platforms”
l. 90/95 “mil” or “M”;
please remove “of”
l. 97 Please remove “The” before “Pl@ntNet”
l. 109 “Ideally, for species mapping applications […]”
l. 115-120 This part might fit better in the Methods section
l. 198 Please remove “Accordingly”
l. 235 “were afterward rasterized”
l. 240-241 What does “sampled with replacement” mean?
l. 317 Please replace “while” with “although”, or similar
l. 337-341 This might fit better in the Discussion section
l. 367 “varying”
l. 373 “partially relatively inaccurate” → This is a little vague. Maybe expand upon it a
little.
l. 387-389 Please remove one instance of “plots with more species (two or four)”
l. 393 “higher value” than what?
l. 442 Maybe you can find a better phrasing than “diversity of human behaviour”
l. 457 “often costly”
l. 484 “large” instead of “excessive” (which means unreasonably much)
l. 485 “good transferability”
Figure 2: The text font is very small. It would also be better if the labels match the ones used in the text: “Ortho_July” and “Ortho_September” instead of “Ortho 1” and “Ortho 2”
Figure 4: The text font here is also very small.
Figure 6: The height of the transects seems to be different between plots (eg. plot 29 and plot 33). If they are all the same (2 m), please show them with the same extents in the figure as well.

Citation: https://doi.org/10.5194/egusphere-2023-2576-RC1
- AC1: 'Response to the first reviewer's comment', Salim Soltani, 28 Mar 2024
  
  Salim Soltani, Remote Sensing Center for Earth System Research
  University of Leipzig, salim.soltani@uni-leipzig.de
  
  Ref. No.: egusphere-2023-2576- “From simple labels to semantic image segmentation: Leveraging citizen science plant photographs for tree species mapping in drone imagery “
  
  Dear reviewer,
  We would like to thank you for your constructive comments that allowed us to improve the quality of the manuscript and for the time that you spent commenting on the manuscript.
  We have addressed the first reviewer's comments. We hope that the revised manuscript addresses all the shortcomings of the earlier version.
  
  Kind regards,
  Salim Soltani
  (on behalf of the Co-authors, Olga Ferlian, Nico Eisenhauer, Hannes Feilhauer , Teja Kattenborn)
  
  Citation: https://doi.org/10.5194/egusphere-2023-2576-AC1
RC2:
'Comment on egusphere-2023-2576', Anonymous Referee #2, 04 Apr 2024

I enjoyed reading the manuscript and its rigorous approach to image segmentation and have no additional comments in addition to those of Reviewer #1.

Citation: https://doi.org/10.5194/egusphere-2023-2576-RC2
- AC2: 'Response to the second reviewer's comment', Salim Soltani, 05 Apr 2024
  
  Dear reviewer,
  We would like to thank you for your positive evaluation of the manuscript.
  We thoroughly addressed the constructive suggestions of reviewer 1.
  
  Kind regards,
  Salim Soltani
  (on behalf of the Co-authors, Olga Ferlian, Nico Eisenhauer, Hannes Feilhauer , Teja Kattenborn)
  
  Citation: https://doi.org/10.5194/egusphere-2023-2576-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

ED: Publish subject to minor revisions (review by editor) (08 Apr 2024) by Paul Stoy

AR by Salim Soltani on behalf of the Authors (11 Apr 2024) Author's response Author's tracked changes Manuscript

ED: Publish as is (22 Apr 2024) by Paul Stoy

AR by Salim Soltani on behalf of the Authors (01 May 2024) Manuscript

Short summary

In this research, we developed a novel method using citizen science data as alternative training data for computer vision models to map plant species in unoccupied aerial vehicle (UAV) images. We use citizen science plant photographs to train models and apply them to UAV images. We tested our approach on UAV images of a test site with 10 different tree species, yielding accurate results. This research shows the potential of citizen science data to advance our ability to monitor plant species.