Ideas and perspectives: enhancing the impact of the FLUXNET network of eddy covariance sites

In the last 20 years, the FLUXNET network provided unique measurements of CO2, energy and other greenhouse gas exchanges between ecosystems and atmosphere measured with the eddy covariance technique. These data have been widely used in different and heterogeneous applications, and FLUXNET became a reference source of information not only for ecological studies but also in modeling and remote sensing applications. The data are, in general, collected, processed and shared by regional networks or by single sites, and for this reason it is difficult for users interested in analyses involving multiple sites to easily access a coherent and standardized dataset. For this reason, periodic FLUXNET collections have been released in the last 15 years, every 5 to 10 years, with data standardized and shared under the same data use policy. However, the new tools available for data analysis and the need to constantly monitor the relations between ecosystem behavior and climate change require a reorganization of FLUXNET in order to increase the data interoperability, reduce the delay in the data sharing and facilitate the data use, all this while keeping in mind the great effort made by the site teams to collect these unique data and respecting the different regional and national network organizations and data policies. Here a proposal for a new organization of FLUXNET is presented with the aim of stimulating a discussion for the needed developments. In this new scheme, the regional and national networks become the pillars of the global initiative, organizing clusters and becoming responsible for the processing, preparation and distribution of datasets that users will be able to access in real time and with a machine-to-machine tool, obtaining always the most updated collection possible but keeping a high standardization and common data policy. This will also lead to an increase in the FAIRness (Findability, Accessibility, Interoperability and Reusability) of the FLUXNET data that will ensure a larger impact of the unique data produced and a proper data management and traceability.

Since the first examples of year-long measurements (e.g. Black et al. 1996, the use of EC data became more and more common not only to study single ecosystems from an ecological and physiological point of view (e.g. 30 Reichstein et al. 2007, Law et al. 2002, Mahecha et al 2010, Luyssaert et al. 2007, Besnard et al. 2018) but also as ground observations in modelling development and validation and remote sensing applications (e.g. Bonan et al. 2011, Friend et al. 2007, Williams et al. 2009, Balzarolo et al. 2014, Jung et al. 2020). The large range of possible applications and the wide interest in these measurements, led first to the creation of regional and continental networks such CarboEurope (Dolman et al. 2006) and AmeriFlux (Novick et al. 2018) (followed by other continents for example with AsiaFlux, OzFlux, LBA and 35 ChinaFlux, see Yamamoto et al., 2005, Beringer et al., 2016, Restrepo-Coupe et al., 2013Yu et al., 2006 and then to the organization of the FLUXNET network-of-networks where all the regional networks contribute with a variable number of sites and years of data. In the context of FLUXNET there have been different initiatives to facilitate discussion and cooperation across networks with specific conferences and meetings (starting in 1995, see Baldocchi et al. 1996) and the preparation of FLUXNET synthesis 40 data collections with the aim to make the data available to wider communities. The main FLUXNET collections were produced in 2001 (Marconi dataset, Falge et al. 2005), 2007 (LaThuille dataset) and 2016 (FLUXNET2015 dataset, Pastorello et al. 2017), including an always larger number of sites-years (97 in Marconi, 965 in LaThuile and more than 1500 in FLUXNET2015) and providing standardized data ready for a large range of heterogeneous applications. These collections were needed because each regional network applies its own processing and formatting scheme (including different variable 45 names and units) and this prevents an easy use of data across sites in different continents. In the last years AmeriFlux and the European networks worked toward a standardization that also highlighted the uncertainty introduced by the data processing (Pastorello et al. 2020) but this still not sufficient to replace a global initiative. The preparation of a FLUXNET collection requires a large effort that involves data collection, data policy agreement, common data quality controls, feedbacks with the site owners for corrections, processing and finally preparation of the products and their distribution. This is why 6 and 9 years 50 passed between one FLUXNET synthesis collection and the following one.
The heterogeneity across regional networks is however something difficult to avoid. These networks are in fact based on general goals and scientific aims that can be different and can require specific design and processing. For example, the NEON network was planned using a hierarchical system to represent different ecoregions (Schimel et al. 2007) and the sites are highly standardized in terms of setup. Also in ICOS (Integrated Carbon Observation System) the stations are highly standardized but 55 the design is driven by the single country decisions and priorities. In AmeriFlux, instead, an open participation is possible and everybody can register their sites in the network, without an overall design or standardization of the towers setup but allowing diversity and bringing under the same network sites designed for specific and heterogeneous research projects. In addition, single sites can be linked to other national or regional initiatives that could impose specific ways to prepare and distribute the data collected. Finally, but often one of the most important aspects, there are different views, sensitivities and readiness respect 60 to the data sharing and data use policies, often linked to the need of visibility that ensure proper funding to sustain the activities.
These are key aspects, fully justified and difficult to change at global level in a short or medium period, which therefore need to be considered in a re-organization of the FLUXNET network structure 2 New needs and the role of FLUXNET The need of ground observation data is increasing continuously and there are new examples of modelling and synthesis 65 applications that require (or would require) direct measurements updated frequently. One example of such activities is the FLUXCOM initiative (Jung et al. 2020), where satellite and meteorological spatialized data are used as input in a machinelearning (ML) ensemble to predict Net Ecosystem Exchange, Gross Primary Production, Ecosystem respiration and other energy fluxes at continental and global scale. The ML algorithms need observations for their parameterization and the FLUXNET data have been successfully used in the training (e.g. Tramontana et al. 2016). Although the relations between 70 drivers and fluxes can be "learned" by the ML also using past data, the availability of new stations, in particular when covering under-sampled areas (Papale et al. 2015), and more recent years that represent the recent climate variability and management effects, would be important to improve the quality of the predictions and reduce their uncertainty. An annual production of these bottom-up empirically upscaled estimations could for example be used as additional input in the Global Carbon Project (www.globalcarbonproject.org) annual report (e.g. see Friedlingstein et al. 2019) on the carbon balance of the globe, where 75 currently the FLUXNET data are in general not sufficiently used.
The same is valid for the remote sensing community that needs ground validation data frequently and with high quality standards, like in case of the Ground-Based Observations for Validation (GBOV) of Copernicus Global Land Products (https://land.copernicus.eu/global/gbov/home/) or the CEOS Land Product Validation (LPV) subgroup (https://lpvs.gsfc.nasa.gov/) that already cite FLUXNET as potential source of data but currently can not find a valid 80 contribution because the data do not overlap in time with the most recent sensors (e.g. the Sentinel constellation).
If we want to have the FLUXNET data more used and integrated with other scientific disciplines, also to start new crossdisciplines collaborations based on recent or even near real time data, we need to change the way in which the data are shared in order to make their use more easy and suitable for new applications. In particular, we need to work to ensure fast updates of the collection and easy and direct machine-to-machine data access and data use capabilities, with a clear and easy to apply 85 data use policy.
The characteristics of a dataset to ensure a machine findable and readable format and a clear rule for its use have been described by the FAIR principles (Wilkinson et al., 2016) and a new scheme should move in this direction (e.g. Collins et al. 2018). In particular, following the FAIR principles, the FLUXNET data should be easy to find (Findable) through common metadata searchable by a tool; easy to access (Accessible) also through a machine-to-machine system and with a common and clear data 90 use policy; processed in the same way and distributed in the same format in order to simplify the merging and synthesis (Interoperable); and clearly identified and permanently referenced in order to allow multiple uses and reproducibility of the https://doi.org/10.5194/bg-2020-211 Preprint. Discussion started: 16 June 2020 c Author(s) 2020. CC BY 4.0 License. studies and results (Reusable). All this, keeping the system robust and sustainable and for this reason not dependent on the capabilities and resources of a single network or group (as it has been until now).
The FLUXNET members would also benefit from a system able to process, standardize and distribute their data rapidly and 95 in a clear and traceable way. The site teams would obtain a set of products as output of the centralized processing, that in some cases could be difficult and time and resources consuming to apply individually. In addition, and more important in my opinion, a FLUXNET network with these characteristics would provide new opportunities to the FLUXNET members for collaboration and joint activities, facilitating synthesis studies at continental and global scales. For example, the ICOS community promptly prepared and shared a collection of in situ measurements from 52 sites in Europe (www.icos-cp.eu) that are used to analyse 100 the effect of the 2018 European drought (e.g. Bastos et al. 2019) on terrestrial ecosystems. This fast data release however was possible only thanks to an extra effort for the data processing by ICOS (in addition to the effort by the site teams to collect and share the data) and it is difficult to imagine this as standard way to proceed in future and globally. In fact, ICOS was created and funded as Research Infrastructure designed to sustain an organized observation network with prompt data delivery but this is not common across all the regional networks that compose FLUXNET. 105

A new FLUXNET organization
In order to answer the new needs and opportunities described above, a new FLUXNET organization is necessary, that must take into consideration the complexity of the system and peculiarities of all the participants. The solution should involve all the regional networks participating in order to increase the robustness and sustainability and, at the same time, keep their autonomy and internal flexibility needed to answer additional specific research questions, respect the organizational and 110 political structures governing them and answer specific needs in terms of data processing, format and sharing.
For this reason, a new FLUXNET organization should be based on an agreement among the different regional networks. In the proposed scheme, the networks are grouped in Continental clusters that agree to share data following a common procedure when the participating networks and the single sites are ready, interested or available to share (Figure 1).
With this organization, the Continental clusters become the pillars of the FLUXNET system, coordinating the participation 115 and data sharing in FLUXNET by different national and regional networks. In order to ensure the needed standardization in terms of processing, format, accessibility and data policy, the Continental clusters must agree to prepare and maintain a specific database structure (the "FLUXNET version" baskets in Figure 1) where a common and agreed data product (including all the needed metadata and versioning information) are loaded and made available.
The FLUXNET product creation requires that all the participating networks agree on the characteristics (for example minimal 120 requirements about the variables, standard processing to apply, (meta)data format, common data policy, mechanism for data access etc.) and contribute to the development. However, we do not have to start from scratch: in the last years, for the preparation of the FLUXNET collections, standards have been already defined and implemented also at regional level (e.g. https://doi.org/10.5194/bg-2020-211 Preprint. Discussion started: 16 June 2020 c Author(s) 2020. CC BY 4.0 License.
AmeriFlux, the European Database and ICOS produce already the same output). These include format, units, processing schemes and codes that are openly accessible, like in the case of the ONEFlux suite (Pastorello et al. 2019 and. 125 Clearly the methods, standards and the needs evolve in time and for this reason it is important to discuss and agree on a plan and strategy to coordinate the efforts and define the common set of rules to apply in the Continental clusters. FLUXNET worked well as bottom up initiative, community driven and without rigid and formal governing bodies, allowing people to participate, propose and use the FLUXNET organization in a democratic way. To keep this spirit, a light coordination committee constituted by Regional networks and Continental clusters representatives that work directly on data processing 130 could serve as tool for the process governance in the definition of the new standards to apply and new products to introduce. It is also important to define a strategy to evaluate and decide on implementation of changes or additions to the standards. In general, there is no reason to change established methods and formats if not motivated since this has an impact on the users that have to adapt their tools (in particular users interested to continuous data uses). For the processing the requirements could be, as in the last FLUXNET releases, that the processing tools should be at least 1) published in peer-review journals, 2) 135 available to be easily applied to large and heterogeneous dataset, 3) with the implementation codes open source and 4) different enough from what is already implemented to justify their addition to the processing flow.
The regional and national networks and single sites that are part of a Continental cluster can continue to keep their specific databases and interfaces if needed (the Data portals in Figure 1) to distribute their data. This could be needed in case of different formats (e.g. when linked to other observation networks with different standards) or in case of different processing (e.g. 140 additional variables calculated centrally from raw data, or products of regionally specific processing tools). It should be noted that standard processing has the advantage of making all the data more comparable but at the same time it is possible that in specific conditions or sites it fails and an ad hoc specific processing is needed and results could be shared in the network Data portals. Differences in the data policies applied to specific sites or specific portions of the database can also be handled through regional data portals that can define licenses different respect to the common used in FLUXNET. Then, when a dataset become 145 ready to be shared in the FLUXNET system, it is processed also following the agreed FLUXNET standard and loaded in the FLUXNET version basket.
The FLUXNET collection is then not any more a large dataset stored in one location but a set of sub-collections stored in the FLUXNET version baskets of the different Continental clusters and accessible visiting all of them to get the last version available. The access can be implemented through a common query system (the FLUXNET shuttle in Figure 1) that points 150 automatically to the different FLUXNET version baskets and, using standardized metadata that include versioning information, gets the last version of the Continental cluster collections to create an updated FLUXNET collection for the user. In this way, each single user could create at any time (on demand) a collection that is built using the most recent data provided by the FLUXNET network, allowing applications that requires updated collections. At the same time, the system gives the possibility to promptly correct possible errors if needed and to include continuously new sites as soon as they are ready to share, making 155 FLUXNET even more inclusive. In order to help scheduling of the work of the teams responsible of the sites, fixed "FLUXNET https://doi.org/10.5194/bg-2020-211 Preprint. Discussion started: 16 June 2020 c Author(s) 2020. CC BY 4.0 License.
shuttle" runs can be scheduled for the main operational activities, e.g. before a FLUXCOM training or periodically when satellite products validation tasks are scheduled.
Clearly one of the requisites to have the FLUXNET shuttle working correctly and the users able to use the data is a common and clear data policy. The Continental clusters must agree on a common data licence that should simplify and promote the use 160 of the data. With the aim to have FLUXNET used and promoted by different communities, standard data licenses should be considered because common across disciplines and for this reason well know. Currently most of the monitoring networks are moving to the Creative Common CC-BY 4 license (https://creativecommons.org/licenses/by/4.0/) that ensure attribution and promote data use. All this, however, must also considered the need of recognition and advantages for the scientists working at the sites that are discussed below. 165

Advantages and risks of the proposed new organization
The proposed FLUXNET scheme would have a number of advantages. First, the users will not have to wait releases of dataset every 5 or 10 years but can get the most updated version of the shared data in real time. This would stimulate the use of data by scientific communities that need recent measurements (e.g. in early detection of anomalies). The data would increase also their level of FAIRness, improving their Findability through the use of standard metadata across the Continental clusters, their 170 Accessibility through a common open data policy and a single tool to retrieve all the data (the FLUXNET shuttle), their Interoperability thanks to the standardization. With a system that creates a new (and potentially different) collection at every user's request, it is crucial to clearly identify the data included (and the versions) also to ensure reproducibility of the results. This is achievable through a specific persistent identifier (PID) that users should always report and that will improve the data Reusability in case of studies reproduction and verification. 175 In terms of robustness, sustainability and flexibility, the proposed system would also substantially improve the current situation thanks to the overlap of data processing capacities and responsibilities among the Continental clusters. In fact, sharing of workload will stimulate collaboration across networks and promote interchangeability of roles since each Continental cluster could process the data of another cluster if needed. This can be periodically tested though a verification system similar to a "round-robin test" where all the clusters process the same set of data with the standard procedure and results are compared. In 180 addition, the Continental clusters could also invest on common and shared computing resources where all the data are processed with the same codes. All this keeping the full flexibility of each single network to decide what to share and when in FLUXNET and the possibility to distribute different formats and versions through their Data portals.
It is however crucial to analyse the concerns that a new FLUXNET organization like the one proposed could raise. In particular, there is the risk of losing the control of the data (who accessed, where they are used etc.) and this is directly linked to an 185 important aspect: the visibility of the people. The large amount of work and investment done by single stations and networks participating to FLUXNET must be fully recognized and should have an effect on the funding to continue the work and data provision and on the career of the people involved. The contribution of data to FLUXNET is in most cases on voluntary bases https://doi.org/10.5194/bg-2020-211 Preprint. Discussion started: 16 June 2020 c Author(s) 2020. CC BY 4.0 License. so the proposed system would not force participation. It is however important to try to get as many people and networks as possible engaged and the analysis of the benefits that data sharing can brings is the natural step to take a decision. Although 190 this has been discussed in different frameworks (e.g. Papale et al. 2012) and studies demonstrated that people sharing data get more recognition due to the collaborations established (Bond-Lamberty, 2018; Dai et al., 2018), it is out of scope here to enter in the details on the benefits and convenience of data sharing.
What a reorganized and truly international FLUXNET system can do is to ensure a full traceability of data access and data uses, to allow each data owner to have an exact quantification of the use of the data shared. From a technical point of view, 195 the compilation of a list of downloads per site it is something that can be easily implemented using the FLUXNET Shuttle and can provide important information about the use of the data. However, this is not enough: it would be important to have in all the papers that use these data the citation of the datasets so that the impact and usefulness of each single site can be quantified and recognized. This would require the help of the journals that should request, during the review, to clearly cite the DOI or PID of the dataset used, and this should not be affected by the limitation in the number of citations often imposed. In this way 200 it would be possible to evaluate and show the importance of the data collected and distributed by FLUXNET and which are the communities using them.

Conclusions
A reorganization of FLUXNET following the line presented here would lead to a number of benefits: 1) an increase of robustness of the global network thanks to the sharing of workload and responsibilities, 2) a strength of the collaborations and 205 links among networks and colleagues across the world and 3) an increase of visibility thanks to the continuously updated products. The solution is also scalable once implemented, giving the possibility to include new measurements (e.g. new GHGs like CH4 or N2O, see Knox et al. 2019, Nemitz et al. 2018 or new processing also starting from raw data. In fact the development of new tools by a Continental cluster, already designed to be generally applicable, can be made available to all the others easily and without duplication the efforts. The proposed scheme would also move FLUXNET in the direction that 210 was already defined 20 years ago, developing a collaborative, self-organized and bottom up network, able to answer to new requests thanks to the continuous updates. The evolution of the regional networks toward more organized and stable infrastructures, the large number of eddy covariance people that are now sharing data and collaborating in FLUXNET and the new spirit of collaboration among regional networks, are solid bases to do this step. https://doi.org/10.5194/bg-2020-211 Preprint. Discussion started: 16 June 2020 c Author(s) 2020. CC BY 4.0 License.
Competing interests: the author declares that they have no conflict of interest. 215 Acknowledgements: the author thanks all the colleagues and friends that shared with him ideas and comments on the development of FLUXNET and thanks the whole FLUXNET community for the very constructive and open spirit that helped to build a so nice bottom-up coalition.
Financial support: the author thanks the support of the RINGO (Grant Agreement 730944) and ENVRIFAIR (Grant Agreement 824068) H2020 European projects for the development of a new and more integrated scheme of FLUXNET and 220 the e-shape (Grant Agreement 820852) H2020 European project to support a first case study on the operational use of FLUXNET data.