Exploring Constraints on a Wetland Methane Emission Ensemble (WetCHARTs) using GOSAT Satellite Observations

Wetland emissions contribute the largest uncertainties to the current global atmospheric CH4 budget and how these emissions will change under future climate scenarios is also still poorly understood. Bloom et al. (2017b) developed WetCHARTs, a simple, data-driven, ensemble-based model that produces estimates of CH4 wetland emissions constrained by observations of precipitation and temperature. This study performs the first detailed global and regional evaluation of the WetCHARTs CH4 5 emission model ensemble against 9 years of high-quality, validated atmospheric CH4 observations from the GOSAT satellite. A 3-D chemical transport model is used to estimate atmospheric CH4 mixing ratios based on the WetCHARTs emissions and other sources. Across all years and all ensemble members, the observed global seasonal cycle amplitude is typically underestimated by WetCHARTs by -7.4 ppb, but the correlation coefficient of 0.83 shows that the seasonality is well-produced at a global scale. 10 The Southern Hemisphere has less of a bias (-1.9 ppb) than the Northern Hemisphere (-9.3 ppb) and our findings show that it is typically the North Tropics where this bias is worst (-11.9 ppb). We find that WetCHARTs generally performs well in reproducing the observed wetland CH4 seasonal cycle for the majority of wetland regions although, for some regions, regardless of the ensemble configuration, WetCHARTs does not well-reproduce the observed seasonal cycle. In order to investigate this, we performed detailed analysis of some of the more challenging 15 exemplar regions (Paraná River, Congo, Sudd and Yucatán). Our results show that certain ensemble members are more suited to specific regions, either due to deficiencies in the underlying data driving the model or complexities in representing the processes involved. In particular, incorrect definition of the wetland extent is found to be the most common reason for the discrepancy between the modelled and observed CH4 concentrations. The remaining driving data (i.e. heterotrophic respiration and temperature) are shown to also contribute to the mismatch to observations, with the details differing on a region-by-region 20 basis but generally showing that some degree of temperature dependency is better than none.


The WetCHARTs Ensemble
WetCHARTs (Bloom et al., 2017b) is a wetland CH 4 emission dataset derived from satellite-based surface inundation extent and precipitation reanalyses, model-based heterotrophic respiration and a range of temperature dependencies. WetCHARTs has been used in a range of studies (including Parker et al. (2018); Treat et al. (2018); Sheng et al. (2018); Lunt et al. (2019); Maasakkers et al. (2019)) typically as the wetland CH 4 a priori in global/regional flux inversion experiments. This study uses 5 v1.2.1 of WetCHARTs which extends the ensemble to 2017. In addition to the extension in time, wetland extent across Lehner and Döll (2004) wetland complex classes 0-25%, 25-50% and 50-100%, were scaled by 12.5%, 37.5% and 75%, respectively. Fundamentally, WetCHARTs works by calculating spatially (x) and temporally (t) resolved CH 4 fluxes at a 0.5 • × 0.5 • resolution globally using the following equation: (1) 10 where A (t, x) is the wetland extent fraction, itself given by A(t, x) = w(x)h(t, x) with w(x) being the static wetland extent fraction and h(t, x) being the temporal variability. R (t, x) is the heterotrophic carbon respiration per unit area. The term Q 10 T (t,x) 10 represents the temperature dependence of the CH 4 :C ratio with Q 10 being the relative CH 4 :C respiration for a 10 • C increase and T (t, x) being the surface skin temperature. s is a global scale factor. Many other studies and wetland emission models utilise some form of this equation to estimate methane wetland fluxes (Gedney et al., 2004;Eliseev et al., 2008;Clark 15 et al., 2011;Xu et al., 2016;Poulter et al., 2017;Comyn-Platt et al., 2018). Figure 1 shows the configurations of the WetCHARTs ensemble members used in this study and also provides guidance on the 4-digit identification which will be used to describe the individual ensemble members from hereon.
The global scale factor (s in Equation 1) is a model-specific scaling factor (Bloom et al., 2017b), derived such that model annual emissions amount to either 124. 5, 166 or 207.5 Tg CH 4 yr −1 which represent the mean 2000-2009 wetland emission 20 estimates from Saunois et al. (2016) along with a ±25% uncertainty. The full ensemble (FE) utilises nine heterotrophic respiration models for 2010 (Huntzinger et al., 2013) but the extended ensemble (EE) used here (2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017) is limited to the CARDAMOM data-constrained terrestrial carbon cycle analysis (Bloom et al., 2016) to calculate values for R. ERA-Interim skin temperature is used as the underlying temperature driving data and the temperature dependence spans three values for Q 10 , ranging from 1 (i.e. no temperature dependence) to 3 (high temperature dependence). Finally, the wetland extent parametri-25 sation (A) for the extended ensemble uses normalised monthly mean ERA-Interim precipitation (Dee et al., 2011) with a static wetland map taken from either the Global Lake and Wetland Database (GLWD - Lehner and Döll (2004)) or GlobCover (Bontemps et al., 2011) to give spatially and temporally varying wetland extent.
In total, for the extended ensemble covering 2009-2017, the above results in 18 ensemble members (3×s, 1×R, 3×Q 10 , 2×A). For more extensive details on the ensemble members see Bloom et al. (2017b). 30 Throughout this study we refer to each ensemble member by a 4-digit code ( Figure 1) and highlight these in bold italics in the text. For example, the ensemble member with a global scale factor of 166 Tg CH 4 yr −1 , using CARDAMOM for its heterotrophic respiration, with a temperature dependence of Q 10 = 3 and using precipitation with the Global Lake and Wetland Database to define the extent would be identified as 2933. When referring to the parameter but not a specific configuration, the xxCx nomenclature is used (e.g. temperature dependency) and when referring to a specific value for a set of configurations the xx1x nomenclature is used (e.g. all ensemble members with a Q 10 = 1 temperature dependency).

TOMCAT Model Simulations
In order to compare the CH 4 emissions from WetCHARTs with atmospheric CH 4 observations, we process the emissions 5 through a global 3-D atmospheric chemistry transport model, TOMCAT (Chipperfield, 2006). Throughout this study, when we refer to CH 4 concentrations from WetCHARTs, we are referring to the output from TOMCAT simulations using the WetCHARTs wetland emissions. These TOMCAT simulations are performed globally at 1.125 • horizontal resolution between 2009 and 2017 using ERA-Interim meteorology to force the model (Dee et al., 2011). The WetCHARTs ensemble is used as the surface wetland emissions. Each ensemble member from WetCHARTs, as described in Section 2, is used to simulate a 10 separate CH 4 tracer, along with a reference CH 4 tracer that has all other CH 4 emission sectors included apart from wetland emissions, resulting in 19 model tracer simulations. The non-wetland CH 4 fluxes are kept consistent between all simulations, using EDGAR (v4.2) for anthropogenic emissions and GFED (v4.1s) for biomass burning. The EDGARv4.2 database runs up to 2012, and we repeat the 2012 emissions for the remaining years, with no seasonal cycle applied. As we focus primarily over wetland emission areas, the local seasonal cycle due to anthropogenic fluxes is likely very small compared to these natural 15 sources. We do however note the possibility that this assumption could be a source of uncertainty. We used the GFEDv4.2 emissions for the correct year up and including to 2016, and used a climatology for 2017 and 2018.
Prescribed annually-repeating values taken from Yan et al. (2009) are used for rice paddy emissions, with the remaining emissions (oceans, termites) used as described in Patra et al. (2011). The atmospheric sink is included via annually-repeating atmospheric OH and O( 1 D) fields and the methanotrophic soil sink is included as in McNorton et al. (2016).

GOSAT Proxy XCH Data
This study uses satellite observations of total column dry air mole fractions of CH 4 (XCH 4 ) generated by the University 5 of Leicester Proxy XCH 4 GOSAT retrieval (Parker et al., 2011(Parker et al., , 2020 as part of the ESA Climate Change Initiative (Buchwitz et al., 2017) and the Copernicus Climate Change Service (Buchwitz et al., 2018).
The GOSAT XCH 4 data has been extensively validated (Dils et al., 2014;Parker et al., 2015;Buchwitz et al., 2017;Parker et al., 2020), primarily using data from the Total Carbon Column Observing Network (TCCON). TCCON is a global network of ground-based, high resolution Fourier transform spectrometers recording direct solar spectra in the near-infrared spectral 10 region . The TCCON data are tied to World Meteorological Organization (WMO) standards (Wunch et al., 2010) and are the primary validation data for satellite observations of greenhouse gases (Cogan et al., 2012;Wunch et al., 2011;Dils et al., 2014). After performing extensive validation to TCCON, we subtract one global offset from the GOSAT data. This value is typically small. For v7.2 of the data (as used in this study, the value was 7.71 ppb). For our latest data, v9.0, the value is 9.06 ppb (see Parker et al. (2020), under review) 15 This version of the GOSAT Proxy XCH 4 data (v7.2) agrees well with TCCON data, with an overall bias of 0.32 ppb, a standard deviation of 13.64 ppb and a correlation coefficient of 0.91 between the 73,304 coincident GOSAT-TCCON measurements. GOSAT XCH 4 has additionally been validated against aircraft measurements over the Amazon (Webb et al., 2016).
This data has been heavily utilised by the atmospheric inversion community and many studies (Fraser et al., 2013;Cressot et al., 2014;Turner et al., 2016;Alexe et al., 2015;Berchet et al., 2015;Feng et al., 2017;Ganesan et al., 2017;Sheng et al., 20 2018; Maasakkers et al., 2019;Saunois et al., 2020;Lunt et al., 2019) have used this data to successfully infer regional and global emissions of CH 4 . To assess how representative the WetCHARTs emission ensemble is of the observed wetland CH 4 seasonal cycle, we cal- 30 culate the smoothed detrended seasonal cycle by applying the NOAA CurveFit (Thoning et al., 1989) to the XCH 4 data for 5 Figure 2. Global correlation coefficient between the wetland CH4 seasonal cycle derived from the GOSAT Proxy XCH4 and each

Global Evaluation of WetCHARTs
WetCHARTs ensemble member (left-most column) and also between individual ensemble members.
the GOSAT observations and for each ensemble member. We also apply this routine to the TOMCAT model simulation that has no wetland CH 4 emissions included. The seasonal cycle from this "no wetland" simulation is subtracted from all other seasonal cycles, resulting in data representing only the wetland component of the seasonal cycle. This makes the assumption that wetlands dominate the uncertainty in interannual variability of the CH 4 emissions and the remaining CH 4 sources are in comparison far less uncertain. It should be noted that there is the potential for our assumptions regarding biomass burning 5 emissions to interfere with our derived wetland seasonal cycle. However, we have carried out full global inversions of CH 4 flux using this GOSAT data product and this chemical transport model (paper in prep) which suggest that this is not a significant issue. Whilst it is difficult to separate out the fire emissions from other emissions sectors using this methodology, our findings suggest that flux changes in burning regions in South America and Africa during the burning season generally change relatively little from the prior, compared to nearby wetland regions. However, it is true that in some extreme years (e.g. 2010 drought in S. America), there are more significant changes to the GFED prior derived by the inversion. Although wetland and burning regions are often spatially distinct, this could affect some of our results to a small extent. 5 We first perform this analysis globally and in later sections separately for data within each region of interest. Figure 2 shows the correlation coefficients between the wetland CH 4 seasonal cycles when calculated globally. The left column shows the correlation against the GOSAT-derived seasonal cycle whilst the remaining columns show the model-model correlations for different pairs of ensemble members. It should be noted that the high correlation between certain groups of ensemble members is expected (e.g. for members 1913, 2913, 3913 where the only configuration difference is the global scaling factor).

10
The correlation against observations highlights that when considered at a global scale, the temperature dependence is clearly important. The ensemble members where there is no temperature dependence, (i.e. Q 10 = 1, ensemble members xx1x), all correlate far more poorly to observations than the same configuration including temperature dependency (e.g. r = 0.69 for 2913 vs r = 0.88 for 2923). While it is clear that some temperature dependence is important, it is not clear from analysis at a global scale what degree of temperature dependence gives the best agreement with observations. Instead, regional analyses as 15 performed in Section 6 are required.
The other feature of note in Figure 2 is the variability in the inter-ensemble correlation coefficients. For example, correlations between pairs of WetCHARTs-based TOMCAT simulations can be as low as 0.63 (3934 vs 1913). In fact, the correlation coefficient is extremely poor (r = 0.63-0.65) for all simulation pairs where the temperature dependency is at the opposite extreme and the alternate wetland extent dataset is used. This further reinforces the need to continue this analysis at a regional 20 scale to understand the driving factors for these differences over a range of wetland ecosystems.

Regional Evaluation of WetCHARTs
Before performing a detailed regional evaluation, it is useful to consider how the varying WetCHARTs ensemble parameters are responsible for the variation in the subsequent WetCHARTs emissions. For this purpose we perform a variance analysis by calculating the partial correlation (Vallat, 2018) between timeseries for each parameter and the resulting WetCHARTs 25 emissions. To clarify, this analysis is purely an assessment of the WetCHARTs CH 4 emissions against its own driving data used to generate those emissions. It is not intended to be interpreted as a general statement about the importance of these parameters to explaining the variance in the real world. This assessment is useful as if a certain driver is dominating the response in WetCHARTs emissions and we subsequently observe discrepancies to the CH 4 measurements, it indicates further evaluation of that driving data may be useful. It should be noted however that for the extended time period examined here 30 we only have 1 heterotrophic respiration model available, the contribution of heterotrophic respiration uncertainty within the WetCHARTs Full Ensemble is considerable due to model disparities in mean emission rates and the corresponding seasonal emissions and the seasonal cycles in the wetland extent fraction, temperature dependency and heterotrophic respiration for each region that were used to derive the WetCHARTs emissions. These values are averaged across all ensemble members to give an indicative value for the sensitivity of each region to the different driving parameters. Note that due to potential cross-correlations, it is not necessarily expected that the percentages sum to 100%.
cycles (see Figure 6 in Bloom et al. (2017b)). Ultimately further expansion and exploration of the heterotrophic respiration model ensemble may prove useful for robustly representing the terrestrial C cycling uncertainty.
We average the result across all ensemble members to derive an indication of which drivers (extent, temperature and respiration) are important in which regions ( Figure 3). We also use this figure as an opportunity to identify the geographic extent of the different wetland regions used throughout this study.

5
When considered globally, no one parameter is found to be the primary driver of the variation in the WetCHARTs emissions with the R 2 value ranging from 29-37%. For the Southern Hemisphere however, the wetland extent fraction itself is found to explain 83% of the variation in the emissions, with the temperature dependency and respiration both explaining approximately a third on their own.
When performing this analysis in smaller regions, individual behaviours become more apparent. For example, variations 10 of heterotrophic respiration are found to be much more important in some regions such as the West Amazon where they can explain 45% of the variations in the WetCHARTs emissions, whereas for the Sudd region it only explains 6% of the variation in the emissions. Likewise, the temperature dependency explains 63% of the variance in China and over 40% in many regions (West Amazon, Pantana, Paraná, Sudd and Southern Africa) but only a few percent in other regions (Yucatán, East Amazon, Indonesia, Papua, S.E. Australia).

15
The wetland extent is found to be the dominant explanation for the variance in all regions, often by a large margin, explaining > 95% of the variance in the Yucatán, West Amazon, East Amazon, Pantanal, Congo, Indo-Gangetic, S.E. Asia and N. Australia regions. Even for the regions where the wetland extent explains the least variance such as East US (72%), Sudd (64%), China (79%) and S.E. Australia (75%), it still explains more than the other parameters. The capability of the WetCHARTs emissions to successfully represent the observed wetland CH 4 seasonal cycle amplitude 5 is a vital component in the assessment of the utility of the emissions. In addition to assessing the seasonal amplitude we assess the magnitude and phase of the emissions using the correlation coefficient between the simulated and observed seasonal cycles.  There is a clear hemispheric distinction between the difference in seasonal cycle amplitude, with the underestimation being far more pronounced in the Northern Hemisphere (9.3 ppb) than in the Southern Hemisphere (1.9 ppb). This is further em-phasised by considering the tropical region only, the North Tropics underestimates the seasonal cycle amplitude by 11.9 ppb, compared to just 2.4 ppb in the South Tropics.
When considering comparisons on a regional scale, it is possible to characterise the different regions into groups that exhibit similar behaviour.
For some regions the seasonal cycle amplitude is always significantly underestimated for all years and for all ensemble Finally, the underestimation of the seasonal cycle amplitude can be similar to those above (-11.2 ppb) but with a very poor 15 correlation to observations (r = 0.20), as observed for the Sudd region.
In order to understand the effect that the different WetCHARTs parametrisations have on the different regional emissions, the correlation coefficient and standard deviation between the simulated and observed seasonal cycle is calculated for each region for each ensemble member. The full table of correlation coefficients per region per ensemble member is presented in Appendix A ( Figure A1). To highlight and isolate the individual effects of adjusting the 3 driving parameters (global scale factor, tem-20 perature dependency and wetland extent map), we plot the correlation coefficient for each configuration of the ensemble and link together data points where the only change between the ensemble members is the change in the specified parameter. This is demonstrated for the Tropics region in Figure 5. In this figure we see that the correlation coefficient is largely unaffected by a change in the global scale factor, when controlling for the other parameters. In contrast, a temperature dependency of Q 10 = 2 clearly leads to a higher correlation coefficient for the Tropics, when controlling for the other parameters, with Q 10 = 3 25 leading to the worst correlation coefficient in all cases. Similarly, it is evident that the GLWD wetland extent map, performs significantly better than the GlobCover map, significantly increasing the correlation coefficient in all cases.
To emphasise the effect related to the relative change of each parameter when considering the behaviour across multiple regions, we subtract as a baseline the lowest correlation coefficient from each set of lines. This change in correlation coefficient therefore gives an indicator of the improvement obtained by the change in each parameter, while keeping the rest of the 30 configuration for that ensemble member the same. Figure 6 shows the distribution of this change in correlation coefficient across all regions. These results illustrate that the choice of global scale factor makes little difference to the correlation between the observed and modelled wetland CH 4 seasonal cycle. In contrast, the choice of Q 10 value is found to make a substantial difference to the correlation coefficient. Setting Q 10 = 1 typically produces the worst correlation coefficient (as indicated by a median value of 0 improvement), with Q 10 = 2 typically leading to the largest improvement in correlation coefficient (with a  yr −1 is much more consistent across regions with a smaller spread (75th-25th range of 0.42 ppb vs 0.82 ppb) but on average produces a larger standard deviation than the lower scale factor (median value of 0.31 ppb). For temperature dependency, the picture is clearer, with a Q 10 = 2 producing the smallest median standard deviation (0.016 ppb) and the smallest spread (75th-25th range of 0.13 ppb). Finally, the GLWD wetland map is found to perform emphatically better than the GlobCover map for nearly every region, with GlobCover on average worsening the standard deviation by an average 0.34 ppb and up to over 4 ppb 5 in some cases When considering the two metrics (correlation coefficient and standard deviation) in unison, a consistent conclusion can be drawn that a temperature dependency can cause large changes in the agreement between model and observation, with a Q 10 = 2 value typically performing better on average but Q 10 = 3 performing better for some regions. Both results suggest that some temperature dependency is necessary, i.e. that Q 10 = 1 performs worse than the alternatives. Likewise, the GLWD wetland 10 extent map is consistently found to perform better than the GlobCover map. Whilst the global scale factor is found to have little influence on the correlation coefficient, it does influence the standard deviation between the modelled and observed wetland CH 4 seasonal cycle.  This summary of the regional analysis all points towards complex interactions and highlights the difficulty in using a simple parameterisation to represent many complex inter-related processes. A more detailed analysis of exemplar regions as case studies is valuable in explaining the above behaviours in more detail. This case highlights one limitation of WetCHARTs, namely that there is no underlying hydrological model to account for river flow or inundation but instead local precipitation determines the wetland extent variability. As such, WetCHARTs may not capture the behaviour where wetland extent is determined by behaviour upstream. This, however, allows WetCHARTs to act as a baseline against which to assess such behaviour in land surface models and determine if they are out-performing the simpler WetCHARTs precipitation-driven approach.
10 Figure 10. The standard deviation between the modelled and observed wetland CH4 seasonal cycle for each of the three parameters that vary within the ensemble (global scale factor, temperature dependency and wetland map). Data points are joined together where the other two parameters are kept constant and the only change is due to the specified parameter. This allows the influence of the change in each individual parameter to be assessed.

8 Case Study 2: The Congo
The Congo is perhaps one of the most important wetland regions to be able to well-characterise as it has the potential to dominate African methane wetland emissions but is still relatively poorly understood (Lee et al., 2011;Melton et al., 2013;Zhang et al., 2017a;Becker et al., 2018;Lunt et al., 2019).  increasing from a median of 4.91 ppb to 7.68 ppb) are increased. This is indicative of there already being too much CH 4 from WetCHARTs in this region and any parameterisation that enhances it further (either by scaling or adding a temperature dependency) exacerbates the discrepancy. This points to the wetland extent fraction being too large and this wetland extent masking clearly plays a significant role as the standard deviation differs substantially based on which extent mask is being used; ranging from 4.22 -6.70 ppb for the GLWD-based ensemble members but 5.08 -10.77 ppb for the GlobCover-based 20 simulations. The different wetland extent masks however do not affect the correlation coefficient between the simulated and observed seasonal cycles (Fig A1), suggesting that both wetland masks are as capable in parameterising the observed seasonal cycle but differ in the magnitude of the resulting emissions. Figure 12 compares both wetland extent datasets against the JRC Surface Water Occurrence and Maximum Extent datasets (Pekel et al., 2016), confirming that the wetland extent used in WetCHARTs is significantly higher than suggested by the JRC data. 25 Although the spatial resolution of GOSAT is relatively coarse (∼250 km), Figure 11 shows that it is possible to identify the spatial signature of the wetland signal in both the GOSAT observations and the model simulations (sampled at the GOSAT sounding locations). The GOSAT wetland signal (i.e. the difference to the simulation without any wetland emissions) is relatively weak, with a maximum (95th percentile) value of 11.9 ppb. We compare this to WetCHARTs ensemble members 2923 and 2924 (chosen to be illustrative of the wider ensemble with the medium global scaling factor and medium temperature 30 dependency). The maximum wetland signal from the WetCHARTs 2923 and 2924 ensemble members is much stronger than that derived from GOSAT, with values of 20.5 and 23.6 ppb respectively. As well as a much smaller maximum signal, the spatial standard deviation of the wetland signal for GOSAT is found to be 6.8 ppb, much smaller than the standard deviations from WetCHARTs of 11.3 ppb (2923) and 15.1 ppb (2924).  This demonstrates that while the spatial signature of the Congo wetland emissions generated by WetCHARTs are reasonably consistent with that from observations, the magnitude and variability of the emissions are far higher than those we observe from GOSAT. This remains the case for the smallest global scale factor and for no temperature dependency, leaving only the wetland fraction or heterotrophic carbon respiration per unit area as tuning parameters to reduce the emissions closer to observations. Various published atmospheric inversions of our CH 4 data that have used WetCHARTs as the prior, all indicate 5 that the Congo emissions are over-estimated by WetCHARTs and reduced when confronted with observations. This highlights the large uncertainty over this region. Once the necessary MsTMIP (or similar) model data becomes available and it is possible to extend the WetCHARTs Full-Ensemble (i.e. all respiration models) to this time period we will revisit this question in a future study.
This case highlights that future WetCHARTs development would benefit from further exploration of the character-10 isation and sensitivity to the heterotrophic respiration, with the extended ensemble currently only featuring a single member (CARDAMOM).

Case Study 3: Sudd
In this section we examine the Sudd wetland region. The Sudd wetlands, located in South Sudan, are one of the largest tropical wetlands in the world and are the largest wetland ecosystem in the Nile basin. They are fed via the White Nile, originating at 15 Lake Victoria to the South with flow through the region ultimately leading in to the Nile to the North. These wetlands therefore play a major role in regional hydrology and understanding their behaviour is of vital importance for environmental, economical and humanitarian reasons.
The extent of these wetlands is driven by seasonal inundation and outflow from Lake Victoria (Rebelo et al., 2012), albeit significantly affected by the complexities of the regional hydrology. The wetland extent exhibits a maximum each year between 20 August and November, coincident with the rainy season. This seasonal flooding has been estimated by Rebelo et al. (2012) to increase the wetland extent in the region by a factor of 4, with the total wetland area split between permanent (18%) and seasonally inundated (82%) wetlands. MODIS NDWI data (not shown) shows that as well as the increase in surface water directly over the Sudd wetland region, there is also increased surface water evident further to the north of the region around Lake Tana and the Blue Nile Basin. 25 The results for this region are of particular interest as whilst the discrepancy between the magnitude of the simulated and observed seasonal cycles is comparable to other regions (-11.2 ppb) the correlation coefficient is extremely poor at just 0.2.
This suggests that unlike many of the other regions where it is the magnitude of the seasonal cycle that WetCHARTs does not fully represent, this is one of the few regions where the seasonality is also poorly represented. This is illustrated by Figure 13 which shows the GOSAT CH 4 seasonal cycle over the Sudd region (black) along with the range of the WetCHARTs ensemble 30 (red) for the full CH 4 seasonal cycle (top panel) and with the "no wetland" simulation subtracted, resulting in a wetland CH 4 seasonal cycle (2nd panel). This wetland CH 4 seasonal cycle shows that while observations have a clear seasonal cycle, with the signal ranging from -10.8 to 14.3 ppb, the typical seasonal cycle for the WetCHARTs ensemble is less than half of this (ranging between -6.7 to 6.3 ppb) and importantly does not seem to have any temporal agreement with the observations, resulting in the very poor correlation coefficients we obtain for all ensemble members. The reason for this lack of sufficient seasonality in the WetCHARTs ensemble is evident when examining the time series of underlying driving data for the emissions (Figure 13, lower panels). The seasonality of both the temperature and heterotrophic respiration are out of phase with the wetland extent. This results in either there being sufficient temperature and respiration to produce methane but no wetland area from which to produce it, or alternatively, a large wetland area but insufficient temperature/respiration to produce the correct magnitude of emissions. The effect of this is very limited seasonality in the CH 4 5 emissions throughout the entire timeseries. This is exemplified in Figure 14 which shows the average wetland signal for August-November (the time period where the satellite CH 4 wetland signal peaks) between 2009-2017 for the GOSAT data (left) and two WetCHARTs ensemble members, 2923 (centre) and 2924 (right). Despite the very strong wetland signal observed over this area, directly over the Sudd wetlands, the WetCHARTs signal is extremely low. From Figure 13, the seasonality of the two wetland extent parametrisations is in agreement with the seasonality of the observed signal, identifying that the issue in 10 this area is not the dynamics of the wetland itself, but rather the seasonality of the temperature and respiration which result in the magnitude of the emissions being far too small even though both wetland extent fraction databases allow WetCHARTs to form wetlands in this area. One limitation of skin temperature is the assumption that heterotrophic respiration is sensitive to top of soil temperature. We advocate for an expansion of the WetCHARTs ensemble to include subsurface soil temperature estimates -in place of surface skin temperatures -to explicitly represent the representation uncertainty associated with the soil 15 temperature dependency of methanogenesis.
This case highlights an example where although the wetland extent is sufficient to lead to the correct seasonality in CH 4 emissions, the temperature/respiration are out of phase and as such, WetCHARTs can not reproduce the observed variability. This emphasises the importance of the interplay between the different driving parameters and the large discrepancy that can be caused if these are not consistent or sufficiently localised. 20 20 The Yucatán area of Mexico contains a variety of wetland ecosystems including mangroves, swamps, marshes and forests with the watershed containing the Grijalva and Usumacinta rivers in the Tabasco/Campeche region the largest wetland complex in the country. The Pantanos de Centla region, located in the Usumacinta/Grijalva delta, is classified as tropical moist forest and includes permanent wetlands as well as seasonally inundated swamp forests. Mangroves are present between the Pantanos de 5 Centla and the Laguna de Términos to the north, with moist tropical forests to the south, east and west.
For the Yucatán region, the correlation between the observed and simulated wetland CH 4 seasonal cycles is reasonable, with WetCHARTs ensemble members ranging from 0.76-0.89 (Fig. A1) suggesting that WetCHARTs is capable of representing the phase of the wetland CH 4 seasonal cycle. However, Figure 4 shows that the wetland CH 4 seasonal cycle amplitude is consistently underestimated compared to observations, with a median difference of -15.8 ppb and 25/75 percentile values 10 of -19.1 and -11.5 ppb respectively. This underestimation of the seasonal cycle can be attributed to the very low wetland extent fraction from both the GLWD and GlobCover datasets ( Figure 15) which fails to represent the wetlands in this region, particularly the large Tabasco/Campeche wetland complexes in the centre of this region, the Alvarado Lagoon system to the west and the Sian Ka'an coastal wetlands to the east. These are barely included in either wetland extent dataset but are clearly identified as being significant from the JRC Surface Water Extent ( Figure 15).

15
This case is of particular interest as it is one of the few examples where the Global Lake and Wetland Database performs particularly poorly, not featuring significant wetland extent that relates to a strongly observed wetland signal. The large difference between the wetland extent fraction from GLWD vs GlobCover is also striking (Figure 16) for this region, more so than in the other regions examined. Furthermore, both wetland extent datasets suggest a double-peak in the maximum extent, leading to two peaks in the emission data. GOSAT observations however only typically observes the second of these peaks in most years.

20
This case highlights the importance of the wetland extent data and that while for the majority of regions it is the detail of the variability in extent that we are concerned over, for some regions the extent is even more poorly constrained with large wetland regions still not being represented.

Discussion and Conclusions
In this study we have assessed the ensemble of WetCHARTs global wetland CH 4 emissions against satellite observations.

25
In particular, we have evaluated how well the magnitude and phase of the seasonal cycle of atmospheric CH 4 driven by the individual WetCHARTs ensemble members agrees with the seasonal cycle of CH 4 observed from the GOSAT satellite. Figure 4 provides an overall summary of how well the phase and magnitude of the observed wetland CH 4 seasonal cycle can be reproduced by WetCHARTs, both globally and at a regional scale. Across all years and all ensemble members, the observed global seasonal cycle amplitude is typically underestimated by WetCHARTs by -7.4 ppb but the correlation coefficient of 0.83 30 shows that the seasonality is well-produced at a global scale. The Southern Hemisphere has less of a bias (-1.9 ppb) than the Northern Hemisphere (-9.3 ppb) and our findings show that it is typically the North Tropics where this bias is worst (-11.9 ppb). When examining such large geographic areas, there is the possibility that significant positive and negative regional biases cancel each other out. While we find that the majority of individual wetland regions underestimate the seasonal cycle by some degree, we find compensatory effects over central Africa where an underestimation in the seasonal cycle amplitude over the Sudd wetlands (-11.28 ppb) is partially compensated for by an overestimation in the Congo (4.97 ppb). Such an effect has implications for flux inversions over central Africa and we advise caution when interpreting such results.

5
In our global evaluation, the most significant finding was that the correlation between the modelled and observed CH 4 seasonal cycle was substantially higher when the WetCHARTs ensemble includes a temperature-dependency (i.e. when the Q 10 value is not 1). For equivalent ensemble members (e.g. Q 10 = 1 vs Q 10 = 2 vs Q 10 = 3 with all other parameters the same), the correlation coefficient increased for example from 0.68 (1913) to 0.87 (1923) to 0.90 (1933). As expected, this behaviour at a global scale is not necessarily reproduced for all individual wetland regions with Figure 3 showing that for some regions, 10 the temperature dependence is far more of a factor in driving the variation in the seasonal cycle than for other regions.
Globally we also find that the difference in the correlation to observations is less reliant on which wetland fraction dataset is used (GLWD vs GlobCover) with both performing reasonably well in representing the global seasonal cycle (correlation coefficients of 0.88 for both 2923 and 2924). However, the choice of wetland fraction can dominate at regional scales with very significant differences in the correlation coefficient between paired ensemble members (e.g. r = 0.76 for 2923 vs r = 0.46 for 15 2924 for the Indo-Gangetic region).
These results all indicate that WetCHARTs is capable of sufficiently reproducing the phase and magnitude of the wetland CH 4 seasonal cycle in the wider sense, highlighting its utility as an apriori constraint on atmospheric flux inversions (i.e. the use for which it was developed). These results do indicate however that for certain regions, specific ensemble members do perform significantly better than others, whether due to the temperature dependence or wetland extent parametrisation. This therefore highlights that for focused regional studies, the ensemble mean (the most typically used configuration of the data) is not ideal and some care needs to be given to assessing whether an individual ensemble member is a more appropriate representation for that region. It is our intention that this study is useful when making this determination in future.

5
Our results also indicate that regardless of the ensemble configuration, WetCHARTs performs poorly at reproducing the observed seasonal cycle in some regions. When no ensemble member is capable of reproducing the observed seasonal cycle signal it suggests a deficiency in the parametrisations used. Understanding this behaviour is valuable as it can be used to identify processes that occur in a particular region that are not captured by a simple data-driven approach. This then informs the development of more complex land-surface models where such processes will need to be explicitly included. In order to address this, detailed analysis of some of the more challenging exemplar regions (Paraná River, Congo, Sudd and Yucatán) was performed.

5
For the Paraná River region in South America, a region we identified in Parker et al. (2018) as having the potential for significant CH 4 emissions driven by overbank inundation, we find that WetCHARTs typically reproduces the seasonality (r = 0.93) but underestimates the magnitude (-10.5 ppb). This underestimation is found to be most severe in specific years (2010,2016,2017) where the Paraná River and the Paraná Delta are flooded. This case highlights one deficiency with a data-driven approach where the variability in wetland extent is forced by precipitation as in WetCHARTs, namely that the effects of 10 significant river flow upstream of the wetland area (e.g. during a strong El Niño event) are not captured. In this instance WetCHARTs could act as a valuable benchmark against which to evaluate more complex land surface models which include lateral river flow and subsequent overbank inundation (Dadson et al., 2010). confirm that the magnitude of emissions from the WetCHARTs ensemble is inconsistent with atmospheric CH 4 observations with a much stronger rainy season wetland signal from WetCHARTs (20.5 and 23.6 ppb) than from observations (11.9 ppb) and that the observed seasonal cycle is poorly represented (r <= 0.67). We do however find that the spatial extent of the wetland emissions is largely in agreement with observations and that neither wetland extent parametrisation out-performs the other.
Our results point to the wetland fraction ( Figure 12) being far too large compared to the JRC Surface Water Extent. When 20 coupled with strong heterotrophic respiration from CARDAMOM, this results in excess emissions. This region highlights the importance (and uncertainty) of the underlying heterotrophic respiration and is a strong argument for the ensemble-based approach that WetCHARTs takes in its ensemble approach, utilising 9 heterotrophic respiration models in its full ensemble (FE) configuration. Utilising the same approach in the extend ensemble configuration (EE) used here (upon availability of suitable model data) would help to further constrain the wetland emissions and be a useful addition to WetCHARTs. 25 The Sudd is the second region in central Africa that we focused on as it provided a stark contrast to the Congo. Whilst the Congo had significant variability with very strong emissions, we found that the Sudd region had very low emissions with very little variability in the WetCHARTs seasonal cycle which is inconsistent with the knowledge that this region is dominated by seasonally inundated wetlands. Indeed our satellite observations showed the strong seasonal cycle that was expected, pointing to a deficiency in WetCHARTs in this region, leading to an extremely poor correlation (r = 0.2) to the observations and making 30 this an interesting case study. Our investigation showed that the reason for the lack of seasonality in the WetCHARTs data was due to a strong anti-correlation between the respiration/temperature and the wetland extent. The seasonality of the wetland extent is in agreement with the observed CH 4 signal, both peaking during the August-November rainy season which leads us to conclude that lack of sufficient temperature/respiration is the reason for the overall lack of a strong CH 4 seasonal cycle. This result again places WetCHARTs in the position to act as a useful benchmark when assessing these underlying fundamental processes within more complex land surface models.
The Yucatán region is our final region of focus. While the agreement between the emissions and observations is reasonable, WetCHARTs does underestimate the observed emissions during their peak each year. Furthermore, WetCHARTs produces a double-peak in the seasonality that is not present in the observations. We attribute both of these discrepancies to the wetland 5 extent parametrisations used. Both wetland extent parameters exhibit a double-peak, driven by the variability in precipitation and by examining the spatial extent of the wetland datasets ( Figure 15) we conclude that neither represent the large wetland complexes in this region, with GLWD doing particularly poorly. This result is of interest as the GLWD-based ensemble members are generally found to out-perform GlobCover for the majority of regions and overall we would conclude that GLWD provides a better representation of wetlands but that is not the case in this region. This highlights the ongoing need for further 10 improvements to global wetland extent datasets.
To conclude, we have performed the first, detailed, global and regional evaluation of the WetCHARTs CH 4 emission model ensemble against a long time series of high-quality, validated, satellite CH 4 observations. Our findings provide confidence that WetCHARTs is generally very capable of reproducing the observed wetland CH 4 seasonal cycle for the majority of wetland regions but that certain ensemble members are more suited to specific regions, either due to deficiencies in the underlying 15 data driving the model or complexities in representing the processes involved. The need for more reliable, validated, long-term wetland extent observations is clear as many of the discrepancies we observed are attributed to deficiencies in our knowledge of wetland extent. The remaining driving data (i.e. heterotrophic respiration and temperature) are shown to also contribute to the mismatch to observations, with the details differing on a region-by-region basis but generally showing that some degree of temperature dependency is better than none. Utilisation of an ensemble of heterotrophic respiration models for the full 20 WetCHARTs period would prove particularly valuable in this respect.
Finally, the data-driven approach utilised to produce WetCHARTs is well-suited to produce an ensemble dataset against which to evaluate more complex process-based land surface models that explicitly model the hydrological behaviour of these complex wetland regions.
Data availability. The latest version of the University of Leicester GOSAT Proxy v9.0 XCH4 data (Parker and Boesch, 2020)  Appendix A: Appendix A -Regional Correlation Coefficients Figure A1. A detailed breakdown of the correlation coefficients between the modelled and observed seasonal cycle for all regions for all ensemble members is presented in Figure A1.
Several regions have a very poor correlation to the observations (Sudd, S.E. Asia, Indonesia, N. Australia and S.E. Australia) across all ensemble members. It is also apparent that the temperature dependency is clearly significant for some regions, 5 as demonstrated by the improved correlation coefficient for the xxCx ensemble members for West Amazon, China, N. Australia and S.E. Australia when comparing no temperature dependency (xx1x) against an increased temperature dependency (xx2x/xx3x). However, the temperature dependency seems to have little effect on other regions (East Amazon, East US, Yucatán, Pantanal and Paraná) and for some regions, the strongest correlation is found when there is no temperature dependency and worsens when the temperature dependency is increased (e.g. Indonesia, Congo, Southern Africa).

10
The importance of the wetland extent parametrisation is region-specific, with the correlation to observations for the Tropics (especially North Tropics) being much better for GLWD extent (0.80-0.90 for North Tropics) vs GlobCover extent (0.62-Author contributions. RJP generated the GOSAT XCH4 retrievals, performed the analysis and wrote the manuscript. AAB produced an updated version of the WetCHARTs dataset for use in this study. CW and MPC produced the TOMCAT model simulations. All co-authors contributed to the planning and discussion of this study and on refining the manuscript.
Competing interests. We declare no knowledge of any competing interests.

5
JM acknowledges financial support from the Horizon 2020 CHE Project (776186). We acknowledge the support of the UK Natural Envi-