Interactive comment on “ Sensitivity of simulated historical burned area to environmental andanthropogenic controls : A comparison of seven fire models

Abstract. Understanding how fire regimes change over time is of major importance for understanding their future impact on the Earth system, including society. Large differences in simulated burned area between fire models show that there is substantial uncertainty associated with modelling global change impacts on fire regimes. We draw here on sensitivity simulations made by seven global dynamic vegetation models participating in the Fire Model Intercomparison Project (FireMIP) to understand how differences in models translate into differences in fire regime projections. The sensitivity experiments isolate the impact of the individual drivers of fire, which are prescribed in the simulations. Specifically these drivers are atmospheric CO2, population density, land-use change, lightning and climate. The seven models capture spatial patterns in burned area. However, they show considerable differences in the burned area trends since 1900. We analyse the trajectories of differences between the sensitivity and reference simulation to improve our understanding of what drives the global trend in burned area. Where it is possible, we link the inter-model differences to model assumptions. Overall, these analyses reveal that the strongest differences leading to diverging trajectories are related to the way anthropogenic ignitions and suppression, as well as the effects of land-use on vegetation and fire, are incorporated in individual models. This points to a need to improve our understanding and model representation of the relationship between human activities and fire to improve our abilities to model fire for global change applications. Only two models show a strong response to CO2 and the response to lightning on global scale is low for all models. The sensitivity to climate shows a spatially heterogeneous response and globally only two models show a significant trend. It was not possible to attribute the climate-induced changes in burned area to model assumptions or specific climatic parameters. However, the strong influence of climate on the inter-annual variability in burned area, shown by all the models, shows that we need to pay attention to the simulation of fire weather but also meteorological influences on biomass accumulation and fuel properties in order to better capture extremes in fire behavior.


humans suppress fires in regions with high population density, observational studies are less clear about what happens in areas of low population density and show both increases or decrease due to human activities (see for instance Marlon et al., 2008;Bowman et al., 2011;Marlon et al., 2013;Vannière et al., 2016;Andela et al., 2017;Balch et al., 2017). Studies of the covariation between population density and number of fires have shown that increasing population density leads to an increase in the number of ignitions or in the number of individual fires until peaking at inter-mediate population densities and drop 5 subsequently (Syphard et al., 2009;Archibald et al., 2010). The increase in burned area for low population density is expected to differ from the one found for number of fires as the largest fires occur in unpopulated areas (Hantson et al., 2015a) and burned area can be expressed as number of fires times fire size. Global analysis find that the net effect of population density is a decrease in burned area (Bistinas et al., 2014;Knorr et al., 2014), with high uncertainties for low population density if the method allows for non-monotonic relationships (Knorr et al., 2014). Regional analysis tends to confirm this, but positive 10 relationships between burned area and population density have been shown, for instance, for the least disturbed areas in the USA (Parisien et al., 2016).
Fire was used to manage croplands in pre-industrial times (e.g. Dumond, 1961;Otto and Anderson, 1982;Johnston, 2003) and it is still common practice in mainly in non-industrialized areas (i.e. Sub-Saharan Africa, parts of South East Asia, Indonesia and Latin America; e.g. Conklin, 1961;Rasul and Thapa, 2003). However fires in agricultural areas are common on all over 2017a; Lasslop et al., 2014). Wind speed for instance strongly varies between datasets and although wind speed is an obvious driver of fire spread, it is difficult to extract this influence on the spatial resolution of global models (Lasslop et al., 2015).
Fire-enabled vegetation models generally simulate observed global patterns of burned area and fire emissions reasonably well (Kloster et al., 2010;Prentice et al., 2011;Li et al., 2012;Lasslop et al., 2014;Yue et al., 2014), but there are large differences between models in terms of regional patterns, fire seasonality and interannual variability, and historical trends (Kelley et al., 5 2013;Andela et al., 2017). A recent evaluation of the FireMIP models indicates that the relationship with climatic parameters is captured well by models, the response to human factors is captured by some models and the response to vegetation productivity needs refinement for most models (Forkel et al., 2019).
In this study we briefly assess how well the FireMIP models simulated present day burned area. We then document how simulated burned area responds to individual forcing factors and relate inter-model differences of the burned area response to 10 differences in model assumptions or parametrisation. We finally discuss the model limitations and implications of our results for model development and application.

Methods
The baseline FireMIP experiment (SF1) is a transient simulation from 1700-2013, in which CO 2 , population density, landuse, lightning, and climate change through time according to prescribed datasets (see Rabin et al. (2017a) for details of the 15 experimental protocol). The five sensitivity experiments (SF2) are designed to isolate differences in model behaviour associated with individual forcing factors. The model inputs and setup are the same as in SF1, but one of the forcings is kept constant throughout the simulation in each experiment (see tab. 1). Thus, for example, in SF2_CO2, population density, land-use, lightning and climate inputs change each year, but CO 2 is held constant at 277.33 ppm for the whole of the simulation. Not all models performed every sensitivity experiment due to limitations in model structure (see tab. 2). Two of the models (CLASS-20 CTEM and CLM) started the simulations later than the others (1861 and 1850, respectively). Since our analyses are confined to differences in behavior during the 20th century, this difference in the length of the simulations between the models should have little impact.  (Rabin et al., 2017a). Rptd indicates the forcing was repeated over the given years. SF2_CO2 stands for fixed CO2, SF2_FPO for fixed population density, SF2_FLA for fixed land use, SF2_FLI for fixed lightning, and SF2_CLI for fixed climate.

Data processing and analysis of simulation results
Our analyses of the SF1 and SF2 simulations focus on the simulation of burned area but are complemented by effects on vegetation carbon pools for the SF2_CO2 simulation. We focus on the time series of global burned area over the historical simulation and the spatial patterns of differences in burned area between 1900 and 2013, as in this period all forcings are transient. Annual global values are an area weighted average using the grid cell area. We quantify the sensitivity of the models 5 to each driving factor using the relative difference in burned area between the baseline and the respective sensitivity experiment ((SF1-SF2_i)/SF2_i, with i in CO2, FPO, FLA, FLI, CLI). We use the climate data operators (CDO version 2018: Climate Data Operators. Available at: http://www.mpimet.mpg.de/cdo) to process and remap the simulated outputs. We test the relative difference time series for trends over the period from 1900 to 2013 using the Mann-Kendall test, implemented in the R package Kendall (McLeod, 2011). We quantify the global trend as the slope of a linear regression and summarize the spatial distribution of trends by quantifying the area with significant positive trends and the area with significant negative trends.
Due to a postprocessing error, INFERNO lacks two years in SF2_CO2 (2002 and2003).
To evaluate the realism of the simulations of burned area, we compare the simulated burned area with remote sensing data products. We used three satellite products: GFED4 (Giglio et al., 2013), GFED4s  and FireCCI50 (Chuvieco et al., 2018). These three data sets use different retrieval algorithms, which cause differences in spatial and temporal patterns in burned area (Hantson et al., 2016;Humber et al., 2018). Since there is no agreement about which is most reliable, using all three 10 products provides a measure of the uncertainty in the observations. For the comparison we use temporally averaged burned In spite of major advances in mapping burned area based on satellite data, these data products include major uncertainties.
GFED4 and FireCCI50 provide uncertainty estimates for the burned area. Applying Gaussian error propagation, which assumes that errors are independent and normally distributed, yields uncertainty estimates of 0.01 and 0.2 % of the global burned area, which is certainly an underestimation. The assumptions of normal distribution and independence are likely violated. The 20 spread between global burned area data sets is probably a more realistic estimate. Since all the products rely on the MODIS sensor, this approach will, however, also not capture the full uncertainty. Nevertheless, to investigate the effect of data quality in the observations on the model-data comparison we use the burned area product uncertainty estimates (aggregated to model resolution assuming independence) to group the observations into points with low, medium and high uncertainty (low: within the 0-33rd percentile, medium: within the 33rd-66th percentile, and high: within the 66th-99th percentile of the relative un-25 certainty estimates = uncertainty / burned area). We then compute the correlations for data points with low, medium and high uncertainty separately.
3 Results and discussion

Evaluation of the baseline experiment
The models show magnitudes of annual global burned area between 354-530 Mha/yr for present day. This is comparable to the 30 estimates obtained from the satellite products, which range from 345-480 Mha/yr (see fig. 1, tab. 3). The correlation coefficients between all of the simulations and the satellite observations are reasonable, with values ranging from 0.51 (CLASS-CTEM, GFED4s) to 0.8 (ORCHIDEE-SPITFIRE, GFED4; see tab. 3). In general, the correlations with GFED4 are highest and with GFED4s lowest for almost all models -which may reflect the fact that most models do not explicitly simulate agricultural fires or may reflect an overestimation or not sufficiently precise estimation of the contribution of such fires to burned area in the GFED4s data set. The correlation coefficients strongly decrease with increasing observational relative uncertainty (see tab. A1),

5
showing that part of the mismatch in the spatial patterns between simulations and observations is a consequence of uncertainties in the satellite products themselves. The FireMIP models simulate the broad scale patterns in burned area reasonably well (see fig. A1), with maxima in the major fire-affected regions of the Sahel, southern Africa, northern Australia and the western USA. All of the models tend to overestimate the burned area in South America and also in the temperate regions of the USA.  GFED4s 480 FireCCI50 389 The simulated trend in burned area of the historical reference simulation differs between the models (see fig. 1). All models,  et al., 2017). The short length of the satellite record leads to uncertainties in the trends, which are in most regions statistically not significant (Andela et al., 2017). Charcoal data are a proxy for fire occurrence over longer time scales (Marlon et al., 2008(Marlon et al., , 2016. These charcoal records show a global decrease in biomass burning over most of the 20th century (Marlon et al., 2008, 10 2016), which is consistent with carbon monoxide data from ice-core records . However, the charcoal records appear to show an increase in burning since 2000 CE, contrary to the decline shown by satellite-based records of burned area (Andela et al., 2017). This discrepancy might reflect sampling or taphonomic issues of the charcoal record. For instance the continents that contributes most to the global burned area (Africa) is heavily undersampled. Uncertainties in the charcoal record have been discussed before (e.g. Arora and Melton, 2018). A decline in global burned area over the 20th century might be more 15 realistic than the increase shown by several models. Better evaluation of historical trends in fire proxies and longer satellite time series will help to gain more confidence in observed trends of fire regimes. The better understanding of the drivers of simulated trends that we provide below can inform us on how certain trends can be achieved in models.

Sensitivity of models to individual drivers
There are large differences in sign and magnitude between models in the temporal response of global burned area from 2 a). The differences between models are increasing over the 20th century for these first three experiments. The response to changes in lightning and climate generally shows much smaller trends: only two models have a significant trend for climate, with increases in burned area due to changing climate (0.054 and 0.028 % year −1 ). Three models show significant (but incon-  The spatial patterns of trends in burned area are mostly heterogeneous (see supplement figures A3-A7). Limited areas of the world can dominate a global trend or the trends can cancel out when aggregating to the global burned area sum. A regional analysis is beyond the scope of this study, but we provide an alternative global view on the trends by quantifying the area affected by positive or negative trends (see fig. 3). This comparison shows that for most models larger areas show significant positive trends for the reference simulation (5 models), rising atmospheric CO 2 (5 models) and varying climate (all models), In the following paragraphs we detail the inter-model differences and their causes for each sensitivity experiment.

Sensitivity of models to CO 2
The overall changes in burned area in individual simulations as a result of atmospheric CO 2 changes are a complex response to multiple changes in vegetation: changes in land cover, fuel load, fuel characteristics and fuel moisture. Burned area can either increase due to higher availability of fuel loads or decrease due to changes in flammability caused by different fuel charac-5 teristics including moisture (Rabin et al., 2017a). The FireMIP-models react to increasing CO 2 in different ways: some models (JSBACH-SPITFIRE and LPJ-GUESS-SPITFIRE) show a strong increase in burned area, some (CLM and INFERNO) show a moderate increase, CLASS-CTEM shows a slight decrease, and LPJ-GUESS-SIMFIRE-BLAZE and ORCHIDEE-SPITFIRE show a non-monotonic response (see fig. 2, a)). For all models, the trends over the 20th century are significant (see tab. 4). 10 We use changes in vegetation carbon to understand changes in fuel load and composition because information on the amount of fuel used within the fire models was not available for individual plant functional types (PFTs). All models show an increase in total vegetation biomass ('total', solid lines; see fig. 4). The response of specific types of vegetation carbon to increasing CO 2 varies between the vegetation models. The biomass of C 3 vegetation (trees and C 3 grasses) increases in all of the models. The biomass of C 4 grasses increases in CLASS-CTEM, INFERNO, and JSBACH-SPITFIRE, but does not change in 15 ORCHIDEE-SPITFIRE. Since ORCHIDEE-SPITFIRE was run with fixed vegetation distribution, changes in the extent of different PFTs can be ruled out as a cause of changes in vegetation carbon. There is a decrease in burned area in regions with abundant C 4 grasses (Sahel and North Australia) in this model, suggesting that increased C 3 tree biomass results in changes in flammability in these regions. The carbon stored in C 4 grasses is reduced in response to increasing CO 2 in CLM and LPJ-GUESS-SIMFIRE-BLAZE and is fairly constant in LPJ-GUESS-SPITFIRE. This can be a result of a decrease in C 4 grass cover in LPJ-GUESS-SIMFIRE-BLAZE and LPJ-GUESS-SPITFIRE. However, since CLM was run with prescribed vegetation cover, the reduction in C 4 carbon must reflect the fact that any increase in C 4 grass biomass due to higher CO 2 is offset 5 by greater losses through burning due to the increased total fuel load.
CLM and LPJ-GUESS-SIMFIRE-BLAZE include an interactive nitrogen cycle, CLASS-CTEM a non-interactive nitrogen down-regulation. Effects of CO 2 on vegetation biomass for these three models are therefore at the lower end of the model ensemble.
Soil moisture is used by several models to compute fuel moisture (see fig. 5). Soil moisture can be influenced by different 10 atmospheric CO 2 as reductions in stomatal conductance can lead to increases in soil moisture, whereas increases in LAI caused by increased biomass of increased tree cover lead to higher transpiration and therefore lower soil moisture. Soil mois-  The strength of CO 2 effects on productivity and allocation is still uncertain. Comparisons with experimental data suggest that models that do not include the nitrogen cycle overestimate the effect on productivity (Hickler et al., 2015). However, an analysis using an observation-based emergent constraint on the longterm sensitivity of land carbon storage shows that models from the Coupled Climate Model Intercomparison Project (CMIP5) ensemble that included an interactive nitrogen cycle underestimate the impact of CO 2 on productivity (Wenzel et al., 2016).    The models all agree that for high population density fire is suppressed, but differ on their assumptions what happens for low population density and the threshold where humans start to suppress fire and whether explicit suppression is included. This leads to some similarities in the spatial patterns of the effect of population changes (see fig. A4). The net or emerging effect of humans on burned area in models, however, also depends on the presence of lightning ignitions. As soon as lightning ignitions are present, the net effect of humans is to suppress fires, even when the underlying relationship assumes an increase in ignitions with population density (Arora and Melton, 2018, supplement). This may explain why global models assuming an increase of ignitions with increases in population density are able to capture the burned area variation along population density gradients (Lasslop and Kloster, 2017;Arora and Melton, 2018) although global statistical analysis support a net human suppression also 5 for low population density (Bistinas et al., 2014).

Sensitivity of models to land-use change
The land-use change imposed in SF2_FLA over the recent centuries is characterized by a strong decrease in forested areas, and shows a decrease in burned area but this change is comparatively muted (see fig. 2, c)).
The FireMIP-models handle land-cover dynamics, the expansion of agricultural areas and fire in agricultural areas differently.  In LPJ-GUESS-SIMFIRE-BLAZE pastures are harvested; this reduction in biomass leads to a decrease in burned area in addition to the decrease caused by exclusion of fire in croplands. In JSBACH-SPITFIRE, the expansion of pastures occurs preferentially at the expense of natural grassland and does not affect tree cover until all the natural grassland has been replaced (Reick et al., 2013). This assumption decreases the effect of land cover conversion on tree cover. Additionally in JSBACH-SPITFIRE the fuel bulk density of pastures is higher than that of natural grass by a factor of two, which decreases fire spread 5 and thus burned area (Rabin et al., 2017b). This difference reduces burned area in pastures compared to natural grassland. In CLASS-CTEM, which also shows a decline, pastures are not included, the only land conversion is due to the expansion of croplands.

Some of the models (CLASS-CTEM, CLM, JSBACH-SPITFIRE, ORCHIDEE-SPITFIRE) prescribe the vegetation distribu-
LPJ-GUESS-SPITFIRE and ORCHIDEE-SPITFIRE react with an increase in burned area to the expansion of land-use since they treat pastures as natural grasslands. The SPITFIRE fire module is very sensitive to the vegetation type with very 10 high burned area for natural grasslands due to higher flammability compared to woody PFTs (Lasslop et al., 2014. Fuel bulk density is an important parameter but additionally grass fuels dry out faster leading to an increase in flammability and therefore burned area if forested areas are converted to grasslands. LPJ-GUESS-SPITFIRE computes the vegetation cover dynamically, so that an increase in burned area reduces the cover fraction of woody types, which might explain the stronger response compared to ORCHIDEE-SPITFIRE. In CLM pastures are represented by increased grass cover. The biomass scaling function does not distinguish fuel types (see fig. 5), therefore the lower fuel amount of grasslands could lead to a decrease in fire probability, while the maximum fire spread rate depends on the vegetation type and is higher for grasslands (Rabin et al., 2017b). The inclusion of cropland and deforestation fires dampen the effect of land-cover change on global burned area. In INFERNO, agricultural regions are not defined explicitly. Instead, woody PFT types are excluded on agricultural area (Clark et al., 2011). INFERNO includes an average burned area for each PFT in the calculation of the burned area per PFT which 20 leads directly to increasing grass cover resulting in higher burned area (Mangeon et al., 2016;Rabin et al., 2017b).
Land-use was already identified as a main reason for inter-model spread in the CMIP5 ensemble (Kloster and Lasslop, 2017).
We have shown that this largely reflects the way pastures are treated, as most models used here (except CLM and INFERNO) simply exclude croplands from burning.

Sensitivity of models to lightning 25
Most of the models show a low sensitivity of burning rates to lightning (see fig. 2), although lightning rates increase by 20% over the simulation period. ORCHIDEE-SPITFIRE shows an increase in burned area between 1940-1960 and towards the end of the simulation. The reason can most reasonably be found in comparison to the other SPITFIRE-models and seems to be related to two points. Firstly, it uses a 12 times higher factor to convert lightning strikes to actual ignitions and anthropogenic ignitions that are 100 times lower than for the other models. Therefore, the partitioning of natural and anthropogenic ignitions 30 is different from other SPITFIRE models (see Rabin et al., 2017b). Secondly, although a partitioning factor (SGFED) varies regionally, the per-capita ignition frequency is constant; in JSBACH-SPITFIRE and LPJ-GUESS-SPITFIRE, the per-capita ignition frequency varies regionally. This results in strong differences in the spatial patterns of burned area (see fig. A1). In consequence the strength of regions contributing to the global burned area varies between the models; ORCHIDEE-SPITFIRE shows much more burning in the tropical and far less burning in the temperate region. Our results show that even a substantial increase (20%) in lightning has little influence on global burned area. However, lightning is known to be an important cause of ignitions regionally and is potentially involved in more complex interactions between fire, vegetation and climate, which can speed up the northward expansion of trees to the north in boreal regions (Veraverbeke et al., 2017). Thus, although we have shown that the influence of increasing lightning is negligible at a global scale, it is a potentially important factor for regional 5 impacts.

Sensitivity of models to climate
Simulated burned area in FireMIP responds to changes in climate with strong interannual variability but only weak trends in burned area (see fig. 2, e). Only three models show a statistically significant trend in the global burned area according to a Mann-Kendall test (CLM, LPJ-GUESS-SIMFIRE-BLAZE,ORCHIDEE-SPITFIRE; see tab. 4). However, in all models the 10 area showing an increased burned area in response to climate is higher (see fig. 3). Agreement in spatial patterns of trends between the models is however low (see fig. A7).
The influence of climate on burned area is complex; it influences burned area through the meteorological conditions and through effects on fuel load and fuel characteristics (Scott et al., 2014). We therefore correlated for each grid cell changes in physical parameters (precipitation, temperature and soil moisture) and vegetation parameters (litter, vegetation carbon and 15 grass biomass) with changes in burned area. We find that the correlation between the individual parameters and burned area is low (see fig. A8). The absolute rank correlations are lower at the monthly scale than at the annual scale. However, at the monthly scale the number of grid cells showing significant correlations with physical parameters is higher than the number showing significant correlations with vegetation parameters, indicating that changes in physical parameters have more influence at shorter time scales than changes in vegetation parameters. This difference disappears with the aggregation to annual time 20 scale. On the annual time scale, however, the mean absolute rank correlation is slightly higher for the vegetation parameters.
Soil moisture which is also influenced by vegetation has a slightly higher correlation compared to precipitation and temperature too. This indicates that vegetation parameters are more influential on the longer annual time step and physical parameters on the monthly time step. The relationship between precipitation or soil moisture and burned area is expected to be negative, while the impact of temperature is expected to be positive. This is clearly reflected in the percentage of positively significant 25 correlations at the annual scale, but is less clear at the monthly time step. This might reflect that the seasonality of temperature, precipitation and vegetation parameter is often synchronized and therefore the effects of the parameters cannot be separated.
The low correlation between individual parameters and burned area reflects the complex interactions between the climatic drivers, vegetation conditions and fire weather.
The impact of climate on the interannual variability is, however, is strongly expressed in the simulated burned area. This is

Implications for model development and applications
The huge spread of simulated burned area trends for any of the forcing factors indicates the high uncertainties in burned area trajectories. With the current state of knowledge, the use of a model ensemble that covers the model structural uncertainties is 5 clearly the best approach for projections. Nevertheless, our analyses suggest a number of promising avenues for further model development and indicates which analysis of observational data would be useful to constrain global models. Improvements of global models will be particularly important to improve the future projections of fire-enabled models to support land management strategies for instance in the context of climate change mitigation.
Representing human influence on fire is the major challenge for long-term projections. Our analyses of the controls on the 10 variability of fire suggest that human activities drive the long term (decadal to centennial) trajectories, while considering climate variability may be sufficient for short-term projections. The large divergence in the response to human activities between the FireMIP models shows that the human impact on fires is still insufficiently understood and therefore poorly represented in current models. There is strong inter-model agreement that burned area is suppressed at high population densities, which means that most models show a similar spatial distribution of fire-prone areas (see fig. A4) and a reduction of the burned area 15 in the last decades of the simulation due to increases in population density. However, the reduction in global burned area in the reference simulation is for most models still much smaller than shown by satellite observations (Andela et al., 2017). This could be solved by increasing the suppression effect of humans through population density in the models, however, it could also be related to land-use and for LPJ-GUESS-SPITFIRE and JSBACH-SPITFIRE to overestimation of the CO 2 fertilization effect. The level of socioeconomic development also modifies the relationship between population density and burned area 20 (Andela et al., 2017;Forkel et al., 2017); further analyses are required to better disentangle the balance of the different driving factors.
We have identified land-use change as the major cause of inter-model spread. Only one model included fires associated with land use and land cover change (cropland and deforestation fires), all the other models only included such effects through changes in vegetation parameters and structure. Croplands are simply excluded from burning in all but one model. The spread 25 of the other models is therefore likely related to the treatment of pastures. The inclusion of cropland fires is certainly important to understand and predict changes in emissions, air pollution and the carbon cycle (Li et al., 2018). Cropland fires are due to their small extent and low intensity still a major uncertainty in remote sensing datasets . High resolution remote sensing may help to improve the detection. But increased understanding in regional differences why and when people burn croplands may help to find an adequate representation of cropland fires within models. Pastures contribute over 30 40% of the global burned area (Rabin et al., 2015). Pasture fires are not treated explicitly in any of the models, although some models slightly modify the vegetation on pastures, by harvesting or changing the fuel bulk density (see 5). Since most models implement expansion of pastures simply by increasing the area of grasslands, information on how fuel properties differ between pastures and natural grasslands could help to improve model parametrisations. Grazing intensity was found to be related to decreases in burned area (Andela et al., 2017). It therefore may be necessary to include information on grazing intensity, or better information on pasture management in general, to represent pastures realistically within global fire models.
In contrast to many model simulations that use a lightning climatology based on satellite observations, the FireMIP experiments were driven by a transient dataset of lightning activity created by scaling a mean monthly climatology of lightning activity using convective available potential energy (CAPE) anomalies. Although we do not detect large signals in global burned area due to 5 changes in lightning, the impact of changes in lightning at a regional scale (and particularly in boreal regions) is considerable.
Since climate changes can be expected to cause changes in lightning, it will be important to develop transient lightning datasets for climate change studies on fire. Using present day lightning patterns, for example, will certainly lead to an overestimation of lightning strikes in regions with drier climate projected in the future. The covariation with climate as well as the temporal resolution are important (Felsberg et al., 2018). The FireMIP dataset was developed using only a limited amount of information 10 about the covariation of precipitation, CAPE and lightning; further analyses of these relationships would be useful.
It is obvious that remotely sensed burned area datasets alone are not a sufficient basis to evaluate fire models as many model structures can lead to reasonable burned area patterns. It is important to test how well current models represent the number of fires, the size of individual fires and fire intensity. Both the effects of fire on vegetation (combustion of biomass and tree mortality; Williams et al., 1999;Wooster et al., 2005) and of plume heights for fire emissions to the atmosphere (Veira et al.,15 2016) are a function of fire intensity. The emergence of longer records of burned area and the increasing availability of information on other aspects of the fire regime should considerably improve opportunities to evaluate and improve our models.
The FRY database  and the global fire atlas (Andela et al., 2018), for example provide information on fire size, numbers of fire, and the characteristics of fire patches. Exploiting such datasets should help to constrain the internal mechanisms of fire models and hopefully allow to improve the balance of different drivers.

Summary and conclusions
The analysis presented here improves our understanding of global modelling of burned area and uncertainties associated with specific drivers and process representations in the models. The identified differences in fire models also provide information to focus analysis of observations that aim to provide constraints for global fire models. Although burned area in most models 25 compares reasonably well with satellite observations, there is a huge spread in transient simulations before the satellite era and a huge spread in the influence of the driving factors between models.
The analysis of the sensitivity experiments showed that: (1) The increase in atmospheric CO 2 concentration over the 20th century leads to increased burned area in regions where fuel loads increase, but to decreased burned area in regions where tree density or coarse fuels with lower flammability increase or increases in soil moisture decrease flammability. Although models 30 agree that the available fuel increases, the type of fuel and vegetation composition are, however, critical to understand the influence of CO 2 on simulated burned area.
(2) Most models link the number of ignitions to population in a way that ignitions increase initially at low population densities.
In densely populated regions, all models assume that the effect of anthropogenic ignitions is outweighed by fire suppression and the increased fragmentation of the landscape by anthropogenic land use. Whether the model shows an overall increase, a decrease or an initial increase followed by a decrease in burned area over the 20th century depends largely on the population threshold assumed for the transition from increasing ignitions to increasing suppression, and the complexity of the treatment of fire suppression.

5
(3) The simulated response of burned area to land-use and land cover change depends on how fires in cropland and pastureland are treated in each model. Most models simply exclude croplands from the burnable area, therefore the treatment of pastures contributes the largest part of the model spread. Models that do not allow fire in croplands, and either harvest biomass in pastures or assume specific vegetation parameters, show a reduction in burned area. Models that treat pastures as natural grasslands and distinguish different fuel types or strongly increase burned area for grasslands show an increase in burned area.

10
(4) The models are comparatively insensitive to changes in lightning, likely because lightning ignitions are not a limiting factor in many regions with very high burning. Previous studies however show the importance of lightning and changes in lightning for burned area in the boreal region. Therefore especially regional studies should pay attention to this factor.
(5) None of the models shows a strong trend due to changing climate but all of them show a strong influence on the interannual variability. Climatic and ecosystem parameters are only able to explain a rather small part of this variation, with stronger more important for the longer term changes of fire as for instance needed, for instance, in Earth system models.
The uncertainties in global fire models need to be taken into account in model applications, for instance if model simulations are to be used to design climate adaptation strategies. Using model ensembles can be suitable to provide estimates of the uncertainties.
Code availability. TEXT Data availability. Datasets will be available after acception of the paper Code and data availability. TEXT Figure A2. Regression slope of a grid cell for the baseline experiment SF1 over 1901-2013.      Figure A8. Spearman rank-order correlation coefficient for each grid cell over 1901-2013 between the relative difference between the baseline experiment SF1 and the sensitivity experiment SF2_CLI (see tab. 1) for annual burned area fraction and precipitation, temperature, carbon stored in litter, carbon stored in vegetation, carbon stored in grass and in soil moisture, respectively. The upper panel shows the mean absolute rank correlation, i.e. the spatial average over the absolute and significant Spearman rank-order correlation coefficients where the relative difference in burned area fraction is > 0.1. The second panel shows the proportion of grid cells with a significant correlation. The lowest panels indicate the percentage of significant grid cells with a positive correlation. Table A1. Correlation coefficients between burned area simulated by the FireMIP-models within the baseline experiment SF1 and the respective observation data. Due to the very skewed distribution of burned area, we use a square root transformation on both model and observations. Numbers in brackets show the Pearson correlation coefficients for not-transformed data. GFED4 and FireCCI50 provide uncertainty estimates. Correlation coefficients for 33% show the correlation between all grid points that lie within the 0-33% percentile of the relative standard error; values for 66% lie within the 33-66% percentile of the relative standard error and values for 99% lie within the 66-99% percentile. Bold numbers indicate correlation coefficients that are significant (p-value < 0.001).