Reconciling different approaches to quantifying land surface temperature 1 impacts of afforestation using satellite observations

16 Satellite observations have been widely used to examine afforestation effects on local surface 17 temperature at large spatial scales. Different approaches, which potentially lead to differing 18 definitions of the afforestation effect, have been used in previous studies. Despite their large

2 estimates the actual effect and the other two the potential effect) and use large-scale 26 afforestation over China as a test case to examine whether the different approaches can be 27 reconciled. We found that the actual effect (ΔTa) often relates to incomplete afforestation over 28 a medium resolution satellite pixel (1km). ΔTa increased with the afforestation fraction, which 29 explained 89% of its variation. One potential effect approach quantifies the impact of quasi-full 30 afforestation (ΔTp1), whereas the other quantifies the potential impact of full afforestation (ΔTp2) 31 by assuming a shift from 100% openland to 100% forest coverage. An initial paired-samples t-32 test shows that ΔTa < ΔTp1 < ΔTp2 for the cooling effect of afforestation ranging from 0.07K to 4 uncertainty for climate model evaluation, with the uncertainty range being as big as, or even an 76 order of magnitude larger than, the afforestation effect. It remains unclear whether the 77 differences arising from these different methods can be reconciled.

79
Until now, studies using satellite data to investigate afforestation impact on surface temperature 80 have mainly focused on three methods. The first method, termed the 'space-and-time' approach 81 ( Fig. 1, red box), aims to examine the actual, realized effect of afforestation ('actual effect') by 82 isolating the forest cover change effect from the gross temperature change over time in places 83 where forest cover change actually occurred (Alkama and Cescatti, 2016;Li et al., 2016a). The 84 second method, termed the 'space-for-time' approach ( Fig. 1, orange box), compares the 85 surface temperature of forest with adjacent 'openland' (i.e., cropland or grassland) under similar 86 environmental conditions (e.g., background climate and topography) and estimates the 87 'potential effect' of afforestation if afforestation were to occur (Ge et al., 2019;Li et al., 2015; 88 Peng et al., 2014). Note that such effects are often quantified using medium-resolution land-89 cover datasets (typical resolution = 1km), which do not necessarily represent 100% ground 90 coverage, and we therefore term such a potential effect a 'mixed potential effect'.

92
The third method, recently used by Duveiller et al. (2018), uses the 'singular value 93 decomposition' technique ( Fig. 1 green box), which is claimed to extract the hypothetical LST 94 for different land-cover types by assuming a 100% coverage of the target cover type. The 95 afforestation effect on LST is then quantified as the difference between the LST of a pixel with 96 a hypothetical 100% forest coverage and the LST of an adjacent pixel with 100% openland 97 coverage. As with the second method, such an approach quantifies the 'potential effect' of 98 afforestation, but in this case, it quantifies the 'full potential effect' by assuming transitions 99 between land-cover types with 100% complete ground coverage.  at time t2 (i.e., having experienced afforestation) and the other unchanged. Note, neither of these 129 pixels will have 100% complete coverage of either openland (i.e., grassland or cropland) or 130 forest, but they will have been classified as either openland or forest by medium-resolution 131 satellite products. (c) and (d) represent pixels with 100% forest or 100% openland coverage 132 whose temperature can be derived from pixels of mixed land cover types by using the singular 133 value decomposition (SVD) technique (Duveiller et al., 2018). The red dotted box describes the 134 quantification of the 'actual effect' of afforestation (ΔTa) occurring from t1 to t2 by the 'space-135 and-time' method. The orange box represents the 'mixed potential effect' determined by 136 hypothesizing potential shifts between openland and forest based on the 'space-for-time' 137 approach (ΔTp1). The green box represents the 'full potential effect' of afforestation (ΔTp2) 138 7 derived by hypothesizing a transition from 100% complete openland coverage to 100% 139 complete forest coverage.  The 'space-and-time' approach assumes that the gross change in land surface temperature (ΔT) 151 over a given time period during which afforestation occurred, contains both signals of 152 temperature change due to afforestation (ΔTa) and background temperature variation (ΔTres) 153 due to changes in large-scale circulation patterns (Alkama and Cescatti, 2016;Li et al., 2016a The 'space-for-time' method relies on the 'space-substitute-for-time' assumption to obtain the 172 potential impact of afforestation on local temperature (Zhao and Jackson, 2014). By assuming 173 that forest and openland share the same environmental conditions (background climate, 174 topography, etc.) within a small spatial domain, the potential temperature effect of afforestation 175 is examined using the search window method with a window size of up to 40km×40km (Ge et 176 al., 2019;Li et al., 2015;Peng et al., 2014). Here, to be consistent with our 'actual effect' 177 approach, a more conservative window size of 11km×11km was used, smaller than that used in 178 the majority of previous studies (Ge et al., 2019;Li et al., 2015;Peng et al., 2014). In most 179 9 previous studies, existing medium resolution (1km) land-cover maps were used directly. Such 180 land-cover products rely on certain thresholds to classify satellite pixels into discrete land-cover 181 types. Given the widespread spatial heterogeneity in land-cover distribution, it is to be expected 182 that only in rare cases will these medium-resolution pixels have 100% coverage of a given land-183 cover type. Therefore, when determined in this way, the potential effect of afforestation has 184 been named the 'mixed potential effect', in contrast to the 'full potential effect', on which we 185 will focus in the next section, which assumes a potential transition between land-cover types of 186 100% coverage.

188
To ensure consistency with the land-cover data used in the 'full potential effect' approach (i.e., 189 the SVD method), the GlobeLand30 land-cover map was aggregated from its original resolution 190 (30m) to 1km resolution. The land-cover type assigned to a given 1km pixel during aggregation 191 was based on the land-cover type with an area fraction >50% within that pixel, to be consistent 192 with the rationale behind the generation of medium-resolution land-cover products (Section   The full potential effect represents the temperature change due to hypothesizing a shift from 213 100% openland to 100% forest coverage, and was determined here by employing the singular 214 value decomposition (SVD) method used in Duveiller et al. (2018). The SVD technique 215 assumes that the temperature observed for a pixel at 1km scale is a linear composition of the 216 temperatures of different land-cover types at a finer resolution (in our study at a 30m resolution).

217
For each 1km pixel, the observed temperature can be written as the composition of the 218 temperature of each component land-cover type and its corresponding fraction, based on the 219 land-cover fractions derived from the 30m-resolution GlobeLand30 map (Section 2.2). The 220 temperature of each type of land cover was assumed constant within a search window of 11km 221 × 11km. For each given search window, the following equations can be obtained: where n is the total number of 1km pixels within the window, after accounting for the elevation to the temperature of land cover type i. To minimize elevation impacts, the linear regression 227 relationship for a given 1km pixel was included only when the elevation difference between 228 this pixel and the central pixel of the search window was smaller than 100m. Using matrix 229 notation, Eq. (5) can be simplified to: where the matrix X contains land-cover fraction for the n 1km pixels as an explanatory variable, column (Zhang et al., 2007), the matrix X was transformed, by applying the SVD technique, to 237 another matrix, Z, of reduced dimension (more details in Duveiller et al., 2018). After this 238 transformation, we have the following: in which the β' coefficient can be obtained from equation (8): However, the β' vector calculated from the transformed matrix Z cannot directly provide 243 surface temperatures for corresponding land-cover types. To obtain temperatures for each land-244 cover type by assuming 100% ground coverage, an identity matrix Y with its dimension equal 245 to the number of land-cover types must be constructed to represent the hypothetical case in 246 which each 1km pixel was 100% covered by a single land-cover type. The same transformation 247 as applied to the matrix X was then applied to Y, to obtain a transformed matrix Z'. Finally, the 248 predicted temperature ( ' 100% LST ) for each land-cover type assuming a 100% coverage is 249 calculated as: For the central pixel of the local search window, ΔTp2 is defined as the difference between the  China was selected as the test case for addressing the important methodological issues in 267 quantifying land surface impacts of afforestation because afforestation is a key national strategy 268 for sustainable development and climate mitigation (Bryan et al., 2018;Qi et al., 2013).  (Table 1). The datasets were 290 regridded to harmonize with spatial (1km) and temporal (annual) resolutions (Table 1).

292
The MOD11A2 product provides 8-day land surface temperature for 10:30 AM and 22:30 PM 293 from the Terra satellite, but here we focused on daytime surface temperature. Only valid LST 294 observations from the original data were used to compute the average daily values for a given 295 year. Years for which more than 40% of daily data are missing were excluded from the analysis.

296
Annual data were then aggregated to obtain the average annual temperature for periods t1 and  The MCD43B3 product provides white-sky and black-sky shortwave albedo at 16-day temporal 300 resolution (Table1). The observed white-sky albedo was used as the daytime albedo (Peng et  Two land-cover datasets were used in this study: the 'actual effect' approach was based on the 309 Global Forest Change (GFC) dataset, while the 'mixed potential effect' and 'full potential effect' 310 used the GlobeLand30 land-cover data (Table 1).

312
The SVD technique, used in the 'full potential effect' approach, requires a land-cover map with 313 a higher spatial resolution than the 1km spatial resolution of the LST data. The GlobeLand30 314 product, which is based on Landsat images, provides land-cover information for China at a 30m 315 resolution for the years 2000 and 2010 (Chen et al., 2015). Cultivated land and grassland in 316 GlobeLand30 were classified as openland. Discrete land-cover type information at 30m 317 resolution in 2010 was aggregated to obtain the area fractions of the different land-cover types 318 at 1km resolution, which were then used to construct matrix X in Eq. (5) (Fig. 2). Furthermore, 319 land-cover type information at the 1km scale was extracted, based on the vegetation type with 320 area fraction >50% for every 1km×1km window. This data was then applied in the 'space-for-321 time' method to identify forest and openland (Fig. 2). GlobeLand30 data is not suitable for detecting forest change (Zeng et al., 2021). The Global 324 Forest Change (GFC) data, however, provides forest gain and forest loss at a spatial resolution 325 of 30m between 2000 and 2012 and has been used for mapping global forest change (Hansen 326 et al., 2013). This product shows an overall accuracy of greater than 99% for areas of forest 327 gain at the global scale when compared with statistical data reported in Forest Resource 328 Assessment (FRA), LiDAR detection (Geoscience Laser Altimetry System), and MODIS 329 NDVI time series (Hansen et al., 2013), and thus has been recommended for use in forest and 330 forest-change estimates (Chen et al., 2020;Zeng et al., 2021). Using this dataset, forest loss 331 events were identified for each year between 2000 and 2012, but forest gain was only identified 332 for the whole period, simply because forest loss is an abrupt change which can be effectively 333 identified over short time periods, whereas forest gain is a gradual change which can only be 334 confidently identified over longer time spans. Here, forest losses and gains from GFC were 335 aggregated at a 1km resolution to obtain net forest change (defined as forest gain minus forest 336 loss) during this period (Fig. 2). A positive net change indicates afforestation and the area 337 percentage of afforestation for the 1km pixel area was defined as Faff. The land-cover type of 338 pixels with Faff = 0% was considered to be stable. This net forest-change information was then 339 used in the calculation of the actual afforestation-induced temperature effect (ΔTa) (Fig. 2). Firstly, it can be reasonably assumed that ΔSWin≈0 and ΔLWin≈0, given that all three 352 approaches consider only local effects on surface temperature by following a comparison of 353 target pixels with surrounding control pixels, thus excluding feedbacks from, e.g., cloud 354 formation (Duveiller et al., 2018). Changes in reflected shortwave radiation can be derived as:  The spatial distributions of original samples for the three methods are different because of the 379 different land-cover maps used ( Fig. 2 and Figure A1) and, therefore, the statistical analysis 380 was limited to those pixels shared by all three approaches: 96,058 sample pixels at 1km 381 resolution. The distribution of these shared sample pixels retained the characteristics of the 382 spatial distribution of the original samples ( Figure A2).

384
Differences in the afforestation effects on LST of the three approaches were tested by 385 performing paired-samples t-tests between pairs of approaches. The paired-samples t-test was 386 used, rather than a normal t-test, to avoid the bias due to strong spatial heterogeneity in the LST 387 effects of afforestation that could occur if the values of all pixels had been pooled together for 388 a normal t-test. The test was made using the 'ttest_rel' method from the 'scipy.stats' package 389 in Python. The Bonferroni correction was applied to adjust the significance level (p-value) to 390 mitigate the increasing Type I error when making multiple paired-samples t-test, which in our 391 case involves three pairs (Lee and Lee, 2018;UC Berkely, 2008). The Bonferroni correction 392 sets the significance cut-off at α/k (with α as the p-value before correction and k as number of 393 pairs). In this study, with 3 hypotheses tests (i.e., 3 pairs) and an original significance level α = 394 0.05, the adjusted p-value is 0.0167. In order to investigate ΔTa in relation to the afforestation 395 intensity, a linear regression was performed between ΔTa and Faff using the ordinary least 396 squares method.  In China, afforestation areas are mainly located in the northeast, southwest and south, where 405 sufficient precipitation is available (Fig. 3a) and largely driven by afforestation of former 406 cropland or abandoned cropland, with a relatively small contribution from forest regeneration 407 or replanting following natural disturbance or timber harvest. One prominent feature of 408 afforestation in China is its small afforestation patch, with most afforested pixels (1km 2 ) having 409 an afforestation fraction of less than 30% (Fig. 3b). Pixels with an afforestation intensity below 410 10% account for 93% of the total number of pixels (Fig. 3b), representing 0.14 Mha, more than 411 half (55.6%) of the total afforestation area (Fig. 3b). Although all three approaches result in similar spatial and latitudinal patterns regarding 419 afforestation effects on LST (Fig. 4), their magnitudes differ substantially. The actual effect has 420 the lowest temperature change, followed by the mixed potential effect, with the full potential 421 effect showing the greatest temperature change (Fig. 4a-c).  Even though the observed land surface temperature is assumed to be uniform for the 1km 444 afforested satellite pixel, the underlying afforestation intensity varies substantially (Fig. 3a).

445
This leads to our first hypothesis that for a 1km pixel, ΔTa should be influenced by the area 446 fraction that has been afforested within the pixel (i.e., afforestation intensity or Faff). Indeed, the 447 actual daytime surface cooling increases significantly with afforestation intensity (Fig. 5), with 448 a 0.079±0.017K (mean ± std) increase for each ten percent increase in Faff. The afforestation effects obtained from the three approaches were compared for each Faff 456 interval (Fig. 6). When afforestation intensity is less than 60%, significant differences exist in 457 the temperature change obtained by the three approaches, with ΔTa < ΔTp1 < ΔTp2. This result 458 confirms our second hypothesis that the actual effect is expected to be smaller than potential 459 effects. However, for pixels with relatively low Faff, the mixed potential effect is found to be 460 smaller than the full potential effect, which is reasonable, but to our knowledge, has not been 461 reported before. When the afforestation intensity is greater than 60%, the significant difference samples t-test, n= 96,058), once again confirming our second hypothesis (Fig. 7). Moreover, 474 extrapolation of the relationship shown in Fig. 5 suggests that ΔTa would reach -0.79±0.17K

475
(mean ± std) if a 1km pixel was 100% afforested, which is conceptually comparable to the 476 potential effects. ΔTa was indeed found to be higher than ΔTp1 but lower than ΔTp2. This result 477 confirms our third hypothesis and demonstrates that the potential cooling effect could indeed 478 be reached when a pixel is fully afforested. with Faff reaching 100%, which is conceptually comparable with ΔTp1 and ΔTp2, is also shown. The three approaches (Li et al., 2015;Alkama and Cescatti, 2016;Duveiller et al., 2018) used 516 to quantify local surface temperature change following forest-cover change and presented with 517 details in this study, have been cited over 919 times in research papers (Web of Science,

518
December 2021) and in high-level climate science synthesis reports. Despite the apparently 519 large differences in temperature effect among them, to our knowledge, no studies have 520 examined whether these differences can be reconciled. This study fills that gap by comparing 521 the three approaches for a single study case, i.e., large-scale afforestation in China. China is 522 highly suitable for the purpose of this study as the size of an afforestation patch is, in general, 523 smaller than the spatial resolution (1km) at which the temperature effects of afforestation were 524 conducted in the previous studies describing the three approaches (Li et al., 2015;Alkama and 525 Cescatti, 2016;Duveiller et al., 2018). Hence, the difference between the actual and potential 526 temperature effects is expected to be large.

528
Indeed, we found surface cooling following afforestation was much less when estimated as the 529 actual effect (ΔTa) compared to the potential effects (ΔTp1 and ΔTp2). This lower ΔTa has been 530 attributed to incomplete afforestation at a 1km resolution, at which potential effects are 531 quantified by assuming complete afforestation (i.e., a complete shift from openland to forest).

532
Consistent with our first hypothesis, the afforestation fraction at a 1km resolution explained 89% 533 of the variation in ΔTa, making it a key determinant of the surface cooling following 534 afforestation (Fig. 5). This result is consistent with the fact that the observed temperature for a 535 mixed surface is determined by the area fractions of its respective components, with each 536 component having a unique temperature. This fact also forms the theoretical foundation for the 537 SVD technique used to derive the full potential effect (Duveiller et al., 2018).

539
Modelling (Li et al., 2016b) and satellite-based (Alkama and Cescatti, 2016) studies have found 540 that temperature change after afforestation (or deforestation) is highly sensitive to the fraction 541 of the model grid cell or satellite pixel that is subjected to afforestation (or deforestation), 542 echoing our finding that ΔTa significantly changes with Faff. In addition, we provide strong 543 evidence in support of our third hypothesis that when Faff reaches 100%, the expected actual 544 effect is comparable to the potential effects (Fig. 7). This finding shows that the three 545 approaches compared here are consistent when the same boundary condition, i.e., full 546 afforestation, is applied, and demonstrates that all three methods are mutually compatible. It is, 547 therefore, the basis of the reconciliation of the three approaches. It also highlights the fact that 548 the actual afforestation area must be considered when evaluating the climate mitigation effects 549 of afforestation.

551
Our results also show that the mixed potential effect (ΔTp1) is smaller than the full potential 552 effect (ΔTp2) (Fig. 6, Fig. 7). We suspect that this phenomenon likely also relates to the 553 incomplete forest coverage for the identified forest pixels at the 1km scale used in the 'space-554 for-time' analysis, because a threshold value of 50% forest cover was used when upscaling the 555 30m land-cover map to 1km resolution. This threshold, however, is consistent with the 556 commonly applied value in land-cover classification based on medium resolution satellite 557 images, such as MCD12Q1, which uses a tree coverage value of 60% to identify forest pixels 558 (Sulla-Menashe and Friedl, 2018). For the purpose of comparison, we also calculated the mixed 559 potential effect based on the MCD12Q1 land-cover map but using the same LST data. The 560 27 result shows that potential effects derived using MCD12Q1 data versus those derived using 561 spatially upscaled GlobeLand30 data are almost identical ( Figure A5), lending credibility to our 562 estimated ΔTp1 in comparison to previous studies using MODIS land-cover data (Li et al., 2015).

563
Progressively increasing the forest-cover threshold from 50% to 90% steadily increases ΔTp1 564 from -0.62±0.02K to -0.75±0.02K ( Figure A6). Further increasing the thresholds used to 565 identify 1km-resolution openland pixels from 50% to 90% increases ΔTp1 from -0.63±0.00K to 566 -1.10±0.02K ( Figure A7), bringing ΔTp1 even closer to ΔTp2 (-1.16±0.01K). This is consistent 567 with the finding of a previous study on the dependence of the temperature effect on the forest 568 cover change thresholds that were used to define afforestation: the higher the threshold, the 569 stronger the impact on temperature (Li et al., 2016). Our results add further support to the 570 compatibility of the three approaches given the same boundary condition, i.e., the complete 571 transformation from full openland to full forest coverage.

573
Several factors may contribute to the remaining differences in the temperature effects produced 574 by different methods even after reconciliation. As described in the Method section, there are 575 discrepancies in the assumptions of the three approaches, which lead to differences in the 576 control pixels (i.e., adjacent comparison pixels). For instance, for the 'pure potential effect' it 577 is assumed that the LSTs for pixels with the same land cover type are uniform and forest pixels 578 are compared with openland pixels, whereas the in the 'mixed potential impact' approach the 579 central target forest pixel is compared with the mean value of non-forest pixels within the 580 searching window. Moreover, space-for-time is an assumption that cannot be verified (Chen et 581 al., 2016), and which will inevitably result in differences in the estimated actual and potential 582 effects, although the consistency between 'potential' and 'actual' effects after reconciliation in 583 our study does illustrate the broad validity of this assumption. Differences between the actual and potential temperature effects can also arise from influences 586 of both the timing of the afforestation and the time period elapsed following afforestation.

587
However, such influences are expected to be small in our study. We argue that such influences 588 should be more pronounced in the case of deforestation than afforestation. The temperature 589 effect caused by deforestation is considered to be instant (Liu et al., 2018). As a result, if after afforestation, the influence of the specific 'timing of afforestation' is expected to be small.

600
Furthermore, the GFC dataset used in this analysis defined forest gain using the condition of 601 successful detection of a stable closed forest canopy that is clearly different from a non-forest 602 state (Hansen et al., 2013), which enhances the chance of temperature change saturation 603 following afforestation. But, given a maximum stand age of 12 years inferred from the GFC 604 dataset, differences in surface temperatures may still exist between newly established forests 605 and the mature existing forests that were used in the 'potential effect' approaches. Thus, we 606 cannot exclude the possible contribution of time period elapsed following afforestation to the 607 difference between the actual and potential effects, which failed to be reconciled.

609
Previous analyses have documented latitudinal patterns of surface temperature change induced 610 by afforestation (Alkama and Cescatti, 2016;Li et al., 2015Li et al., , 2016aPeng et al., 2014). When 611 comparing the three approaches for a single case study, consistent latitudinal patterns of local 612 surface temperature effects following afforestation are observed (Fig. 4). Notably, all three 613 approaches show a warming effect in the northern high latitudes and an opposite cooling effect 614 in the southern low latitudes, with a largely neutral effect in the 40-48° N latitude band, 615 providing further evidence that the three approaches are compatible. In particular, although the 616 three approaches used different land-cover maps, they derived consistent LST impacts 617 following afforestation, which highlights that fact that the reconciliation provided in this study 618 is rather robust and unlikely to be dependent on the land cover datasets used. All datasets used in this study are summarized in Table 1