Predicting evapotranspiration from drone-based thermography – a method comparison in a tropical oil palm plantation

For the assessment of evapotranspiration, nearsurface airborne thermography offers new opportunities for studies with high numbers of spatial replicates and in a fine spatial resolution. We tested drone-based thermography and the subsequent application of the DATTUTDUT energy balance model using the widely accepted eddy covariance technique as a reference method. The study site was a mature oil palm plantation in lowland Sumatra, Indonesia. For the 61 flight missions, latent heat flux estimates of the DATTUTDUT (Deriving Atmosphere Turbulent Transport Useful To Dummies Using Temperature) model with measured net radiation agreed well with eddy covariance measurements (r2= 0.85; MAE= 47; RMSE= 60) across variable weather conditions and times of day. Confidence intervals for slope and intercept of a model II Deming regression suggest no difference between drone-based and eddy covariance methods, thus indicating interchangeability. The DATTUTDUT model is sensitive to the configuration of the net radiation assessment. Overall, we conclude that drone-based thermography with energy balance modeling is a reliable method complementing available methods for evapotranspiration studies. It offers promising, additional opportunities for fine grain and spatially explicit studies.

Abstract. For the assessment of evapotranspiration, nearsurface airborne thermography offers new opportunities for studies with high numbers of spatial replicates and in a fine spatial resolution. We tested drone-based thermography and the subsequent application of the DATTUTDUT energy balance model using the widely accepted eddy covariance technique as a reference method. The study site was a mature oil palm plantation in lowland Sumatra, Indonesia. For the 61 flight missions, latent heat flux estimates of the DAT-TUTDUT (Deriving Atmosphere Turbulent Transport Useful To Dummies Using Temperature) model with measured net radiation agreed well with eddy covariance measurements (r 2 = 0.85; MAE = 47; RMSE = 60) across variable weather conditions and times of day. Confidence intervals for slope and intercept of a model II Deming regression suggest no difference between drone-based and eddy covariance methods, thus indicating interchangeability. The DATTUTDUT model is sensitive to the configuration of the net radiation assessment. Overall, we conclude that drone-based thermography with energy balance modeling is a reliable method complementing available methods for evapotranspiration studies. It offers promising, additional opportunities for fine grain and spatially explicit studies.

Introduction
Evapotranspiration (ET) is a central flux in the hydrological cycle on a regional and on a global scale. Terrestrial ET consumes almost two-thirds of terrestrial precipitation (Oki and Kanae, 2006). There is an interest in better understanding ET and its drivers as climate change is expected to increase atmospheric evaporative demand, and droughts are predicted to become more severe and frequent in the future (Prudhomme et al., 2014). ET is also strongly affected by land-cover and land-use changes, which are currently very pronounced in tropical regions (Hansen et al., 2013).
The eddy covariance (EC) technique is a widely accepted and well-established method to quantify ET at the stand scale (Baldocchi et al., 2001;Fisher et al., 2017). It results in a single latent heat (LE) flux value integrated over the footprint of the EC tower at a given time that can be converted to an ET estimate. A spatial fine grain attribution of different surface patches to this overall ET value is generally not possible. The EC method is costly and labor intensive; therefore, a relatively low number of spatial replicates within a given region and among its different ecosystems are typically available. The EC method also has certain constrains regarding topography, atmospheric turbulence and landscape heterogeneity (Göckede et al., 2008).
A complementary approach for assessing LE at larger spatial scales is the use of remotely sensed land surface tem-F. Ellsäßer et al.: Predicting evapotranspiration from drone-based thermography peratures (LSTs) as boundary conditions for energy balance modeling and subsequent conversion to ET (Brenner et al., 2017;Guzinski et al., 2014;Hoffmann et al., 2016;Ortega-Farías et al., 2016;Xia et al., 2016). Transpiration from leaf surfaces leads to evaporative cooling of the canopy; LSTs, along with air temperature, can thus be used as a reliable indicator of plant water use, both in monocultures and in spatially highly heterogeneous systems such as natural forests (Lapidot et al., 2019). Compared to the EC method, this approach can potentially increase the number of spatial replicates within and among ecosystems, and it is also applicable in challenging terrain. Remotely sensed LSTs are regarded as good indicators for plant water use, stress and transpiration (Jones and Vaughan, 2010). One approach to obtain LST data is the use of satellite-based observations (Allen et al., 2007;Bastiaanssen et al., 1998;Ershadi et al., 2013). However, the spatial resolution of LST satellite data such as Landsat TM, ASTER, MODIS or AVHRR ranges from 90 m to 1 km, limiting the distinction of plant canopies and soil (Berni et al., 2009). A higher temporal resolution of satellite-based thermal infrared (TIR) observations is usually associated with a lower spatial resolution, and TIR data from satellites in both high spatial and high temporal resolutions are not yet available (Brenner et al., 2017). Additionally, clouds are barriers for thermal radiation and therefore have a strong effect on the quality and availability of satellite-based TIR observations (Guzinski et al., 2013). This is of particular importance in regions with frequent cloud cover such as in tropical environments.
An alternative recently emerging approach to measure LSTs is the use of drones. Radiometric TIR sensors for LST recording have become lightweight and affordable, and drones are now capable of carrying adequate payloads for reasonable time spans. Near-surface thermography-based studies allow for temporal resolutions in flexible, e.g., hourly, time steps and a spatial resolution in the decimeter scale or finer. Drone-based TIR recording and subsequent modeling of LE with energy balance models has previously shown promising results for short grass and crop vegetation in central Europe (Brenner et al., 2018;Hoffmann et al., 2016). However, remote sensing of LST from drones is challenging and involves careful planning. Recording LST close to the surface results in a high resolution but reduces the area covered in a certain time span compared to surveying from a higher altitude. Increasing flight altitude reduces spatial resolution of LST images and thus increases the averaging of surface temperatures from individual canopies, soil patches and branches from neighboring canopies into a single pixel (Still et al., 2019). Further, air humidity can have a major effect on measurement accuracy as water vapor not only attenuates the signals from the surface of interest to the sensor but also emits its own thermal radiation (Still et al., 2019).
Different energy balance models are available to compute LE from LST and subsequently calculate ET. In the one-source energy balance model DATTUTDUT (Deriving Atmosphere Turbulent Transport Useful To Dummies Using Temperature) (Timmermans et al., 2015), fluxes are estimated by relating single pixel temperatures to local temperature extremes. Crucial in applying such energy balance models is how the net radiation (R n ) is implemented. In the original formulation of the DATTUTDUT model, R n is fully modeled, assuming a range of prerequisites and environmental conditions (Timmermans et al., 2015). Using airplanes or drones to record LSTs, the DATTUTDUT model previously showed promising results for grass and vineyard surfaces in temperate and subtropical regions (Brenner et al., 2018;Xia et al., 2016). However, to our knowledge, a comprehensive method comparison considering potential errors in both a reference method (e.g., the EC technique) and novel dronebased approaches is not yet available. Since full method comparisons based on model II regression require a sample size of at least n = 60 data pairs (Legendre and Legendre, 2003), many previous studies with smaller sample sizes were constrained to using error terms and correlation coefficients.
The current study was conducted in the lowlands of Jambi province (Sumatra, Indonesia) where, over the last decades, large areas of rainforest have been converted to rubber and oil palm plantations Margono et al., 2012). This resulted in regional-scale changes in transpiration (Röll et al., 2019) and land surface warming (Sabajo et al., 2017). We assessed energy fluxes in a mature monoculture oil palm plantation and compared the LE estimates of drone-based methods with the established EC method as measured ground-based reference. The DATTUTDUT energy balance model was tested, with three different configurations for the determination of R n (fully modeling R n , R n estimates based on short-wave irradiance and measuring R n ). The objectives of our study were to compare LE estimates from the drone-based methods to the EC technique, with a special focus on the detection of proportional and continuous errors among the methods and an evaluation of the model's prediction performance. The present study focuses on the comparison of different drone-based methods as a baseline for future ecological studies, rather than applying the methods to different land-use types.

Study site
The study site is located in the lowlands of Jambi province (Sumatra, Indonesia) near the Equator (1.6929879 S, 103.3914411 E; 76 m a.s.l.). Average annual air temperature in the region is 26.5 • C, and average annual precipitation is 2235 mm yr −1 (Drescher et al., 2016). At the time of our measurement campaign in August 2017, the studied monoculture oil palm (Elaeis guineensis) plantation was 15 years old. Palm stem density was 140 palms ha −1 , with an average palm height of 14.3 m and an average canopy radius of 4.5 m. Leaf area index (LAI) was estimated at 3.64 m 2 m −2 (Fan et al., 2015), and canopy cover was estimated to be 90 %. Plantation management included the removal of older and non-vital leaves from the oil palms, herbicide application to remove most understory plants and fertilization (196 kg N ha −1 yr −1 ) . The average annual oil palm yield is 27.7 Mg ha −1 . An EC tower (22 m height) is situated in the center of the site with a fetch of up to 500 m in all directions   (Fig. 1).

Drone-based image acquisition
We used an octocopter drone (MK EASY Okto V3; HiSystems, Germany) equipped with a thermal and an RGB camera mounted in a stereo setup on a gimbal to ensure nadir perspective. The radiometric thermal camera was a FLIR Tau 2 640 (FLIR Systems, USA) attached to a TeAx ThermalCapture module (TeAx Technology, Germany). The sensor covers spectral bands ranging from 7.5 to 13.5 µm with a relative thermal accuracy of 0.04 K and an absolute thermal accuracy of ± 2 K (FLIR Systems, USA). The RGB camera was based on an Omnivision OV12890 CMOS sensor (Omnivision, USA) with a 170 • field-of-view (FOV) fisheye lens. Instead of the mosaicking approaches applied in most of the mentioned previous studies, we used a single image recording concept as faster image acquisition allows for a denser temporal resolution of LSTs. To capture an area of 100 m radius around the EC tower in a single shot of the thermal camera, images were taken from 260 m altitude. Image corners were removed due to vignetting effects. During a consecutive 5 d flight campaign in August 2017, 61 LST datasets and matching EC measurements were recorded. Flights were conducted between 09:00 and 16:00 LT (local time), in accordance with the 30 min intervals of the EC averaging cycles, resulting in 10 to 14 flights per day. All LSTs were measured using a fixed emissivity of one as the energy balance model would introduce an emissivity parameter in the process.

Energy balance models
LSTs are recorded as "snapshots" representing an instantaneous state of surface temperatures. Soil-Vegetation-Atmosphere Transfer (SVAT) models use these instantaneous observations of LST to solve the energy balance equation and estimate instantaneous fluxes. In our study the one-source energy balance model DATTUTDUT (Timmermans et al., 2015) was applied. Using drones, the proximity of the thermal camera to the surface is much closer compared to other typical carriers (such as satellites or planes); hence, atmospheric effects are supposed to have only a very minor effect. We used directional radiometric temperature recordings from the drone as input without applying further corrections. All model configurations in this study use instantaneous land surface temperatures (LSTs) to solve the energy balance equation: where R n is the net radiation, G is the ground heat flux, and the turbulent fluxes H and LE represent sensible and latent heat fluxes, respectively. R n is estimated by calculating the budget of incoming (↓) and outgoing (↑) long-wave (R l ) and short-wave (R s ) radiation: where the short-wave component is calculated by multiplying incoming short-wave radiation R s ↓ [W m −2 ] with its absorption ratio deducted from the combined soil and vegetation albedo P surf . This way, reflected outgoing short-wave radiation R s ↑ is subtracted from the energy balance. The longwave radiation budget is calculated from surface (soil and vegetation) emissivity ε surf , atmospheric emissivity ε atm , the Stefan-Boltzmann constant σ (5.6704 × 10 −8 W m −2 K −4 ), air temperature T air and radiometric land surface temperature T surf (both in kelvin). The incoming long-wave radiation component is added to the budget and the outgoing longwave radiation is subtracted.

DATTUTDUT
Key input for the DATTUTDUT model is a LST map from where the hottest (T max ) and the 0.005 quantile of coldest pixels (T min ) are extracted, assuming that hot pixels are a 864 F. Ellsäßer et al.: Predicting evapotranspiration from drone-based thermography result of very little to no evapotranspiration and cold pixels originate from a high evapotranspiration rate (Timmermans et al., 2015). Fully modeled R n is computed based on downwelling short-wave radiation estimates calculated using sunearth geometry to solve Eq. (2). Therefore, surface albedo P surf is calculated as in Timmermans et al. (2015) based on the assumption that dense vegetation appears colder than rocks or soil in the thermal imagery (Brutsaert, 1982;Garratt, 1992): Downwelling short-wave radiation R s ↓ is calculated from the dimensionless atmospheric transmissivity τ and the exoatmospheric short-wave radiation SW exo = 1360 W m −2 (Timmermans et al., 2015). Transmissivity τ is calculated as described in Burridge and Gadd (1977) using the solar elevation angle α that was determined from the geographic position of our site and the coordinated universal time (UTC) of the measurements: Timmermans et al. (2015) suggest using a constant value of 0.7 for τ and 0.8 atmospheric emissivity (ε atm ), but as our flight times range from 09:00 to 16:30 LT, we decided to include the solar elevation angle as in Eq. (4). Further, we used a constant surface emissivity (ε surf ) of 0.98 as recommended for vegetation-dominated areas (Jones and Vaughan, 2010) and not 1.0 as simplified in the original formulation of the DATTUTDUT model. Air temperature T air was calculated as the 0.005 quantile of the coldest pixels in the image.
As the original DATTUTDUT formulation does not account for cloud cover, Eq. (5) is replaced by measured shortwave irradiance as in Brenner et al. (2018) for model runs with Rn_sw. For model runs with Rn_mes, Eq. (2) was replaced by R n measurements recorded at the EC tower.
The sum of the turbulent fluxes is calculated by subtracting G from R n . The result is allocated into its components H and LE, using the evaporative fraction (EF) (Timmermans et al., 2015): To implement the DATTUTDUT model, we used the QGIS3 plugin QWaterModel  that is provided with an easy-to-use graphical user interface.
The actual amount of evapotranspiration (ET, in mm h −1 ) was calculated similar to Timmermans et al. (2015): where LE is the latent heat flux in W m −2 , t is the respective time span in seconds and T air is the air temperature in kelvin.

Eddy covariance measurements
The micrometeorological tower is located in the center of the study site (Fig. 1). The EC technique was used to measure LE and H fluxes from high-frequency (10 Hz) measurements of above-canopy water vapor concentration, sonic temperature and 3-D wind components. and three-cup anemometers (Thies Clima, Göttingen, Germany) (18.5, 15.4, 13 and 2.3 m heights) for wind speed measurements were installed on the tower. Ground heat flux was measured using heat flux plates (HFP01, Huxeflux, Delft, the Netherlands) at 10 cm depth. Additional soil moisture and temperature measurements (Trime-Pico 32, Imko, Ettlingen, Germany) above the heat flux plate at 5 cm depth were used to calculate heat flux at the soil surface. EC data recording, filtering and processing were carried out identical to the methodology described in Meijide et al. (2017) for the same study site. As the applied drone-based model assumes full energy balance closure, we used the Bowen ratio closure method (Pan et al., 2017;Twine et al., 2000) to compute full closure for the EC measurements. The Bowen ratio method was found to produce the most congruent results in conjunction with drone-based latent heat flux estimates (Brenner et al., 2017) and was therefore applied in this study. The energy balance closure (EBC) of the reference EC measurements was 0.77 (r 2 = 0.87), which is in line with EBC reported for other tall vegetation canopies (Stoy et al., 2013). Since the DATTUTDUT energy balance model assumes full EBC, we applied the so-called Bowen ratio closure method to the EC data (Pan et al., 2017). The method assumes that wind measurements miss some of the total covariance and dispersive fluxes. Therefore, underestimations of LE and H are carried over proportionally because of similarity among fluxes (Twine et al., 2000). The Bowen ratio closure method proportionally assigns the underestimated turbulent energy to LE and H fluxes to reach full EBC.
EC data processing and quality checks were performed following the methodology described in . Following Mauder and Foken (2006), flux estimates during low turbulence and thus stable atmospheric conditions were removed from the analysis; however, low turbulence mainly occurred during night hours and was not observed during the daytime drone flights. Generally, the EC method is associated with uncertainties of 5 %-20 % (Foken, 2008). Further limitations are the high costs and quite specific requirements regarding size and terrain of the study site.

Statistical analyses
Both methods, the reference EC technique and the dronebased estimates, are associated with a certain degree of uncertainty. A model II Deming regression (Deming, 1964) was applied for the analysis to consider uncertainties in both x and y variables (Cornbleet and Gochman, 1979;Glaister, 2001). We assumed that the error ratio (σ ε 2 /σ δ 2 ) of the variances (σ ) of errors on y (ε i ) and on x (δ i ) would not differ from 1, which is the standard procedure if both uncertainties are unknown (Legendre and Legendre, 2003). We used the interquartile range method with a factor k = 1.5 to remove outliers from the regression. A Durbin-Watson test was applied to test for correlation in error terms. We checked for heteroscedasticity visually and using a White test. Normal distribution of error terms was tested visually plotting standardized residuals vs. theoretical quantities and performing a Shapiro-Wilk test. Standard errors and confidence intervals for slope and intercept of the Deming regression were calculated using analytical methods (parametric) and the jackknife method (Armitage et al., 2001;Linnet, 1993). As further supporting indicators of model performance, we calculated the coefficients of determination (r 2 ), the mean absolute error (MAE), the root mean square error (RMSE), and slope and intercept from the Deming regression. Statistics such as r 2 have their limitations in method comparison since they are designed to indicate how well the resulting model of the regression describes the outcome and are not necessarily a good measure for systematic bias between methods. However, they are used as a statistic in this study since they represent an additional indicator for interpretation. Linearity was checked visually plotting residuals vs. fitted values.
All modeling procedures and parts of the statistical analyses were computed using Python version 3.7.1 (Python Software Foundation), involving the libraries NumPy 1.14.2, SciPy 1.1.0, pandas 0.23.1, scikit-learn 0.19.1, gdal 2.3.2, Astropy 3.2.2 and tkinter 8.6. The Deming regression was computed using the mcr v2.2.1 package (Manuilova et al., 2014) in R version 3.6.1 (R Development Core Team, 2019). Graphic representation was processed in Python using the Matplotlib 3.0.2 and Seaborn 0.9.0 libraries.

Dataset characteristics
The dataset offers a comparatively high number of replicates from 61 drone flights and the corresponding eddy covariance measurements, enabling a method comparison which requires at least n = 60 observations (Legendre and Legendre, 2003). The data were recorded at a 30 min frequency to facilitate the analysis of daily courses of evapotranspiration behavior, creating a trade-off situation of more flights per day with shorter flight times per flight. Because flight times were so short, only a smaller footprint with a radius of 100 m around the eddy covariance station was covered, while the footprint recorded with the eddy covariance system ranged up to a 500 m radius around the tower. Therefore, the reduced area of the drone-recorded LST maps is often smaller than the extent of the eddy covariance footprint. We have several reasons to assume that this does not cause major problems for the comparison though: the study area is very homogenous with an elevation difference of 5 m in the eddy covariance footprint, and the biosphere is strongly dominated by only one species (oil palm). The plantation is very well managed so that all oil palm canopies are alive; no oil palms have died and only dry leaves are removed.

Meteorology
During our 61 flight missions, cloudiness was variable from clear sky to full cloud cover; short-wave irradiance ranged from 204 to 1110 W m −2 . The prevailing wind direction was from northeast, at an average wind speed of 1.7 m s −1 . Canopy air temperature ranged from 22.5 to 32.3 • C, and relative humidity varied between 62 % and 99 %. Temperature differences between measured air temperature at 16.3 m (top of canopy) and mean land surface temperatures ranged from 0.005 K to a single peak of 8.689 K for the single flights, while the daily averaged differences ranged from 1.32 to 2.13 K. The energy balance closure of the reference EC measurements was 0.77 (r 2 = 0.87).

Drone-based modeling methods vs. eddy covariance method
At the time of the drone flights, LE from the EC method ranged between 87 and 596 W m −2 (mean: 337 W m −2 ) and eddy covariance-derived evapotranspiration was on average 0.43 ± 0.21 mm h −1 , with peak evapotranspiration of up to 0.87 mm h −1 during midday. Congruence of LE estimates with reference EC measurements differed among the three applied models and was further affected by the configuration of the R n assessment (Fig. 2). The assumptions for Rn_mod were not always met as cloud cover was present during several flights; consequently, the corresponding net radiation estimates were too high, leading to a substantial overestimation especially of smaller latent heat fluxes in the DATTUT-DUT model. The short-wave irradiance-based Rn_sw configuration resulted in R n estimates that were on average very comparable with the measured net radiation Rn_mes but also showed a rather high variation (Fig. 2). Generally, error metrics were reduced and agreement was increased the more measurement-controlled the R n determination process was. DATTUTDUT LE estimates closely agreed with EC measurements around noon, but were higher in the morning Figure 2. Measured net radiation (Rn_mes) plotted against fully modeled net radiation (Rn_mod) and net radiation estimates based on short-wave irradiance (Rn_sw). and afternoon hours, which is caused by overestimations of R n from the Rn_mod method (Fig. 3a). The DATTUTDUT model configuration with Rn_sw produced LE estimates that matched LE from EC more closely (Fig. 3b). DATTUTDUT computed similar or higher estimates of LE compared to the EC measurements during noon but mostly underestimated LE fluxes in the morning and afternoon. With Rn_mes configuration, DATTUTDUT computed closely matching LE estimates at all times of day across the 5 d measurement period (Fig. 3c).
Across all times of day and weather conditions (n = 61 flight missions), congruence among drone-based LE estimates and reference EC measurements was highest for the DATTUTDUT model with Rn_mes configuration (r 2 = 0.85); MAE and RMSE were 47 and 60 W m −2 , respectively (Fig. 4). To compare the model predictions and the eddy covariance measurements, we computed a Deming regression between both LE predictions from the models and LE estimates by the EC method. The methods are considered to be statistically interchangeable if the confidence intervals of the slope and intercept include one and zero, respectively. If the confidence intervals for the intercept of the Deming regression include zero, there is no constant or continuous error between the two methods. If the confidence intervals for the intercept do not include zero, both methods differ by a constant amount, i.e., the new method has a continuous error compared to the reference method. In contrast, the confidence intervals of the slope of the Deming regression indicate whether there is a proportional error between the methods, which increases proportionally with the magnitude of the predicted value. Deming regression of the LE estimates of the DATTUTDUT model with Rn_mes configuration showed no significant proportional or constant error compared to EC measurements as the values one and zero lay within the respective 99 % confidence interval ranges of slope and intercept (Fig. 5). It is thus indicated that there is no significant difference between LE estimates from DATTUT-DUT with Rn_mes configuration and the EC technique. In the Rn_sw configuration, the DATTUTDUT model showed no significant proportional and continuous error of LE estimates compared to EC measurements (Fig. 5). However, the confidence intervals for model runs with the Rn_sw configuration were rather wide, indicating a large level of uncertainty. The DATTUTDUT model in the Rn_mod configuration showed significant proportional and constant errors or large biases compared to EC measurements, as well as very large confidence intervals (Figs. 4 and 5).

Spatial distribution of LE
For 9 August 2017, 12:30 LT, the DATTUTDUT model in Rn_mes configuration predicted a mean of 526 W m −2 (minimum of 0 on the corrugated iron roof of the EC tower system, maximum of 637 W m −2 , coefficient of variation 7.53 %, for the analyzed 18 383 pixels) (Fig. 6), which translates to a mean ET of 0.778 mm m −2 h −1 . Locally, i.e., in the center of oil palm crowns, high LE of >400 W m −2 was observed, while LE from soil and ground vegetation areas between oil palm canopies was lower. The LE fluxes of all pixels were almost normally distributed for the one-source energy balance model DATTUTDUT in all configurations, mainly showing differences in the range and magnitude of LE estimates (Fig. 7).

Discussion
Our study indicates a high agreement between latent heat fluxes assessed by drone-based thermography and the eddy covariance technique. However, the performance of the applied energy balance model differed and among the different configurations of net radiation assessments (Figs. 3 and 4). Model II Deming regression analyses and associated quality assessments suggest interchangeability between the DAT-TUTDUT model in Rn_mes configuration and the EC technique (Figs. 4 and 5). Applying this configuration, a fine grain spatial analysis of latent heat fluxes suggests relatively low heterogeneity of LE in the studied tropical oil palm plantation (Fig. 6).

Drone-based LE modeling vs. eddy covariance measurements
The confidence intervals of slope and intercept of the Deming regression indicate that the one-source energy balance model DATTUTDUT with Rn_mes configuration is statistically interchangeable with the established EC method for estimating LE fluxes. There are advantages and limitations to both methods. For example, the DATTUTDUT model provides insights into the spatial distribution of LE fluxes within the  full extent of the available LST maps, whereas the EC technique averages the LE fluxes within its footprint to a single value. On the other hand, the DATTUTDUT model is temporally limited to the availability of LST maps, whereas the EC method can measure fluxes continuously over several years once the equipment is in place. The DATTUTDUT model with Rn_mes configuration further requires additional measurements of short-and long-wave radiation. In our study, Figure 5. Confidence intervals for intercept and slope of Deming regression for the different LE estimation approaches compared with EC measurements. x level for the bias is the mean of the established EC reference method. The intercept is displayed in W m −2 . these measurements were taken with the EC equipment, but future stand-alone drone approaches are possible by using onboard miniaturized radiation sensors (Castro Aguilar et al., 2015;Suomalainen et al., 2018). However, the accuracy of such onboard radiation sensors should first be tested against reference methods, e.g., visually by scatter or intercomparison plots (Castro Aguilar et al., 2015;Suomalainen et al., 2018) or with a model II regression procedure evaluating the interchangeability of methods and measurements (Passing and Bablok, 1983). The DATTUTDUT model with the Rn_sw configuration showed a significant proportional error compared to EC measurements, which was mainly rooted in the high variance of the Rn_sw estimates. The short-wave irradiance measurements used in this study were stored as 10 min averages that probably did not represent the high level of irradiance variations in the tropical study area adequately. Previous studies have pointed out that R n derivation based on short-wave irradiance measurements is challenging as long-wave radiation budgets are often completely independent from their shortwave counterparts (Hoffmann et al., 2016). Estimation errors in long-wave radiation budgets have, for example, been reported to be related to high relative air humidity, when some of the original model assumptions are no longer met  (Hoffmann et al., 2016). We observed a negative correlation (r 2 = 0.46) between incoming long-wave irradiance and relative humidity and assume that the high relative humidity in our tropical study area may have affected the determination of R n when using the Rn_sw configuration through inaccuracies in estimating long-wave radiation budgets and therefore causing the observed significant continuous errors. Since we recorded the data during very different times of day and weather situations, the short-wave irradiance-based approach might not be the most adequate means of R n derivation. However, this approach can be very useful for measurements without the presence of clouds or high levels of relative humidity. We thus also consider the Rn_sw configuration valuable for future research, particularly because measurements of incoming short-wave radiation are much easier to implement than assessing complete short-and long-wave radiation budgets as necessary for the Rn_mes configuration. The application of the Rn_sw configuration for a one-source energy balance model such as DATTUTDUT was also tested in two previous studies, with similar results to our study, i.e., a reduction of errors compared to its original formulation with fully modeled Rn_mod (Brenner et al., 2018;Xia et al., 2016). Lastly, the model configuration Rn_mod did not produce accurate LE estimates for the DATTUTDUT model, as many of the basic assumptions for fully modeled R n determination are not met in tropical environments such as our equatorial study area. As such, the sky is often cloudy, while haze frequently occurs during periods without rainfall. Even if no clouds are visible, relative humidity is often high, which interferes with the clear-sky assumptions of the Rn_mod configuration (Still et al., 2019).
The relatively simple DATTUTDUT model produced precise LE estimates compared to eddy covariance reference measurements. Similar conclusions were reached by Brenner et al. (2018), where DATTUTDUT marginally outperformed the more complex TSEB (two-source energy balance) model. On the other hand, contrasting observations were made by Xia et al. (2016) in vineyards with more extreme temperature divergences between soil and vegetation, where the TSEB model produced more precise estimates of LE than the DAT-TUTDUT model. This was explained by the better physical representation of energy and radiative exchange in the more complex TSEB model. The authors further point out that R n determination is not the only source of error in the DATTUT-DUT model (Xia et al., 2016). However, limitations of the presented methods compared with the reference EC method still exist. As such, the thermography-based recording process for land surface temperatures can be affected by water vapor, haze or dust, which increase atmospheric emissivity. Also, wind and turbulence cooling effects that compete with evaporative cooling are not captured in this approach.
We used the Bowen ratio method to close the energy balance for the reference EC measurements. As reported by Xia et al. (2016), agreement between measured EC and modeled LE estimates could potentially be increased by using the residual method from Twine et al. (2000) for energy balance closure. Further potential improvements include the aerial sampling alignment with the EC measurement logging cycles. We compared snapshot measurements of LST to 30 min averages of EC measurements for the corresponding times in an environment where key variables such as solar irradiance can change very quickly. Better matching the measurement cycle duration may further improve agreement between the methods and was already suggested in a previous study (Brenner et al., 2018). Further, in our study the aerial-derived LST images represented only the center area of the (at times quite variable and large) EC footprint. Covering the whole potential area of the footprint in all directions could also increase agreement between the measurements but would require an even higher flight altitude or longer flight times to cover the whole area; both options would reduce the number of temporal replicates and increase errors from measurements and processing but could nonetheless be viable approaches for other research questions.
Only few previous studies have demonstrated applicability and limitations of estimating LE with energy balance models from non-satellite data. In these studies, LSTs were, for example, recorded from drones for European grasslands and croplands (Brenner et al., 2018;Hoffmann et al., 2016) and from drones or airplanes for taller vegetation including olive orchards and vineyards (Ortega-Farías et al., 2016;Xia et al., 2016). Our study adds to this an application of these models in a tropical environment, for higher vegetation (i.e., oil palm) and across variable times of day and weather conditions. Generally, the equatorial study site was rather challenging due to high temperatures, high humidity and frequent occurrence of haze, as well as for logistical reasons. Additionally, many previous drone-based studies were conducted on grasslands (e.g., Brenner et al., 2017Brenner et al., , 2018 or on lowgrowing crops such as wheat fields (Hoffmann et al., 2016) but not on crops with a rather complex canopy structure such as oil palm. On the other hand, our study site showed large temperature differences between soil and canopy, which simplified the distinguishing of each fraction. We further ana-lyzed for the first time whether models and EC measurements based on drone data can be used interchangeably, as our large sample size of n = 61 flights allowed for a method comparison based on a model II Deming regression (Legendre and Legendre, 2003). We conclude that this is the case for some model configurations, above all for the DATTUTDUT model with Rn_mes configuration.

Spatial distribution of latent heat fluxes
A particular strength of drone-based thermal imagery is the high spatial resolution which allows for spatially explicit assessments of evapotranspiration, within and potentially also beyond the footprints of EC towers. The outlines of the single oil palm canopies are clearly visible in the LE flux map (Fig. 6), with the highest LE fluxes occurring in the center of the oil palm canopies. We assume that this spatial pattern is caused by an increased local LAI in the centers of the oil palm canopies, while leaf area density decreases towards the outer canopies. Further, the central areas of oil palm canopies are more exposed to sunlight and wind throughout most of the day, increasing their potential for (evapo)transpiration compared to canopy edges. Mixed pixel effects (among soil and canopy) likely also contribute to the observed lower LE fluxes towards the borders of oil palm canopies. Further contributing factors to higher LE fluxes in the centers of oil palm canopies could be leaf age (with younger leaves in the center) and additional ET from pockets in the axils of pruned leaves along the stem, which contain small water reservoirs and epiphytes Tarigan et al., 2018).
The LE fluxes histogram shows only few pixel values of zero and most pixels closely distributed around the mean. For all R n configurations, mean and median are very similar indicating close to zero skewness. Such a distribution tending towards unimodality is also considered typical for landscapes where ET is highly dominated by one species (Xia et al., 2016).
Drone-based methods have a large untapped potential for ecological applications, e.g., regarding ecohydrological optimization in land-use systems and designing the climatesmart urban landscapes of the future. We see great potential in the drone-based remote sensing applications presented in this study, especially when recent developments in drone-environment interaction, mobile edge computing (potentially aboard the drone) and communication technologies such as LoRaWAN (Long Range Wide Area Network) or 5G are combined (Becerra, 2019;Marchese et al., 2019). Autonomous acquisition of LSTs over EC stations and the surrounding areas can be supplemented by onboard and ground sensors. Energy balance models can then potentially be calculated using edge computing schemes aboard the drone to enable a dense temporal resolution of LST, flux and ET maps in almost real time. This concept can, for example, be used for the attribution of fluxes in mixed species plant communities and the study of edge effects in landscapes, and it can 870 F. Ellsäßer et al.: Predicting evapotranspiration from drone-based thermography be further adapted to detect water stress in agriculture and forests.

Conclusions
Drone-based thermography and subsequent energy balance modeling under certain configurations can be considered a highly reliable method for estimating latent heat flux and evapotranspiration; for some configurations, statistical interchangeability is suggested with the established eddy covariance technique. They thus complement the asset of available methods for evapotranspiration studies by fine grain and spatially explicit assessments.
Data availability. The final data used for the statistical tests were uploaded to the Göttingen Research Online database (https://doi.org/10.25625/JY9AFT, Ellsäßer, 2020). Raw thermal images, orthomosaics and terrain data, georeferenced rasters, and model configurations are available upon request to the corresponding author.
Author contributions. The study was conceptualized by DH in cooperation with H (drone measurements) and AK in cooperation with TJ (eddy covariance measurements). FE led the writing of the paper with help from AR, and DH supervised the work. FE collected and processed the drone data and CS the eddy covariance data. FE conducted data processing, model application, statistical analysis and production of plots in cooperation mainly with DH and AR. FE, DH and AR created a first version of the manuscript, which was further improved in cooperation with all authors.
Competing interests. The authors declare that they have no conflict of interest.