Climatic traits on daily clearness and cloudiness indices

Solar radiation has a crucial role in photosynthesis, evapotranspiration and other biogeochemical processes. The amount of solar radiation reaching the Earth’s surface is a function of astronomical geometry and atmospheric optics. While the first is deterministic, the latter has a random behaviour caused by highly variable atmospheric components such as water and aerosols. In this study, we use daily radiation data (1978–2014) from 37 FLUXNET sites distributed across the globe to inspect for climatic traits in the shape of the probability density function (PDF) of the clear-day (c) and the clearness (k) indices. The analysis was made for shortwave radiation (SW) at all sites and for photosynthetically active radiation (PAR) at 28 sites. We identified three types of PDF, unimodal with low dispersion (ULD), unimodal with high dispersion (UHD) and bimodal (B), with no difference in the PDF type between c and k at each site. Looking for regional patterns in the PDF type, we found that latitude, global climate zone and Köppen climate type have a weak relation and the Holdridge life a stronger relation with c and k PDF types. The existence and relevance of a second mode in the PDF can be explained by the frequency and meteorological mechanisms of rainy days. These results are a frame to develop solar radiation stochastic models for biogeochemical and ecohydrological modelling.


Introduction
Solar radiation drives most physical, chemical and biological processes at the Earth's surface. It is the primary energy source for photosynthesis, evapotranspiration and other biochemical processes (Wu et al., 2016;Mercado et al., 2009). The amount of solar irradiance reaching any place on the Earth's surface at a given time results from the Sun's emission spectrum, Sun-Earth distance, the angle of incidence of solar rays and the atmospheric attenuation of light. The geometry of Earth's orbit and rotation is well-known and can be calculated with high precision. However, atmospheric attenuation of light is strongly affected by atmospheric constituents such as molecular gases, aerosols, water vapour and clouds by reflecting, absorbing and scattering processes (Platt et al., 2012;Wallace and Hobbs, 2006). Aerosols, water vapour and clouds are highly variable in space and time. As a consequence, uncertainty is unavoidable when calculating surface solar radiation because of the high space and time variability of aerosols, water vapour and clouds (Li and Trishchenko, 2001;Chen et al., 2000).
Scientists have tackled the problem of atmospheric light attenuation with mechanistic and statistical approaches. While the former deal with the physical and chemical processes governing light attenuation, the latter use large numbers of observations to infer patterns of variability caused. Two indices are widely used to quantify the random nature of atmospheric light attenuation, the clear-sky index (c) (see Tran, 2013;Harrouni, 2008;Ianetz and Kudish, 2008;Allen et al., 2006;Hansen, 1999;Skartveit and Olseth, 1992;Bendt et al., 1981;Gordon and Hochman, 1984;Liu and Jordan, 1960), defined as the ratio of actual radiation to clean-dry atmosphere radiation, and the clearness index (k) (see Engerer and Mills, 2014;Hollands and Suehrcke, 2013;Ianetz and Kudish, 2008;Polo et al., 2008;Olseth and Skartveit, 1984), which is the ratio of actual radiation to top-of-theatmosphere radiation (i.e. with no atmospheric attenuation). Although c and k can be calculated for any spectral band and time aggregation scale, they are often studied at the hourly or daily time steps and for the shortwave band (e.g. Utrillas et al., 2018;Cañada et al., 2003;Martinez-Lozano et al., 1999). Notice that because of the different physical mechanisms involved in the magnitude, frequency and duration of clouds, water vapour and aerosols across the globe, c and k must show statistical properties related to regional and local climate.
In this paper, we analyse the statistical properties of c and k at 37 FLUXNET sites distributed worldwide (Sect. 2) looking for climate-related variability patterns. We process historical extraterrestrial spectral irradiance data from the SOLID project (Sect. 2) by using mechanistic models -solar geometry and the Beer-Lambert law (Sect. 3) -to remove the deterministic component of historical daily solar radiation observations. Then, we analyse the probability distribution function of c and k in relation to global climate regions, Köppen climate types and Holdridge life zones (Sects. 4 and 5).
Characterising the stochastic behaviour of surface solar radiation is of great importance in many research fields, e.g. photovoltaic electricity generation, photosynthesis, nutrient dynamics in ecosystems, water dynamics in soils, solar power forecasting and forest fire risks (e.g. Muñoz, 2019;Engerer and Mills, 2014). Of special interest are the ecohydrological and biochemical models of Rodríguez-Iturbe and Porporato (2004) and collaborators (e.g. Schaffer et al., 2015;Tamea et al., 2011;Laio et al., 2009;Ridolfi et al., 2008;Nordbotten et al., 2007;Daly et al., 2004a, b;Manzoni et al., 2004;Porporato et al., 2003;D'Odorico et al., 2000), which have made great progress since the beginnings of the 20th century. All these remarkable works, however, have been oriented to water-limited ecosystems, where uncertainty is introduced by the rainfall process only. To study energy-limited ecosystems, daily solar radiation is required as one more external variable driving evaporation and transpiration processes. Solar radiation is also a random variable, since it highly depends on atmospheric transmittance, especially that of clouds . This paper has the purpose of establishing a framework for daily solar radiation characterisation that serves as a base for developing the ecohydrology of energy-limited ecosystems.

Data
Our data set comprises daily observations of incoming solar radiation and rainfall from 37 sites around the world from the FLUXNET data set (Baldocchi et al., 2001;Olson et al., 2004) (Fig. 1). We analyse two spectral bands, the photosynthetically active radiation (PAR) and the shortwave radiation (SW). While SW observations are available at all sites, PAR observations are available at 28 sites only. Sites have different periods of record spanning from 1996 to 2014 and elevations from sea level to 1550 m (Table 1). These sites were selected from an initial set of more than 200 sites after filtering by several criteria such as record length, data quality and spatial coverage of the whole group. FLUXNET PAR data are given as photosynthetic photon flux density (PPFD). The wavelength domain for PPFD in the FLUXNET data set is 400-700 nm (Olson et al., 2004) and has units of µmol m −2 s −1 . We convert PPFD to PAR irradiance in W m −2 through the relationship 4570 nmol m −2 s −1 = 1 W m −2 (Sager and McFarlane, 1997).
We use the solar spectral irradiance (SSI) at the top of the atmosphere from the "First European Comprehensive Solar Irradiance Data Exploitation project" (SOLID) (Haberreiter et al., 2017;Schöll et al., 2016) as input data for an atmospheric radiation transfer model (Sect. 3). The SOLID spectral time series has a daily time resolution from In order to analyse the spatial climatic patterns of the random component of PAR and SW radiation, we use the Köppen climate classification from Peel et al. (2007), downloaded from the author's web page (https://people.eng.unimelb.edu.au/mpeel/koppen.html, last access: 14 September 2020), and the Holdridge life zones (Holdridge, 1947(Holdridge, , 1967 from Leemans (1992), downloaded from UNEP-WCMC (https://www.unep-wcmc. org/resources-and-data/holdridges-life-zones, last access: 14 September 2020).

Methods
Daily radiation amount at a site on the Earth's surface results from integrating instantaneous irradiance over the day length. Surface instantaneous irradiance estimation comprises solar irradiance at the top of the atmosphere (TOA) and the attenuation of light as it passes through the atmosphere for the site and time of interest. The clear-day and the clearness indices are two frequently used measures to assess atmospheric attenuation without dealing with the physics of the atmospheric attenuation process. The clear-day index (c) (also known as clear-sky index, relative clearness index and normalised clearness index) is defined as the ratio of observed radiation to theoretical radiation for a cloudless, clean 576 E. Muñoz and A. Ochoa: Climatic traits in statistical properties of daily photosynthetically active radiation and dry atmosphere (cda) at ground level (Eq. 1). The clearness index (k) is the ratio of observed radiation to theoretical radiation without atmosphere (which amounts to the same as radiation at TOA) (Eq. 2). While k assesses the whole atmospheric effect on radiation, c evaluates the effect of those atmospheric components that are highly variable in space and time, such as clouds, water vapour and aerosols.
The definitions of c and k in Eqs. (1) and (2) can be used for different time aggregation scales, spectral bands, radiation components (i.e. diffuse, beam, global) or surface orientation. We use them for daily global radiation on a horizontal surface for the SW and PAR spectral bands. While estimating k only involves geometric operations (Sect. 3.1), assessing c also requires calculating the cloudless, clean and dry atmosphere attenuation, for which we use the Beer-Lambert law (Sect. 3.2).

Daily radiation at the top of the atmosphere
Integration of the daily SSI over the spectral band of interest (400-700 nm for PAR and 285-280 nm for SW) gives the total spectral irradiance for each band (TSI PAR and TSI SW ) at TOA, as shown in Eq. (3). After some geometric transformations accounting for solar declination (δ) and latitude (φ) (Iqbal, 1983), the daily global radiation on a horizontal surface can be calculated as shown in Eq. (4).
Here E 0 is the eccentricity correction factor of the Earth's orbit, and ω sr is the sunrise hour angle for the day.

Daily surface radiation for a cloudless, clean dry atmosphere
Daily horizontal surface radiation for a cloudless, clean and dry atmosphere (H cda ) is the sum of the corresponding direct (H b ) and diffuse (H d ) components, which we calculate separately. To calculate daily H b and H d , we first model the direct and diffuse instantaneous spectral irradiances at the ground level and then integrate them along the day length and both PAR and SW spectral domains. Following Iqbal (1983), we assume the cloudless, clean and dry atmosphere to be composed by uniformly mixed gases (m) and ozone (o). Using the Beer-Lambert law and integrating, daily H b is calculated as in Eq. (5).
where SSI 0, n, λ is the extraterrestrial spectral irradiance normal to the rays from the sun (obtained from SOLID), γ is the solar altitude varying from sunrise (sr) to sunset (ss), and τ ma, λ is the transmittance due to the molecular absorbers of the cda atmosphere.
For the assumed atmosphere composition τ ma, λ = τ o · τ g , where τ o and τ g are the ozone and the mixed-gas transmittance, respectively (see details in Iqbal, 1983, Sect. 6.14). We arbitrarily assumed forward and backward scatterances of 0.5 and considered only the first pass of radiation through the atmosphere. Although higher reflectances could bring about some subestimation of H , especially during snow-cover periods, we think it is not a critical issue for the sake of this study. Uncertainty caused by these two assumptions will be included in the statistical properties of c. H d can then be calculated by Eq. (6).
Several atmospheric parameters are required by Eqs. (5) and (6). We assume the 1976 US standard atmosphere (NASA, 1976) (sea level pressure of 101.325 kPa, sea level temperature of 288 K and sea level density of 1.225 kg/m 3 ) and the Kasten and Young (1989 , Table II) optical air mass function of solar altitude, which has 336 values for solar altitudes between 0 • and 90 • . Transmittance for ozone and mixed gases is calculated as in Eqs. (7) to (9).
Here m r is the relative air mass at standard pressure, m a is the relative air mass at actual pressure, k o and k g are the absorption attenuation coefficient for oxygen and mixed gases, and l o is the amount of ozone in centimetres (at normal temperature and pressure, NTP). We calculate k o, λ for any λ value using the Leckner (1978) interpolation of the classic Vigroux (1953) data. For calculating l o we interpolate, for each latitude and day of the year (doy) of interest, the Table 5.3.2  from Iqbal (1983), which gives the monthly total amount of ozone in a vertical column of air for several latitudes (Iqbal's  table is a reproduction of Table 4.2 from Robinson, 1966, p.114). k o,λ is calculated by interpolating Table 6.13.1, which is a reproduction of Table 4 in Leckner (1978, p.146).

Statistical properties of k and c
After the process is described in Sect. 3.1 and 3.2, we calculate daily time series of c and k by using the expressions in Eqs. (1) and (2) for the SW and PAR spectral bands. Then, we estimate the mean annual cycle and empirical probability density functions (PDFs) of H , c and k for both bands. We separate the data samples of c and k by humid and dry days, using precipitation data as a proxy of cloudiness and water vapour in the atmosphere. Finally, we examine the seasonality of c and k by comparing the cumulative distribution function (CDF) of each month with the CDFs of the other months. Comparison is carried out visually and tested by using the Kolmogorov-Smirnov (KS) and the Anderson-Darling (AD) goodness-of-fit tests (Dodge, 2008;Pearson, 1900;Scholz and Stephens, 1987). KS is maybe the more frequently used test for comparing distributions. AD makes a good complement to the KS test because it is more reliable to detect differences at the extremes of the distribution (Engmann and Cousineau, 2011).

Results
Plots of the time series and the mean annual cycle of daily PAR, c and k at AT-Neu -a site with long and high-quality records -are shown in Fig. 2. Analogous figures for all studied sites are available in the Supplement. Although both PAR and SW radiation and their corresponding c and k exhibit annual seasonality in the maximum values, they take values spanning over their whole domain nearly all year round. Separating the data into the rainy days and the dry days allows estimation of the corresponding conditional PDFs and CDFs (Fig. 3 for AT-Neu, all other sites in the Supplement). Re-peating this process for each month reveals the existence (or not) of seasonality in the PDFs and CDFs (Fig. 4 for AT-Neu, all other sites in the Supplement). Inspection of the PDFs of all sites (see Supplement) led us to define three types of PDF according to the shape of the functions, namely unimodal with low dispersion (ULD), unimodal with high dispersion (UHD) and bimodal (B). The three types of PDFs are illustrated in Fig. 5a.
Looking for climate-related regionalisation patterns, the c and k PDF type was compared to the global climate region (Fig. 5b), the Köppen climate classification (Fig. 5c) and the Holdridge life zones (Fig. 6). An in-depth inspection of the plots of all sites allowed us to set out the following statements: (a) the same behaviour is observed for c and k at each site, (b) latitude is not enough to explain the shape of the PDFs, (c) Köppen climate types show a more clear pattern than global climate regions and (d) Holdridge life zones show the clear-cut pattern of variability of the PDF types. Ta-

Discussion
We analysed the stochastic behaviour of the daily clearness and clear-sky indices for the PAR and SW spectral domains. Both indices remove the astronomical seasonality, and c also removes the seasonality of the clean-and dry-air optical mass. Therefore, c is arguably more effective at classifying sites based on their solar radiation characteristics than k. A few values of c and k greater than 1 occurred occasionally. This could happen because of multiple reflections of light, as is probably the case during winter at high latitudes. This could introduce important errors if we were performing a forecast work, but it is not so problematic in this study. The analysis of rainy and dry days revealed that c and k have the same PDF shape for both indices at each site. This result is obvious because the whole data sample was separated by rainy and dry days and both indices are designed to consider the effect of water in the atmosphere. This is also a reason explaining that the PDF shows the same shape type for the SW and the PAR bands. Although radiation itself is a proxy of cloudiness (Nyamsi et al., 2019;Oliphant et al., 2006), we use rainfall to separate the data sample into rainy and dry days keeping in mind the connection of our stochastic description of attenuation with the family of stochastic models of Rodríguez-Iturbe and  and others. This way, bimodal distributions of  c and k vanish in most of the cases when data are divided into rainy and dry days, reinforcing the idea that the modes are strongly related to clear and overcast sky conditions. Exceptions to this pattern occur in DE-Geb and DE-Hai (30 km distant) and AT-Neu (Fig. 4), where the PDFs of dry days still have a bimodal distribution. Despite DE-Geb and DE-Hai being only 30 km distant, local conditions are quite different. While DE-Geb is in a cropland at 162 m a.s.l. with a mean annual rainfall (MAP) of about 500 mm (Anthoni et al., 2004b, a), DE-Hai is in a deciduous broadleaf forest at 430 m a.s.l. with a MAP of 750-800 mm. (Knohl et al., 2003). The high frequency of fog and low stratus occurring in central Germany (Rösner et al., 2020;Egli et al., 2017;Cermak et al., 2009) could be related to the physical mechanisms by which dry days still have bimodal distributions.
Results suggest the occurrence of the three PDF types described in Sect. 4. Our interpretation is that a general bimodal-shaped PDF could explain the three types. Following the nomenclature used in the ecohydrological models of Rodríguez-Iturbe and Porporato (2004) and collaborators (e.g. Tamea et al., 2011;Laio et al., 2009;Nordbotten et al., 2007;Manzoni et al., 2004;Porporato et al., 2003), let λ be the probability of occurrence a rainy day. Let also x represent any of our indices c and k. The PDF of x can then be written as in Eq. (10).
where f R (x) and f D (x) are the conditional PDF of x for rainy and dry days, respectively. The family of this PDF shown in Fig. 7 has members with one and two modes. The UHD-type PDF emerges in the marginal function f (x) when the two modes of f R (x) and f D (x) are close to each other. The general model of Eq. (10) is also useful to understand the climate influence on the marginal PDF of c and k. If λ → 0 (i.e. in very dry sites), the c and k PDFs are of the ULD type and result from solar geometry and the permanent constituents of the atmosphere. If λ is not so low, the PDF will be of the UHD type where convection is the main driver of cloud formation and of the B type where cloud dynamics is controlled by larger-scale phenomena such as atmospheric jets and meteorological fronts (Boucher et al., 2013). Obviously, as λ → 1, the c and k PDFs will approximate those of the humid days, no matter the physical mechanisms.

Regionalisation
As shown in Table 2 and Fig. 5, latitude and global climate regions have a weak relation with the PDF types, with ULD occurring in the tropical, warm tropical, subtropical and cool temperate regions; UHD in tropical and subtropical regions; and B-type PDFs in warm tropical, subtropical, cool temperate, and boreal regions. There is also one UHD site in the subpolar region and two ULD sites in the polar region. In summary, there is no clear pattern of PDF type variability when grouped by latitude or global climate regions. The Köppen classification system performs better than global climate regions to capture the PDF type variability (Fig. 5). C climates (Cfa, Cfb, Csa) and Df climates (Dfb and Dfc) have a clear predominance of the B-type PDFs, except for one site in Australia (AU-Tum). B climates (BSh and BSk) and Aw climates (with one exception) are all of the ULD type. Am climate is UHD, and a variety of PDF types occur in the Dwb and Dwc climates.
Plotting the sites on the Holdridge life zone scheme (see Fig. 6) reveals a clear-cut pattern of c and k PDF types. The Holdridge life zone triangle has the advantage of showing several climate variables independently. These variables are humidity, annual precipitation and potential evapotranspiration ratio. Boreal and cool temperate regions in the more humid provinces have PDFs of the B type, tropical regions in the humid provinces show PDFs of the UHD type, and the dry provinces show all ULD-type PDFs. Two isolated sites occur at the top of the triangle, two ULD sites in the polar desert of China and one UHD site in Alaska (see map in Fig. 1).

Conclusions
We inspected 37 sites worldwide for the influence of local climate on statistical properties of clearness and clear-day indices. We identified three types of statistical behaviour according to the PDF shape, namely ULD, UHD and B. The same PDF type was found to occur for c and k at each site for both PAR and SW radiation bands. It was evidenced that latitude is not enough to explain the shape of the PDFs, suggesting that climate could play an important role. Global climate region and the Köppen climate have a stronger relation than latitude with the PDF types. Holdridge life zone classification showed the most clear-cut pattern of variability of the PDF types. We proposed a general mathematical model for the PDF of c or k that has the ULD, UHD and B types as particular cases. This model constitutes an important basis for biogeochemical modelling in energy-limited ecosystems, especially in ecohydrology, but also in meteorology, glaciology, agroclimatology and other research fields.
Author contributions. EM and AO conceived the idea. AO calculated the clear-sky daily radiation. EM did all the statistics. Both EM and AO analysed the results and wrote the manuscript. AO supervised the work.
Competing interests. The authors declare that they have no conflict of interest.