Validation of demographic equilibrium theory against tree-size distributions and biomass density in Amazonia

Predicting the response of forests to climate and land-use change depends on models that can simulate the timevarying distribution of different tree-sizes within a forest so-called forest demography models. A necessary condition for such models to be trustworthy is that they are able to reproduce the tree-size distributions that are observed within existing forests worldwide. In a previous study, we showed that Demographic Equilibrium Theory (DET) is able to fit tree-diameter distributions for forests across North America, using a single site-specific fitting parameter (μ) which represents the ratio of 5 the rate of mortality to growth for a tree of a reference size. We use a form of DET that assumes tree-size profiles are in a steady-state resulting from the balance between a size-independent rate of tree-mortality and tree growth-rates that vary as a power law of tree size (as measured by either trunk diameter or biomass). In this study, we test DET against ForestPlots data for 124 sites across Amazonia, fitting, using Maximum Likelihood Estimation, to both directly measured trunk diameter data and also biomass estimates derived from published allometric relationships. Again, we find that DET fits the observed tree10 size distributions well, with best-fit values of the exponent relating growth-rate to tree mass giving a mean of φ= 0.71 (0.31 for trunk diameter). This is broadly consistent exponents of φ= 0.75 (φ= 1/3 for trunk diameter) predicted by Metabolic Scaling Theory (MST) allometry. The fitted φ and μ parameters also show a clear relationship that is suggestive of life-history trade-offs. When we fix to the MST value of φ= 0.75, we find that best-fit values of μ cluster around 0.25 for trunk diameter, which is similar to the best fit value we found for North America of 0.22. This suggests an as yet unexplained preferred ratio 15 of mortality-to-growth across forests of very different types and locations. Copyright statement. TEXT

how much emissions need to be reduced to keep global warming within a certain level. These issues have led to the development of more advanced Dynamic Global Vegetation Models (DGVMs), used within ESMs, to more effectively represent vegetation processes (Sitch et al., 2015;Fisher et al., 2018). One of the key advances has been the inclusion of tree size-distributions, which allows better representation of land-use change and recovery from disturbance.
These recent DGVMs broadly consist of two different approaches to representing tree size, either based on individual-based 5 models (Shugart et al., 2018) or using cohort-based ecosystem demography models (Moorcroft et al., 2001;Longo et al., 2019).
DGVMs also need to balance additional complexity against practical considerations of usability, and computer execution time and memory usage. Key issues in the usability of complex numerical models are the understanding of the effect of many model parameters and the dependence on initial conditions (Moore et al., 2018). To solve these issues we have been exploring simplifications to the modelling of forest demography that are both parameter sparse and have steady-state solutions that can 10 be solved for analytically (Moore et al., 2018;Argles et al., 2019).
We follow Demographic Equilibrium Theory (DET) (Muller-Landau et al., 2006b) in assuming that forests are in a steadystate with size distributions completely determined by size-dependent functions of tree growth and mortality. Previously we showed that DET was able to fit the large-scale size-distributions of forests in North America (Moore et al., 2018), even though many of these forests are net carbon sinks (and therefore not in a precise steady-state). The current study uses the simplest 15 reasonable form of DET that assumes growth is a power law of size and mortality is constant. This has been shown to be a useful model of underlying demographic processes, with the model parameters correlating with observations (Muller-Landau et al., 2006b;Lima et al., 2016), even though individual forest plots may deviate from the simplifying assumptions. While the growth and mortality functions of a forest are often unknown, DET can provide useful indications of the patterns of the ratio of mortality-to-growth based on observed tree-size distributions alone (Moore et al., 2018). 20 Amazonia is one of the largest pools of land carbon on the planet (Feldpausch et al., 2012) and may be vulnerable to climate change (Cox et al., 2000;Brienen et al., 2015). It is therefore vital that DGVMs are able to model this region well. We therefore extend the analysis of Moore et al. (2018) by fitting DET model to tree trunk-diameter data for this key region, and also to tree mass data derived from allometry, which is even more relevant for ESMs. As a baseline comparison we also fit the Metabolic Scaling Theory of forest demography (MSTF), which assumes that trees of varying sizes fill space in such a way that the 25 size-distribution scales with trunk diameter D as D −2 (West et al., 2009).
In Section 2 below we summarise the theoretical basis for DET and also MSTF, deriving analytical formulae for total forest biomass in each case. Section 3 describes the Methods and data, and Section 4 describes the results. Finally discussion and conclusions are in Sections 5 and 6. The distribution of tree sizes in a forest can be understood in terms of how the growth and mortality of the trees vary with tree size (Kohyama et al., 2003;Coomes et al., 2003;Muller-Landau et al., 2006b). The amount of trees in a given size class (i.e. range of tree size) depends on the number of smaller trees growing into it and the number leaving it due to growing out or dying. The balance of growth and mortality will determine whether the abundance of a size class is increasing, decreasing or if it is in demographic equilibrium (Van Sickle, 1977). At the scale of a whole forest, there is a further balance between the rate of seedling recruitment from seeds (lower boundary condition) and the whole forest mortality. Again this balance will determine if the forest as a whole is gaining or losing both mass and/or abundance. 5 The governing equation for this process is variously known as the one-dimensional drift or continuity equation (Van Sickle, 1977), the Kolmogorov forward or the Fokker-Planck equation with the second-order term omitted (Kohyama, 1991) ∂n (D, t) ∂t + ∂ ∂D (n(D, t) g(D, t)) = −γ(D, t) n(D, t) where n is the size-distribution (tree density per size class) in Trees cm −1 ha −1 in terms of tree trunk diameter D in cm, trunk diameter growth rate g in cm year −1 , γ is the mortality rate per year and time t in years.

10
It was shown (Kohyama et al., 2003) that for an unchanging, equilibrium size-distribution, this equation can be integrated as where n L is the value of n at the lower boundary D L , which for forest inventory data is the minimum sampling size (in this study 10 cm). 15 This equation can be solved to give an exact solution, if simplifying assumptions of size-independent mortality γ(D) = γ and power law growth rate g(D) are used. The growth rate g(D) in cm per year is then where g 1 is a constant with the same value as the growth rate for a tree with trunk diameter of 1 cm. The solution (Muller-Landau et al., 2006b;Lima et al., 2016;Moore et al., 2018) for the size distribution is then the Left-Truncated Weibull Distri-20 bution (LTWD) where µ 1 = γ/g 1 is the mortality to growth ratio at D=1 cm (Note: the units of µ 1 are cm φ−1 but as it is defined for the point D = D 1 = 1 cm can be assumed to be dimensionless if the size variable D is implicitly a ratio D/D 1 , which has the same exact numerical value as D but is dimensionless).

25
This solution is also applicable for other size variables such as tree dry mass m in kg where m L , µ m1 and φ m are the mass equivalents of D L , µ 1 and φ.
The LTWD distribution has been shown to be a good description of tree trunk diameter distributions in a variety of tropical forests (Muller-Landau et al., 2006b;Lima et al., 2016) and in temperate forests in the US over larger scales (Moore et al.,5 2018). When these distributions are fitted to data then they can have both parameters φ and µ 1 as fitting parameters or just fit µ 1 and fix φ to the values used in MST allometry (Niklas and Spatz, 2004;West et al., 2009) of φ = 1/3 and φ m = 3/4.

Total Biomass Density for DET
The total biomass density (kg of dry tree mass per hectare) of the LTWD tree mass distribution can be obtained by integrating Eq. (5) in terms of mass, between the lower boundary m L and infinity where Γ is the upper incomplete Gamma Function, x = 1/(1 − φ m ).
As real forests do not satisfy the assumption of infinite maximum size tree, this can lead to errors in the calculated biomass density. A correction to this can be found, in terms of m max the largest tree mass in the distribution In cases where m max is both large and much larger than m L then there will be little difference between Eq. (6) and Eq. (7). m max is a somewhat arbitrary function of the sample size, due to large trees being statistically rare, meaning the infinite upper bound solution Eq. (6) is expected to be more accurate for larger sample sizes.

Metabolic Scaling Theory (MST)
Metabolic scaling theory is a theory of scaling of organisms with size, based on theories of metabolism, physics and chemistry 20 (West, 1997;Muller-Landau et al., 2006a). This theory uses the predictions of the scaling of individuals to predict the larger scale patterns and structure of populations and communities. For forests this is in the form of using the scaling of photosynthesis of trees and the vascular structures that transport water to predict individual scaling. This is then combined with assumptions from self-thinning about how trees fill space to describe the expected forest size-distribution (Coomes et al., 2003;West et al., 2009). This leads to a power law distribution for trunk diameter and for mass the distribution is almost identical

Total Biomass Density for MST
The MST equations also enable the calculation of biomass density (kg of dry tree mass per hectare). In this case only the finite upper bound of m max can be used as the solution goes to infinity as the upper bound goes to infinity.

Forest inventory data
The tree census data used in this study is from the public access permanent sample plots of the RAINFOR (Peacock et al., 2007) network. RAINFOR provides a systematic framework for long-term monitoring of the Amazon. The RAINFOR data is stored on the ForestPlots database (https://www.forestplots.net). This database stores measurements (stem diameter, species ID, recruitment, growth, and mortality) of individual trees from hundreds of locations, taken using standardised techniques to 15 allow the behaviour of tropical forests to be measured, monitored and better understood (Lopez-Gonzalez et al., 2011).
We selected 124 open access forest plots ( Fig. 1) classified as mixed forest (not monoculture) and old-growth to most closely match the model assumptions of forests undisturbed by human interference and approximating to equilibrium demography.
The 124 selected plots all had a consistent lower cut-off in measurements at 10 cm trunk diameter. Two available upper montane plots with very few measurements above 10 cm were not included in the 124 plots used, as they did not have enough 20 measurements to allow a reliable fit.

Calculating Dry Tree Mass from Trunk Diameter
The open access plots of the Amazon RAINFOR dataset consists only of trunk diameter values. To estimate the tree mass, the methodology developed by Feldpausch et al. (2012) was used. In that study two functional forms (with and without height)  White circles show location of the forest plots used. The two western regions share common allometry but are split based on rainfall seasonality for analysis purposes.
were tested against destructively sampled mass data (trees carefully measured then cut down and weighed) to find ones which best estimated mass from trunk diameter. It was found that mass estimation accuracy doubled when including height, even if the height had in turn been estimated from trunk diameter. Out of three choices of height functional form (power law, Weibull-H and exponential), Feldpausch et al. (2012) found the Weibull-H form Eq. (11) to be the best at estimating mass across multiple size classes. The height H in metres is then with the coefficients varying geographically between defined allometric regions (see Table S1 in Supplementary Material and Figure 1).
The regions were defined by geography and substrate origin (Feldpausch et al., 2012). Western Amazonia (Columbia, Ecuador and Peru) being recently weathered Andean deposits, the geologically old Brazilian Shield to the south (Bolivia The wood specific gravity ρ w was obtained from the Dryad Global Wood Density Database https://doi.org/10.5061/dryad.234/1 Zanne et al., 2009). For each tree measurement the ρ w value used was for that species from the closest 20 available region. Where the species data was unavailable or the species of the measurement had not been recorded then the ρ w value of the Genus was used, based on an average of all trees in the Dryad database in that Genus. Trees without Genus data were estimated from Family data, and any remaining measurements where the ρ w was still unknown were set to the average ρ w of the trees in that same forest plot with known ρ w values.
3.3 Fitting methodology 25 As in our previous study (Moore et al., 2018), Maximum Likelihood Estimation (MLE) was used to find the parameters that give the best fit for both the Left-Truncated Weibull, derived from DET (DET-LTWD) and Metabolic Scaling Theory (MST) distributions. MLE is an effective method for parameter fitting of forest size distributions (Taubert et al., 2013;White et al., 2008).
Maximising the log-likelihood L results in a more numerically tractable summation of terms rather than a product of terms obtained from using the Likelihood directly. L in terms of the probability distribution function (pdf) f (D) is then where D i is tree trunk diameter measurement of stem i in the dataset.
The data was fitted both by plot, by allometric region (an aggregated dataset of all plots in that region), by country (again 5 aggregation of plots) and for all the data, from all 124 plots, grouped together as one large dataset. This allows both the study of the individual plots and the larger scale patterns across South America.

Maximum Likelihood Estimation (MLE) for Demographic Equilibrium Theory (DET)
The probability density function (pdf) f (D) for the DET-LTWD, in terms of tree trunk diameter D and minimum tree size D L is related to the number density distribution n(D) (Eq. 4) where N is the total number of trees in the dataset being fitted, φ is the growth scaling power from Eq. (3) and A the area of the plots containing the trees sampled in the dataset. This equation is equivalent to the standard form of the LTWD where c = 1 − φ is the shape parameter and λ = c µ 1 1/c the scale parameter. 15 We fit DET-LTWD twice, once with both parameters φ and µ 1 allowed to vary as fitting parameters and secondly with the growth scaling parameter φ fixed to the MST allometry values (φ = 1/3 and φ m = 3/4, see Niklas and Spatz (2004) and West et al. (2009)). Fixing φ means we have a DET-LTWD model following just one assumption of MST (the allometry) and so acts as way of comparing the effect of the second MST assumption of space-filling when comparing DET-LTWD and MST fits.

20
For this situation, where we are only aiming to find the parameter µ 1 and φ is assumed, then MLE can be solved analytically (Kizilersu et al., 2016) where c = 1 − φ. The equations are the same for tree mass, just with the symbols appropriately substituted (m for D etc).

Two Parameter Fit
For the two parameter case, where both φ and µ 1 are fitted, then we calculate the Log-Likelihood L as follows Substituting Eq. (16) into Eq. (17) creates a function only of c and therefore φ. This allows minimisation of −L in terms of φ by using Brent's bounded algorithm (Brent, 1973). Once the optimum φ has been found then µ 1 can be calculated from 5 equation 16. As equation 16 is included in the minimisation of −L, then it means we are in fact solving for both parameters at once and are finding the maxima of L. This algorithm was tested both with real data and data generated by computer from known LTWD distributions, by plotting the L values against φ and µ 1 , to confirm the maxima was found correctly.
Once the parameters µ 1 and φ are estimated, then this allows n L , the tree density per size class at D L , to be obtained from these parameters and the known quantities of the total number of trees N and the plot area A. This can be derived by integrating 10 the equation for n (Eq. 4), to give and noting that the observed number of trees is identical to the integral, we get where c = 1 − φ and D max is the largest tree size in the dataset. For this study it was found that as D max >> D L for most cases 15 (and that c is never much larger than µ 1 ), n L can be assumed to be Again, the equations are the same for tree mass, just with the symbols appropriately substituted (m for D etc).

Maximum Likelihood Estimation (MLE) for Metabolic Scaling Theory
From the equation for number density n (Eq. 8) the pdf for MST is where D max is the largest tree size in the dataset. As all the quantities are known then there are no free parameters to fit and all that needs to be done is calculate n L , the tree density per size class at D L Similarly the MST pdf for mass from Eq. (9) is

Estimating Plot and Regional Biomass Density
To test the biomass density equations, we used the results of the MLE fits to calculate the biomass density predicted by Eq.
(7) and Eq. (10). The biomass density predicted by these equations are then compared to the allometric biomass density (i.e. 10 the sum of the mass of all trees in a dataset divided by the area of the plots). This comparison then provides a goodness of fit measure that is relevant to climate.
We chose to measure the biomass density as a function of size in terms of the total mass per unit area from trees with masses equal or greater than a given size. The main reason for this is that the forest plot data only sampled trees with a trunk diameter equal to or greater than 10 cm. Therefore it makes little sense to measure the biomass density below a given size, as would be 15 the case with a traditional cumulative distribution function. This approach has a second benefit that the mass of a forest above a given size is a much more useful way of easily seeing the contribution of the dominant larger trees to total biomass (Bastin et al., 2018).
A correction term is added to Eq. (7) and Eq. (10) to make sure the biomass density correctly evaluates at the upper boundary (the mass of the largest tree m max ). This is because these equations only evaluate the mass up to but not including the trees 20 with a mass equal to the largest value in the dataset. Therefore, to comply with the definition above it is necessary to add the mass of the largest trees back into the total biomass.
As the large trees are so rare this correction will be equivalent to adding just one tree of the largest mass m max in the dataset divided by A, the total area of plots in the dataset.

25
This Eq. (25) is used for all biomass density estimates where the upper bound of tree size is assumed finite (based on m max ), while for the cases where the simplifying assumption of infinite tree size is used then Eq. (7) is used.

Mass Distribution
When the mass data was estimated from the trunk diameter measurements using the methodology of Feldpausch et al. (2012), 5 it was noticed that the mass size-distribution (for all regions and plots) had a peak, which was not present in the trunk diameter distribution. We found this to be an artefact of the conversion from trunk diameter to mass in a distribution that was by definition truncated already in trunk diameter. Fig. 2a shows the relationship between trunk diameter and tree mass for the whole dataset, illustrating that for any particular trunk diameter there is a range of tree masses. This variation in tree mass is caused by the differences in wood density between 10 species and the variation in height allometry between regions (see Eq. eqn:Height and supplementary Table S1). If instead the dataset shown in Fig. 2a is truncated in mass rather than trunk diameter, then the truncation would instead follow the horizontal dotted line and there would be data in the region between that line and the diagonal dotted line. So in effect there is "missing" data for low mass trees, which is a result of the trunk diameter observations having a minimum sampling size (truncation point) and there being a range of tree masses for trees with a given trunk diameter. This hypothesis is further confirmed by increasing 15 the trunk diameter truncation point, as shown in Fig. 2b. As the truncation point is increased the "peak" moves to higher mass.

Eliminating the Mass Peak
When working with mass data the peak was eliminated from fitting by creating 40 bin edges (39 bins) in log-space (base e) from the smallest to largest tree in the dataset. These edges define the range of each bin and the value of each bin was selected as the midpoint in log-space. The data was then binned following these bins. Once the data was binned, the bin with the highest 20 frequency was identified. The value of this bin was then used as the truncation point for the dataset when fitting to the dataset distribution. The binning was purely used to identify the peak and for plotting the data and not used during the MLE fitting process. Figure 2. The effect of truncating data measured in trunk diameter and then converting to mass using allometry. In a), the mass for each tree is shown in terms of its trunk diameter. If the data had been truncated in mass there would be data in the triangle marked by the intersection of the dotted lines. This truncation effectively leads to missing data in the mass distribution, as seen in b). The mass distribution should constantly decrease with increasing mass but instead rises to a peak then decreases, and is due to incomplete data for the low mass end of the distribution. This peak can be seen to be an artefact of the trunk diameter truncation point. When the trunk diameter truncation point is increased the mass distribution peak moves with the truncation point.

Trunk Diameter Results
Fitting the DET-LTWD and MST equations to the trunk diameter size distributions, showed a consistent pattern for all the geographical aggregations of plot data. In all cases, except Guyana shield, the DET-LTWD solutions (both one and two parameter versions) more closely captured the curvature of the observed size-distribution than the MST solution ( Fig. 3a and see supplementary material Fig. S1 and S2). In particular the MST model deviated from the observed data at large trunk diameters.

5
Guyana Shield region only had four small plots, totalling 819 trees, which may explain the reason it was hard to distinguish visually the best fitting model ( Supplementary Fig. S2).
The two parameter DET-LTWD fits gave a fitted value of the growth scaling power φ between 0.137 and 0.546 (Table 1) and five of the twelve regions were within 0.05 of the theoretical value of 1/3 (i.e. φ in range 0.28-0.38). In general the one and two parameter DET-LTWD solutions were quite similar in terms of the appearance of the fit on the distribution plots. This finding was confirmed using the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) ( Table 2). Both the AIC and BIC are a way of determining from several models which has the best "goodness of fit", with a lower value indicating a better fit. Both criteria are calculated from the log likelihood and number of fitting 7UHH1XPEHU'HQVLW\7UHHVFP 1 KD 1 'DWD 3DUDPHWHU'(7/7:')LW 3DUDPHWHU'(7/7:')LW 067)LW parameters with a difference of 10 being the threshold where the evidence is considered to be very strongly against the higher scoring model (Kass and Raftery, 1995). BIC penalises a higher number of fitting parameters more than AIC.
It was only possible to distinguish the quality of the fits for four of the twelve geographical aggregations of forest plots. In all four cases (All S.America, Bolivia, Brazilian Shield and N.Western) the two parameter DET-LTWD fit was favoured and for the other eight it was not possible to say that the inclusion of the growth scaling power as a fitting parameter improved the 5 fit.  in supplementary material Tables S2 and S3 and Fig. S5 to S13) again resulted in the DET-LTWD models generally fitting much more closely than MST. Table 3 shows the results of BIC comparison of the models for the 124 forest plots. In every case, the best model is determined by the lowest BIC value. Inferior models are only considered strongly rejected if their BIC is greater than the best model by 10 or more. This is represented by 5 the columns in the table and shows the one parameter DET-LTWD was the model most commonly favoured by BIC score (81 plots). However, in none of those plots was it possible to strongly reject both the other models. The most common result (75 plots) was of the one parameter DET-LTWD being the best model, MST being rejected but the two parameter DET-LTWD also so closely fitting the data that it cannot be rejected. The next most common result (17 plots) was the reverse with again MST rejected but the two parameter DET-LTWD now narrowly better but not sufficient to strongly reject the one parameter DET-

10
LTWD. The MST model was the best model for 15 plots and for 5 of those (ELD_01, ELD_02, RIO_01, RIO_02, TIP_03) the two DET-LTWD models were both strongly rejected. Four of these plots though had a very low number of trees, so it would be less likely to be able to pick a model with as much confidence from a distribution of only ∼100 trees. In fact the MST model seemed more likely to have a favourable AIC or BIC score, compared to the other models, for plots with smaller sample sizes and an increasingly unfavourable score for higher sample sizes (see Supplementary Fig. S30).
15 Table 3. Shows the best and acceptable models for the 124 individual forest plots for trunk diameter. Models are labelled as (M) for MST, (1) for one parameter DET-LTWD and (2) for two parameter DET-LTWD. Columns refer to best fitting model (lowest BIC score). Rows refer to models that are so good a fit compared to the best that they cannot be rejected, as their BIC score is so close to the best model. Plotting just the φ results in a histogram (Fig. 4a), reveals an approximate bell-shaped distribution with a peak close to the theoretical MST value. The median of the φ value for the plots is 0.34 (95% confidence interval 0.29-0.40) and the mean is 0.31 (95% confidence interval 0.26-0.36). These values are close to the theoretical value of 1/3, as suggested by the histogram. The histogram of µ 1 (Fig. 4b) shows a skewed bell-shaped distribution with a peak around 0.3 for 2-parameter DET-LTWD and a more symmetric bell curve centred around 0.25 for 1-parameter DET-LTWD. For 1-parameter DET-LTWD the the median of µ 1 for the plots is 0.25 (95% confidence interval 0.24-0.26) and the mean is 0.25 (95% confidence interval 0.24-0.26). For   If it is assumed that for any fixed value of φ there is a µ 1 value that gives the best fit for that (as can be seen in Fig. 3b) then an equation can be derived (see supplementary material Section 2) in terms of the DET theory and the known global best fit values φ t and µ t1 (ie the values fitted to all plots together)

2-parameter DET-LTWD
Equation 26 appears to fit the general trend of the fitted values well (Fig. 5) but as can be seen in Supplementary plots S31 and S32 the curves for all plots together and individual plots do not coincide so it is unclear whether this equation explains the relationship or if it is coincidental. Whether the equation is the true description or not, the relationship between µ 1 and φ suggests that there is a trade-off as a high µ 1 , high φ tree would have a superior growth:mortality ratio at smaller sizes but an inferior growth:mortality ratio at larger sizes compared to a low µ 1 , high φ tree. The results here hint at a trade-off that may 10 be taking place with the results representing the dominant strategy in each forest plot depending on local conditions affecting growth and mortality.
To test this, fitting parameters µ 1 and φ were compared to forest plot properties such as sample size, geographical location, and mean plot height, trunk diameter, mass, wood density and basal area. The relationships were generally weak with little correlation, suggesting a poor "signal to noise" ratio or that the metrics used above had little or no correlation to the fitting 15 parameters.

Mass Results
All fitting was performed on mass data after trees smaller than m P had been excluded. m P was chosen based on the methodology in section 4.1.1. When fitting the DET-LTWD and MST equations to the mass size distributions, there was again a consistent pattern for all the geographical aggregations of plot data. In all cases the DET-LTWD solutions (both one and two parameter versions) fitted much more closely than the MST solution ( Fig. 6 and see supplementary material Fig. S3 and S4). . The two parameter fits gave a fitted value of the growth scaling power φ m between 0.635 and 0.794 (Table 4) which showed that the growth allometry is close to the theoretical value of 0.75 (10 of 12 regions with φ m in range 0.7-0.8). The table also shows the truncation point m P used for each dataset, and all trees with mass less than this value were excluded. The value of m P corresponds to the peak in distribution created by the conversion from trunk diameter to mass data. The allometric biomass density agrees with the values found previously by Feldpausch et al. (2012), using the same biomass allometry. As this biomass 5 density value is dry mass then it is a reasonable approximation (Chave et al., 2005;Martin and Thomas, 2011) to halve these values to obtain the carbon biomass density, giving a range of 10-15 kg C m −2 . As with the trunk diameter, fits for the two DET-LTWD solutions were, in general, quite similar in terms of the appearance on the mass distribution plots. Again the AIC and BIC fitting metrics were barely able to distinguish which DET-LTWD model best fit the data (Table 5). For nine of the geographical aggregations (All S.America, Brazil, Bolivia, Colombia, Ecuador, Peru, 10 N.Western, Guyana Shield and Eastern Central) it was not possible to distinguish between the DET-LTWD fits with either AIC or BIC. For Venezuela AIC indicated that the two parameter fit may be slightly better but BIC was not able to show any difference. The S.Western allometric region was the only one showing the one parameter fit as being better but only for BIC.
The only region to have both AIC and BIC favouring one of the fits was the Brazilian Shield region, where both AIC and BIC favoured the two parameter fit. again resulted in the DET-LTWD models often fitting much more closely than MST. All fitting was performed on mass data after trees smaller than m P had been excluded. m P was chosen, for each plot, based on the methodology in section 4.1.1. Table 6 shows the results of BIC comparison of the models for the 124 forest plots. In every case, the best model is determined 5 by the lowest BIC value. Inferior models are only considered strongly rejected if their BIC is greater than the best model by 10 or more. This is represented by the columns in the table and shows the one parameter DET-LTWD was the best model by far the most (80 plots). However, in none of those plots was it possible to strongly reject both the other models. The most common result (74 plots) was of the one parameter DET-LTWD being the best choice model (according to BIC), MST being rejected but the two parameter DET-LTWD also so closely fitting the data it cannot be rejected. The next most common result 10 (14 plots) was the reverse with again MST rejected but the two parameter DET-LTWD narrowly better but not sufficient to strongly reject the one parameter DET-LTWD. The MST model was the best model for 15 plots and for 5 of those (ELD_01, ELD_02, RIO_01, SUC_03, TIP_03) the two DET-LTWD models were both strongly rejected. Three of these plots though had very low number of trees so it would be less expected to be able to accurately pick a model from a distribution of only ∼100 trees.
15 Table 6. Shows the best and acceptable models for the 124 individual forest plots for mass. Models are labelled as (M) for MST, (1) for one parameter DET-LTWD and (2) for two parameter DET-LTWD. Columns refer to best fitting model (lowest BIC score). Rows refer to models that are so good a fit compared to the best that they cannot be rejected, as their BIC score is so close to the best model. Plotting just the φ results in a histogram (Fig. 8a), reveals an approximate bell-shaped distribution with a peak close to the theoretical MST value. The median of the φ m value for the plots is 0.72 (95% confidence interval 0.71-0.75) and the mean is 0.71 (95% confidence interval 0.69-0.73). These values are close to the theoretical value of 0.75, as suggested by the histogram.
The histogram of µ m1 (Fig. 8b) shows a bell-shaped distribution with a peak around 0.19 for both 1-parameter and 2-parameter DET-LTWD. For 1-parameter DET-LTWD the the median of µ m1 for the plots is 0.199 (95% confidence interval 0.196-0.205)

Biomass Results
The biomass density equations Eq. (6), Eq. (7) and Eq. (10) were tested against the allometric biomass density (summed tree mass data), as can be seen in Table 7. The biomass density equation parameters were obtained from the fits in Table 4. For the DET-LTWD solutions the biomass density was calculated for both the cases where the upper bound was infinity and the maximum tree mass in the dataset. For each of those cases, the one and two parameter DET-LTWD solutions were calculated.

5
The value of m P was used for the lower bound for calculating the predicted biomass in equations Eq. (6), Eq. (7) and Eq.
(10). The same values of m P were used to truncate the data when finding the biomass density. So, comparisons between the theory and the mass obtained directly from a combination of observation and allometry were always using the same lower truncation point for each dataset but varied between datasets. The values of m P used are given in Table 4 and the methodology used to estimate m P is in section 4.1.1.

10
It is apparent that the MST biomass density equation is inferior to DET-LTWD derived biomass density equation from the DET theory. For all aggregations the biomass density was overestimated by MST, and in many cases by a considerable margin.
The comparison of the different DET-LTWD biomass density equations was found to favour the two parameter fit using the finite upper bound (6 regions out of 12). Four areas had better estimates with the two parameter fit using the infinite upper bound (All S.America, Bolivia, Peru and Guyana Shield). 15 Interestingly, two regions (S.Western and Ecuador) had a worse fit for two parameter DET-LTWD. The S.Western region though, fits the biomass within 2% regardless of the choice of upper bound or DET model, so the very slight difference in the biomass density prediction is almost certainly not significant for this region. When the reverse cumulative biomass density, defined as biomass density of all trees above a given tree mass, is plotted for Ecuador (see supplementary material Figures S27 and S28) the error comes from the shape of the tail of the distribution, which is much flatter than theory. This could be due to 20 it being a region with a smaller number of trees (4159) or could be due to higher mortality for large trees in this region.

Biomass Results for Individual Plots
To look deeper at the relationship between model choice and predicted biomass density the analysis was repeated for the individual forest plots. In Fig. 9, the results of the biomass density predicted by the models is shown as a function of the actual allometric biomass density. It can be observed that correcting for the largest tree size in each plot is much better than assuming an infinite maximum tree size and that the one parameter model performs less well for finite maximum tree size case.

5
This finding is supported by looking at the relative root mean squared error (root mean squared error divided by allometric biomass density) for each model, as shown in Table 8. For the small individual forest plots, finite maximum tree size has a larger effect on accuracy than using the two parameter DET-LTWD over the one parameter version. 3UHGLFWHG3ORW'U\%LRPDVVNJP 2 $OORPHWULF3ORW'U\%LRPDVVNJP 2 Figure 9. Comparison of the biomass density prediction based on the size-distribution fits to the mass data, to the allometric biomass density in each of the 124 forest plots. Results are plotted for both the one and two parameter fits and for both the assumption of infinite and finite maximum tree size. The finite tree size case is limited to the largest tree mass mmax in each forest plot. The red dotted line shows illustrates the line of a perfect one to-to-one relationship (i.e. theory matching the data perfectly).
In this paper we show that the Left-Truncated Weibull (LWTD), which is consistent with the Demographic Equilibrium Theory (DET) when the mortality is size independent and the growth is a power-law of tree size, fits the observed tree-size distributions for 124 forest plots across Amazonia. Our fitting was undertaken with either two free parameters or with one free parameter and the growth scaling power φ constrained to that specified in Metabolic Scaling Theory (1/3 for trunk diameter and 3/4 5 for mass, see West et al. 2009;Niklas and Spatz 2004). We also compared the performance of DET-LTWD to that of the Metabolic Scaling Theory for forest demography (MSTF, West et al. 2009). Our analyses were carried out for both trunk diameter measurements and for trunk diameter converted allometrically to mass (Feldpausch et al., 2012).
We found that this conversion of trunk diameter to mass introduces a peak in the mass distribution that is purely an artefact of the conversion. The peak is due to the variation in mass of trees of a given trunk diameter, due to height and wood density 10 variation leading to some small mass trees being in effect "missing" from the mass distribution. If the diameter to mass relationship was purely one-to-one, then the artefact peak would not occur. This has implications for anyone using mass size-distributions converted from trunk diameter data. Our solution was to fit only to trees with mass greater than the mass distribution peak.
The model fitting shows that Amazon size-distributions are generally better fit by the DET-LTWD based models than MSTF. 15 The 2 and 1-parameter DET-LTWD fits were often not significantly different enough from each other for comparison by AIC or BIC (which balance the quality of the fit against the number of unknown parameters) to choose which is the best description of the size-distributions. The few plots and regions (including all plots combined) where one model was found to have a significantly better AIC or BIC score all favoured the 2-parameter model.
The best-fit growth-scaling exponent φ varied between plots and regions, but the mean value of φ across all 124 plots fell 20 close to the values predicted by MST. For the 1-parameter DET-LTWD best-fit values of µ 1 for trunk diameter cluster tightly around 0.25 (and around µ m1 = 0.19 for mass). This is close to the mean value of µ 1 = 0.22 that we found for North American forests (Moore et al., 2018), hinting at a preferred value of the ratio of mortality to growth across different regions and forest types.
The clustering of φ results close to the value predicted by MST allometry (Niklas and Spatz, 2004;West et al., 2009) suggest 25 two possibilities. Either that the clustering represents an underlying "basin of attraction" that is modified by local conditions (Price et al., 2007) or that plots do not meet the model assumptions of growth, mortality and equilibrium and this in turn somehow leads to this clustering. We cannot say for certain why the plots cluster close to the MST values but it does lead to intriguing future avenues of study.
It was suggested (Coomes and Allen, 2009;Coomes et al., 2011) that light competition should modify the MST scaling of 30 growth with size. This would mean that for trunk diameter the growth scaling power would vary with size and be greater than the predicted MST value of 1/3. For our regional fits the fitted power was slightly larger than the MST value of 1/3 in most cases but for the individual forest plots, the value was very close to MST with no clear bias. So our results cannot be taken as conclusive evidence of light competition modifying the growth scaling but neither are they completely inconsistent with it.
We find the fitted 2-parameter DET-LTWD φ values for both mass and trunk diameter also have a well defined relationship to the fitted mortality : growth ratio µ 1 . This relationship does not appear to be a fitting artefact, as if artificial data is generated with known µ 1 and φ values off the observed curve the fitting process correctly fits it to the generated values, not the curve seen in this study. This relationship suggests an interesting but as yet unknown property of the Amazon forests but may represent life-history trade-offs (Uriarte et al., 2012). Trees have different strategies such as live-fast die-young pioneer species versus 5 grow-slow live-long canopy species. This is one possible explanation of the relationship between µ1 and φ, as when both are high the early growth at small size will be slower but keep increasing, while when φ and mu 1 are both low the early growth will be higher but more quickly level off. Interestingly no plots had low phi, with high µ 1 , which would correspond to uncompetitive low growth at all sizes. As these results are at the plot level rather than per tree basis, it would suggest that each site has a dominance of one life-history strategy. As there is no correlation of µ1 or φ with plot metrics such as height or wood 10 density, this hypothesis remains unconfirmed.
MSTF was rarely a good fit at plot, regional or all plots level for either trunk diameter or mass distributions, and significantly overestimated total biomass density, so we reject the MSTF model as a good model of forest size-distributions. This rejection is consistent with the recent study by Zhou and Lin (2018) that showed the MSTF model failed to account for the effect of sizedependent growth rate on how fast a tree transitions through a given size class. This observation explains that the assumptions 15 of MSTF of the size distribution scaling D −2 is inconsistent with the assumption of individual tree resource use scaling as D 2 . Here, we have confirmed the D −2 (and m −11/8 ) size-distribution model should be rejected for South American tropical forests. Furthermore, for most plots we can reject a general power law distribution, as the distributions observed are rarely linear when plotted in log-log space.
There was a strong correlation between sample size and how likely MSTF was to be considered either the best or a acceptable 20 model, with small sample sizes favouring MSTF. This suggests that sample sizes may lead to difficulty identifying the best model or even wrongly choose the best model, most likely as rarer large trees are more likely to be absent from a small sample.
Meaning, where practical, larger forest plots of at least a 1000 stems are desirable when analysing size-distributions.
All three models of size distribution were used to predict total biomass density by the integration of the analytical form of their respective mass distributions. One interesting implication of the resulting equations for DET is that mortality and growth 25 only ever appear in the form of the ratio µ 1 and never independently. The ratio of mortality to growth therefore determines the equilibrium state of a forest, while the absolute magnitudes of the individual mortality and growth terms determine the transient effects away from a steady state.
When considering how well the models predicted total biomass density from the fitted size-distribution, the biggest source of error at the plot scale is the model assumption of infinite maximum tree size. However, this can be corrected for and allows 30 the 1-parameter DET-LTWD to estimate biomass density with relative root mean square error of 10% over the 124 forest plots and 2-parameter DET-LTWD within 6%. Conversely, the MST model consistently overestimated the biomass density, often by a considerable margin. The regional scale, which has larger sample size, showed much better prediction of the biomass density and the 2-parameter DET-LTWD with finite upper bound had the smallest error in biomass density. This suggests the DET-LTWD model is a useful model of biomass for large-scale applications such as being used to initialise a DGVM based on the continuity equation equation (Argles et al., 2019) or as a climate relevant measure of goodness of fit.
One of our priorities for further work is to investigate whether the commonality found in the values of µ 1 and the relationship between µ 1 and φ is indicative of some form of optimality operating at the forest scale.
This study demonstrates that demographic equilibrium theory (DET) is able to fit measured tree size-distributions in Amazonian forests. The fitted growth scaling parameter φ was clustered for both trunk diameter (0.31 ± 0.02) and mass diameter (0.71 ± 0.01) distributions close to the values predicted by Metabolic Scaling Theory (MST). The small bias seen could be indicative of deviations from MST allometry due to light competition. The fitted mortality: growth ratio parameter µ 1 was clearly related 5 to the fitted φ parameter suggesting a possible life-history trade-off in the forest plots. If the DET φ is constrained to the MST value then the fit is often as good as the 2-parameter fit and with one less fitting parameter is preferred by the Bayesian Information Citerion and µ 1 clusters with a value (0.25 for trunk diameter) close to that of 0.22 previously reported for US forests. We therefore find evidence that the 1-parameter DET is useful in modelling forests on the global scale, particularly for applications where parameter sparsity is important (Argles et al., 2019). Further support for such applications comes from the 10 model's ability to replicate forest biomass density over large scales, when compared to the data. The relationship between µ 1 and φ and a common value of µ 1 between the US and Amazon may indicate some optimality principle is in play.
Code availability. Code is available on reasonable request to the corresponding author.
Author contributions. J.R.M. and P.M.C. conceived the project. J.R.M. carried out the data analysis, wrote the paper and prepared the figures.
K.Z., A.A. and C.H. gave much invaluable advice on analysis, mathematics and the general direction of the project as well as commented on