Seed traits and phylogeny explain plant’s geographic distributions

. Understanding the mechanisms that shape the geographic distribution of plant species is a 25 central theme of biogeography. Although seed mass, seed dispersal mode and phylogeny have long been suspected to affect species distribution, the link between the sources of variation of these attributes and their effects to the distribution of seed plants are poorly documented. This study aims to quantify the joint effects of key seed traits and phylogeny on species‟ distribution. We collected seed mass and seed dispersal mode from 1,426 species of seed plants representing 501 genera of 122 families and used 30 4,138,851 specimens to model species distributional range size. Phylogenetic generalized least squares regression and variation partitioning were performed to estimate the effects of seed mass, seed dispersal mode and phylogeny on species distribution. We found that species distributional range size was significantly constrained by phylogeny. Seed mass and its intraspecific variation were also important in limiting species distribution, but their effects were different among species with different dispersal 35 modes. Variation partitioning revealed that seed mass, seed mass variability, seed dispersal mode and phylogeny together explained 46.82% of the variance in species range size. Although seed traits are not typically used to model the geographic distributions of seed plants, our study provides direct evidence showing seed mass, seed dispersal mode and phylogeny are important in explaining species geographic distribution. This finding underscores the necessity to include seed traits and the phylogenetic history of 40 species in climate-based niche models for predicting the response of plant geographic distribution to climate change.

5 phylogenetic relatedness could invoke biogeographic limits to expansion (Martin and Husband, 2009) 95 or promote the evolutionary divergence of species and the variation in seed traits (Donoghue et al., 2001;Moles et al., 2005). Although a species" geographic range could well be dependent on its evolutionary history (Felsenstein, 1985), few studies have included phylogeny to discern the effect of seed traits on species distribution.
In this study, we attempted to quantify the effects of seed mass, intraspecific seed mass variation, 100 dispersal mode and phylogeny on species geographic range size. We hypothesized that species possessing small seeds with high variability in seed mass, coupled with a strong dispersal capacity, would have larger distributional range sizes than species with contrasting seed traits, and furthermore, species distribution range would beis phylogenetically conserved. We collected data on seed mass and seed dispersal mode from 1,426 plant species distributed mainly across China. We specifically aimed to 105 answer two questions: (1) What are the joint effects of seed mass, seed dispersal and phylogeny on species geographic range size? and (2) Are there significant phylogenetic signals associated with species geographic range size? from populations within the natural distribution range of the species, and dried for 1 to 6 months in a drying room where the relative humidity and temperature were maintained at 15% and 15°C, respectively. After drying, 50 seeds were randomly sampled from each population for five times (sampling with replacement) and weighed the sampled seeds to the nearest 0.1 mg each time, resulting 120 in five weights for the population. The five weights were averaged and converted to the 1000-seed weight of the population. For each species, the 1000-seed weights across all populations were further averaged and this "grand" average was used as the seed mass for the species. Seed mass variability (i.e., intraspecific variation in seed mass), ranging from zero to one, was calculated for each species as the absolute difference between the maximum 1000-seed weight and the minimum 1000-seed weight across 125 all the populations of the species divided by the maximum value, which is a common measure of plant trait variation (Valladares et al., 2000;Rozendaal et al., 2006). This measure is more suitable than the coefficient of variation (CV), which is sensitive to small changes in mean values when the mean is close to zero; and some plants in this study, such as orchids, have very small seed mass.

130
In this study, we estimated the distributional range size for each of the 1,426 species using ArcGIS10.2 from the global distribution of the species. Thus, the range sizes of the species were the global distribution range. Firstly, the specimen distributional information of each species was obtained from the Global Biodiversity Information Facility (GBIF:.org, https://doi.org/10.15468/dl.umswqd, accessed on 04 August 2019), the Chinese Virtual Herbarium (http://www.cvh.ac.cn/) and the Biodiversity of the 135 Hengduan Mountains and Adjacent Areas of South-Central China databasewebsites (BHMAASCC: http://hengduan.huh.harvard.edu/fieldnotes). Specimens lacking data on GPS locations, having duplication, containing incorrect coordinates, and those taken from gardens and small oceanic islands were filtered out from our analysis. In addition, species that were cultivated, introduced, invasive, or naturalized were also excluded from our dataset. After excluding these species records, 4,138,851 140 specimens of the 1,426 seed plant species were obtained. Secondly, shapefile (containing points) of each species was produced from the coordinates of the specimens. The shapefile was transformed into raster using the World Sinusoidal Projection at a spatial resolution of 100 km using ArcGIS10.2 (ESRI, Redlands, CA, USA). The distributional range size of each species was calculated by multiplying the number of grids the raster contained by 10,000 km 2 (100 x 100 km). In order to assess the impact of 145 different spatial resolutions used in calculating species distributional range size, raster with the spatial resolution of 50 km was also used to calculate the range size. Because the distributional range size calculated at this resolution was highly correlated with the distributional range size calculated at the resolution of 100 km (r = 0.993, P < 0.001; Fig. A1), we thus only used the distributional range size calculated at the spatial resolution of 100 km in subsequent analyses.

Dispersal modes
Based on the published literature and floras, dispersal modes were classified to autochory (self-dispersal, e.g., by explosive seed release from fruits or gravity, n = 223 species), zoochory (dispersal by animals through ingestion or attachment to an animal body, n = 468 species), and anemochory (dispersal by wind, n = 735 species) according to the morphological features of their seeds or fruits 155 (Pé rez- Harguindeguy et al., 2013). For example, seeds or fruits with wings, hairs or pappus were considered wind dispersed (anemochory); seeds or fruits with an aril or flesh offering a succulent reward for consumers were classified as zoochory; and seeds or fruits lacking modifications pertaining to the other two categories were classed as autochory (unassisted dispersal) (Qi et al., 2014). 160 For all the species used in our analysis, the scientific names were checked and standardized according to the Plant List (http://www.theplantlist.org/). Different varieties and subspecies of a given species were considered to belong to the same species. The phylogenetic tree was extracted from a previously published supertree using the "phylo.maker" function in R package V.PhyloMaker (Jin and Qian, 2019), which was based on the APG classification of flowering plants (Zanne et al., 2014). The "multi2di" 165 function in the ape package was used to randomly resolve polytomies in the phylogenetic tree. To test the phylogenetic signal in species distribution, "phylosig" function in the R package phytools was used to calculate Pagel"s , which is ranged between 0 and 1.  = 0 means that the evolution of the trait is phylogenetically independent, and  = 1 indicates that trait evolution follows the Brownian motion. Any value of  significantly higher than zero is regarded to have a phylogenetic signal approaching Because closely related species tend to have similar traits, interspecific analyses can be compromised by phylogenetic relatedness (Felsenstein, 1985;Lynch, 1991). In our case, species" range size is not phylogenetically independent. We thus used a phylogenetic generalized least squares (PGLS) regression to determine the effects of seed mass (SM), intraspecific variation in seed mass (ISM) and 175 dispersal mode (DM) on the distributional range size (RS) of species (Swenson, 2014). The SM×DM and ISM×DM interaction terms were also included in the PGLS model, in order to show effects of SM and ISM on distributional range size among dispersal modes. The regression model was RS = β 0 + β 1 SM + β 2 ISM + β 3 DM + β 4 SM×DM + β 5 ISM×DM. The PGLS was implemented using "gls" function in nlme package, and the possible phylogenetic dependence in species" range size was incorporated in a 180 form of a phylogenetic variance-covariance matrix in gls.

Construction of phylogenetic tree and statistical analyses
We further used "varpart" function in vegan package to partition the variances in range size explained by seed mass, seed mass variability, dispersal mode, and genus (regarded as phylogeny).
Because our phylogenetic tree had some polytomies at the species-level, genera were used as a surrogate in the phylogeny. Variation partitioning is a linear model, which does not require the type of explanatory variables, and hence is suitable to our data structure (Borcard et al., 2018).
In the analyses of this study, the values of species range size and seed mass were log e -transformed to reduce data skewness and downplay extreme values; and the log e -transformed seed mass and seed mass variability were standardized to make their coefficients (i.e., effect size) comparable. Seed mass and seed mass variability were each standardized by subtracting the smallest value across all 1,426 190 species and divided by the difference between the largest value and the smallest value. All statistical analyses in this study were conducted using R4.0.2 (R Core Team, 2020).

Effects of phylogeny on species distributional range size
We detected a strong phylogenetic signal in species distributional range size for the sampledtudy species 195 ( = 0.627, P < 0.001), with the signal being stronger in gymnosperms ( = 0.975, P < 0.05) than in angiosperms ( = 0.423, P < 0.001). The phylogenetically closely related species had more similar range size than that for distantly related species.

Effects of seed traits on species distributional range size
The results of the phylogenetic generalized least squares regression showed that seed mass had a 200 negatively strong association with species distributional range size (effect size = -13.974, P < 0.001; Fig.   1, Table A1), while the effect of seed mass variability on species distributional range size was not significant (effect size = 0.459, P = 0.109). Dispersal mode was also significantly associated with species" range size. In the PGLS model, autochorous (explosive/gravity dispersal) species was treated as the baseline dispersal mode. Compared to zoochory (dispersal by animal ingestion or attachment to an 205 animal body) and anemochory (dispersal by wind), autochorous species had significantly larger range size after the effects of seed mass and seed mass variability were accounted in the interaction terms between seed traits and dispersal modes (Fig. 1, Table A1). The interaction terms between seed mass/seed mass variability and dispersal mode (i.e., seed massanemochory, seed masszoochory and seed mass variability×zoochory) were significantly positive (effect size = 7.527, P < 0.001; effect size = 210 12.637, P < 0.001; effect size = 1.824, P < 0.001 respectively), indicating the distributional range sizes of anemochorous and zoochorous species were strongly subject to seed mass and its intraspecific variation (Fig. 1, Table A1).

Joint effects of seed traits and phylogeny on species' range size
Variation partitioning showed that the effects of seed mass, seed mass variability, dispersal mode and 215 phylogeny together explained 46.82% of the variance of species" range size (Fig. 2). Of the explained variation, seed mass (including mass variability) contributed a pure 11.38% fraction, phylogeny contributed a pure 21.31%, and a small fraction from the pure dispersal mode (0.72%). We also noted a considerable joint effect of seed traits and phylogeny (13.41%) on species" range size (Fig. 2).

The relationship between phylogeny and species distributional range size
We found a significant phylogenetic signal associated with species distributional range size. This result suggests that closely related species are more similar in distribution range size than distantly related species. It corroborates some studies (e.g., Hunt et al., 2005;Martin and Husband, 2009), but does not support those of Webb and Gaston (2003) which showed the distributional range sizes of closely related 225 species were not more similar to each other than expected by chance. This discrepancy may be due to the different evolutionary history of the studied taxa as well as the heritability of their life-history traits, which can play a critical role in the establishment and persistence of species, and thus influence their distributional range sizes (Angert and Schemske, 2005;Umaña et al., 2018). It is worth noting that Webb and Gaston (2003) studied birds that have much stronger dispersal ability than seed plants, which 230 may explaincontribute to the difference between the twoour studies. Seed traits associated with range size can also change over evolutionary time, which in turn could alter the range size of a species" distribution (Blomberg et al., 2003). Furthermore, the geographic distribution range of a species can be influenced by its ecological tolerances associated with life-history traits (Geber and Griffen, 2003; Latimer and Zuckerberg, 2021). Our results imply that the geographic distribution of related plant 235 species may have a similar response to patterns of climate change at a regional scale, due in part, to phylogenetic constraints on the distributional range of species. Here, it seems likely that closely related species have commonly evolved seed traits that result in shared adaptative strategies to climate change, although this causal mechanism requires further empirical study in the field. 240 We found a very strong negative relationship between seed mass and species range size, meaning larger seeds having smaller range size (Fig. 1, Table A1). This result is consistent with previous studies that also found a significant relationship between seed mass and range size (Morin and Chuine, 2006; Procheş et al., 2012). Different from the effect of seed mass, seed mass variability had no or a weak positive association with distributional range size.

245
The PGLS model showed that the range sizes of zoochorous (animal-dispersed) and anemochorous (wind-dispersed) species were significantly smaller than that of autochorous (explosive/gravity dispersed) species (Fig. 1). This may appear counterintuitive at the first glance but was resulted after the effects of the interactions between seed mass (and mass variability) and dispersal mode were taken accounted. These strong positive interaction terms (except the interaction between seed mass variability 12 and wind dispersal) shown in Fig. 1 indicate that the range sizes of species with different dispersal modes are strongly subject to seed mass (and also mass variability). For example, zoochorous species with large seed mass and mass variability have significantly larger range size than species that have similar seed traits but dispersed by explosive gravity. This dependence of species distributional range size on the interactions between seed mass and dispersal mode is further confirmed by a simpler PGLS 255 model that excludes all the interactive terms between seed mass (and mass variability) and dispersal mode. The results of this model in Appendix Table A2 show that zoochorous species had significantly larger range size than that of autochorous and anemochorous species (P < 0.001), while the latter two groups were not significantly different (P = 0.257).
Although intraspecific seed mass variability did not seem to affect distributional range size of 260 autochorous and anemochorous species, the variability was strongly positively associated with range size of zoochorous species. This may be because species with large variation in seed mass could have greater colonization ability in various habitats and seeds of zoochorous species with long dispersal distance have more chances to arrive at heterogeneous habitats than seeds of autochorous and anemochorous species. Given that small-and large-seeded species are shown to adapt to different 265 habitats (Silvertown, 1989), it seems likely that zoochorous species may experience trade-offs between competition ability and dispersal ability through seed mass variation (Chen et al., 2018), resulting in a similar effect for seed mass on species distributional range size at the geographic scale.
It is interesting to note that Sides et al. (2014) found that species with greater intraspecific variation in specific leaf area (SLA) have wider ecological breadth. Due to its potential role in modulating the 270 response of plant species to environmental changes, greater intraspecific functional variability enables species to adjust to a wider range of competitive and abiotic conditions (Sides et al., 2014;Basnett and Devy, 2021). Plastic responses of seed mass to heterogeneous environments may be related to molecular signals at a single gene or acrossof the entire genome (Nicotra et al., 2010) and thus influence the distributional range size of species (Savolainen et al., 2007). Distributional patterns of plant species may 275 reflect the fact that individuals within a species have different levels of genetic variation in association with seed mass, thus facilitating the species to adapt to a broad spectrum of environments (Völler et al., 2012).

Effects of seed mass, seed dispersal and phylogeny on species' range size
Our results show that seed traits and phylogeny jointly affect species distributional range size, 280 indicating that species distribution may be limited by ecological and evolutionary processes (Fig. 2).
There are two possible reasons for this relationship: (1) the evolution of both seed mass and dispersal mode is phylogenetically conserved (Gallagher and Leishman, 2012;Chen et al., 2018;Kang et al., 2021); and (2) seed mass and seed dispersal mode are not evolutionarily independent but are constrained by evolutionary history, e.g., phylogenetic divergences in dispersal syndrome is related to 285 divergences in seed mass (Moles et al., 2005). However, we also need to recognize that more than 50% of the variance in species distribution in our study remains unexplained. This result suggests that climatic tolerance, competition, colonization ability and other geographic factors could also be important for affecting species distribution (Morin and Chuine, 2006).

290
This study provides evidence that seed mass, intraspecific seed mass variation, seed dispersal mode and phylogeny contribute to explaining species distribution variation on the geographic scale. We found that (1) species distributional range size was significantly constrained by phylogeny, seed mass and its intraspecific variability, and seed dispersal mode; (2) the effects of seed mass and seed mass variability on species distribution varied among dispersal modes; and (3) seed mass, dispersal mode and phylogeny 295 together explained 46.82% of the variance associated with species distributional range size. Despite that more than half of the variation in species distribution is left unexplained, our study clearly shows the importance of including seed life-history traits in modeling and predicting the impact of climate change on species distribution of seed plants. 300 Data availability. The data are available from the freely accessible databases cited in the manuscript.
Authors contribution. DZL, LMG and FH designed the study; KC and XYY collected data; KC conducted statistical analysis and generated the graphs; KC, KSB and LMG wrote the manuscript; DZL, FH and XYY revised the manuscript. All authors reviewed and approved the final manuscript.    Table A1. The phylogenetic generalized least squares regression for modeling the effects of seed mass, seed mass variability, dispersal mode, seed mass × dispersal mode and seed mass variability × dispersal mode interaction terms on species distributional range size. The graphic presentation of the results of this table is given in Figure 1 in the main text.