Geodiversity inﬂuences limnological conditions and freshwater ostracode species distributions across broad spatial scales in the northern Neotropics

. Geodiversity is recognized as one of the most important drivers of ecosystem characteristics and biodiversity globally. However, in the northern Neotropics, the contribution of highly diverse landscapes, environmental conditions, and geological history in structuring large-scale patterns of aquatic environments and aquatic species associations remains poorly understood. We evaluated the relationships among geodiversity, limnological conditions, and freshwater ostracodes from southern Mexico to Nicaragua. A cluster analysis (CA), based on geological, geochemical, mineralogical, and water-column physical and chemical characteristics of 76 aquatic ecosystems (karst, volcanic, tectonic) revealed two main limnological regions: (1) karst plateaus of the Yu-catán Peninsula and northern Guatemala, and (2) volcanic terrains of the Guatemalan highlands, mid-elevation sites in El Salvador and Honduras, and the Nicaraguan lowlands. In addition, seven subregions were recognized, demonstrating a high heterogeneity of aquatic environments. Principal component analysis (PCA) identiﬁed water chemistry (ionic composition) and mineralogy as most inﬂuential for aquatic ecosystem classiﬁcation. Multi-parametric analyses, based on biological data, revealed that ostracode species associations represent disjunct faunas. Five species associations, distributed according to limnological regions, were recognized. Structural equation modeling (SEM) revealed that geodiver-sity explains limnological patterns of the study area. Limnol-ogy further explained species composition, but not species richness. The inﬂuence of conductivity and elevation were individually evaluated in SEM and were statistically signiﬁcant for ostracode species composition, though not for species richness. We conclude that geodiversity has a central inﬂuence on the limnological conditions of aquatic systems


Introduction
Geodiversity is defined as the natural variety of geological (bedrock), geomorphological (elevation), soil features and assemblages (mineralogy, sedimentology), fossils, and hydrological features of a landscape (Gray, 2004(Gray, , 2019Schrodt et al., 2019). Geodiversity influences aspects of regional or local climate (Vartanyan, 2006b;Hu et al., 2020), and by interacting with the biosphere and the atmosphere, can also contribute, through sediment delivery, to the input of nutrients into ecosystems and determine the chemical composition of environments (Vartanyan, 2006a;Bravo-Cuevas et al., 2021). Geodiversity of a given region can be evaluated from different perspectives by using selected indicators such as elevation or a broad range of measured elements (Gray, 2018;Zarnetske et al., 2019;Bravo-Cuevas et al., 2021;de Paula et al., 2021).
Biodiversity, defined as the variety of life forms in a place on Earth (Huston, 1995), is strongly related to geodiversity, as species are and have been distributed in response to landscape features in the geological time (Mittelbach et al., 2007;Etienne and Apol, 2009;Smith et al., 2010;Bryson et al., 2013;Gillespie and Roderick, 2014;Steinbauer et al., 2016). Linkages between biodiversity and geodiversity are complex and usually difficult to track, since geodiversity patterns and ecological processes may be discernible only to a certain geographical extent, from landscapes to territories Alahuhta et al., 2020;Ren et al., 2021). Understanding causal relationships between geodiversity parameters and biodiversity patterns is currently a priority and a hot topic globally because of their relevance for conservation (bio-and geoconservation; Crofts, 2019), ecosystem management (Bravo-Cuevas et al., 2021), and prediction of ecosystem responses to future climate scenarios (Martiny et al., 2006;Hulsey et al., 2010;Jiménez-Alfaro et al., 2018).
In areas with highly dynamic and complex biological systems, such as tropical regions (Rull, 2011;Antonelli et al., 2018;Matzke-Karasz et al., 2019;Moguel et al., 2021), it is difficult to discern the spatial and temporal contribution of either individual or joint geodiversity-related factors that have shaped the regional species pool (Rossetti and Toledo, 2006).
The northern Neotropical region extends from centralsouthern Mexico to Central America and includes the Caribbean. It is characterized by a dynamic geological history, caused by the interplay of the North American, Cocos, and Caribbean tectonic plates (Molnar and Sykes, 1969;Marshall, 2007). The region is characterized by broad ranges of elevation and soil types, displays frequent volcanic and seismic activity, and has been subjected to repeated marine regressions and transgressions (Brezonik and Fox, 1974;Horn and Haberyan, 1993;Umaña et al., 1999;Haberyan et al., 2003;Obrist-Farner et al., 2021).
Numerous studies have attempted to elucidate the indirect relationship between geodiversity and biodiversity and identify the factors that account for the current biogeographic patterns in the northern Neotropics (Wallace, 1853;Patton et al., 1994;Gillespie and Roderick, 2014). Most evidence from terrestrial taxa suggests in situ diversification, resulting from repeated colonization events by North and South American taxa, before and after the closure of the Isthmus of Panama, estimated to have occurred between 15 and 4 Ma (Bacon et al., 2015;Montes et al., 2015). Molecular evidence suggests that extant Mesoamerican terrestrial taxa (i.e., angiosperms, ferns, birds, reptiles, and mammals) originated primarily in the Amazon Basin, with ancestors arriving by dispersal during the last 10 Myr (Antonelli et al., 2018).
These large-scale species movements between the American continents were mainly associated with large-amplitude Pleistocene climate fluctuations such as glacial and interglacial cycles and episodes of shorter, centennial to millennial fluctuations, including the Last Glacial Maximum, Heinrich, and Dansgaard-Oeschger stadials (Behling et al., 2000;Carnaval and Moritz, 2008;Bouimetarhan et al., 2018;Baker et al., 2001Baker et al., , 2020. In aquatic environments of the northern Neotropics, relationships between geodiversity and biodiversity are less well known than those that operate in terrestrial environments. The limnological conditions of a region, defined as the set of physical, chemical, and biological components of inland waters, and its interactions with terrestrial, atmospheric, anthropogenic, and geological elements (Last, 2002;Azim, 2009;McCullough et al., 2021), are key to understanding such relationships. Limnological conditions represent the interface between geodiversity and biodiversity in aquatic environments and are generally accepted as a fundamental driver of the diversification and distribution of aquatic species (Matamoros et al., 2015).
During the last 50 years, anthropogenic influences on limnological conditions of aquatic environments have altered biodiversity and species distributions because of the modification of natural conditions (Albert and Reis, 2011;Wehrtmann et al., 2016;Franco-Gaviria et al., 2018). Currently, most aquatic environments in the northern Neotropics are used as potable water sources and for agriculture; in large lakes, fishing and aquaculture have caused eutrophication and introduced invasive species (e.g., Oreochromis niloticus). In addition, some lakes located close to urban centers and farmlands are used as disposal sites for waste waters, agrochemicals, and mine residues (McKaye et al., 1995;Soto et al., 2020). The high heterogeneity of geodiversity in the northern Neotropics, and the poor knowledge of the limnological conditions of aquatic ecosystems in the region make it difficult to (1) understand the complex relations among geodiversity, water-column physicochemical variables, and biological characteristics (community composition and species richness), and (2) distinguish different limnological regions and determine their geographic coverage.
Freshwater ostracodes are a well-suited group to evaluate past and present drivers of species distribution in the northern Neotropics. Ostracodes are bivalved microcrustaceans that are abundant, diverse, and widely distributed in aquatic ecosystems (Pérez et al., 2011b(Pérez et al., , 2013Cohuo et al., 2016Cohuo et al., , 2020Macario-González et al., 2018;Echeverria-Galindo et al., 2019). This taxonomic group shows levels of endemism (restricted distribution) as high as 74 %, sometimes confined to a single lake, or found in surface waters throughout the region (Cohuo et al., 2016). In sediment sequences from lakes of the northern Neotropics, ostracode remains are abundant, particularly in late Pleistocene deposits (Pérez et al., 2011b(Pérez et al., , 2013Cohuo et al., 2020). The greatest limitation for using freshwater ostracodes to identify drivers of species distribution in the northern Neotropics is the scarcity of integrated and comparable regional studies, for which detailed spatial and temporal limnological and biological data were collected.
In this study using a set of selected geodiversity variables (geology, mineralogy), measured limnological conditions (physical and chemical variables of water) and biological data of freshwater ostracodes from southeast Mexico to Nicaragua, we aim to answer two main questions: (1) To what extent does geodiversity control limnological variables and thus define limnological regions? (2) How do geodiversity and limnological conditions influence biological richness and diversity of ostracode species? 2 Materials and methods

Study area
Our study area covers the northernmost Neotropics, ranging from southern Mexico (Yucatán Peninsula) to Nicaragua (Fig. 1). This region is considered a biodiversity hotspot (Mesoamerican hotspot; Myers et al., 2000), with more than 5000 endemic vascular plants (De Albuquerque et al., 2015), and about 1120 bird (∼ 200 endemic), 440 mammal (65 endemic), 690 reptile (∼ 240 endemic), 550 amphibians (∼ 350 endemic), and 500 fish species (∼ 350 endemic) (CEPF, 2021). Species from North and South America converge in this region (Myers et al., 2000;Ojeda et al., 2003;DeClerck et al., 2010;Rull, 2011). The orography is highly irregular, and elevations range from sea level to more than 4500 m a.s.l. (Molnar and Sykes, 1969;Marshall et al., 2003;Marshall, 2007). Tectonic plate interactions are responsible for active volcanism along the Central American Volcanic Arc and high seismic activity (Marshall et al., 2003(Marshall et al., , 2007. The climate is typically tropical (Köppen-type group A-climate "tropical/megathermal climate", Peel et al., 2007) and predominantly warm at low elevations (26 • C mean annual temperature) (Waliser et al., 1999). Because of the irregular orography, at least a dozen climate subzones are distinguished (Taylor and Alfaro, 2005). Mean annual precipitation in the study area ranges from < 500 to > 3000 mm and is highly seasonal, governed by the seasonal migration of the Intertropical Convergence Zone (ITCZ). The northern position of the ITCZ during summer results in the so-called "rainy season", during which precipitation increases from ∼ 240 mm in April to more than 1600 mm in September-October (Hastenrath, 1967;Magaña et al., 1999). The hurricane season extends from July to December and is an important contributor to the humidity budget because on average, 300 mm d −1 of rain falls during tropical storms and hurricanes (Jury, 2011). The region is rich in aquatic systems that are of different origins, shapes, and hydrological dynamics, as well as water chemistry and sediment composition. The karst Yucatán Peninsula, for example, with about 8000 cenotes (sinkholes), many lakes, and both surface and subterranean rivers, is considered a unique hydrological region (Schmitter-Soto et al., 2002a;Alcocer and Bernal-Brooks, 2010). In Central America, lakes and wetlands cover more than 8 % of the total land area (Ellison, 2004). The most important aquatic ecosystems in the northern Neotropics include coastal, tectonic, and volcanic lakes, such as crater lakes and maars, karst waterbodies including lakes, cenotes, and aguadas (water accumulated in topographic depressions under canopy cover), flooded caves, subterranean rivers, and both permanent and ephemeral ponds (Brezonik and Fox, 1974;Pérez et al., 2011a;Delgado-Martínez et al., 2018;Echeverría-Galindo et al., 2019;Obrist-Farner and Rice, 2019).

Sampling aquatic environments in the northern Neotropics
A total of 76 aquatic ecosystems located in 5 countries across the northern Neotropical region ( Fig. 1) were sampled during July-October 2013, coinciding with the rainy season in the region. These systems are situated on the Yucatán Peninsula, Mexico (n = 28), Guatemala (n = 26), El Salvador (n = 14), Honduras (n = 5), and Nicaragua (n = 3) (Fig. 1). For all water bodies, limnological conditions such as temperature, dissolved oxygen, pH, and conductivity were measured in situ with a WTW Multi Set 350i multi-parameter probe at a water depth of 0.5 m. The maximum water depth at each site was measured with an echosounder Fishfinder GPSMAP 178C. The location of each site, including elevation, latitude, and longitude, was determined with a navigator Garmin GPSmap 60c.
Water samples for analysis of major anions (Cl − , SO 4 2− , CO 3 2− , HCO 3 − ) and cations (Ca 2+ , K + , Mg 2+ , Na + ) were collected at water depths of 0.5 m below surface using a Ruttner-type water sampling bottle. All water samples were filtered in situ using a 0.45 µm pore-sized Whatman glass microfiber filter. For the cation analysis, filtered samples were acidified with HNO 3 to pH 2. Waters were stored under refrigeration until analysis.
Biological samples were collected from the littoral zone and the deepest area of the profundal zone. In most lakes, we collected five samples in the littoral areas, using a 250 µm mesh hand net and five surface sediment samples collected from the littoral to the deepest area with an Ekman grab. Collection sites were distributed at regularly spaced intervals across the systems and were mostly characterized by submersed vegetation. For sediment samples, to ensure we collect only extant specimens, we only analyzed the uppermost 3 cm of each grab. In six large lakes, approximate to or exceeding an area of 100 km 2 , such as Atitlán, Amatitlán (Guatemala), Coatepeque, Ilopango (El Salvador), Nicaragua, and Masaya (Nicaragua), seven water and seven sediment samples were taken.
2.3 Environmental variables: water chemistry, sediment geochemistry, mineralogy, and geology Water chemistry analysis: Ionic composition was analyzed following Armienta et al. (2008). Bicarbonate was measured by acid titration to pH 4.6, using a mixed indicator of methyl red and bromocresol green. Concentrations of calcium and magnesium were obtained by complexometric titration with EDTA, whereas sodium and potassium were measured by atomic emission spectroscopy. Chloride was potentiometri-cally determined using an ion-selective electrode, adding a 5 M solution of NaNO 3 as an ionic strength adjuster. Sulfate was determined by turbidimetry. Analytical quality was checked by ionic charge balance and most samples were balanced, with < 5 % error. Major ion concentrations were expressed in mg L −1 , but the data were transformed to meq L −1 and percentages to determine anion and cation dominance and water type and test the charge balance. Sodium and potassium were summed. Ternary plots were constructed using the PAST 4.03 software (Hammer et al., 2001). Geochemical analysis: Total carbon (TC) and total nitrogen (TN) contents in sediments were determined by combustion under oxygen saturation with a LECO TruSpec Macro CHN analyzer. Total inorganic carbon (TIC) was quantified with a Woesthoff Carmhograph C-16 after dissolution with phosphoric acid (45 % H 3 PO 4 ) and detection of the CO 2induced conductivity change in NaOH. Total organic carbon (TOC) was calculated by subtracting TIC from TC.
Mineralogical analysis: Qualitative and semi-quantitative mineralogical compounds in sediments were examined by x-ray diffraction with a RIGAKU Miniflex600. For the identification and semi-quantification of the minerogenic components, the software Philips Highscore was used. All sediment analyses used are described in detail in Vogel et al. (2016).
Geological analysis: The geological map of North and Central America generated by the Geological Society of America (GSA) (Reed et al., 2005) and adapted and converted to a geographic information system (GIS) by Garrity and Soller (2009), was used to identify geological regions in our study area. The ArcGIS software was used to identify geological attributes of sampling sites such as bedrock type and age of sediments. Three major types of bedrock were distinguished: sedimentary, volcanic, and plutonic rocks. Ten geologic periods and epochs, respectively, were defined: Jurassic, Cretaceous, Tertiary of undetermined age, Paleogene of undetermined age, Paleocene, Eocene, Oligocene, Neogene of undetermined age, Miocene, and Quaternary.
Prior to the statistical analysis, numerical data were log-transformed, except for pH, which is already a logtransformed value, to achieve an approximate normal distribution of variables. Normality was verified for all variables using the Shapiro-Wilk test. Missing data represented < 8 % of the data set and were substituted with the mean value of the respective variable (Jakobsen et al., 2017).
We performed a cluster analysis (CA) to define groups of lakes based on the similarity of their measured attributes. For this analysis, we included all numerical variables. We used the unweighted pair group method with arithmetic mean (UPGMA) for the CA, and Euclidean distance to investigate the grouping similarity of sampling points. Calculations were conducted in R software (R Core Team, 2017), using the Vegan package (Oksanen et al., 2017).
We then used a Principal component analysis (PCA) for each of the main groups discriminated by the cluster, to identify correlated and explanatory variables of the data sets. For each group, the first PCA run included all 23 variables measured (numerical and categorical), and those represented by superimposed arrows in the graphs were considered correlated and excluded from further statistical analysis. A second PCA run, using uncorrelated variables, was used to iden-tify explanatory variables of the data sets. The PCAmixdata package implemented in R software (Chavent et al., 2014) was used because of its ability to handle quantitative and categorical data simultaneously. The loading values for all parameters were obtained using normalized rotation.
To provide a graphical representation of the most meaningful variables of the data sets detected in the PCA, we created an environmental variable-specific map using kriging interpolation. Resulting maps represent measured data and estimates from unmeasured locations. The software Surfer ® from Golden Software, LLC (https://www.goldensoftware. com, last access: 2 April 2022) was used for calculations.

Biological analysis: ostracode abundances, identification, and statistics
Ostracode extraction and counting were carried out using 15 cm 3 of wet sediment. Specimens were picked using a Leica MZ75 stereomicroscope and sorted according to external morphology. Only adult specimens, represented by single valves, empty carapaces or complete organisms (carapace and soft body) were counted. For morphotypes identification to species level, at least three individual adult specimens with complete soft parts were dissected. Identification using more than a single organism is appropriate in ostracodes, given morphological plasticity and mutations that can modify morphology in single specimens. Individuals were dissected using distilled water and glycerin (3 %) under a stereomicroscope. Selected shells from identified species were photographed with a Zeiss Axio Imager 2 microscope. Shells were stored in micropaleontologic slides. Dissected soft parts were mounted on individual slides with Hydromatrix ® . Species-level identification was conducted using species keys provided by Karanovic (2012). Taxonomic classification follows Cohuo et al. (2016). Undissected material was preserved in Eppendorf plastic vials with 70 % ethanol and is currently available at the ostracode collection of the Instituto Tecnológico de Chetumal, Mexico. The Shannon diversity index, used for species diversity metrics, was calculated with PAST 4.03 software (Hammer et al., 2001). Ostracode species associations: Species associations were examined by means of non-metric multidimensional scaling (NMDS) (Legendre and Legendre, 1998). This procedure generates an ordination in a two-dimensional space, representing the pairwise dissimilarity between species according to their occurrences. We used the Bray-Curtis dissimilarity coefficient on a presence-absence database (Sørensen-Dice coefficient), since count data were highly heterogenous (with species presence in subsets of samples and absences in most of the database), to which NMDS is sensitive. Only species with at least two occurrences were included in this analysis. In the NMDS graph, species associations were determined with a hierarchical CA based on Ward distances. To test the significance between species groups discriminated in the NMDS, a permutational multivariate analysis of variance (PERMANOVA) was performed. We used a permutation with 9999 replicates and applied the Bonferroni correction. Calculations were done with R software, using the Vegan package (Oksanen et al., 2017).

Structural equation modeling (SEM)
To disentangle the relationships between the geodiversity and limnological conditions with species composition (as a function of distribution) and richness we used structural equation modeling (SEM), the R software, and the package Lavaan (Rosseel, 2012). This is a multivariate statistical technique that enables one to model pre-defined causal relationships between observed (measured parameters) and latent variables which are not observed directly but rather mathematical modeled from observed variables (e.g., geodiversity and limnological conditions) and tests their statistical significance (Fan et al., 2016;Sarstedt and Ringle, 2020). Our conceptual model for SEM was based on the assumption that geodiversity variables in the northern Neotropics are heterogenous. Consequently, limnological conditions of aquatic systems were partially or entirely influenced by underlying geodiversity. At the regional scale, geodiversity and limnological conditions were expected to exert a direct or indirect influence on ostracode species richness and composition. The individual influence of variables that display environmental gradients such as elevation, conductivity, and TOC, were also tested in the models evaluated. In such cases, the variables were excluded from the construction of latent variables in the respective model. For this conceptual framework, "geodiversity" (latent and exogenous variables) was constructed with all or a subset of geological (bedrock type and age and elevation) and mineralogical variables. "Limnological conditions" (latent and endogenous variables) were constructed using geochemistry, and physical and chemical variables of water (major anion and cation, temperature, pH, and conductivity). Species richness was treated as an observed variable, and the latent variable "species composition" was constructed using NMDS associations. Using a covariance matrix with a set of uncorrelated variables, we fitted five models using this conceptual framework. For all models, statistical significance was tested with root mean square error of approximation (RMSEA), comparative fit index (CFI) and standardized root mean squared residuals (SRMR). The predictive power of the model (R-square) was measured based on the amount of variation of the biological data. The most parsimonious model fitting our data set was selected as the explanatory model (Sect. S1 in the Supplement).

Limnological regionalization in the northern Neotropics
The cluster analysis (CA) identified two main groups that represent limnological regions (Fig. 2). The first group (YG: Yucatán and Guatemala) consists of lowland lakes from the Yucatán Peninsula (Mexico), the Petén district (northern Guatemala) and the Pacific lowlands of southern Guatemala. The second group (GSHN: Guatemala, Salvador, Honduras, Nicaragua) consists of Guatemalan highland lakes, El Salvador and Honduras mid-elevation lakes, and Nicaraguan lowland lakes. Table S1 shows a list of all the studied aquatic ecosystems located in the YG and GSHN limnological regions and subregions, as well as detailed results of water physicochemical, geochemical, mineralogical, and geological measurements for all studied water ecosystems.
For the YG region, the first PCA run identified 13 uncorrelated variables. The second PCA run clearly explained the variation of the data set (Fig. 3a). The first (PC1) and second (PC2) components explained 38 % of the total variance of the data set ( Fig. 3a and Table S2.1 in the Supplement). The PC1 accounted for 23.4 %, and the PC2 for 14.6 % of the total variance, respectively. The biplot based on component 1 and 2 indicated that conductivity (ranging from 175 to 3479 µS cm −1 ) and related ion sodium (Na + ), chloride (Cl), and magnesium (Mg 2+ ) were the variables that exhibit the highest correlations (< 0.64) with the first principal component ( Fig. 3a-c). Thus, they represented the most influential variables differentiating aquatic ecosystems in the YG region. The pH, ranging from 6.9 to 9.9, was highly correlated (> 0.73) with the second component (PC2), suggesting that it is the second most influential variable of the YG aquatic environments ( Fig. 3a and d and Table S2.1). Figure 3b-d show regional distribution of the most meaningful variables for the YG region.
For the GSHN region, the PCA based on 13 uncorrelated variables, explained 49.6 % of the total variance of the data set, within the first (PC1) and second (PC2) components ( Fig. 4a and Table S2.2 in the Supplement). The first component (PC1) accounted for 26.8 % and PC2 for 22.8 % of the total variance. The PCA biplot based on components 1 and 2, respectively, showed that water ionic composition, particularly content of bicarbonates (HCO − 3 , ranging from 4 to 373 mg L −1 ), sodium (Na + , ranging from 2 to 180 mg L −1 ), and chloride, were the most influential in discriminating aquatic systems in the GSHN region, as it correlated > 0.62 with the first component (Fig. 4a-c and  Table S2.2). Geochemical and geological variables such as TOC (ranging from 5 % to 22.9 %), age and bedrock were the second most influential variables as they are strongly correlated with the second component (> 0.75) (Fig. 4a and d). Cluster analysis dendrogram of the 76 studied aquatic ecosystems using 23 water physicochemical, sediment geochemistry, mineralogical, and geological variables. Blue line indicates cut-off criterion for cluster partition. Two major groups and seven subgroups were detected and named according to their position in the study area: Yucatán Peninsula/Northern Guatemala (YG) and Guatemala highlands, El Salvador, Honduras, Nicaragua (GSHN). In order to provide a graphic representation of cluster grouping, a color bar was assigned to each group and lakes within these groups were plotted on the (b) YG map and (c) GSHN map using the same color. Full lake name codes presented in cluster and maps are given in Table S1. Numbers of lake name codes correspond to that in Fig. 1. Figure 4b-d show the spatial distribution of meaningful variables for the GSHN region.
Water ionic dominance was graphically evaluated with ternary plots (Figs. S1 and S2 in the Supplement) and information on lake water types is shown in Table S1. Here, we highlight the most relevant characteristics for YG and GSHN limnological subregions.

The YG region composed by karst aquatic systems
In the YG limnological region, four subregions were identified in the cluster analysis. The first subregion (YG1; n = 9) within YG included systems located in central-southern Yucatán and in the northern Petén district in Guatemala (Fig. 2). Lakes are located in the lowlands (< 170 m a.s.l.) and most of them are relatively shallow (< 16 m depth), such as Bacalar, Encantada, Sabana Chetumal, Rosario, Salpetén, Chichancanab, except for Lake Petexbatún (40 m depth) and Lachuá (378 m depth). The latter constitutes the deepest lake in the study area. Waters of these systems are dominated primarily by sulfates, followed by calcium and magnesium. The YG limnological subregion YG2 (n = 15) contained lakes that are mostly restricted to the Guatemalan lowlands (Fig. 2). Most of these systems are large lakes, including Lake Petén Itzá, one of the largest (100 km 2 ), deepest (165 m), and oldest (∼ 400 ka) lakes of the northern Neotropics. Lake waters are dominated by carbonates, and therefore calcium prevails in these lakes. Aquatic systems located in the central and northern portion of the Yucatán Peninsula were grouped in YG3 (n = 11) (Fig. 2). Colac, Sabak ha, Mucuyche, Oxolá, Gruta Miguel Hidalgo, and Yumku are cenotes in northern Yucatán, and lakes Yalahau, Caobas, Kaná, Señor, and Muyil are located in the central-northern Peninsula. Carbonates dominate lake waters and chloride shows high values. Limnological subregion YG4 (n = 10) was constituted of lakes located in the central-southern Yucatán Peninsula (Fig. 2). Lakes are relatively large, shallow, and far from the Caribbean and Gulf of Mexico coasts, at least > 60 km, e.g., Silvituc and Chacanbacab. Waters are dominated by carbonates and calcium. The mineralogical analysis reveals that most lakes are dominated by carbonates in the YG limnological region. In subregions YG1 and YG3 (central and northern Yucatán Peninsula), most lake sediments have calcite as the dominant mineral (Table S1 and Fig. S2). Lakes Chichancanab and Salpetén both belong to YG but show carbonates with a codominance of phyllosilicates and gypsum. Lakes from YG2 are mainly dominated by phyllosilicates and feldspars. The mineralogical composition of lake sediments of the YG4 subgroup varies. Sediments of the lakes such as Vallehermoso, Emiliano Zapata, and Chacanbacab are dominated by phyllosilicates with or without feldspars. Sediments of lakes such as Miguel Hidalgo, on the other hand, are dominated by carbonates, with calcite as main mineral, whereas Lake Silvituc is characterized by exotic minerals such as silver and gold (Table S1).

The GSHN limnological region is composed by volcanic aquatic systems
In the GSHN region, three subregions were identified by CA. Lakes of GSHN1 (n = 16) are located in Central America at low and middle elevations, ranging from 17 to 689 m a.s.l.
Lake waters show a clear dominance of carbonates, followed by sodium, potassium, and magnesium (Table S1 and Fig. 4). The GSHN2 is composed of three of the largest lakes in Central America: Ilopango, Coatepeque, and Nicaragua. These three lakes originated from volcanic activity. Lakes Ilopango and Coatepeque are caldera lakes, whereas Lake Nicaragua surrounds Volcanoes Concepción and Maderas. There is no clear pattern for ions in the water column, but sodium and potassium dominated, followed by magnesium. The GSHN3 (n = 12) is formed mainly by crater lakes located in the highlands of Guatemala, Honduras, and El Salvador. Most lake waters are dominated by bicarbonates, followed by magnesium, whereas Chiligatoro, Chicabal, and Alegría are dominated by sulfates (Table S1). Sediment mineralogy of Central American lakes (GSHN region) shows that most subgroups are dominated by feldspars. Co-dominance with other minerals such as phyllosilicates and carbonates occurs. The GSHN1 combined lakes are dominated by phyllosilicates and feldspars, whereas quartz is the dominant mineral in lakes Yojoa and Ticamaya (Honduras). The GSHN2 region included large lakes dominated by feldspars. In Lake Nicaragua, this dominance is also shared with phyllosilicates. For the GSHN3, we detected two main mineral assemblages. The first is dominated by phyllosilicates and feldspars, with clay minerals and feldspars as main minerals, and the second is dominated by feldspars and phyllosilicates.

Ostracode species associations and their relationship with limnological regions
We found ostracode species in 74 of the 76 aquatic systems we studied in the northern Neotropics. In the volcanic Lake Alegría (El Salvador) and the karstic cave San Miguel (Yucatán, Mexico), ostracodes were not observed. Living adult specimens were encountered in samples from all systems, except those from lakes Chicabal, Tekoh, Yaxhá, Verde, and Cenote Mucuyche, where only empty shells or single valves were recovered. A taxonomic analysis of species enabled us to identify 70 species (Table S3 in the Supplement), out of which 31 were recorded at single sites, whereas the remaining 39 were observed in at least 2 systems. Species richness ranged between 1 to 9 with an average of 4 species per site, whereas the maximum value of the Shannon diversity index (H ) was 2.1, corresponding to lakes Bacalar and Petén Itzá and the Candelaria River. For all other lakes, the index aver-aged 1.1. The list of ostracode species found in our study is presented in Table S3. The NMDS ordination, based on species occurrence data, revealed five major species associations (OST 1-5) with a reliable stress value of 0.08 (Fig. 5a) (Clarke, 1993). The PERMANOVA test showed statistically significant differences between group centroids (F = 1.19, p = 0.0001), thus supporting NMDS ordinations. Ostracode groups 1 and 2 are located in the YG limnological region (karst terraces), and groups 3-5 in the limnological region GSHN (volcanic Guatemalan highlands and Central American mid-elevations and lowlands). The first species group (OST1) consisted of 12 ostracode species, recorded from lakes and cenotes from the eastern Yucatán Peninsula (Fig. 5b). Most of these species are tolerant of high conductivity (particularly related to Na + and Cl − ), such as Heterocypris punctata and Limnocythere floridensis (Fig. 5c), Cyprideis cf. salebrosa and Perissocytheridea cf. cribrosa. The second species group (OST 2) included 11 species, distributed in lakes and ponds of the southern Yucatán Peninsula and northern Guatemala (Fig. 5b). Some of these species (Fig. 5b), such as Cypria petenensis, Cypretta campechensis and Paracythereis opesta, are considered endemic (Cohuo et al., 2016). Some oth-ers, such as Alicenula serricaudata, Pseudocandona antilliana and Cytheridella ilosvayi, have a wide Neotropical distribution (Cohuo et al., 2016). The third species group (OST 3) was integrated of seven species distributed mainly in mid-elevation lakes from Guatemala, El Salvador, and Honduras (Fig. 5b). Several of these species have very restricted distributions and correspond to the Strandesia, Keysercypria and Cypridopsis genera (Fig. 5c). The fourth species group (OST 4) is composed of six species: Heterocypris nicaraguensis, Potamocypris islagrandensis, Physocypria granadae, Limnocytherina royi, Perissocythere marginata, and Cyprideis sp., distributed exclusively in lakes in Nicaragua ( Fig. 5a and b). The fifth group (OST 5) included only three species from highland lakes of Guatemala: Chlamydotheca cf. colombiensis, Cypria sp. 4 and Cypridopsis sp. 7 (Fig. 5b and c).

Structural equation modeling: geodiversity, limnological conditions, and freshwater ostracodes
Five models of the relationships of geodiversity, limnological conditions, and species composition and richness were tested with structural equation modeling (SEM). Four models used geodiversity only as exogenous variables (models 1-2, 4-5), and in one model (model 3) "limnological conditions" were also considered as an exogenous variable (two exogenous variables, explaining an endogenous variable). Descriptions of the rationale behind variable selection and relationships tested in each model are found in Sect. S1. Model 1 evaluated the direct influence of geodiversity on limnological conditions and the influence of limnological conditions on species composition and richness, resulting in the following metrics of global fit: CFI -0.63 (values close to 1.00 indicate better fit of the model), RMSEA -0.19 (values < 0.05 are considered reliable; Fabrigar et al., 1999), and SRMR -0.12 (values < 0.08 are generally considered a good fit; Hu and Bentler, 1999). Model 2 tested relationships similar to model 1, but additionally the relevance of TOC on species composition and richness, with the following statistics: CFI -0.79, RMSEA -0.13, and SRMR -0.14. Models 1 and 2 received relatively low values of global fit, all below the threshold of statistical significance. Values of a chi-square test were < 0.05 in these two models, and they were therefore rejected as explanatory models. Model 3 evaluated the influence of geodiversity and limnological conditions on species composition. These two variables were considered independent and without any influence on one another. Metrics of global fit of this model were better than in models 1 and 2, (CFI -0.94, RMSEA -0.06, and SRMR -0.14). However, the data were too small to calculate the statistical significance of both exogenous variables, and then, interpretation of the results is not reliable. This model was also rejected. Models 4 and 5 evaluated the same relationships between geodiversity, limnological conditions, and species composition and richness as in model 1, but additionally, the indi-vidual influence of elevation and conductivity (model 4) and TOC and latitude (model 5) were analyzed. Model 5 received the following metrics of global fit: CFI -0.62, RMSEA -0.20, and SRMR -0.17, all below the threshold of significance, and the model was therefore rejected. The optimal model was model 4 (Fig. 6). This is supported by the following metrics of global fit: CFI − −0.94, RMSEA − −0.01, and SRMR − −0.03. The optimal model suggests that geodiversity strongly influenced limnological conditions, which in turn explained species composition (distribution) but not species richness. Elevation and conductivity did not explain species richness but demonstrated significant influences on species composition. The direct influence of limnological conditions on species composition was statistically significant, whereas it did not exert an influence on species richness (p-value > 0.05). All paths in Fig. 6 are statistically significant (p-value < 0.05), except for the direct effect of limnology, conductivity, and elevation on species richness.

Geodiversity defines two main limnological regions in the northern Neotropics
Our cluster analysis based on 76 lakes and 23 lake attributes shows limnological regionalization in the northern Neotropics. Two main regions, corresponding to Yucatán Peninsulanorthern Guatemala (Group YG) and northern Central America (Group GSHN), were identified (Fig. 2). The group YG is located in karstic plateaus of sedimentary origin, dominated by limestone, dolomite, evaporites, and carbonate-rich impact breccia (Hildebrand et al., 1995;Schmitter-Soto et al., 2002a, b;Vázquez-Domínguez and Arita, 2010). The group GSHN is located in volcanic bedrock terrains of Guatemala, El Salvador, Honduras, and Nicaragua, where pyroclastic and volcanic epiclastic materials, usually reworked, are abundant, reflecting active or past volcanic activity (Dengo et al., 1970;Stoiber and Carr, 1973;Carr, 1984). The YG and GSHN groups were further subdivided into four (YG1-4) and three (GSHN1-3) subgroups, representing limnological subregions. This proposed regionalization therefore reveals high heterogeneity of aquatic systems in the northern Neotropics. Multivariate statistics (PCA) show that regions and subregions can be distinguished by the ionic composition of waters. Geochemical variables related to sediments, such as TOC and mineral composition, are recognized as the second most important characteristics (Figs. 3 and 4).
In the YG karst region, lakes are characterized by carbonate, calcium and calcite signatures, which is expected because waters interact with limestone and dolomite-rich bedrock on the Peninsula (Schmitter- Soto et al., 2002a, b;Perry et al., 2009). This is also responsible for the dominance of calcium, sodium, and magnesium ions in waters, which in turn are related to the generally alkaline surface wa-  ters in most aquatic systems of the region (Alcocer et al., 1998;Schmitter-Soto et al., 2002a, b) (Fig. 3a and d). In specific areas, such as YG3, dominance of chloride is also relevant. This can be explained by two main processes: (1) marine intrusion and (2) input of subterranean waters that have interacted with evaporites. The spatial distribution map of chloride contents in lake waters (Fig. 3c) shows a clear tendency to higher values on the northern Yucatán Peninsula where marine intrusion is probably the most important source of chloride (Sánchez-Sánchez et al., 2015;Saint-Loup et al., 2018). Marine intrusions in northern Yucatán have been mapped as far as 100 km inland (Steinich and Marín, 1996). Pérez-Ceballos et al. (2012) found that several water systems, mainly cenotes, in this same region are characterized by marine waters below freshwater lenses, with water intermixing.
Sulfate is an interesting component of some lakes of the YG1 systems (Socki et al., 2002;Pérez-Ceballos et al., 2012). The presence of sulfates in lake waters may be attributed to the K/T anhydrite/gypsum-bearing impact breccia and dissolution of CaSO 4 of evaporites (Rosencrantz, 1990). This suggests that lakes with high sulfate contents receive high ground water input, as evaporites are only present at depths greater than 170 m below surface, or that these lakes developed along sites with past tectonic activity such as faulting and uplift . Some lakes from the YG1 with high content of sulfates did, in fact, develop along fault zones. For example, Lake Chichancanab is associated with the Sierrita de Ticul fault (Hodell et al., 2005) and Lachuá to the Polochic fault (Erdlac and Anderson, 1982), whereas other lakes are related to high ground water input, such as the Bacalar hydrological system (Perry et al., 2009).
High TOC values in sediments of most lakes of YG2 probably reflect the trophic state of lake waters. Our data confirm results by Pérez et al. (2011a), who recorded a TOC increase from north to south on the Yucatán Peninsula. This may be attributed to the combined effect of soil, precipitation, and vegetation type, which changes from north to south. The northern part of the Yucatán Peninsula is characterized by leptosols, which are shallow soils with high amounts of exposed hard rock and calcareous material (Bautista et al., 2011;Estrada-Medina et al., 2013). There, precipitation of about 450 mm yr −1 (Pérez et al., 2011a), gives rise to lowstature deciduous forests with low biological productivity. In the south of the Peninsula, where precipitation increases to > 3200 mm yr −1 (Pérez et al., 2011a), Luvisols and Vertisols, clayey and fertile soils, support the growth of tropical evergreen forests that generate high amounts of organic matter runoff, particularly during the rainy season.
In Central American, volcanism is the most common mode of lake formation. These lakes are classified as caldera lakes, crater lakes in (partially) active or inactive volcanoes, maar lakes, or are located in volcanic bedrock basins (Golombek and Carr, 1978;Newhall and Dzurisin, 1988;Dull et al., 2001;Vallance and Calvert, 2003) (Table S1). The existence of at least three limnological subregions highlights that these lakes are additionally influenced by regional factors related to orography (elevation), climate and the level of volcanic activity, including magmatic heat and gas input. Ionic dominance of Central American lakes is highly variable, but anions Cl − , SO 4 2− , and cations K + , Mg 2+ , Na + were dominant (Fig. 4a). This ionic composition reveals that two main processes control water chemistry: (1) active volcanic activity with strong interaction with lakes, and (2) precipitation-evaporation rates, especially at high-elevation sites. Dominance of magnesium, chloride, and sulfates, such as in GSHN3, can be attributed to volcanic activity and hydrothermal systems. Chloride and sulfates are strongly influenced by volcanic gas input, by incorporation of HCl and SO 2 , and interaction with igneous rocks. Lakes with such ionic dominance also display high contents of phyllosilicates and feldspars, which may have formed through dissolution by hot and likely acid groundwater, heated by hydrothermal activity. Lakes located in the Trans-Mexican Volcanic Belt of central Mexico (TMVB; Armienta et al., 2008;Sigala et al., 2017) have similar ionic and sediment chemical composition as lakes in Central America. The ionic dominance of the currently extinct crater lakes of Volcano Popocatépetl, e.g., changed from sulfate to calcium-magnesium dominance and finally magnesium dominance, resulting from heating of andesite rocks after a period of increased volcanic activity (Armienta et al., 2000(Armienta et al., , 2008. This suggests that the influence of active volcanism on lake water chemistry may exert similar influences along the northern Neotropical region and American transition zone. High rates of evaporation are anticipated to represent an important driver for water chemistry in Central America as well, because of the dominance of carbonates, bicarbonates, and sodium (GSHN1 subregion). This ionic composition can be attributed to interaction and weathering of volcanic rocks, capture, and dissolution of CO 2 , and high evaporation rates. Although Central America can be considered a tropical and humid region, high temperatures and solar radiation, especially at high elevations, may produce high evaporation rates leading to ion-specific signatures such as dominance of carbonates-bicarbonates.
Given the origin of discriminating variables in PCA for both YG and GSHN, we found three main sources controlling limnological conditions in the northern Neotropics: (1) bedrock type, which determines specific mineral and ionic composition of lake sediments and host waters; (2) volcanic and marine influence, which determines the presence of dominant and conservative ions such as Mg 2+ and Cl − , and (3) precipitation-evaporation balance across altitudinal and latitudinal gradients, which determines the concentration of solutes and therefore conductivity. This is consistent with results from limnological studies in other regions of Central America and southern Mexico, which suggested that geology, through bedrock types and volcanic input to lakes, and marine interactions, are the most influential factors for aquatic environments (Löffler, 1972;Haberyan and Horn, 1999;Haberyan et al., 2003;Cervantes-Martínez et al., 2002;Perry et al., 2002;Schmitter-Soto et al., 2002a;Socki et al., 2002;Pérez et al., 2011a). Several authors, however, consider additional features relevant for lake classification, such as temperature, pH, and altitude (orography) (Brezonik and Fox, 1974;Horn and Haberyan, 1993;Umaña et al., 1999;Haberyan et al., 2003). In our study, water temperature, pH, and elevation scored relatively low on the PCA, suggesting that their influence is relevant only on a local scale. Considering the determinants of limnological variability of aquatic systems at both regional and local scales, the interaction with marine environments and the high variability of aquatic system morphology and origin, it is evident that the current geodiversity is the main factor driving aquatic ecosystem properties and defining limnological regions in the northern Neotropics.

Geodiversity as a determinant of distribution of freshwater ostracodes in the northern Neotropics
We found ostracode species in almost all the aquatic systems we studied. Seventy species were recognized, demonstrat-ing that this group is abundant and diverse in the northern Neotropics. The number of species per lake (species richness) was relatively low. In most sites, we found between two and six species, and in large lakes, such as Petén Itzá and Nicaragua, we found a maximum of nine species. Diversity metrics substantiated this tendency, with values of the Shannon diversity index always < 2, reflecting low diversity in aquatic environments (Margalef, 1957;Chao and Shen, 2003). Although we do not intend to deeply evaluate taxonomic and ecological aspects for the northern Neotropical ostracodes in this study, we detected low species richness in the study area compared to that in South America and Europe. In the flood plains of the Upper Parana River, South America, species richness can be as high as 44 (Higuti et al., 2017), whereas in Europe, a single lake may host 32 species (Sluys, 1981;Lorenschat et al., 2014). This pattern may be influenced by sampling effort, as sampling sites did not always cover the lake extent. In addition, sampling was conducted in a single climatic period. Ostracode phenology, such as differences in hatching time and functional traits, varied within species and are mostly influenced by photoperiod, conductivity, and water temperature (Rossi et al., 2013;Rosa et al., 2021). Therefore, the estimated values of biodiversity may partially represent the true diversity in the region. Low species in Neotropical lakes, however, was also observed in previous studies in the region. In Lake Petén Itzá, for example, a maximum of 11 species was reported by Pérez et al. (2010). In Lake Nicaragua, the largest lake in Central America, seven species were found (Hartmann, 1959). In Colombian aquatic systems, such as the La Fe reservoir (Saldarriaga and Martínez, 2010) and the Magdalena River basin (Roessler, 1990a, b), the number of ostracode species is similar to that in Central America (∼ 6 species per lake). Therefore, to clarify structural and diversity patterns of ostracodes in the Neotropical region, more intensive sampling in lakes and rivers is needed. Evidence from other tropical regions around the world will be valuable to understand patterns of ostracode diversity (tropical vs. temperate) in the Ostracoda group.
The NMDS analysis shows the existence of at least five species associations in the region (OST1-OST5), emphasizing that ostracodes do not conform to a faunal unit, but rather display disjunct faunas (Fig. 5a), similar to what is observed in other studies of ostracodes  and freshwater fishes of the region (Miller, 1966;Matamoros et al., 2015). Ostracode associations are geographically delimited, and no overlap was observed (Fig. 5b). Ostracode groups OST1 and OST2 belong to the YG limnological region, whereas the OST3, OST4, and OST5 associations belong to the GHSN limnological region ( Fig. 5a and b). Few species were present in more than three limnological subregions, and these can be considered of wide Neotropical distribution, e.g., C. ilosvayi and Carebara vidua.
Correspondence between species associations and limnological regions and subregions suggests a major influence of physical and chemical properties of lake environments on biological systems. The SEM analysis exposed the significant influence of geodiversity on limnological conditions, and of limnological conditions on species associations, identified in the NMDS. This illustrates that limnological conditions, particularly geochemistry, and water chemistry, is the primary factor responsible for species distributions in the study area. We were, however, unable to find statistical significance to explain the relationships between geodiversity and limnological conditions with ostracode species richness. This suggest that the number of species per lake may not be fully governed by our predictors. For instance, the individual influence of conductivity and elevation in model 4 and TOC and latitude in model 5, also failed to explain species richness. This revealed that other intrinsic or extrinsic factors such historical water level fluctuations and precipitation-evaporation balance, might instead control species richness. In our SEM models, we also tested the direct influence of geodiversity and its indirect influence through limnology on species distribution and richness. The optimal SEM model demonstrated the significance of paths describing the direct influence of geodiversity on species richness and composition, but significance of standardized coefficients was of minor importance (0.1 and 0.04, respectively). Conversely, strong ties were discovered when we analyzed the indirect effect of geodiversity (via limnology) on species composition (> 2.0). This implies that ostracode species composition is more predictable from limnological variables such as geochemistry and physical and chemical variables of water than from geodiversity variables such as bedrock and mineralogy. Our SEM model also revealed that elevation is an important predictor of species composition (Fig. 6). The negative correlation suggested that an increase in elevation causes a decrease in the number of species per site. Most highland lakes were characterized by up to three species and, more commonly, by a single species, except for large lakes Amatitlán and Atitlán in Guatemala.

Conclusions
The northern Neotropics is a region characterized by diverse environmental conditions, abundant aquatic systems, and high biodiversity. Our limnological survey of 76 aquatic environments identified 2 main limnological regions in the northern Neotropics. The YG region is associated with karst plateaus in southern Mexico and northern Guatemala, whereas the GSHN region is associated with landscapes formed by volcanic activity in southern Guatemala, El Salvador, Honduras, and Nicaragua. At least seven limnological subregions were identified, illustrating the high heterogeneity of aquatic systems in the northern Neotropics. Low ostracode species richness in the northern Neotropics seems to be strongly related to the geological history of the region. The low number of species per lake contrasts with the number of species per lake in temperate regions, which is at least 5 times higher. The SEM analysis highlights that geodiversity has a direct influence on limnological regions, and an indirect but relevant influence on freshwater ostracodes. This is the first study to integrate data on geodiversity including watershed geology, sediment mineralogy, limnological conditions such as physical and chemical characteristics of the water column, geochemistry, and biota in aquatic ecosystems of southern Mexico and Central America. Further studies should focus on the establishment of a more detailed regionalization, by including a greater number of lakes, more environmental variables, and samples collected at different times throughout the seasonal cycle.
Data availability. Water physicochemical, sediment geochemistry, mineralogical and geological data from the 76 aquatic ecosystems sampled in this study are available in Table S1 in the Supplement and in the Pangaea repository: https://doi.org/10.1594/PANGAEA.940538 . Ostracode relative abundances at each sampling site can be found at the Pangaea repository: https://doi.org/10.1594/PANGAEA.940254 (Macario-González, 2022).
Author contributions. LMG and SC conducted the fieldwork, performed data analysis, and wrote the manuscript. AS, LP, and MEG developed, managed, and coordinated the project and contributed to data interpretation and manuscript writing. PH contributed and interpreted mineralogical data and gave scientific input to the manuscript. MC developed water chemistry analyses and gave scientific input to the manuscript. AO, MP, and MRA provided support for sampling in their respective countries and organized sampling permits.
Competing interests. The contact author has declared that none of the authors has any competing interests.
Disclaimer. Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Acknowledgements. We thank all colleagues and institutions involved in this work, including the student team: Christian Vera, León E. Ibarra, Miguel A. Valadéz, and Cuauhtémoc Ruiz (Instituto Tecnológico de Chetumal, Mexico); Ramón Beltran (Centro Interdisciplinario de Ciencias Marinas, Mexico); and Lisa Heise (Universidad Autónoma de San Luis Potosí, Mexico) for their excellent contributions during field work. We also thank the following colleagues and institutions that made the analysis of field sampling and water chemistry possible: the team from the Asociación de Municipios del Lago de Yojoa y su área de influencia (AMUPROLAGO, Honduras); Margaret Dix, Eleonor de Tott, Roberto Moreno (Universidad del Valle de Guatemala, Guatemala); Consejo Nacional de Áreas Protegidas (CONAP, Guatemala); Néstor Herrera (Ministerio de Medio Ambiente, San Salvador), Teresa Álvarez (El Colegio de la Frontera sur, Chetumal Unit, Mexico); María Aurora Armienta (Laboratorio de Química Analítica, Instituto de Geofísica, Universidad Nacional Autónoma de México); and Adriana Zavala (El Colegio de la Frontera Sur, Mexico). We also gratefully acknowledge anonymous reviewers and Mark Brenner for their constructive feedback and language editing. This open-access publication was funded by Technische Universität Braunschweig.
Review statement. This paper was edited by Gabriel Singer and reviewed by two anonymous referees.