The interplay of riverscape features and exotic introgression on the genetic structure of the Mexican golden trout (Oncorhynchus chysogaster). »

Get Complete Project Material File(s) Now! »

Study system and data collection

In this study, the sampling was conducted in Río Fuerte, Río Sinaloa and Río Culiacán at altitudes ranging from 1965 to 2730 m (Fig. 1). The heterogeneous riverscape of this area offers a great diversity of terrestrial and aquatic habitats providing high endemism and biodiversity (Hendrickson et al., 2006), with O. chrysogaster as one of the principal predators inhabiting this riverine ecosystems. However, the introduction of rainbow trout with aquaculture purposes is reported since the 1860’s, being strongly supported by federal agencies during the 1980’s and 1990’s (Escalante et al., 2014).
Based on previous study (Escalante et al., revision), five variables were considered to characterize riverscape: precipitation of the driest month, temperature of the warmest quarter, river length, slope, altitude and stream order. They were generated in that study from data available at worldclim database (http://www.worldclim.org/) and the Japanese Space System (http://www.jspacesystems.or.jp/ersdac/GDEM/E/4.html) websites. Moreover, the effect of latitude and longitude was also tested.
Wild trout were collected by electrofishing in 26 sample sites at Río Fuerte (10 sites), Río Sinaloa (11 sites) and Río Culiacán (5 sites) during winter and spring season of 2013, 2014 and 2015 (Fig. 1). Additionally, rainbow trout samples obtained from two aquaculture farms at Río Sinaloa; as well as seven farmed O. chrysogaster and four lab hybrids from O. chrysogaster and O. mykiss donated by the Mexican Institute of Fisheries (INAPESCA), were included in the study for further genotyping. From them, small pieces of tissue (either fin or muscle) were clipped and preserved in 95% ethanol, for posterior analyses.

Genotyping by sequencing

Genomic DNA of each individual was extracted using Qiagen DNeasy Blood & Tissue Kit protocol (Qiagen, Hilden, Germany, http://www1.qiagen.com). DNA quality was checked using agarose gel electrophoresis, and quantified using Nano-Drop spectrophotometer (Thermo Scientific) and QuantiT Picogreen dsDNA Assay Kit (Invitrogen). Then, DNA libraries were generated by GBS methods (Elshire et al., 2011) at Cornell University in Ithaca New York using ECOT221 enzyme. Finally, single-read 100-bp sequencing was performed on Illumina HiSeq2500.
The quality of raw sequences was controlled with FastQC (Andrews et al., 2010). Reads were treated with Cutadapt (Martin, 2011) to remove potential fragments of Illumina adapters, allowing only 10% mismatch in the adapter sequence. The bioinformatics software/pipeline Stacks 1.32 (Catchen, 2013; Catchen et al., 2011) was used to demultiplex reads, identify Restriction site Associated DNA (RAD) loci and call SNP’s. Reads were filtered for overall quality, demultiplexed and trimmed to 85bp using process_radtags module, where one mismatch in the bar code sequence was allowed. We used the ustacks module of Stack, with a minimum stack depth of 4x, a maximum distance allowed between stacks of 4 (6 for secondary reads, which could not be used to call SNP’s). Using cstacks module, the catalogue of loci using n = 4 was built. With sstacks module, samples were matched against the catalog of loci. Finally, individuals were genotyped using populations module with at least 70% of the individuals being genotyped and a minimum read depth of 5x for each loci. Genotypes were exported in VCF format for further filtering. The SNP’s dataset was filtered using VCFtools (Danecek et al., 2011) for a minimum average read depth ranging from 8 to 40x and a minor allele frequency of 1%. We constituted a blacklist of loci deviating from Hardy-Weinberg equilibrium (HWE, p-value ≥ 0.05) in 1 or more populations among 3 populations exempt from stocking and with relatively large number of individuals.
We constituted six datasets at four different spatial scales after quality filters: Dataset A for population genetic analysis including all the genotyped individuals, Dataset B for landscape genetics analyses across all the study area including native trout, Dataset C for landscape genetic analysis with central populations, Dataset D for landscape genetic analyses at Río Fuerte, Dataset E for landscape genetics analyses at Río Sinaloa, and Dataset F for gene environment associations across all the study area with native trout. Further information about datasets is included in Appendix 5.

Population genomics analyses

Trout genetic diversity (Dataset A) was estimated from private polymorphisms (PP), expected heterozygosity (HE) and observed heterozygosity (HO) using adegenet (R package; Jombart, 2008). Effective population (NE) size was calculated applying a molecular co-ancestry method (Nomura, 2008) implemented in the R package NeEstimator (Do et al., 2014) .
Genetic distances among all sample sites (Dataset A) were assessed with two different approaches using the adegenet package. Initially, pairwise FST coefficients among sample sites were calculated. Moreover, a phenogram was built from all individuals using Nei genetic distance (Tamura & Nei, 1993) by the neighbor joining algorithm (Saitou & Nei, 1987). Confidence intervals based on bootstrap values were estimated from 10,000 permutations.
We then applied fastStructure (Raj et al., 2014), to assess the genetic structure of O. chrysogaster and genetic admixture with aquaculture trout. Based on a Bayesian framework, this approach infers population genetic structure for a large amount of SNP’s datasets without assuming predefined populations. We ran fastStructure using Dataset A including all SNP’s from all the genotyped individuals. Based on the number of sample sites, we considered K =30 as maximum value.
Riverscape genetics analyses

READ MANUSCRIPT " PHYTOVIROME ANALYSIS OF WILD PLANT POPULATIONS

Riverscape drivers on genetic diversity

To investigate the influence of riverscape on neutral genetic diversity, we applied a multiple linear regression between expected heterozygosity and riverscape variables. A bidirectional stepwise selection procedure was performed (R package MASS; Ripley, 2002) and the environmental variables with significant influence on expected heterozygosity were selected based on AKAIKE criteria. We ran four analyses with different datasets (Dataset B, Dataset C, Dataset D and Dataset E) testing each time seven predictors: latitude, longitude, precipitation of the driest month, temperature of the warmest quarter, river length, altitude and stream order.

Riverscape drivers on genetic divergence

To investigate the effect of riverscape on neutral genetic differentiation, we first derived a resistance surface from four riverscape features using ArcGIS v10.2 (ESRI, 2013). This surface was defined from the rasters of temperature of the warmest quarter, slope, stream order, and altitude. We assigned values to the pixels at each raster representing the degree to obstruct movement according to survival and dispersal requirements for distinct trout species (Hendrickson et al., 2006; Meeuwig et al., 2010). Further information about parameterization is included in Appendix 6.
Then, gdistance R package (van Etten, 2012) was applied to calculate riverine least cost distance and riverscape resistance matrices among sample sites at different spatial scales. This method simulates potential movement for species in a spatially structured landscape, linking different dispersal functions and connectivity thresholds by the Djikstra’s shortest path algorithm. Therefore, eight matrices were generated under different hypothesis of movement of both Isolation by Riverine Distance (IBD) or Riverscape Resistance (RR): IBD for all populations (Matrix I), RR for all populations (Matrix II), IBD for central populations (Matrix III), RR for central populations (Matrix IV), IBD for Río Fuerte (Matrix V), RR for Río Fuerte (Matrix VI), IBD for Río Sinaloa (Matrix VII) and RR for Río Sinaloa (Matrix VIII). For further details, see Appendix 6.
In order to define the influence of both riverine distance and riverscape resistance on genetic distances, we tested the movement hypothesis aforementioned using ecodist R package (Goslee & Urban, 2007). Thus, Mantel tests were performed among the regression of FST/(1-FST) and their corresponding riverine distance/riverscape resistance matrices, under the eight movement hypothesis (Appendix 6). All tests were performed under 10,000 permutations assuming no correlation.

Detection of SNP’s under divergent selection

To detect O. chrysogaster SNP’s potentially under selection we analyzed Dataset F using three different software considering two different approaches: population (i.e. sampled sites) outlier detection approach (1) and association tests between genotypes and continuous climatic variables (i.e. riverscape adaptive genomics) (2).
First, we applied PCAdapt R package (Luu et al., 2016) to detect SNP’s under selection by the approach 1. Combining principal component analysis and Mahalanobis distances, this method assumes that molecular markers excessively associated with population structure are candidates for local adaptation. Based on the vector of z-scores, loci not following the distribution of the main bulk of points are considered outliers. The analysis was run with a threshold of 10% and K=6 based on fastStructure results.
For approach 2, we used two gene environment association software, testing two environmental explanatory variables (temperature of the warmest quarter and precipitation of the driest month) on each method. Those variables were previously suggested as important adaptation drivers in salmonids (Hand et al., 2016; Hecht et al., 2015). Using mixed models, both methods detect outlier loci by allele frequencies exhibiting strong statistical correlations with environmental variables. Initially, we applied BAYEN2 (Günther & Coop, 2013), using an average of five independent runs (100,000 iterations). Also, Latent Factor Mixed Models (LFMM) algorithm included in the R package LEA (Frichot & François, 2015; Frichot et al., 2013) was run with five repetitions, 10,000 cycles, 5,000 burn in and K=6 (based on fastStruture outputs). For LFFM and BBAYENV2, we defined a threshold of 1% of the total SNPs to select the outlier SNP’s with the highest posterior probabilities.

Table of contents :

Chapter 1. Synthèse de thèse en français »
Introduction générale (A version in English is included in appendix I) »
Objectifs »
Résultats principaux »
Discussion générale »
Chapter 2. Genetic introgression of cultured rainbow trout in the Mexican native trout complex. »
Abstract »
Introduction »
Materials and methods »
Results »
Discussion »
Chapter 3. The interplay of riverscape features and exotic introgression on the genetic structure of the Mexican golden trout (Oncorhynchus chysogaster). »
Abstract »
Introduction »
Materials and methods »
Results »
Discussion »
Chapter 4. Riverscape genetics of the endemic Mexican golden trout, a conservation genomics approach. »
Abstract »
Introduction »
Materials and methods »
Results »
Discussion »
Chapter 5. General Discussion