Whole transcriptome analysis.
Total RNA was extracted from frozen grapevine tissues using the “SpectrumTM Plant total RNA Kit” (Sigma) according to manufacturer instructions. mRNA was extracted from the total RNA using the Dynabeads mRNA Direct kit (Invitrogen pn 610.12). A variable quantity of mRNA ranging from 0,4 to 1,6% respect to the amount total of RNA was obtained. Samples for Ligation Sequencing were prepared according to the SOLiD Whole transcriptome library preparation protocol (pn 4452437 Rev.B). The samples were purified before RNase III digestion with Purelink RNA micro kit columns (Invitrogen, pn 12183-016), digested from 3′ to 10′ according the starting amount of mRNA, retro-transcribed, size selected using Agencourt AMPure XP beads (Beckman Coulter pn A63881) and barcoded during the final amplification. Obtained libraries were sequenced using Applied Biosystems, SOLiD™ 5500XL , which produced paired end reads of 75 and 35 nucleotides for the forward and reverse sequences respectively. Reads were aligned to the reference grape genome using PASS aligner, a program able to perform several alignments on a sequence subset trying different parameters to trim the low quality bases and select for those values that maximize the number of aligned reads (Campagna et al., 2009). The percentage identity was set to 90% and one gap was allowed whereas the quality filtering parameters were set automatically by PASS. Moreover, a minimum reads length cutoff of 50 and 30 nt was set for the forward sequences and reverse reads respectively. The spliced reads were identified using the procedure described in PASS manual (http://pass.cribi.unipd.it). The forward and the reverse reads were aligned independently on the reference genome. We used PASS-pair program in the PASS package to perform the pairing between the forward and the reverse reads and we selected only those sequences that are uniquely aligned. As gene reference we used the v1 grape gene prediction available at http://genomes.cribi.unipd.it/grape and htseq-counts program
Multifactorial and pairwise statistical analysis.
DEseq R package (http://www.r-project.org/) was used to perform the statistical analyses for discovering differentially expressed genes (DEGs) by using both a multi-factor and a pairwise-comparison approaches (Maza et al., 2013). In order to evaluate the single effects of the genotype (R: 101.14 and M4), of the type of stress imposed (Treatment, T: WW, WS, and SS for control, water stress and salt stress experiment, respectively) and of the time point considered (P: T1-T4 and T1-T3 for WS and SS, respectively) on gene expression, a multifactorial analysis was carried out by using the multi-factor designs method of DEseq R package (Anders and Huber, 2012; http://bioconductor.org/packages/release/bioc/html/DESeq.html). This method allows to evaluate the weight of each factor considered in the analysis (R, T and P) and its impact on gene expression and, consequently on DEGs, according to a false discovery rate (FDR) < 0.05 for both experiments and tissues. This procedure consist of the following passages: create a count data set with multiple factors (with the three components described above), estimate size factor and dispersions of the data, fit generalized linear models (GLMs) according to the two models (full model regresses the genes’ expression on both the library type and the treatment condition, the reduced model regresses them only on the library type), and then compare them in order to infer whether the additional specification of the treatment improves the fit and hence, whether the treatment has significant effect (Anders and Huber, 2010).
Pairwise tests (Anders and Huber, 2010) between stressed (WS) and unstressed (WW) tissues were also performed. In this case the analysis was performed considering each genotype, tissue and time point singularly.
Ontology and Differential Clustering Analyses (DCA).
In order to functionally classify the genes affected by drought treatment, the Gene Ontology (GO) terms were retrieved, imported in the Blast2GO software v2.5.0 (Götz et al., 2008) and increased of about 16% by means of the Annex function (Myhre et al., 2006) as previously reported by Botton et al. (2008). DEGs resulted from multifactorial analysis, which were affected by the effect of all components (common DEGs between R, T, and P) and those ones in common between rootstock and treatment components, were associated to GO categories. Within the more representative GO categories, those DEGs associated to GO terms related to plant hormones, secondary metabolism, sugars, stresses, cell wall and transcription factors (TFs) were selected for the following Differential Clustering Analysis (DCA).
The DCA analysis was performed by using an R script, which is a slightly modified version of the original method carried out by Ihmels et al. (2005), Lelandais et al. (2008) and Cohen et al. (2010). The DCA analysis is carried out in three steps that we develop here. (i) The correlation values of all DEGs belonging to the GO categories related to a reference rootstock (rr) and a target rootstock (tr) were initially calculated with the ‘cor’ R function. (ii) Correlation values of the rr were subsequently clustered by applying the ‘kmeans’ R function. The number of clusters related to the rr were selected in accordance to an average correlation value, which was heuristically chosen higher than 0.65 for each cluster. The same order chosen for the rr was used to arrange the tr DEGs of the GO-selected categories. Hence, the transcripts from each cluster were co-expressed in the rr (correlation > 0.65) but not necessarily in the tr one. (iii) DEGs related to each cluster of the tr were subsequently grouped into two sub-clusters (a and b) by using a hierarchical clustering method (with ‘hclust’ R function). The average of the correlation values belonging to each sub-cluster (Ca and Cb) and the average of the correlation values between the two clusters a and b (Cab) were eventually calculated.
Correlation values of tr and rr matrices are graphically represented in white, yellow and red colors for strongly correlated, weakly correlated and anti-correlated genes, respectively. DCA results were finally presented as a unique distance matrix between gene expression measurements in which rr and tr rootstocks were respectively represented in rows and columns. Clusters of each rootstock were compared and assigned to “full”, “partial”, “split” or “no” conservation categories after comparing Ca, Cb and Cab values with the threshold T, which is chosen equal to 0.5 in this study. Specifically, if (Ca and Cb) < T the cluster was assigned to the “no conservation” category, if (Ca or Cb) > T the cluster was assigned to the “partial conservation” category, if (Ca and Cb) > T and Cab < T the cluster was assigned to the “split conservation” category and if (Ca, Cb and Cab) > T the cluster was assigned to the “full conservation” category.
mRNA-seq sequencing and mapping of reads to the grapevine genome
To gain a better understanding of the molecular mechanisms underlying the drought tolerance of the M4 genotype compared to the susceptible 101.14 one, an mRNA-seq analysis using Next Generation Sequencing (NGS) technologies was performed. The whole transcriptome sequencing was performed on a Solid 5500XL platform. The analysis was accomplished on all the 80 samples previously described, in order to obtain a detailed screenshot of the transcriptome changing of the two rootstocks within the entire stress kinetics. The whole experiment produced approximately 4.8 billion of paired-end reads (75 and 35 bp length for forward and reverse reads respectively), with the total number of reads produced for each time point ranging from 29 to 82 million paired-end reads and a median of 45 million reads. An average percentage of 90% of total reads passed the quality control test (filtered based on reads length after the trimming of the low quality bases) and were mapped to the PN40024 12Xv1 reference genome (Jaillon et al., 2007) producing a number of unique mapping reads ranging from 10 to 37 million depending on the sample.
Multi-factor design and discovering of differentially expressed genes
For the evaluation of Differentially Expressed Genes (DEGs) upon water stress in the two rootstocks under study, a main step in the statistical analysis was estimating the influence of different independent components (i.e. variables) on transcriptome. Amongst these were the effect of genotype, indicated as “R” (101.14 and M4 respectively), the type of treatment imposed, indicated as “T” (well watered, WW; water stress, WS) and the time point considered during the kinetic of stress, indicated as “P” (T1, T2, T3, T4).
Thus, a multi-factor analysis was carried out on mRNA-seq data sets obtained from WS root tissues and WS leaf tissues, in order to evaluate both the singular (R, T, P) and combined (R:T, R:P, T:P, R:T:P) impacts of each component on DE genes according to a p-value lower than 0.05. The Vennn diagram shown in Figure 2A summarizes the impact of each component indicating the number of genes specifically influenced byy a single component and those ones influenced by more than one variable. In root tissues undergoing water stress treatment, the total amount of DEGs influenced by each single component R, T and P was 74008, 7905 and 5839, respectively (Figure 2A), whereas in leaf tissues, DEGs were 3794 forr R, 3476 for T and 2284 for P component (Figuure 2B). In other words, considering for exaample WS roots (Fig 1A), 2887 genes were differentially expressed only because of the effect of rootstock, i.e. of the different genotype, regardless of the effect of treatment (WW or WS) and time point considered (T1-T4). Conversely 2077 genes were exclusively inflluenced by the application of treatment, indepenndently by the genotype (101.14 or M4) annd the time point considered (T1-T4). Finally there were only 551 genes which appeared to change their expression just because of the effect of the experiment kinetic.
Plant Material and experimental design
Experiments were performed on Vitis vinifera L. cv Cabernet Sauvignon (CS) grafted onto 1103P and M4 rootstocks (Pasqua vigneti e cantine, Novaglie VR, Italy). Grapevines were grown in well-watered conditions.
Whole berries were collected from both CS/1103P and CS/M4 bunches, at 45, 59, 65 days after full bloom (DAFB). During this phase most berries reached véraison. Additional samples (skin and pulp separately) were collected at 72, 86 and 100 DAFB (Figure 1). All tissue samples were immediately frozen in liquid nitrogen and stored at -80 °C until use. Two biological replicates were collected, each made up of one hundred berries collected from fifty bunches (two berries per bunch) and chosen according to the CIRG index proposed by Carreño et al. (1995) at the same position within the cluster (median position). Sampling was performed in 2011 and 2012.
Physical (berry diameter and volume) and biochemical (Total Soluble Solids, Brix°) parameters were determined on 80 berries at each time point considered in the experiment. In order to better define grape berry ripening evolution in the two CS/rootstock combinations, colorimetric analyses were performed on 100 berries at 45, 59, 65, 72 and 86 DAFB, using the CR-10 colorimeter (Konica-Minolta Holdings Inc., Tokyo, Japan). Color was measured according to the L*a*b* space, defining brightness (L*, from white to black) and the chromatic coordinates (a*, from red to green; b*, from yellow to blue).
RNA-seq and qPCR analyses.
Total RNA for both mRNA-seq and real-time PCR experiments was extracted from either whole berry samples or from skin and pulp separately, using the perchlorate method as reported by Ziliotto et al. (2012). Small RNAs for microRNA-seq analyses were extracted from all samples following the CTAB method (Chang et al., 1993), with few modifications.
mRNA was purified from the total RNA using the Dynabeads mRNA Direct kit (Invitrogen pn 610.12). A variable quantity of mRNA ranging from 0.4 to 1.6% with respect to the total RNA was obtained.
Samples for Ligation Sequencing were prepared according to the SOLiD Whole transcriptome library preparation protocol (pn 4452437 Rev.B). Samples were purified before RNase III digestion with Purelink RNA micro kit columns (Invitrogen, pn 12183-016), digested from 3 to 10 minutes according the starting amount of mRNA, reverse-transcribed, size selected using Agencourt AMPure XP beads (Beckman Coulter pn A63881), and barcoded during the final amplification. Obtained libraries were sequenced using Applied Biosystems SOLiD™ 5500XL , which produced paired-end reads of 75 and 35 nucleotides for the forward and reverse sequences, respectively. Reads were aligned to the reference grape genome using PASS aligner, a software able to perform several alignments on a sequence subset, trying different parameters to trim the low quality bases and select the best ones to maximize the number of aligned reads (Campagna et al., 2009). The percentage identity was set to 90% with one gap allowed whereas the quality filtering parameters were set automatically by PASS. Moreover, a minimum reads length cut-off of 50 and 30 nt was set for the forward sequences and reverse reads, respectively. The spliced reads were identified using the procedure described in PASS manual (http://pass.cribi.unipd.it). Forward and reverse reads were aligned independently on the reference genome. PASS-pair was used from the PASS package to perform the pairing between forward and reverse reads and select only those sequences that are uniquely aligned. The version 1 of grape gene prediction available at http://genomes.cribi.unipd.it/grape was used as a reference genome, whereas htseq-counts program (http://www-huber.embl.de/users/anders/HTSeq/doc/count.html) was adopted to quantify gene transcripts abundance.
RNA-seq was carried out in three specific phases of berry development: pre-véraison, véraison and maturity. Specifically, pre-véraison. (45 DAFB) and traditional CS harvest date (100 DAFB) were the same for both combinations, while full véraison stage was at 72 and 86 DAFB for CS/M4 and CS/1103P, respectively (Figure 1). mRNA and small RNA pre-véraison (whole berries), véraison and harvest (skin and pulp) for both CS/rootstocks combinations were used for mRNA-seq and microRNA-seq analysis, performed at CRIBI, University of Padova.
For quantitative real-time PCR analysis (qPCR), cDNA was synthesized using 2 µg of total RNA, 2.5 µM (dT)18 primer, 200 Units of M-MLV Reverse Transcriptase (Promega) and 1 Unit of RNAguard (Amersham Biosciences), at 37°C for 90 minutes in a final volume of 20 µL. qPCR was carried out in triplicate on two biological replicates for each sample with StepOne Plus Real-Time PCR System (Applied Biosystems) by using specific primers listed in Supplementary table 1. For microRNAs quantification, cDNA synthesis and qPCR experiments were carried out using the TaqMan® MicroRNA Assays (Life Technologies) according to the manufacturer’s instructions.
Statistical and bioinformatics analyses on mRNA-seq and microRNA-seq data
The R package DEseq (Anders and Huber, 2010) was used for the statistical analyses of both mRNA-seq and microRNA-seq data. A false discovery rate (FDR) of 0.05 was set up as a threshold for identifying differentially expressed genes (including those encoding miRNAs). A pairwise comparison between M4 and 1103P genotypes was accomplished for each couple of samples at each time point (Pre-véraison, Véraison, Harvest).
In order to cluster together transcripts and microRNAs with complementary behaviors, a Time Course Cluster analysis using the Mfuzz R package (Kumar and Futschik, 2007) was performed.
Table of contents :
Chapter I – General introduction
1. Grapevine rootstocks
1.1 Influence of rootstocks on grapevine abiotic stresses tolerance
1.2 Rootstocks widely used in viticulture and characterization of new genotypes with OMICS techniques
2. Study of genes expression by using a transcriptomic approach
3. Grapevine and its genome
4. References (Chapter I)
Chapter II – Transcriptomes comparison of a susceptible and a putative tolerant grapevine rootstocks to drought give insight into the mechanisms of water stress tolerance
2. Materials and methods
2.1 Experimental design and plant material
2.2 Whole transcriptome analysis
2.3 Multifactorial and pairwise statistical analysis
2.4 Ontology and Differential Clustering Analyses (DCA)
3.1 Physiological traits analyses
3.2 mRNA-seq sequencing and mapping of reads to the grapevine genome
3.3 Multi-factor design and discovering of differentially expressed genes
3.4 Differential cluster analysis (DCA)
6. References (Chapter II)
Chapter III – Grapevine rootstocks differentially affected ripening inception and auxin-related genes of Cabernet sauvignon berries
2. Materials and methods
2.1 Plant Material and experimental design
2.2 RNA-seq and qPCR analyses
2.3 Statistical and bioinformatics analyses on mRNA-seq and microRNA-seq data
2.4 Phylogenetic analyses of ARF and AUX/IAA auxin-related gene families
3.1 Physical and biochemical analyses
3.2 Differentially expressed (DE) genes, DEmiRNA and time course clustering analyses
3.3 Characterization and phylogenetic analyses of grape AUX/IAA and ARF auxin-related families
3.4 CS/1103P and CS/M4 berries highlights differential regulation of auxin-related genes
3.5 Effect of rootstock on miRNA expression and their control of auxin metabolisms
6. References (Chapter III)
Chapter IV – Grape berry ripening delay induced by a prevéraison NAA treatment is paralleled by a shift in the expression pattern of auxin- and ethylene related genes
2. Materials and methods
2.1 Plant Material and treatment
2.2 RNA extraction, microarray analysis and quantitative real time PCR
2.3 Microarray annotation and enrichment analysis
2.4 HORMONOMETER analyses
3.1 Biochemical analyses
3.2 Differentially expressed genes and enrichment analysis
3.3 MapMan analysis
3.4 HORMONOMETER analysis
3.5 Expression of auxin-, ethylene-, and abscisic acid related genes
7. References (Chapter IV)
Chapter V – General conclusions
References (Chapter V)