Chapter 3 Expression analysis of ZFAS1 and ZNFX1
Transcription of lncRNAs that are antisense relative to protein coding genes commonly occurs in human and other eukaryotic cells. More than 63% of protein coding transcripts have antisense partners, most of which do not encode for protein (47, 223, 224). The majority of lncRNAs originate predominantly from the vicinity of protein-coding genes (225). These antisense lncRNAs can regulate expression of overlapping genes in cis, in which case they interact with genes transcribed from the same DNA region, or in trans, in which case they interact with distal genes. They can act as versatile positive or negative modulators of gene function (224) by interacting with DNA, RNA and protein to exert a number of regulatory functions (94, 121, 136, 226, 227).
ZFAS1 is orientated antisense to a protein coding gene, ZNFX1, as described in the UCSC Genome Browser (228). The location of ZFAS1 suggests that it may act to regulate the expression or function of ZNFX1, in which case their expression levels should be correlated. Previous experiments in mouse samples have shown that ZFAS1 and Znfx1 are independently regulated (107). This has not been investigated in human samples. We explored the relationship between these genes by measuring their expression in a number of cell lines, as well as by analysis of The Cancer Genome Atlas (TCGA) dataset to determine if potential co-regulation existed.
The dysregulation of lncRNAs has been linked to many diseases, including cancer (229). Additionally, some lncRNAs are expressed selectively in different breast cancer subtypes (230). Askarian-Amiri et al. (107) demonstrated that ZFAS1 is expressed at lower concentrations in a small cohort of invasive ductal carcinoma samples compared to matched normal tissue.
Gene expression was determined by using quantitative PCR (qPCR). qPCR offers several advantages over other gene expression techniques in that it is highly sensitive, able to detect amplification of PCR products in real time, detects the expression of weakly expressed genes accurately, requires only a small amount of RNA as a template for cDNA synthesis and provides quantitative expression levels of specific genes of interest. qPCR is based on the principle of normal PCR, in which Taq polymerase amplifies a specific product from a cDNA or genomic DNA template (depending on the primers used) after a series of heating and cooling cycles. For gene expression quantification, qPCR utilises fluorescent markers that are detected upon the accumulation of amplified DNA (231). Based on this background, qPCR was selected as an accurate detector to quantitate ZFAS1 and ZNFX1 gene expression. SYBR was used as a fluorescent marker as it binds directly to double stranded DNA.
Expression analysis of ZFAS1 and ZNFX1 was further studied by using the publically available database, TCGA. This database is a collaborative effort between the National Cancer Institute (NCI) and National Human Genome Research Institute (NHGRI) aimed to generate a comprehensive collection of cancer genomic profiles of over 30 different tumour types. Data from these samples are derived from several cooperating centres that collect and process samples, after which genomic DNA and cDNA are prepared and subject to high throughput sequencing and bioinformatics data analysis. Several platforms are utilised to provide global information on cancer genomics, of which we analysed RNA sequencing (RNA-seq) data, based on the Illumina system. RNA-seq is a high throughput technology for transcriptome profiling, and is able to rapidly identify and quantify rare and common transcripts, multiple isoforms, novel transcripts and ncRNA (232). This method of genomic analysis is ideal for examining lncRNA expression of ZFAS1 as well as that of the protein coding counterpart ZNFX1. Breast cancer data sets were analysed to determine a possible link with cell line data as well as to determine possible differences in gene expression in different breast cancer subtypes.
Previous experiments in mice have found that ZFAS1 transcripts are present in both the cytoplasm and the nucleus, whereas ZNFX1 is restricted to the nucleus (107). LncRNAs are often transcribed as multiple isoforms (55) and these multiple isoforms may have distinct functions (233). Additionally, distribution of lncRNA in specific cellular compartments is related to their functions. For example, the nuclear restricted lncRNA NEAT1 regulates paraspeckle formation (234) whereas UCHL1AS functions in the cytoplasm to regulate the translation of its protein coding counterpart mRNA (136). To determine whether there may be an isoform- or cellular compartment-dependent manner in which ZFAS1 functions, expression of ZFAS1 isoforms was analysed by PCR in multiple cell lines and in different cellular compartments.
This chapter investigates the expression of ZFAS1 and ZNFX1 in both cell lines and human tumour derived data sets, with a focus on breast cancer. The expression of these genes was analysed to determine whether they are coregulated, and whether they are differentially expressed in breast cancer, breast cancer subtypes and normal tissue, to give further insight into potential function.
Protein coding potential of ZFAS1
Alternative splicing of pre-mature RNA is an important process that increases the repertoire of mRNA isoforms. Five different isoforms have been reported for ZFAS1 according to the Mar 2006 NCBI36.1/hg18 genome assembly. To ensure that the human variants of ZFAS1 are non-protein-coding as demonstrated in mice (107), predicted open reading frames (ORFs) generated from ExPASy for each isoform were aligned against Riboseq data derived from GWIPS-viz (235) to determine whether predicted peptides matched those identified by ribosomal occupancy (Figure 3-1). The majority of the peaks corresponding to ribosomal occupancy overlapped with genomic regions of intron-derived snoRNAs. These peaks are a source of background RNA in profiling experiments, similar to that of GAS5, another lncRNA that is host to several snoRNAs as described by Ingolia et al. (236). A peak in ribosomal occupancy was observed in exon 2 of the ZFAS1 isoforms, which corresponded to an open reading frame (ORF) that predicted a peptide of 25 amino acids (M D F G R G S H H W T S K E A T C R H L Q P S I S Stop). A query of PeptideAtlas, a database of peptide sequences deduced from proteomic analyses (237), showed that no peptides have been identified that correspond to this particular ORF. Together, these observations led us to conclude that the human isoforms of ZFAS1 are unlikely to encode a peptide.
Ribosome occupancy derived from multiple ribosome profiling studies according to the GWIPS database is mapped to ZFAS1. Red peaks from ribosome profile indicate the level of ribosome occupancy whereas green peaks from mRNA seq coverage indicate the level of transcription of a particular gene region. ZFAS1 is indicated in blue, with numbers indicating nucleotide number for each exon above the gene layout. Potential ORFs, shown in pink, were predicted using ExPASy and mapped to the genomic layout of ZFAS1. Peaks corresponding to ribosome occupancy were then overlaid with ORFs, with the peaks mapping to snoRNAs in the intronic regions of ZFAS1 (238).
Expression of ZFAS1 in cell lines
ZFAS1 is expressed in mouse mammary gland tissues, and was previously found to be downregulated in human invasive ductal breast carcinoma, as compared to normal breast tissue (107). To further analyse ZFAS1 expression and to characterise its function, we have used breast cancer as our model system. We performed qPCR on cDNA prepared from 17 breast cancer cell lines, a keratinocyte cell line (HaCat), a melanoma cell line (MDA-MB-435) and two breast epithelial cell lines, Bre80hTert and MCF10A. Different levels of expression of ZFAS1 were detected in these cell lines, as shown in Figure 3-2. ZNFX1 was also expressed in these cell lines, and as with previous results (107), was approximately 25 fold less abundant than ZFAS1 (Figure 3-2). The expression levels of ZFAS1 and ZNFX1 were not significantly correlated (Pearson correlation coefficient, R = −0.021, p = 0.93), further suggesting that ZFAS1 and ZNFX1 are independently regulated (Figure 3-2B). ER+ and ER- cell lines did not show any significant difference in ZFAS1 or ZFNX1 expression (Figure 3-2C). Many lncRNAs regulate protein-coding genes in cis. If this were the case with ZFAS1 and ZNFX1, it would be expected that the abundance of their transcripts should be correlated. The lack of any correlation is evidence that cis regulation involving this pair of genes does not apply, and provides a basis for seeking alternative ZFAS1 activities.
Expression of ZFAS1 (left panel) and ZNFX1 (right panel) in cell lines relative to that of housekeeping genes HPRT and GAPDH as analysed by RT-PCR. Dark blue and dark red bars indicate normal breast epithelial cell lines, light blue and light red coloured bars indicate breast cancer cell lines. Green and grey coloured bars indicate non-breast cancer cell lines (MDA-MB-435 and HaCat are melanoma and keratinocyte cell lines respectively). Expression levels were analysed in 3 biological replicates, error bars represent SEM. (B) Correlation of ZFAS1 and ZNFX1 expression in cell lines. Axis values represent gene expression relative to housekeeping genes HPRT and GAPDH as analysed by qPCR, n=21. Correlation was calculated using Pearson’s correlation. (C) Comparison of ZFAS1 expression in ER+ (n = 9) and ER- (n = 5) cell lines. P values were calculated using Student’s t-test (238).
Expression analysis of ZFAS1 from The Cancer Genome Atlas
The genome-wide RNA transcript profile from TCGA (breast invasive carcinoma expression) by RNAseq data set (HiSeqV2-2015-02-24) including 1069 samples from primary breast cancers and 113 samples from normal breast tissue was analysed. The expression of ZFAS1 was not significantly different in breast cancer patients as compared to healthy controls (Figure 3-3Ai). Nevertheless, ZFAS1 expression was significantly reduced in basal and HER2 breast cancer subtypes compared to normal breast tissue (Figure 3-3Aii). ER+ breast tumours also displayed higher expression of ZFAS1 compared to ER- breast tumours (Figure 3-3A(iii)). A previous study of ZFAS1 (107) in breast cancer using a limited number of samples, suggested that ZFAS1 expression was down-regulated in breast cancer cells relative to normal breast epithelial cells. This was investigated more thoroughly using TCGA datasets. No differences between unselected neoplastic and normal breast samples were observed. The possible subtle differences between ZFAS1 expression in certain subtypes of breast cancer and normal cells (Figure 3-3A(ii-iii)) could reflect the large number of samples examined, and thus be of minimal clinical impact. Whether ZFAS1 contributes to cancer progression was further investigated by examining patient survival time and their corresponding levels of ZFAS1 expression. Figure 3-3B displays a Kaplan-Meier plot generated from http://www.oncolnc.org of TCGA breast cancer data set. High expressers are those 50% of patients with the highest ZFAS1 expression, and low expressers are those 50% of patients with the lowest ZFAS1 expression. These groups do not show significant differences in survival up to 6000 days. Further dividing these patients by their ER status also did not show significant differences in survival (Supplementary Figure 10-1)
A)(i) Expression of ZFAS1 in normal breast (n = 113) and breast cancer (n = 1069) samples. (ii) Expression of ZFAS1 by tumour subtype based on PAM50 classification. ZFAS1 is more highly expressed in normal tissues compared to basal and HER2 breast cancer subtypes. (iii) Expression of ZFAS1 in ER+ (n = 601) and ER-(n = 179) breast cancer samples. Unpaired Student’s t-test showed that ZFAS1 was differentially expressed according to estrogen status. Box plots extend from the 25th to 75th percentile. Whiskers represent minimum and maximum values for each plot. (B) Kaplan-Meier plot generated from http://www.oncolnc.org/ of TCGA breast cancer data set showing overall survival of breast cancer patients relative to the expression of tumour ZFAS1. High expressers are those 50 % of patients with the highest ZFAS1 expression, and low expressers are those 50 % of patients with the lowest ZFAS1 expression. High expressers of ZFAS1 do not show altered survival up to 6000 days. Unpaired Student’s t-test relative to normal tissue samples was used to calculate P values (238).
Isoform detection by RT-PCR
Primers were designed to differentiate between various isoforms in ZFAS1 (Figure 3-4) in order to determine which isoforms are expressed. Predicted product sizes to determine what specific isoforms are amplified are detailed in Table 3-1.
Breast epithelial, breast cancer and non-breast cancer cell lines were tested for the presence of multiple isoforms (Figure 3-5). Every cell line tested using primer set ZFAS1 E1F3-E5R1 confirmed the presence of at least 3 isoforms, showing that the presence of ZFAS1 isoforms is not cell line specific. The isoforms appear to be the same size across different cell lines except for 8701BC, which contains an isoform slightly larger than 388bp.
Breast epithelial (MCF10A), breast cancer (BT20, KPL1, PMC42ET, MDA-MB-468, MCF7, MDA-MB-453, MDA-MB-436, HBL100, ZR751, P7731, SKBR3, HCC1143, MDA-MB-231, HCC1954, 8701BC and HCC70) and non-breast cancer cell lines (MDA-MB-435 and HaCat) were investigated for different isoforms of ZFAS1 using primer set ZFAS1 E1F3-E5R1. Arrowheads indicate the multiple isoforms and their sizes detected in cell lines examined (238).
Previous experiments had been performed to amplify the entire transcript of ZFAS1 from exon 1 to exon 5 but were unsuccessful. In order to determine why primers at the beginning of the first exon were unable to amplify ZFAS1, the GC content of exon 1 was analysed using BitGene 2010 (http://www.bitgene.com/cgi/gene_analysis.cgi) (Figure 3-6). The 5’ end of ZFAS1 has a GC content of approximately 80%. This high level of GC content, in addition to the length of the template to be amplified, may have resulted in failure of amplification of the 5’ end of ZFAS1.
Primers covering regions binding nearer to the 5’ end of exon1 as represented by primers E1F1 and E1F2 lie in GC-rich regions whereas primer E1F3 lies in an approximately 50% GC region.
Previous experiments in mice showed that Zfas1 was located in both cytoplasm and nucleus, whereas Znfx1 mRNA was restricted to the nucleus (107). Since cellular location will provide an indication of the function of lncRNA, cellular fractionation was performed to identify the subcellular localisation of ZFAS1 in MDA-MB-468 and MDA-MB-231 breast cancer cells. Cell lysates were separated into cytoplasmic and nuclear fractions, total RNA extracted and cDNA analysed by RT-PCR. For the purpose of validating the effectiveness of the fractionation procedure, we used NEAT1, a nuclear lncRNA, as a nuclear marker, and GAPDH as a positive control as it is found in both the cytoplasm and nucleus (Figure 3-7).
Expression of GAPDH in cellular compartments in MDA-MB-231 and MDA-MB-468 cells. GAPDH is found in both cytoplasm and nucleus. (B) Expression of NEAT1 in cellular compartments in MDA-MB-231 and MDA-MB-468 cells. NEAT1 is present in whole cell fractions and nucleus in both cell lines. The expression of these two genes was detected by RT-PCR and visualised on an agarose gel stained with ethidium bromide. Extracts are from whole cell (W), cytoplasm (C), and nucleus (N); n, biological replicate.
Primers E1F3-E5R1 amplify ZFAS1 from exons 1 to 5, var2 to var5. At least 3 of the 4 isoforms are present in both cytoplasm and nucleus as indicated by the number of distinct bands in PCR products in both cell lines (Figure 3-8).
ZFAS1 was amplified using the primer set ZFAS1 E1F3-E5R1, amplifying at least 3 of the 4 isoforms. These isoforms are present in both cytoplasm and nucleus. Arrowheads point to the PCR products and indicate the expected sizes. n, biological replicate (238).
To differentiate between variants 3 and 4 that are 366 and 388 bp in length respectively but appeared as a single band on previous gels, an internal PCR was performed using primers E1F3-E2R2 (Figure 3-9), yielding 2 products also present in both the cytoplasm and nucleus, confirming the presence of both variants.
ZFAS1 was amplified first with PCR using the primer set ZFAS1 E1F3-E5R1, then an internal PCR was performed using primer set E1F3-E2R2 to differentiate between variants 3 and 4. Both isoforms are present in both the cytoplasm and nucleus. n, biological replicate (238).
To identify the remaining isoform of ZFAS1, variant 1, KapaHifi taq was used as it is designed to amplify GC rich regions. An initial PCR was performed using primers E1F1-E2R2, yielding a faint product of 295 bp (Figure 3-10). Due to the faint signal, a second internal PCR was performed to confirm that variant 1 was present. Using primers E1F2-E2R1, a 93 bp product was amplified (Figure 3-10). This isoform of ZFAS1 was present in both the cytoplasm and nucleus leading us to conclude that the functions of the isoforms are not distinguishable by their separation into different cellular compartments.
Table of Contents
Table of Contents
List of Figures
List of Tables
Chapter 1. Introduction
Pervasive transcription of the human genome
Zfas1: a mammary lncRNA with unknown function
Chapter 2. Materials and Methods
Molecular biology techniques
Western blot analysis
Rapid Amplification of cDNA Ends (5’RACE)
Serum starvation and refeeding
Inhibition of ribosome activity using ribosome inhibitor drugs
Culture of C2C12 muscle cells
Development of ZFAS1 shRNA knockdown cell lines
Measurement of cell proliferation using sulforhodamine B (SRB) assay
Measuring protein synthesis
Cell size determination
Chapter 3. Expression analysis of ZFAS1 and ZNFX1
Chapter 4. Ribosome association of ZFAS1
Chapter 5. Characterising 5’TOP function of ZFAS1
Chapter 6. Functional studies of ZFAS1 using short hairpin RNA
Chapter 7. Association of ZFAS1 in muscle models of ribosome biogenesis
Chapter 8. Identifying putative protein interactions with ZFAS1
Chapter 9. Concluding discussion .
Possible model of action for ZFAS1
Chapter 10. Appendix
Survival curves in ER+ and ER- breast cancer patients stratified by ZFAS1 expression
Sucrose gradient separation biological replicates
Expression of candidate ribosomal protein genes based on their ER status
Cell sorting results of ZFAS1 shRNA transfection
Protein gels from RNA pulldown experiments
Proteins identified from RNA pulldown
Chapter 11. Journal permissions
GET THE COMPLETE PROJECT
ZFAS1: a long non-coding RNA with novel implications for ribosome function