Labwork and Data-cleaning – Project topics materials

Get Complete Project Material File(s) Now! »

Chapter 3 Labwork and Data-cleanin

Introduction

Through the process of meiosis, oﬀspring inherit genetic characteristics from their parents by inheriting fractions of their parents’ deoxyribonucelic acid (DNA). DNA carries genetic information in the form of sequences of chemicals known as nu-cleotides. There are four types of nucleotide: adenine, thymine, cytosine and guanine (abbreviated A, T, C and G respectively).
DNA consists of two chains of nucleotides held together by hydrogen bonds, and twisted in a double-helix formation. Nucleotides in one chain are paired with their complementary nucleotides in the other chain, so that adenine will always be paired with thymine, and cytosine will always be paired with guanine. It is therefore suﬃcient to know the sequence of one chain of nucleotides only, since the other chain will simply be the sequence of complementary nucleotides. The length of DNA is often expressed in terms of millions of base-pairs (Mbp), where the term “base-pair” refers to the complementary pair of nucleotides at a point along the DNA molecule.

Nuclear DNA

DNA within a cell’s nucleus is known as nuclear DNA. In diploid organisms, the chromosomes inside the nucleus are arranged as pairs. The DNA in one of a homologous pair of chromosomes is donated by one of an individual’s parents, and the DNA in the other of the pair is provided by the other parent. An individual’s genetic makeup is therefore some combination of the DNA possessed by each of their parents.
In most species, there is more than one pair of chromosomes within a nucleus. The choice of the DNA provided from a parent to one of its oﬀspring during the process of meiosis is independent between each pair of chromosomes, since the chromosomes are separate physical units (“physically unlinked”). This means the probability of inheritance can be assessed for each chromosome individually, and combined multiplicatively to provide the probability of inheriting the oﬀspring’s full genotype across all chromosomes.
The lengths of DNA sequences found in nuclei diﬀer between species, but they are typically in the order of many millions of nucleotides.

Mitochondrial DNA

Mitochondria are small, intra-cellular structures (organelles) that convert sugar into energy for use by the cell. It has been postulated that mitochondria were once independently existing organisms. At some point in time they chose to merge with animal cells, providing energy in exchange for protection and cellular regulatory services (Wallin, 1923; Martin and Muller, 1998). Mitochondria have their own DNA, which is typically passed intact from a mother to her oﬀspring. That is, assuming no mutation occurs, an individual’s mitochondrial DNA is almost always identical to his or her mother’s. During fertilisation, the sperm will also contribute mitochondria from the father to the zygote, but either due to a biological process that destroys or rejects the paternal mitochondria, or simply due to the weight of numbers of maternal mitochondria in the egg relative to the number of paternal mitochondria in the sperm, embryos will generally develop with copies of only maternal mitochondria. Mitochondrial DNA is commonly abbreviated as mtDNA.
Most cells contain many mitochondria, and each mitochondrion typically carries between 5 and 20 copies of the mitochondrial chromosome. Thus, there can be many thousands of copies of the mitochondrial genome in a single cell, while there is only ever one copy of the nuclear DNA genome. Because of its abundance,

CHAPTER 3. LABWORK AND DATA-CLEANING

mitochondrial DNA is often used when studying ancient remains, where a large proportion of DNA is degraded and unusable. Unlike nuclear DNA, mitochondrial DNA is typically arranged in an unpaired, circular chromosome. Mitochondrial DNA is also much shorter than the nuclear DNA of animals, in the range of 13-18 thousand nucleotides.

READ Seasonal variation of aerosol size distribution data at the Puy de Dôme station with emphasis on the boundary layer/ free troposphere segregation (Farah et al., (2018) published in atmosphere)

Characterising DNA

There are several features that can be used to characterise DNA. For instance, the entire sequence of nucleotides can be examined. The process of determining this information is known as sequencing. The typical length of a genome, the complete set of information available from DNA, is in the order of billions of base-pairs. It is therefore usual to sequence only a small section of the genome. Even then, this process is time-consuming and expensive.
An alternative to sequencing is to examine the length of microsatellites at par-ticular regions (loci) of the genome. Microsatellites consist of small groups of nucleotides, typically between two and four nucleotides long, which are repeated many times. For example, the dinucleotide pair AC might be repeated 5 times at a particular region in the DNA, (AC)₅ = ACACACACAC. Microsatellites in the order of 5 to 20 repeated units are sometimes referred to as Short Tandem Repeats (STRs). The term “microsatellite” is generally used to refer to regions of DNA with at least 50 repeating nucelotide groups.
Microsatellites are drawn from regions of the DNA which do not code for proteins, i.e. from regions which aren’t associated with genes. These regions are known as “non-coding DNA”, and also commonly referred to as “junk DNA”. Because these regions are not associated with a particular characteristic, there is no reason to favour individuals that inherit a particular variant over any other. That is, there is no selection pressure on the inheritance of microsatellites. Microsatellites are therefore inherited at random, which makes them useful for statistical analysis.

1 Introduction
1.1 Motivation and Goals of the Project
1.2 Species of Rattus in New Zealand .
1.3 The Effect of Rats on New Zealand Ecosystems
1.4 The Importance of Understanding the Invasibility of Islands to Rats
2 Fieldwork Methods
2.1 Study Sites
2.2 Fieldwork Methods
3 Labwork and Data-cleaning
3.1 Introduction
3.2 DNA Extraction, PCR and Allele Identification
3.3 Data-cleaning and Binning
3.4 Genotyping Errors
4 Pilot Study: Waiheke Island, May 2004
4.1 Introduction
4.2 Trap-site Details
4.3 Results
4.4 Discussion of Results
5 Literature Review for Exploratory Methods of Investigating Ge- netic Differentiation
5.1 F-statistics
5.2 Genetic Distances
5.3 Trees
5.4 Summary
6 Graphical User Interface 85
6.1 Map Creation in R
6.2 Genetic Data Format for the GUI
6.3 Specifying Population Locations for the GUI
6.4 Features of the GUI
7 Survivors or Reinvaders? Introduction to Population Identifica- tion
7.1 Pearl Island
7.2 Analysis
7.3 Methods of Interpreting the Output .
7.4 Missing Data
7.5 Application
7.6 Discussion
8 Multivariate Posterior Assignment Probabilities
8.1 Assignment Probability
8.2 Calculating the Probabilities
8.3 Displaying the Probability Profiles
8.4 Statistical Test of Separation
8.5 Discussion
9 The Power of Exploration
9.1 Invasion of the Birdie Snatchers: Ship Rats on Stewart Island, Rakiura
9.2 Something in the Water: Norway Rats in the Bay of Islands, Ipipiri o Tokerau
9.3 Pi-rats of the South Pacific: Ship Rats in the Bay of Islands, Ipipiri o Tokerau
9.4 Barriers to Progress: Ship Rats on Great Barrier Island, Aotea
10 Literature Review for Modelling Rates of Inter-population Migra- tion
10.1 Using Mark – Recapture Methods to Estimate Migration
10.2 Estimating Migration Through Genetics
11 Adapting the Program BayesAss
11.1 Estimating Migration Through Genetics
11.2 Application to Real Data
11.3 Great Barrier Island
12 Metropolis Hastings Proposal Intervals
12.1 Construction of Acceptance Probabilities
12.3 Simulation
12.4 Discussion
13 Summary Statistics
13.1 Head-Body Length versus Tail Length81
13.2 Colour Morphs of Ship Rats
13.3 Variation of Mass
13.4 Discussion
14 Multiple Paternity
14.1 Introduction
14.2 Methods
14.3 Results
14.4 Discussion
15 Conclusions
15.1 Advancement of Knowledge
15.2 Communication of Findings
15.3 Future Directions
GET THE COMPLETE PROJECT
Stochastic modelling of rat invasions among islands in the New Zealand archipelago