![Page 1: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/1.jpg)
Biostatistics 666
Statistical Models and Numerical Methods in Human Genetics
![Page 2: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/2.jpg)
How to find me…� Gonçalo Abecasis
� Assistant ProfessorDept. of Biostatistics
� SPH II, Room M4132
� Phone: (734) 763-4901� E-mail: [email protected]
� Office Hours: Monday 4:00pm - 5:00pm
![Page 3: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/3.jpg)
Course Grading� Written Exams 50%
� In-class midterm - 20%� Final exam - 30%
� Problem Sets 20%
� Research Project or Review 30%� In-class presentation for 10% extra credit
![Page 4: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/4.jpg)
Paper or Project� Clearly written paper
� No more than 2,000 words
� Original thinking� Critical evaluation of literature� Original data analysis� Interesting computer simulation� Investigate analytical model
� Extra credit for oral presentation to class
![Page 5: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/5.jpg)
Paper or Project� Review Paper
� Critically evaluate a recently published research article
� I will provide list of suggested articles
� Research Project� Analyse your own data� Carry out a simple simulation project
![Page 6: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/6.jpg)
Class Schedule� Wednesdays 3:00pm – 4:30pm
� Room M4332
� Fridays 3:00pm – 4:30pm� Room M4332� Occasionally in Computing Lab, SPH II A
� Please bring your schedules this Friday
![Page 7: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/7.jpg)
DNA – Information Store� Encodes the information required for
cells and organisms to produce new cells and organisms and to function.
� DNA variation is responsible for many individual differences, some of which are medically important.
![Page 8: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/8.jpg)
DNA – A string of bases� Each base is either a:
� Purine� (A) – Adenine� (G) – Guanine
� Pyrimidine� (C) – Cytosine� (T) – Thymine
� Has an orientation
![Page 9: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/9.jpg)
DNA Double Helix� Pair of DNA strands
� Each strand is a sequence of A, C, T and G
� Complementary Strands� Facilitate replication� Bound by Hydrogen Bonds
![Page 10: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/10.jpg)
Chromosomes� Human DNA is protected by special
proteins� DNA is coiled around histone proteins
� Nucleosomes� Higher order structures
� Reference points in chromosomes� Telomeres� Centromeres
![Page 11: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/11.jpg)
Human Genome� Multiple chromosomes
� Each one is a DNA double helix� 22 autosomes
� Present in 2 copies� One maternal, one paternal
� 1 pair of sex chromosomes� Females have two X chromosomes� Males have one X chromosome and one Y chromosome
� Total of ~3 x 109 bases
![Page 12: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/12.jpg)
Inheritance� Offspring inherit one chromosome from
each parent
� Through meiosis, germ line cells produce haploid gametes
� These fuse to create an egg, and eventually a new human being
![Page 13: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/13.jpg)
Meiosis� DNA is replicated
� Chromosomes are paired
� DNA stretches exchanged between chromosomes
� Successive cell divisions take place
![Page 14: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/14.jpg)
Some Types of DNA Sequence� Genes (<5% of all human DNA)
� ~30,000-35,000 in humans� Exons, translated into protein� Introns, transcribed into RNA, but not protein
� Promoters� Enhancers� Repeat DNA� Pseudogenes
![Page 15: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/15.jpg)
Central Dogma� Information in cells is stored in DNA.
� DNA can be transcribed into RNA.
� RNA can be translated into protein.� Proteins can catalyze chemical reactions.� Proteins receive and transmit signals.� Proteins constitute structural building blocks.
![Page 16: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/16.jpg)
Genetic Code� DNA � RNA � Protein� DNA: 4 bases (A,T,C,G)� RNA: 4 bases (A,U,C,G)� Proteins: 20 amino-acids� Universal Genetic Code
� Translation between DNA/RNA and protein� Three bases code for one amino-acid
![Page 17: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/17.jpg)
Genetic Code
![Page 18: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/18.jpg)
Human Variation� When two chromosomes are compared most
of their sequence is identical� Consensus sequence
� About 1 per 1,000 bases differs between pairs of chromosomes in the population� In the same individual� In the same geographic location� Across the world
![Page 19: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/19.jpg)
Types of Genetic Variation� Sequence Polymorphisms
� A single or few bases differ between individuals
� Length Polymorphisms� In some regions of DNA, the number of
copies of particular DNA repeat varies
![Page 20: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/20.jpg)
Repeat Length Polymorphisms� Variable Number Tandem Repeats
� VNTRs� Typical repeat units of 10 – 100s bp� E.g.: ~110 bp repeat in IL1RN gene
� Microsatellites� Simple repeat sequences
� Most popular are 2, 3 or 4 bp
� E.g.: ACACACAC …� D naming scheme (e.g., D2S160)
![Page 21: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/21.jpg)
Example VNTR� Picture of DNA on an
agarose gel� This is a repeat
sequence near IL1 gene
� Small fragments move faster towards positive pole
–
+
![Page 22: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/22.jpg)
Microsatellites� Most popular markers for linkage
analysis� Large number of alleles (10 is common)� Can distinguish and track individual
chromosomes in families
� Relatively abundant� ~15,000 mapped loci
![Page 23: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/23.jpg)
SNPs� SNP is usually read as Snip� Single nucleotide polymorphisms
� Change one nucleotide� Replace� Insert� Delete
� Abundant, but traditionally hard to detect� Typically have one or a few alleles
![Page 24: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/24.jpg)
Single Base Changes� Transitions
� A/G, C/T� Purine � Purine� Pyrimidine � Pyrimidine
� Transversions� Purine � Pyrimidine� A/T, A/C, C/G, G/T
![Page 25: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/25.jpg)
A little more on SNPs� Most SNPs have only
two alleles� Easy to automate their
scoring� Becoming extremely
popular� Typing Methods
� Sequencing� Restriction Site� Hybridization
![Page 26: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/26.jpg)
Phenotypes� Can measured genetic variation indirectly
� E.g., Cystic Fibrosis� Patients must carry two mutations in CF gene� Parents of patients must carry one mutation� Normal individuals carry 0 or 1 mutations
![Page 27: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/27.jpg)
3 Stages of Genetic Mapping� Are there genes influencing this trait?
� Epidemiological studies
� Where are those genes?� Linkage analysis
� What are those genes?� Association analysis
![Page 28: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/28.jpg)
Is a trait genetic?� Examine distribution of trait in the population
and among relatives
� E.g. Inflammatory Bowel Disease (Crohn’s)� General population
� 1-3 cases per 1,000 individuals
� Twins of affected individuals� 44% of monozygotic twins also have Crohn’s� 3.8% of dizygotic twins also have Crohn’s
![Page 29: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/29.jpg)
Where are those genes?� Find genetic markers that co-segregate
with disease
� E.g. D16S3136co-segregateswith Crohn’s
![Page 30: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/30.jpg)
What are those genes?� Identify genetic variants that are associated
with disease…
� E.g. Disruptive mutations in NOD2 much more common in Crohn’s patient� Crohn’s Controls� Arg702Trp: 11% 4%� Gly908Arg: 4% 2%� Leu1007fs 8% 4%
![Page 31: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/31.jpg)
Checking Assumptions…
![Page 32: Statistical Models and Numerical Methods in …csg.sph.umich.edu/abecasis/class/2003/Lecture01.pdfStatistical Models and Numerical Methods in Human Genetics How to find me… Gonçalo](https://reader030.vdocuments.us/reader030/viewer/2022040216/5f2f28ef3fe24c1a241da500/html5/thumbnails/32.jpg)
Take Home Reading!� An introduction to important issues in
genetics:
� Lander and Schork (1994)Science 265:2037-48