Download - Church sfaf13
![Page 1: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/1.jpg)
Keep CalmAnd
Carry on SequencingDeanna M. Church Staff Scientist, NCBI
@deannachurch
![Page 2: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/2.jpg)
http://genomereference.org
Valerie Schneider, NCBI
![Page 3: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/3.jpg)
Photograph: Paul Popper/Popperfoto/Getty Images
![Page 4: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/4.jpg)
![Page 5: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/5.jpg)
GRCh38 is coming(September, 2013)
![Page 6: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/6.jpg)
![Page 7: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/7.jpg)
http://www.ncbi.nlm.nih.gov/variation/tools/1000genomes
![Page 8: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/8.jpg)
![Page 9: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/9.jpg)
![Page 10: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/10.jpg)
![Page 11: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/11.jpg)
http://www.bioplanet.com/gcat
![Page 12: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/12.jpg)
![Page 13: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/13.jpg)
http://genomereference.org
![Page 14: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/14.jpg)
Dennis et al., 2012
1q32 1q21 1p21
1p21 patch alignment to chromosome 1
![Page 15: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/15.jpg)
http://www.ncbi.nlm.nih.gov/variation/tools/1000genomes
CDC27
1KG Phase 1 Strict accessibility mask
SNP (all)
SNP (not 1KG)
![Page 16: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/16.jpg)
Sudmant et al., 2010
![Page 17: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/17.jpg)
Kidd et al, 2007 APOBEC cluster
Part of chr22 assembly
Alternate locus for chr22
White: InsertionBlack: Deletion
![Page 18: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/18.jpg)
http://www.ncbi.nlm.nih.gov/variation/tools/1000genomes
![Page 19: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/19.jpg)
Mouse Ren1 chr1 (CM000994.2/NC_000067.6): 133350674-133360320
129S6/SVEvTac tiling path
Alignment to C57BL/6J chr1
B6 Genes
129S6/SvEvTac Genes
+ 32Kb in 129S6/SvEvTac
![Page 20: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/20.jpg)
Mouse Ren1 chr1 (CM000994.2/NC_000067.6): 133350674-133360320
NM_031192.3: transcript from C57BL/6JNM_031193.2: transcript from FVB/N
129S6/SvEvTac Alt Locus Alignment (allelic)
FVB/N Transcript Alignment (paralog)
![Page 21: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/21.jpg)
129S6/SvEvTac Ren1
FVB Ren2 Tx
Paralogousdiff
SNP +Paralogous
diff
Mouse Ren1 chr1 (CM000994.2/NC_000067.6): 133350674-133360320
NM_031192.3: transcript from C57BL/6JNM_031193.2: transcript from FVB/N
![Page 22: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/22.jpg)
An assembly is a MODEL of the genome
![Page 23: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/23.jpg)
Assembly Model
![Page 24: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/24.jpg)
![Page 25: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/25.jpg)
BAC insertBAC vector
Shotgun sequence
Assemble
GAPS
Finishing
![Page 26: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/26.jpg)
http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/issue_detail.cgi?id=HG-21
NCBI36 (hg18)
GRC
h37
(hg1
9)
![Page 27: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/27.jpg)
NCBI35 (hg17)
GRCh37 (hg19)
AL139246.20
AL139246.21
![Page 28: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/28.jpg)
Daly et al., 2013
![Page 29: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/29.jpg)
http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/issue_detail.cgi?id=HG-1012
![Page 30: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/30.jpg)
http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/issue_detail.cgi?id=HG-1321
![Page 31: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/31.jpg)
Fixing Rare/Incorrect Bases
![Page 32: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/32.jpg)
Fixing Rare/Incorrect Bases
![Page 33: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/33.jpg)
GRCh37B Sites for Update: n=1164
Sites with unique successful ctg 1148 (98.6%)
Avg Length 448 bp
Min/Max Success Length 51/791 bp
Avg Coverage 80x
Read Source (all contigs)
High coverage 32%
Low coverage 57%
Exome 10%
Fixing Rare/Incorrect Bases
![Page 34: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/34.jpg)
Build sequence contigs based on contigs defined in TPF (Tiling Path File).
Check for orientation consistenciesSelect switch pointsInstantiate sequence for further analysis
Switch point
Representative chromosome sequence
![Page 35: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/35.jpg)
RP11-34P13 64E8 RP4-669L17 RP5-857K21 RP11-206L10 RP11-54O7
Gaps
![Page 36: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/36.jpg)
NCBI36
![Page 37: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/37.jpg)
nsv832911 (nstd68) Submitted on NCBI35 (hg17)
![Page 38: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/38.jpg)
NCBI35 (hg17) Tiling Path
GRCh37 (hg19) Tiling Path
Gap Inserted
Moved approximately 2 Mb distal on chr15
NC_0000015.8 (chr15)
NC_0000015.9 (chr15)
Removed from assembly
Added to assembly
http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/issue_detail.cgi?id=HG-24
![Page 39: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/39.jpg)
Sequences from haplotype 1Sequences from haplotype 2
Old Assembly model: compress into a consensus
New Assembly model: represent both haplotypes
![Page 40: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/40.jpg)
AC074378.4AC079749.5
AC134921.2AC147055.2
AC140484.1AC019173.4
AC093720.2AC021146.7
NCBI36 NC_000004.10 (chr4) Tiling Path
Xue Y et al, 2008
TMPRSS11E TMPRSS11E2
GRCh37 NC_000004.11 (chr4) Tiling Path
AC074378.4AC079749.5
AC134921.1AC147055.2
AC093720.2AC021146.7
TMPRSS11E
GRCh37: NT_167250.1 (UGT2B17 alternate locus)
AC074378.4AC140484.1
AC019173.4AC226496.2
AC021146.7
TMPRSS11E2
nsv532126 (nstd37)
![Page 41: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/41.jpg)
Adding Novel Sequence
1000G ph1 decoy sequence, viewed by:• GenBank alignment• Percent Repeat Masker• Repeat Masker type• Sequence Source (HTG, HuRef, ALLPATHS)
![Page 42: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/42.jpg)
Adding Novel Sequence
![Page 43: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/43.jpg)
Adding Novel Sequence
![Page 44: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/44.jpg)
Genovese et al., 2013
![Page 45: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/45.jpg)
Adding Novel Sequence
Karen Hayden and Jim Kent
![Page 46: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/46.jpg)
Human Resolved for GRCh38
http://genomereference.org
![Page 47: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/47.jpg)
Examples
![Page 48: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/48.jpg)
Preview of GRCh38 (scheduled Fall 2013)
TEX28 TKTL1
LOC101060233(opsin related)
LOC101060234(TEX28 related)
GRCh37 (current reference assembly)chrX
![Page 49: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/49.jpg)
Hydin: chr16 (16q22.2)Hydin2: chr1 (1q21.1)Missing in NCBI35/NCBI36 Unlocalized in GRCh37 Finished in GRCh38
Alignment to Hydin2 Genomic, 300 Kb, 99.4% ID
Alignment to Hydin1 CHM1_1.0, >99.9% ID
Alignment to Hydin2 Genomic, 300 Kb, 99.4% ID
Alignment to Hydin1 CHM1_1.0, >99.9% ID
Doggett et al., 2006
![Page 50: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/50.jpg)
FAM23_MRC1 Region, chr10
Segmental Duplications
1KG accessibility Mask
Novel Patch 250 kb of artificial duplication
![Page 51: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/51.jpg)
Adding Novel Sequence
![Page 52: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/52.jpg)
Richa Agarwala
MHC Alternate locus
Alignment to chr6
![Page 53: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/53.jpg)
![Page 54: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/54.jpg)
Making the assembly accessible to existing tools: masking
Query set: 439,109,084 NA12878 HiSeq reads
![Page 55: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/55.jpg)
Masking effectively blocks alignments in regions with high identity
Simulated reads from GRCh37.p9• Unpaired reads• 101 bp• 1x coverage• Default wgsim parameters
Masking parameters• Percent Id: 100%• Step size: 5 bp• Minimum length: 101 bp• Center SNPs in unmasked regions
![Page 56: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/56.jpg)
Masking improves alignments in regions with alternate loci or patches
![Page 57: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/57.jpg)
NA12878 reads whose best alignment was on an alt/patch in the masked assembly were evaluated for their alignment location when aligned to the primary assembly alone
Masking effectively reduces the increase in NA12878 reads that have alignments with MAPQ=0 that occurs when the full assembly is used as an alignment substrate
![Page 58: Church sfaf13](https://reader037.vdocuments.us/reader037/viewer/2022103018/5589cd31d8b42a302e8b45c1/html5/thumbnails/58.jpg)
GRCh38 is coming(September, 2013)