finding new nirk genes in metagenomic data. what is nirk? -one kind of nitrite reductase
TRANSCRIPT
![Page 1: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/1.jpg)
Finding new nirK genes in metagenomic data
![Page 2: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/2.jpg)
What is nirK?-one kind of nitrite reductase
![Page 3: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/3.jpg)
Nitrogen Cycling
![Page 4: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/4.jpg)
+5 +3 +2 +1 0
![Page 5: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/5.jpg)
Metagenomic Datasets
• 2 Samples from Agricultural soil, 2 sequencing runs per sample( by roche 454 pyrosequecing technique)
• 2 Samples from Forest soil, 2 sequencing runs per sample( by roche 454 pyrosequecing technique )
• Data are from Tom Schmidt Lab
![Page 6: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/6.jpg)
Methods
• Start with sequence similarity search softwares-------HMMER
• HMMER : an implementation of profile hidden Markov models (profile HMMs) for biological sequence analysis
• Profie HMMs are built from multiple sequence alignment made of known members of a given protein family by alignment tool
![Page 7: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/7.jpg)
Advantage over BLAST
• HMMs have a formal probabilistic basis: use probability theory to guide how all the scoring parameters should be set
• HMMS have consistent theory behind gap and insertion scores
• But much slower than BLAST
![Page 8: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/8.jpg)
HMMER components
• HMMER has components:• to build profile HMM---hmmbuild• to search a profile against sequence
database---hmmsearch • and to align sequences according to a existing
profile---hmmalign
![Page 9: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/9.jpg)
6 Good knownnirKs
Mutiple alignment
format
Fungene pipe line
Profile HMM
download clustalw
hmm
build
hmmcalibrate
Potential nirKs
hmmsearch
Against soil data
BlAST nirK
result
blastAgainst soildata
compare
![Page 10: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/10.jpg)
Blast and Hmmer results• input files: /u/gjr/nirk2/ma1w2_run1_dereplicated_blastp.txt <==========> /u/gjr/nirk2/ma1w2_run1_dereplicated_localhmm.txt• • blastOnly: 23• shared : 6• hmmOnly : 2• • • input files: /u/gjr/nirk2/ma1w2_run2_dereplicated_blastp.txt <==========> /u/gjr/nirk2/ma1w2_run2_dereplicated_localhmm.txt• • blastOnly: 28• shared : 8• hmmOnly : 4• • • input files: /u/gjr/nirk2/ma1w4_run1_dereplicated_blastp.txt <==========> /u/gjr/nirk2/ma1w4_run1_dereplicated_localhmm.txt• • blastOnly: 24• shared : 8• hmmOnly : 5• • • input files: /u/gjr/nirk2/ma1w4_run2_dereplicated_blastp.txt <==========> /u/gjr/nirk2/ma1w4_run2_dereplicated_localhmm.txt• • blastOnly: 34• shared : 16• hmmOnly : 5
![Page 11: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/11.jpg)
Profile matters!
• Hmmsearch 6 seed profile hmm against all 3055 fungene nirKs (some may not real nirKs…)
• See the E-value distribution
![Page 12: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/12.jpg)
6Seed profile e-value distribution
make the seqs(124) on left into a profile
![Page 13: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/13.jpg)
124Seq e-value distribution
![Page 14: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/14.jpg)
Cumulative curve
![Page 15: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/15.jpg)
124Seq profile HMMER and BLAST Result
• input files: /u/gjr/nirk3/ma1w2_run1_dereplicated.blastp.txt <==========> /u/gjr/nirk3/ma1w2_run1_dereplicated.localhmm.txt • blastOnly: 112• shared : 7• hmmOnly : 0
• input files: /u/gjr/nirk3/ma1w2_run2_dereplicated.blastp.txt <==========> /u/gjr/nirk3/ma1w2_run2_dereplicated.localhmm.txt • blastOnly: 129• shared : 8• hmmOnly : 0
• input files: /u/gjr/nirk3/ma1w4_run1_dereplicated.blastp.txt <==========> /u/gjr/nirk3/ma1w4_run1_dereplicated.localhmm.txt • blastOnly: 109• shared : 10• hmmOnly : 0
• input files: /u/gjr/nirk3/ma1w4_run2_dereplicated.blastp.txt <==========> /u/gjr/nirk3/ma1w4_run2_dereplicated.localhmm.txt • blastOnly: 120• shared : 18• hmmOnly : 0
![Page 16: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/16.jpg)
Then tree methodnirK1
Seq1(good)
nirK2
nirK1
nirK2
Seq2(bad)
Just to show an idea
![Page 17: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/17.jpg)
NCBI nirK(cultured)
Soil blast result
Soil Hmmeresult
Hmmalign with 6 seq profile
quicktree
tree
![Page 18: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/18.jpg)
![Page 19: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/19.jpg)
Question to answer
• Best definition of nirK according to the current information
• Criteria of choosing seeds for profile hmm• Blast false positive problem
![Page 20: Finding new nirK genes in metagenomic data. What is nirK? -one kind of nitrite reductase](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfc81a28abf838ca8761/html5/thumbnails/20.jpg)
Thanks