genome the chromosomes are the volumes of an encyclopedia called genome cell nucleus tissue the...
Post on 20-Dec-2015
216 views
TRANSCRIPT
![Page 1: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/1.jpg)
Genome
• The chromosomes are the volumes of an encyclopedia called Genome
Cell
Nucleus
Tissue
• The chromosomes contains the set of instructions for alive beings
![Page 2: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/2.jpg)
Chromosome>human chromosome TACGTATACTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGCGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGATGCTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCCGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGATGCTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGATCGATGCTATACGACGATCGTAGCTAGCTGCATGCTAGCGATGCTACGATCGATGCTATACGACGATCGTAGCTTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTCGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGATGCTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGATCGATGCTATACGACGATCGTAGCTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGATGCTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGATCGATGCTATACGACGATCGTAGCTGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGGTACGATCGTCGATCGTCAGCTCGATACGTTACGATCTACGATTACGATCATCTATACTATACTATACGATATATCTAGATATCGATCTA.ACTCCATTCTTTAAACCGTACTACACACACTACTGATCGACGATTACGACGACGAAAGGGCCATATCGGCTAACTACATCATAGACAACATCACGGATCGTCTAAGGCCGAGTTAGGTACGATTAACGTACGACTACCTATCGTATATACATCACGGATATAACCTATCTACTACGATTAACACGATCTATCGTACGGCATATGCATCGTATAGCATCGATTAGAATACGTATACGTACGATCGTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTAC
GTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGATGCTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGATCGCGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGATGCTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGATCGATGCTATACGACGATCGTAGCTATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGTACGGTACACCG
CGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGCTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGATGCATGCTAGCGATGCTACGACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGATGCTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACTGCATCGATGCTATACGACGATCGTAGCTACGTACGATCGTACGACGTACGTTACGTACGATCGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTGTCACGTAGCATGCTGACGTACGATCGATTCGATCGATCGTACGATCGTAGCTAGCTAGTCGTAGCGACGTAGGATTCACGTAGCGATGCGTAGCGTAGCATGCTGACGATGCATCGATCGATGCATCATGCTAGCGTAGCTAGCTAGCATGACTGATCGATTAACGGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGTACGGTACACCGCGCACGATCACACGATGCGACGATGCGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGCTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGACGATCGTACGACTGCTAGCTACGCATGCCTACGTACGTATCCTACGTACGATCGTGCAGCATCGATGCTACGTACGACGATCGATATTAATGCAATCATGCAGCTGCA
TGCTAGCGATGCTACGACGACGATCGATATTAATGCAATCATGCAGCTGCATGCTAGCGATGCTACGTACGATCGTATGCTAGCTAGCATGCATGCATGCATGCAT ………..
![Page 3: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/3.jpg)
Recuperació de la informació
•Bioinformatics. Sequence and genome analysisDavid W. Mount
•Flexible Pattern Matching in Strings (2002)Gonzalo Navarro and Mathieu Raffinot
•Algorithms on strings (2001)M. Crochemore, C. Hancart and T. Lecroq
•http://www-igm.univ-mlv.fr/~lecroq/string/index.html
![Page 4: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/4.jpg)
String Matching
String matching: definition of the problem (text,pattern) depends on what we have: text or patterns• Exact matching:
• Approximate matching:
• 1 pattern ---> The algorithm depends on |p| and || • k patterns ---> The algorithm depends on k, |p| and ||
• The text ----> Data structure for the text (suffix tree, ...)
• The patterns ---> Data structures for the patterns
• Dynamic programming • Sequence alignment (pairwise and multiple)
• Extensions • Regular Expressions
• Probabilistic search:
• Sequence assembly: hash algorithm
Hidden Markov Models
![Page 5: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/5.jpg)
Exact string matching: one pattern
For instance, given the sequence
CTACTACTACGTCTATACTGATCGTAGCTACTACATGC
search for the pattern ACTGA.
How does the string algorithms made the search?
and for the pattern TACTACGGTATGACTAA
![Page 6: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/6.jpg)
Exact string matching: Brute force algorithm
Given the pattern ATGTA, the search is
G T A C T A G A G G A C G T A T G T A C T G ...A T G T A
A T G T A
A T G T A
A T G T A A T G T A
A T G T A
Example:
![Page 7: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/7.jpg)
Exact string matching: Brute force algorithm
Text :
Pattern :
From left to right: prefix
• Which is the next position of the window?
• How the comparison is made?
Pattern :
Text :
The window is shifted only one cell
![Page 8: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/8.jpg)
Exact string matching: one pattern
There is a sliding window along the text against which the pattern is compared:
How does the matching algorithms made the search?
Pattern :
Text :
Which are the facts that differentiate the algorithms?
1. How the comparison is made.2. The length of the shift.
At each step the comparison is made and the window is shifted to the right.
![Page 9: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/9.jpg)
Exact string matching: one pattern (text on-line)
Experimental efficiency (Navarro & Raffinot)
2 4 8 16 32 64 128 256
64
32
16
8
4
2
| |
Long. pattern
Horspool
BNDMBOM
BNDM : Backward Nondeterministic Dawg Matching
BOM : Backward Oracle Matching
w
![Page 10: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/10.jpg)
Horspool algorithm
Text :
Pattern :Sufix search
• Which is the next position of the window?
• How the comparison is made?
Pattern :
Text : a
Shift until the next ocurrence of “a” in the pattern:
aa a
a a a
We need a preprocessing phase to construct the shift table.
![Page 11: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/11.jpg)
Horspool algorithm : example
Given the pattern ATGTA
• The shift table is:
A C G T
![Page 12: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/12.jpg)
Horspool algorithm : example
Given the pattern ATGTA
• The shift table is:
A 4C G T
![Page 13: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/13.jpg)
Horspool algorithm : example
Given the pattern ATGTA
• The shift table is:
A 4C 5G T
![Page 14: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/14.jpg)
Horspool algorithm : example
Given the pattern ATGTA
• The shift table is:
A 4C 5G 2T
![Page 15: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/15.jpg)
Horspool algorithm : example
Given the pattern ATGTA
• The shift table is:
A 4C 5G 2T 1
![Page 16: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/16.jpg)
Horspool algorithm : example
Given the pattern ATGTA
• The shift table is:
A 4C 5G 2T 1
• The searching phase: G T A C T A G A G G A C G T A T G T A C T G ...A T G T A
A T G T A
A T G T A
A T G T A A T G T A
A T G T A
![Page 17: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/17.jpg)
Exemple algorisme de Horspool
Given the pattern ATGTA
• The shift table is:
A 4C 5G 2T 1
• The searching phase: G T A C T A G A G G A C G T A T G T A C T G ...A T G T A
A T G T A
A T G T A
A T G T A A T G T A
A T G T A A T G T A
![Page 18: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/18.jpg)
Qüestions sobre l’algorisme de Horspool
A 4C 5G 2T 1 Given a random text over an
equally likely probability distribution (EPD):
Given the pattern ATGTA, the shift table is
1.- Determine the expected shift of the window. And, if the PD is not equally likely?
2.- Determine the expected number of shifts assuming a text of length n.
3.- Determine the expected number of comparisons in the suffix search phase
![Page 19: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/19.jpg)
Exact string matching: one pattern (text on-line)
Experimental efficiency (Navarro & Raffinot)
2 4 8 16 32 64 128 256
64
32
16
8
4
2
| |
Long. pattern
Horspool
BNDMBOM
BNDM : Backward Nondeterministic Dawg Matching
BOM : Backward Oracle Matching
w
![Page 20: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/20.jpg)
Text :
Pattern :
Search for suffixes of T that are factors of
BNDM algorithm
• Which is the next position of the window ?
• How the comparison is made?
That is denoted as
D2 = 1 0 0 0 1 0 0
Depends on the value of the leftmost bit of D
Once the next character x is read D3 = D2<<1 & B(x)
B(x): mask of x in the pattern P. For instance, if B(x) = ( 0 0 1 1 0 0 0)
D = (0 0 0 1 0 0 0) & (0 0 1 1 0 0 0 ) = (0 0 0 1 0 0 0 )
x
![Page 21: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/21.jpg)
BNDM algorithm: exaple
Given the pattern ATGTA
• The searching phase: G T A C T A G A G G A C G T A T G T A C T G ...A T G T A
A T G T A
A T G T A
A T G T A
• The mask of characters is:
B(A) = ( 1 0 0 0 1 )B(C) = ( 0 0 0 0 0 )B(G) = ( 0 0 1 0 0 )B(T) = ( 0 1 0 1 0 )
D1 = ( 0 1 0 1 0 )D2 = ( 1 0 1 0 0 ) & ( 0 0 0 0 0 ) = ( 0 0 0 0 0 )
D1 = ( 0 0 1 0 0 )D2 = ( 0 1 0 0 0 ) & ( 0 0 1 0 0 ) = ( 0 0 0 0 0 )
D1 = ( 1 0 0 0 1 )D2 = ( 0 0 0 1 0 ) & ( 0 1 0 1 0 ) = ( 0 0 0 1 0 )D3 = ( 0 0 1 0 0 ) & ( 0 0 1 0 0) = ( 0 0 1 0 0 )D4 = ( 0 1 0 0 0 ) & ( 0 0 0 0 0) = ( 0 0 0 0 0 )
![Page 22: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/22.jpg)
Exemple algorisme BNDM
A T G T A
• Given the pattern ATGTA
• The mask of characters is :
• The searching phase: G T A C T A G A G G A C G T A T G T A C T G ...A T G T A
B(A) = ( 1 0 0 0 1 )B(C) = ( 0 0 0 0 0 )B(G) = ( 0 0 1 0 0 )B(T) = ( 0 1 0 1 0 )
D1 = ( 1 0 0 0 1 )D2 = ( 0 0 0 1 0 ) & ( 0 1 0 1 0 ) = ( 0 0 0 1 0 )D3 = ( 0 0 1 0 0 ) & ( 0 0 1 0 0 ) = ( 0 0 1 0 0 )D4 = ( 0 1 0 0 0 ) & ( 0 1 0 1 0 ) = ( 0 1 0 0 0 )D5 = ( 1 0 0 0 0 ) & ( 1 0 0 0 1 ) = ( 1 0 0 0 0 )D6 = ( 0 0 0 0 0 ) & ( * * * * * ) = ( 0 0 0 0 0 ) Trobat!
![Page 23: Genome The chromosomes are the volumes of an encyclopedia called Genome Cell Nucleus Tissue The chromosomes contains the set of instructions for alive](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d415503460f94a1bf27/html5/thumbnails/23.jpg)
Exemple algorisme BNDM
Given the pattern ATGTA
• The searching phase: G T A C T A G A A T A C G T A T G T A C T G ...A T G T A
A T G T A
A T G T A
• The mask of characters is :
B(A) = ( 1 0 0 0 1 )B(C) = ( 0 0 0 0 0 )B(G) = ( 0 0 1 0 0 )B(T) = ( 0 1 0 1 0 )
D1 = ( 0 1 0 1 0 )D2 = ( 1 0 1 0 0 ) & ( 0 0 0 0 0 ) = ( 0 0 0 0 0 )
D1 = ( 0 1 0 1 0 )D2 = ( 1 0 1 0 0 ) & ( 1 0 0 0 1 ) = ( 1 0 0 0 0 ) D3 = ( 0 0 0 0 0 ) & ( 1 0 0 0 1 ) = ( 0 0 0 0 0 )
How the shif is determined?