string matching chapter 32 highlights

16
String Matching Chapter 32 Highlights Charles Tappert Seidenberg School of CSIS, Pace University

Upload: ryann

Post on 23-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

String Matching Chapter 32 Highlights. Charles Tappert Seidenberg School of CSIS, Pace University. String Matching Problem in this chapter. Problem: Find all valid shifts s with which a given pattern P occurs in a given text T - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: String Matching Chapter 32 Highlights

String MatchingChapter 32 Highlights

Charles TappertSeidenberg School of CSIS, Pace

University

Page 2: String Matching Chapter 32 Highlights

String Matching Problemin this chapter

Problem: Find all valid shifts s with which a given pattern P occurs in a given text T

This problem occurs in text editing, DNA sequence searches, and Internet search engines

Example:

Page 3: String Matching Chapter 32 Highlights

String Matching AlgorithmsPreprocessing & Matching

Times

Page 4: String Matching Chapter 32 Highlights

Notation and Terminology

(Sigma-star) = set of all finite-length strings of alphabet sigma (eta is empty string)

String w is a prefix of string x, denoted w [ x, if x = wy for some string y

String w is a suffix of string x, denoted w ] x, if x = yw for some string y

Example: ab [ abcca and cca ] abcca

Page 5: String Matching Chapter 32 Highlights

Problem Re-statement

in notation/terminology

Denote a k-char prefix P[1..k] of pattern P by Pk

Similarly, denote a k-char prefix of text T by Tk

Matching problem: Given n = T.length and m = P.length, find all shifts s in range 0<=s<=n-m such that P ] Ts+m

Page 6: String Matching Chapter 32 Highlights

Naïve String Match Algorithm

sliding “template” pattern match

Page 7: String Matching Chapter 32 Highlights

Problem 1-1 How many template comparisons are made? How many were matches and how many non-matches? How many computation units are used?

Problem 1-2 How many computation units are used?

Naïve String Match Algorithm

sliding “template” pattern match

Page 8: String Matching Chapter 32 Highlights

Finite Automata Algorithm

Efficient – examine each text char only once

Page 9: String Matching Chapter 32 Highlights

Finite Automata Algorithm

Example: simple two-state finite automaton:

Transition function (delta)

State transition diagram

Page 10: String Matching Chapter 32 Highlights

Finite Automata Algorithm

Final-state function

Final-state function (phi)

Page 11: String Matching Chapter 32 Highlights

Finite Automata Algorithm

Construct the automaton

Suffix function (small sigma)

Page 12: String Matching Chapter 32 Highlights

Finite Automata Algorithm

Construct the automaton

Example:

State m

P = a b a b a c a

Page 13: String Matching Chapter 32 Highlights

Finite Automata AlgorithmCritical transition function

(delta)

Transition function (delta) obtained from Suffix function (small sigma)

Page 14: String Matching Chapter 32 Highlights

Finite Automata AlgorithmMatching operation

Transition function (delta)

Page 15: String Matching Chapter 32 Highlights

Finite Automata AlgorithmCompute transition function

Transition function (delta)

Page 16: String Matching Chapter 32 Highlights

Finite Automata Algorithm

Problem 3-1 Problem 3-2