an improved algorithm to accelerate regular expression evaluation author : michela becchi 、...

16
An Improved Algorithm to Accelerate Regular Expression Evaluation Author Michela Becchi Patrick Crowley Publisher ANCS’07 Presenter Wen-Tse Liang Date 2010/11/17 1

Upload: colin-greene

Post on 30-Dec-2015

223 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang

An Improved Algorithm to Accelerate Regular

Expression EvaluationAuthor : Michela Becchi 、 Patrick CrowleyPublisher : ANCS’07Presenter : Wen-Tse LiangDate : 2010/11/17

1

Page 2: An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang

Introduction D2FA Improved Algorithm Experiment Evaluation

Outline

2

Page 3: An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang

Kumar et al. [9] observe that many states in DFAs have similar sets of outgoing transitions. Substantial space savings in excess of 90% are achievable in current rule-sets when this redundancy is exploited.

The proposed automaton, called a Delayed Input DFA (D2FA), replaces redundant transitions common to a pair of states with a single default transition.

3

Introduction

Page 4: An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang

In this paper, we propose an improved yet simplified algorithm for building default transitions that addresses these problems.

On practical data sets, the level of compression achieved is similar than the original D2FA scheme, while providing a superior worst-case memory bandwidth bound.

4

Introduction

Page 5: An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang

Consider two states u and v, where both u and v have a transition labeled by the symbol a to a common third state w, and no default transition.

If we introduce a default transition from u to v, we can eliminate the a-transition from u without affecting the destination state function δ(x).

5

D2FA

Page 6: An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang

6

D2FA

two automata on the input string aabdbc.

Page 7: An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang

Note that by the same reasoning, if there are multiple symbols a, for which u has a labeled outgoing edge and for which δ(a,u)=δ(a,v), the introduction of a default edge from u to v allows us to eliminate all these edges.

7

D2FA

Page 8: An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang

8

D2FA

Page 9: An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang

The edge joining a pair of vertices (states) u and v is assigned a weight w(u,v) that is one less than the number of symbols a for which δ(a,u)=δ(a,v).

9

D2FA

Page 10: An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang

The natural way to avoid long default paths is to construct a maximum weight spanning tree with a specified bounded diameter.

10

D2FA

Page 11: An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang

Diameter bound of 4

11

D2FA

Page 12: An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang

Diameter bound of 2

12

D2FA

Page 13: An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang

in order to propose a more general compression algorithm which leads to a traversal time bound independent of the maximum default transition path length.

we define its depth as the minimum number of states visited when moving from s0 to s in the DFA. In other words, the initial state s0 will have depth 0, the set of states S1 directly reachable from s0 will have depth 1, the set of states S2 directly reachable from any of the S1 (but not from s0) will have depth 2, and so on.

13

Improved Algorithm

Page 14: An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang

Lemma: If none of the default transitions in a

D2FA lead from a state with depth di to a state of depth dj with dj ≥ di, then any string of length N will require at most 2N state traversals to be processed.

In other words, a 2N time bound is guaranteed on all D2FA having only “backwards” transitions. In a sense, this can be thought of as a generalization of to regular expressions.

14

Improved Algorithm

Page 15: An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang

15

Improved Algorithm

Page 16: An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang

16

Experiment Evaluation