regular expressions for processing data on the web · • you may have heard of regular expressions...
TRANSCRIPT
![Page 1: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/1.jpg)
Regular Expressions for
Processing Data on the Web
Wim Martens
![Page 2: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/2.jpg)
What will we be doing?
![Page 3: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/3.jpg)
It's very simple:
What will we be doing?
![Page 4: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/4.jpg)
It's very simple:
• You may have heard of regular expressions
What will we be doing?
![Page 5: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/5.jpg)
It's very simple:
• You may have heard of regular expressions• Processing Data on the Web is a hot topic
What will we be doing?
![Page 6: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/6.jpg)
It's very simple:
• You may have heard of regular expressions• Processing Data on the Web is a hot topic
What will we be doing?
![Page 7: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/7.jpg)
It's very simple:
• You may have heard of regular expressions• Processing Data on the Web is a hot topic
So, let's
What will we be doing?
![Page 8: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/8.jpg)
It's very simple:
• You may have heard of regular expressions• Processing Data on the Web is a hot topic
So, let's
• put both of them in a pot, stir around a bit
What will we be doing?
![Page 9: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/9.jpg)
It's very simple:
• You may have heard of regular expressions• Processing Data on the Web is a hot topic
So, let's
• put both of them in a pot, stir around a bit • aim for teaching you something new
What will we be doing?
![Page 10: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/10.jpg)
It's very simple:
• You may have heard of regular expressions• Processing Data on the Web is a hot topic
So, let's
• put both of them in a pot, stir around a bit • aim for teaching you something new• and see what happens
What will we be doing?
![Page 11: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/11.jpg)
Why Regular Expressions?They are available in tools
![Page 12: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/12.jpg)
Why Regular Expressions?They are available in tools
Your father was on holiday.
![Page 13: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/13.jpg)
Why Regular Expressions?They are available in tools
Your father was on holiday. He made 37.000.000 photos,
![Page 14: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/14.jpg)
Why Regular Expressions?They are available in tools
Your father was on holiday. He made 37.000.000 photos,with several cameras.
![Page 15: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/15.jpg)
Why Regular Expressions?They are available in tools
Your father was on holiday. He made 37.000.000 photos,with several cameras. Some store the photos as .jpg,some as .jpeg, and some as .JPG.
![Page 16: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/16.jpg)
Why Regular Expressions?They are available in tools
Your father was on holiday. He made 37.000.000 photos,with several cameras.
He doesn't like that.Some store the photos as .jpg,
some as .jpeg, and some as .JPG.
![Page 17: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/17.jpg)
Why Regular Expressions?They are available in tools
Your father was on holiday. He made 37.000.000 photos,with several cameras.
He doesn't like that.Some store the photos as .jpg,
some as .jpeg, and some as .JPG.In his opinion, .jpg is the only true JPEG file extension.
![Page 18: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/18.jpg)
Why Regular Expressions?They are available in tools
Your father was on holiday. He made 37.000.000 photos,with several cameras.
He doesn't like that.Some store the photos as .jpg,
some as .jpeg, and some as .JPG.In his opinion, .jpg is the only true JPEG file extension.He is prepared to pay you one week's salary to fix this.
![Page 19: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/19.jpg)
Why Regular Expressions?They are available in tools
Your father was on holiday. He made 37.000.000 photos,with several cameras.
He doesn't like that.Some store the photos as .jpg,
some as .jpeg, and some as .JPG.In his opinion, .jpg is the only true JPEG file extension.He is prepared to pay you one week's salary to fix this.
How do you do it?(Without developing repetetive strain injury)
![Page 20: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/20.jpg)
Why Regular Expressions?They are available in tools
Your father was on holiday. He made 37.000.000 photos,with several cameras.
He doesn't like that.Some store the photos as .jpg,
some as .jpeg, and some as .JPG.In his opinion, .jpg is the only true JPEG file extension.He is prepared to pay you one week's salary to fix this.
How do you do it?(Without developing repetetive strain injury)
Answer: rename 's/(\.JPG$|\.jpeg$)/\.jpg$/' *.*
(these are Perl regular expressions)
![Page 21: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/21.jpg)
Regular Expression Recap
Basics
• An alphabet is a non-empty, finite set• It contains letters, which we denote by a, b, c, ...
We usually denote an alphabet by the letter Σ
![Page 22: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/22.jpg)
Regular Expression RecapRegular Expressions
The set of regular expressions over alphabet Σ is inductively defined as:• the symbol ∅ is a regular expression
• the symbol 𝜀 is a regular expression• every symbol from Σ is a regular expression• if A and B are regular expressions, then so are
• (A.B) (concatenation)• (A + B) (disjunction)• (A*) (Kleene star)
We abbreviate (A.B) by (AB)
![Page 23: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/23.jpg)
Regular Expression RecapRegular Expressions: SemanticsThe language L(r) of regular expression r is a set of words / sequences over alphabet Σ and is inductively defined as:• L(∅) = ∅
• L(𝜀) = {𝜀}• L(a) = {a}, for every a in Σ• If r = (r1. r2) then L(r) = L(r1) ∪ L(r2)
• If r = (r1. r2) then L(r) = L(r1) . L(r2) (= {w1 w2 | w1 ∈ L(r1), w2 ∈ L(r2)})
• If r = (r1*) then L(r) = L(r1)* (= {w1 ... wk | k ∈ 𝐍, every wi ∈ L(r1)})
![Page 24: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/24.jpg)
Regular Expression Recap
![Page 25: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/25.jpg)
Regular Expression RecapRegular Expressions: Remarks
![Page 26: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/26.jpg)
Regular Expression RecapRegular Expressions: Remarks
Expressions are not very readable when strictly adhering to the definition:
(((((ab)*)c)+(de))*)
![Page 27: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/27.jpg)
Regular Expression RecapRegular Expressions: Remarks
Expressions are not very readable when strictly adhering to the definition:
(((((ab)*)c)+(de))*)
So we use • associativity of concatenation and disjunction• priorities between operators
to make them more readable
![Page 28: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/28.jpg)
Regular Expression RecapRegular Expressions: Remarks
Expressions are not very readable when strictly adhering to the definition:
(((((ab)*)c)+(de))*)
So we use • associativity of concatenation and disjunction• priorities between operators
to make them more readable
Priorities: first (), then *, then ., then +
![Page 29: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/29.jpg)
Regular Expression RecapRegular Expressions: Remarks
Expressions are not very readable when strictly adhering to the definition:
(((((ab)*)c)+(de))*)
So we use • associativity of concatenation and disjunction• priorities between operators
to make them more readable
Priorities: first (), then *, then ., then +
The above expression becomes ((ab)*c+de)*
![Page 30: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/30.jpg)
Regular Expression RecapRegular Expressions: Examples
![Page 31: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/31.jpg)
Regular Expression RecapRegular Expressions: Examples
(aa)*
![Page 32: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/32.jpg)
Regular Expression RecapRegular Expressions: Examples
(aa)*
(a + b)* a
![Page 33: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/33.jpg)
Regular Expression RecapRegular Expressions: Examples
(a + b)* a (a + b)
(aa)*
(a + b)* a
![Page 34: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/34.jpg)
Regular Expression RecapRegular Expressions: Examples
(a + b)* a (a + b)
(aa)*
(a + b)* a
(a+b)* abb (a+b)*
![Page 35: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/35.jpg)
Regular Expression RecapRegular Expressions: Examples
(a + b)* a (a + b)
(aa)*
(a + b)* a
(a+b)* abb (a+b)*
{(ab)n a (ba)n | n ∈ 𝐍}
![Page 36: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/36.jpg)
Regular Expression Recap
![Page 37: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/37.jpg)
Regular Expression Recap
Syntactic sugar
![Page 38: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/38.jpg)
Regular Expression Recap
Syntactic sugar
r? abbreviates (r + 𝜀) r+ abbreviates r . r*
![Page 39: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/39.jpg)
Expressions vs Automata
![Page 40: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/40.jpg)
Expressions vs AutomataAutomata:
a a a bb b
aa(bb)*ab
![Page 41: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/41.jpg)
Expressions vs Automata
L(A) language of automaton A
Automata:
a a a bb b
aa(bb)*ab
![Page 42: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/42.jpg)
Expressions vs Automata
L(A) language of automaton A
deterministic versus non-deterministic automata
Automata:
a a a bb b
aa(bb)*ab
![Page 43: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/43.jpg)
Expressions vs AutomataAutomata:
a a a bb b
aa(bb)*ab
Notation
• I ⊆ Q: initial states
• F ⊆ Q: accepting states
Automaton A = (Q, Σ,𝛅,I,F) with• Q: Finite set of states• Σ : Alphabet• 𝛅 : transition function
![Page 44: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/44.jpg)
Expressions vs AutomataTheoremRegular expressions and finite automata
define the same languages: the regular languages
![Page 45: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/45.jpg)
Expressions vs AutomataTheoremRegular expressions and finite automata
define the same languages: the regular languages
More precisely, for each regular expression r, there is a finite automaton A such that L(A) = L(r) and vice versa
![Page 46: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/46.jpg)
Expressions vs AutomataTheorem
But what about the blow-up?
Regular expressions and finite automatadefine the same languages: the regular languages
![Page 47: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/47.jpg)
Expressions vs AutomataTheorem
But what about the blow-up?
Expression to automaton: O(n)
Automaton to expression: exponential
Regular expressions and finite automatadefine the same languages: the regular languages
![Page 48: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/48.jpg)
So, these are the basics
What do we do now?
![Page 49: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/49.jpg)
Well...This could surprise you, but
![Page 50: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/50.jpg)
Well...This could surprise you, but
• Regular expressions are still used a lot in research We are still discovering new fundamental properties
![Page 51: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/51.jpg)
Well...This could surprise you, but
• Regular expressions are still used a lot in research We are still discovering new fundamental properties
• Research also uses many variants of regular expressions
![Page 52: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/52.jpg)
Well...This could surprise you, but
• Regular expressions are still used a lot in research We are still discovering new fundamental properties
• Research also uses many variants of regular expressions
This is common in research:
Often you're solving a problem and the standard tools you have are not really what you want / need
So you have to tweak them a little bit
![Page 53: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/53.jpg)
Well...This could surprise you, but
• Regular expressions are still used a lot in research We are still discovering new fundamental properties
• Research also uses many variants of regular expressions
This is common in research:
Often you're solving a problem and the standard tools you have are not really what you want / need
So you have to tweak them a little bit
We will be looking at both kinds of cases
![Page 54: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/54.jpg)
Expressions vs the WebWho uses expressions on the Web?
![Page 55: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/55.jpg)
Expressions vs the WebWho uses expressions on the Web?
You can play golf with them
![Page 56: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/56.jpg)
Expressions vs the WebWho uses expressions on the Web?
You can play golf with them
"regex golf"
![Page 57: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/57.jpg)
Expressions vs the WebWho uses expressions on the Web?
You can play golf with them
Schema languages for XML
"regex golf"
![Page 58: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/58.jpg)
Expressions vs the WebWho uses expressions on the Web?
You can play golf with them
Schema languages for XML
Query languages for XML
"regex golf"
![Page 59: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/59.jpg)
Expressions vs the WebWho uses expressions on the Web?
You can play golf with them
Schema languages for XML
Query languages for XML
Query languages for graph data
"regex golf"
![Page 60: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/60.jpg)
XML
![Page 61: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/61.jpg)
XMLeXtensible Markup Language
![Page 62: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/62.jpg)
XMLeXtensible Markup Language
Developed by World Wide Web Consortium (W3C)
![Page 63: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/63.jpg)
XMLeXtensible Markup Language
Developed by World Wide Web Consortium (W3C)
A widely used standard for exchanging data on the Web
![Page 64: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/64.jpg)
XML
"Stores data in a tree"
eXtensible Markup Language
Developed by World Wide Web Consortium (W3C)
A widely used standard for exchanging data on the Web
![Page 65: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/65.jpg)
XML by Example<store> <general> <name>The excellent guitar shop</name> <url>www.theexcellentguitarshop.com</url>
</general> <catalog> <guitar> <maker>Gibson</maker> <type>Les Paul</type> <year>1959</year> <price>2000</price>
</guitar> <guitar> ...
</guitar> </catalog>
</store>
![Page 66: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/66.jpg)
XML by Example<store> <general> <name>The excellent guitar shop</name> <url>www.theexcellentguitarshop.com</url>
</general> <catalog> <guitar> <maker>Gibson</maker> <type>Les Paul</type> <year>1959</year> <price>2000</price>
</guitar> <guitar> ...
</guitar> </catalog>
</store>
store
general catalog
name url guitar guitar
maker type year price
The excellent... www...
Gibson Les Paul 1959 2000
...
![Page 67: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/67.jpg)
Exchanging XMLWeb
![Page 68: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/68.jpg)
Exchanging XMLWeb
A B
![Page 69: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/69.jpg)
Exchanging XMLWeb
A Bベルギービールは、世界で最⾼高です
![Page 70: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/70.jpg)
Exchanging XMLWeb
A Bベルギービールは、世界で最⾼高です ???
![Page 71: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/71.jpg)
Exchanging XMLWeb
A BXML
![Page 72: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/72.jpg)
Exchanging XMLWeb
A BAha!XML
![Page 73: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/73.jpg)
Exchanging XMLWeb
A BXS XS
XS is a schema
XML
![Page 74: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/74.jpg)
XML Schemas by Examplestore
general catalog
name url guitar guitar
maker type year price
The excellent... www...
Gibson Les Paul 1959 2000maker type year priceFender Stratocaster 1954 2500
discount400
![Page 75: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/75.jpg)
XML Schemas by Examplestore
general catalog
name url guitar guitar
maker type year price
The excellent... www...
Gibson Les Paul 1959 2000
Schemas describe trees:store -‐> general, catalog general -‐> name, url catalog -‐> guitar* guitar -‐> maker, type, year, price, discount?
maker type year priceFender Stratocaster 1954 2500
discount400
![Page 76: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/76.jpg)
XML Schemas by Examplestore
general catalog
name url guitar guitar
maker type year price
The excellent... www...
Gibson Les Paul 1959 2000
Schemas describe trees:store -‐> general, catalog general -‐> name, url catalog -‐> guitar* guitar -‐> maker, type, year, price, discount?
maker type year priceFender Stratocaster 1954 2500
discount400
Aha!Schemas are based on regular expressions!
![Page 77: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/77.jpg)
XML Schemas by ExampleSchemas describe trees:store -‐> general, catalog general -‐> name, url catalog -‐> guitar* guitar -‐> maker, type, year, price, discount?
![Page 78: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/78.jpg)
XML Schemas by ExampleSchemas describe trees:store -‐> general, catalog general -‐> name, url catalog -‐> guitar* guitar -‐> maker, type, year, price, discount?
Definition: Schema for XMLA schema for XML is a tuple (Σ,S,R), where• Σ is the alphabet• S⊆Σ is a set of start symbols
• R is a set of rules of the forma → r
with a ∈ Σ and r is a regular expression over Σ
![Page 79: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/79.jpg)
XML Schemas by ExampleDefinition: Schema for XMLA schema for XML is a tuple D = (Σ,S,R), where• Σ is the alphabet• S⊆Σ is a set of start symbols
• R is a set of rules of the forma → r
with a ∈ Σ and r is a regular expression over Σ
![Page 80: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/80.jpg)
XML Schemas by ExampleDefinition: Schema for XMLA schema for XML is a tuple D = (Σ,S,R), where• Σ is the alphabet• S⊆Σ is a set of start symbols
• R is a set of rules of the forma → r
with a ∈ Σ and r is a regular expression over Σ
Remarks and conventions
• We assume that, for every a∈Σ, there is at most one rule with left-hand side a
• In examples, S is always the singleton containing the left symbol of the first rule (unless stated otherwise)
• When we don't write a rule for a, we implicitly assumea →𝜀
![Page 81: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/81.jpg)
XML Schemas by ExampleDefinition: Schema for XMLA schema for XML is a tuple D = (Σ,S,R), where• Σ is the alphabet• S⊆Σ is a set of start symbols
• R is a set of rules of the forma → r
with a ∈ Σ and r is a regular expression over Σ
An (XML) tree t is in the language L(D), if• its root is labeled by an element in S• for every node u, labeled a
the word formed by the labels of its children is in L(r),where a → r is a rule in R
![Page 82: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/82.jpg)
XML Schemas by Examplestore
general catalog
name url guitar guitar
maker type year price
Schemas describe trees:store -‐> general, catalog general -‐> name, url catalog -‐> guitar* guitar -‐> maker, type, year, price, discount?
maker type year price discount
With these definitions, the tree is in the language of the schema
![Page 83: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/83.jpg)
XML Schemas by ExampleTheory vs Practice:
I just showed you a formal definition of aDocument Type Definition (DTD)
DTDs are part of the specification for XML
![Page 84: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/84.jpg)
XML Schemas by ExampleTheory vs Practice:
I just showed you a formal definition of aDocument Type Definition (DTD)
DTDs are part of the specification for XML
But their description in the spec is much longerwhich means that I'm not saying some things
![Page 85: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/85.jpg)
XML Schemas by ExampleTheory vs Practice:
I just showed you a formal definition of aDocument Type Definition (DTD)
DTDs are part of the specification for XML
But their description in the spec is much longerwhich means that I'm not saying some things
The biggest thing that I'm not saying:In DTDs, regular expressions must be deterministic
![Page 86: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/86.jpg)
Deterministic regular expressions
![Page 87: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/87.jpg)
Deterministic regular expressions
What does this mean?
![Page 88: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/88.jpg)
Deterministic regular expressions
What does this mean?
Let's have a look
![Page 89: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/89.jpg)
Deterministic regular expressions
![Page 90: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/90.jpg)
Deterministic regular expressionsaa(ab + ac) is not deterministic
![Page 91: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/91.jpg)
Deterministic regular expressions
aaa(b+c) is deterministic
aa(ab + ac) is not deterministic
![Page 92: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/92.jpg)
Deterministic regular expressions
aaa(b+c) is deterministic
aa(ab + ac) is not deterministic
Deterministic regular expressionsA regular expression is deterministic, if its Glushkov automaton is deterministic
![Page 93: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/93.jpg)
Deterministic regular expressions
aaa(b+c) is deterministic
aa(ab + ac) is not deterministic
Deterministic regular expressionsA regular expression is deterministic, if its Glushkov automaton is deterministic
Glushkov automaton by example a a (a b + a c)
![Page 94: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/94.jpg)
Deterministic regular expressions
aaa(b+c) is deterministic
aa(ab + ac) is not deterministic
Deterministic regular expressionsA regular expression is deterministic, if its Glushkov automaton is deterministic
Glushkov automaton by example a a (a b + a c)
a a a b c
Its states are positions in the regular expression
a
![Page 95: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/95.jpg)
Deterministic regular expressions
aaa(b+c) is deterministic
aa(ab + ac) is not deterministic
Deterministic regular expressionsA regular expression is deterministic, if its Glushkov automaton is deterministic
Glushkov automaton by example a a (a b + a c)*
a a a
a
b c
Its states are positions in the regular expression
![Page 96: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/96.jpg)
Deterministic regular expressions
aaa(b+c) is deterministic
aa(ab + ac) is not deterministic
Deterministic regular expressionsA regular expression is deterministic, if its Glushkov automaton is deterministic
Glushkov automaton by example a a (a b + a c)*
a a a
a
b c
Its states are positions in the regular expressiona
a
![Page 97: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/97.jpg)
Deterministic regular expressions
aaa(b+c) is deterministic
aa(ab + ac) is not deterministic
Deterministic regular expressionsA regular expression is deterministic, if its Glushkov automaton is deterministic
Glushkov automaton by example a a (a b + a c)*
a a a
a
b c
Its states are positions in the regular expressiona
a
a
a
![Page 98: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/98.jpg)
Deterministic regular expressions
aaa(b+c) is deterministic
aa(ab + ac) is not deterministic
![Page 99: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/99.jpg)
Deterministic regular expressions
aaa(b+c) is deterministic
aa(ab + ac) is not deterministic
(a+b)*a
![Page 100: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/100.jpg)
Deterministic regular expressions
aaa(b+c) is deterministic
aa(ab + ac) is not deterministic
(a+b)*a is not deterministic
![Page 101: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/101.jpg)
Deterministic regular expressions
aaa(b+c) is deterministic
aa(ab + ac) is not deterministic
(a+b)*a
b*a(b*a)*
is not deterministic
![Page 102: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/102.jpg)
Deterministic regular expressions
aaa(b+c) is deterministic
aa(ab + ac) is not deterministic
(a+b)*a
b*a(b*a)*
is not deterministic
is deterministic
![Page 103: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/103.jpg)
(a + b)* a (a + b)
Deterministic regular expressions
aaa(b+c) is deterministic
aa(ab + ac) is not deterministic
(a+b)*a
b*a(b*a)*
is not deterministic
is deterministic
![Page 104: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/104.jpg)
(a + b)* a (a + b)
Deterministic regular expressions
aaa(b+c) is deterministic
aa(ab + ac) is not deterministic
(a+b)*a
b*a(b*a)*
is not deterministic
is deterministic
is not deterministic
![Page 105: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/105.jpg)
(a + b)* a (a + b)
Deterministic regular expressions
aaa(b+c) is deterministic
aa(ab + ac) is not deterministic
(a+b)*a
b*a(b*a)*
An equivalent deterministic expression seems not so easy to find?
is not deterministic
is deterministic
is not deterministic
![Page 106: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/106.jpg)
Deterministic regular expressions
![Page 107: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/107.jpg)
Deterministic regular expressionsA few words about deterministic expressions
![Page 108: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/108.jpg)
Deterministic regular expressionsA few words about deterministic expressions
• Testing if an expression is deterministic is easy (Build Glushkov automaton and check it)
![Page 109: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/109.jpg)
Deterministic regular expressionsA few words about deterministic expressions
• Testing if an expression is deterministic is easy (Build Glushkov automaton and check it)
![Page 110: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/110.jpg)
Deterministic regular expressionsA few words about deterministic expressions
• Testing if an expression is deterministic is easy (Build Glushkov automaton and check it)
• Not every regular expression can be determinized [Brüggemann-Klein, Wood, 1998] (a+b)*a(a+b) is an example
![Page 111: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/111.jpg)
Deterministic regular expressionsA few words about deterministic expressions
• Testing if an expression is deterministic is easy (Build Glushkov automaton and check it)
• Not every regular expression can be determinized [Brüggemann-Klein, Wood, 1998] (a+b)*a(a+b) is an example
![Page 112: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/112.jpg)
Deterministic regular expressionsA few words about deterministic expressions
• Testing if an expression is deterministic is easy (Build Glushkov automaton and check it)
• Not every regular expression can be determinized [Brüggemann-Klein, Wood, 1998] (a+b)*a(a+b) is an example
• It can be tested if a given expression can be determinized [Brüggemann-Klein, Wood, 1998] (It's complicated)
![Page 113: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/113.jpg)
Deterministic regular expressionsA few words about deterministic expressions
• Testing if an expression is deterministic is easy (Build Glushkov automaton and check it)
• Not every regular expression can be determinized [Brüggemann-Klein, Wood, 1998] (a+b)*a(a+b) is an example
• It can be tested if a given expression can be determinized [Brüggemann-Klein, Wood, 1998] (It's complicated)
![Page 114: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/114.jpg)
Deterministic regular expressionsA few words about deterministic expressions
• Testing if an expression is deterministic is easy (Build Glushkov automaton and check it)
• Not every regular expression can be determinized [Brüggemann-Klein, Wood, 1998] (a+b)*a(a+b) is an example
• It can be tested if a given expression can be determinized [Brüggemann-Klein, Wood, 1998] (It's complicated)
• The problem is PSPACE-complete [Czerwinski et al, 2013]
![Page 115: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/115.jpg)
What do people do with schemas?
Back to Schemas
![Page 116: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/116.jpg)
What do people do with schemas?
Back to Schemas
• Validate trees• Construct new schemas by combining others• Redesign or optimize them
![Page 117: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/117.jpg)
XML ValidationThe XML validation problem Input: A tree t and a schema D
Question: Is t ∈ L(D) ?
![Page 118: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/118.jpg)
XML ValidationThe XML validation problem Input: A tree t and a schema D
Question: Is t ∈ L(D) ?
Solution:
![Page 119: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/119.jpg)
XML ValidationThe XML validation problem Input: A tree t and a schema D
Question: Is t ∈ L(D) ?
Solution:Test, for every node u of the tree, whether the word formed by its children is in the language of the relevant expression
![Page 120: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/120.jpg)
XML ValidationThe XML validation problem Input: A tree t and a schema D
Question: Is t ∈ L(D) ?
Solution:Test, for every node u of the tree, whether the word formed by its children is in the language of the relevant expression
This was easy!
![Page 121: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/121.jpg)
XML ValidationThe incremental XML validation problem
store
general catalog
name url guitar guitar
maker type year price
Schema:store -‐> general, catalog general -‐> name, url catalog -‐> guitar* guitar -‐> maker, type, year, price, discount?
maker type year price discount
![Page 122: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/122.jpg)
XML ValidationThe incremental XML validation problem
store
general catalog
name url guitar guitar
maker type year discount
Schema:store -‐> general, catalog general -‐> name, url catalog -‐> guitar* guitar -‐> maker, type, year, price, discount?
maker type year price discount
![Page 123: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/123.jpg)
XML ValidationThe incremental XML validation problem
store
general catalog
name url guitar guitar
maker type year discount
Schema:store -‐> general, catalog general -‐> name, url catalog -‐> guitar* guitar -‐> maker, type, year, price, discount?
maker type year price discount
Oh no, it's broken!
![Page 124: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/124.jpg)
XML ValidationThe incremental XML validation problem
![Page 125: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/125.jpg)
XML ValidationThe incremental XML validation problem
So, the setting is as follows:
![Page 126: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/126.jpg)
XML ValidationThe incremental XML validation problem
So, the setting is as follows:• We have a schema D and a (possibly huge) XML tree t
![Page 127: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/127.jpg)
XML ValidationThe incremental XML validation problem
So, the setting is as follows:• We have a schema D and a (possibly huge) XML tree t• t ∈ L(D)
![Page 128: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/128.jpg)
XML ValidationThe incremental XML validation problem
So, the setting is as follows:• We have a schema D and a (possibly huge) XML tree t• t ∈ L(D)
• But t is updated to u(t)
![Page 129: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/129.jpg)
XML ValidationThe incremental XML validation problem
So, the setting is as follows:• We have a schema D and a (possibly huge) XML tree t• t ∈ L(D)
• But t is updated to u(t)Question: is u(t) in L(D) ?
![Page 130: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/130.jpg)
XML ValidationThe incremental XML validation problem
So, the setting is as follows:• We have a schema D and a (possibly huge) XML tree t• t ∈ L(D)
• But t is updated to u(t)Question: is u(t) in L(D) ?
This boils down to the same problem for regular expressions
![Page 131: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/131.jpg)
XML ValidationThe incremental XML validation problem
So, the setting is as follows:• We have a schema D and a (possibly huge) XML tree t• t ∈ L(D)
• But t is updated to u(t)Question: is u(t) in L(D) ?
This boils down to the same problem for regular expressionsJust take the above scenario and replace
tree t and schema Dby
word w and expression r
![Page 132: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/132.jpg)
Incremental EvaluationIncremental EvaluationSay that we have word w = a1 ... an and expression r
We can do the updates:• replace(i,b): replace ai by b• insert(i,b): insert a new symbol b after position a• delete (i): delete position i
![Page 133: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/133.jpg)
Incremental EvaluationIncremental EvaluationSay that we have word w = a1 ... an and expression r
We can do the updates:• replace(i,b): replace ai by b• insert(i,b): insert a new symbol b after position a• delete (i): delete position i
We want to deal with an update quickly
Say, time logarithmic in n and polynomial in |r|
To achieve this, we're allowed to store some auxiliary data Can we do it?
![Page 134: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/134.jpg)
Incremental EvaluationIncremental Evaluation: How to do it
w =
Take A = (Q,Σ,𝛅,I,F), a non-deterministic automaton for r
![Page 135: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/135.jpg)
Incremental EvaluationIncremental Evaluation: How to do it
w =
Take A = (Q,Σ,𝛅,I,F), a non-deterministic automaton for r
![Page 136: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/136.jpg)
Incremental EvaluationIncremental Evaluation: How to do it
w =
Take A = (Q,Σ,𝛅,I,F), a non-deterministic automaton for r
(q1,q2) such that q2 ∈ 𝛅(q1, )
![Page 137: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/137.jpg)
Incremental EvaluationIncremental Evaluation: How to do it
w =
Take A = (Q,Σ,𝛅,I,F), a non-deterministic automaton for r
(q1,q2) such that q2 ∈ 𝛅(q1, )
![Page 138: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/138.jpg)
Incremental EvaluationIncremental Evaluation: How to do it
w =
Take A = (Q,Σ,𝛅,I,F), a non-deterministic automaton for r
(q1,q2) such that q2 ∈ 𝛅(q1, )
![Page 139: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/139.jpg)
Incremental EvaluationIncremental Evaluation: How to do it
w =
Take A = (Q,Σ,𝛅,I,F), a non-deterministic automaton for r
(q1,q2) such that q2 ∈ 𝛅(q1, )
⨝
![Page 140: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/140.jpg)
Incremental EvaluationIncremental Evaluation: How to do it
w =
Take A = (Q,Σ,𝛅,I,F), a non-deterministic automaton for r
(q1,q2) such that q2 ∈ 𝛅(q1, )
⨝
![Page 141: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/141.jpg)
Incremental EvaluationIncremental Evaluation: How to do it
w =
Take A = (Q,Σ,𝛅,I,F), a non-deterministic automaton for r
(q1,q2) such that q2 ∈ 𝛅(q1, )
⨝
R = {(q1,q3) | ∃ q2 with (q1,q2) in R1 and (q2,q3) in R2}
![Page 142: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/142.jpg)
Incremental EvaluationIncremental Evaluation: How to do it
w =
Take A = (Q,Σ,𝛅,I,F), a non-deterministic automaton for r
(q1,q2) such that q2 ∈ 𝛅(q1, )
![Page 143: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/143.jpg)
Incremental EvaluationIncremental Evaluation: How to do it
w =
Take A = (Q,Σ,𝛅,I,F), a non-deterministic automaton for r
(q1,q2) such that q2 ∈ 𝛅(q1, )
![Page 144: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/144.jpg)
Incremental EvaluationIncremental Evaluation: How to do it
w =
Take A = (Q,Σ,𝛅,I,F), a non-deterministic automaton for r
We have w ∈L(r) iff this contains a (qi, qf) with qi ∈ I and qf ∈ F
(q1,q2) such that q2 ∈ 𝛅(q1, )
![Page 145: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/145.jpg)
This shows that we can do incremental evaluationIn time O( (log n) . |r|3) per update, while maintaining auxiliary structure of size O( n . |r|2)
Incremental Evaluation
[Patnaik and Immerman, 1997]
![Page 146: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/146.jpg)
What do people do with schemas?
Back to Schemas
• Validate trees ✓• Construct new schemas by combining others• Redesign or optimize them
![Page 147: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/147.jpg)
Combine Schemas: Why?Web
A BXS XS
XML
![Page 148: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/148.jpg)
Combine Schemas: Why?Web
A1
B
XS1
?
XML
A2 XS2XML
![Page 149: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/149.jpg)
Combine Schemas: Why?Web
A1
B
XS1
?
XML
A2 XS2XML
B needs the union of XS1 and XS2
![Page 150: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/150.jpg)
Combine Schemas: Why?Web
A
B1
?
XMLXS1
B2XS2XML
![Page 151: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/151.jpg)
Combine Schemas: Why?Web
A
B1
?
XMLXS1
B2XS2XML
A needs the intersection of XS1 and XS2
![Page 152: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/152.jpg)
Combine Schemas: Why?
• You have a schema S• You update it to S'
![Page 153: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/153.jpg)
Combine Schemas: Why?
• You have a schema S• You update it to S'
Can you make a schema for the trees that are now invalid?
![Page 154: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/154.jpg)
Combine Schemas: Why?
• You have a schema S• You update it to S'
Can you make a schema for the trees that are now invalid?
This requires taking the difference of schemas
![Page 155: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/155.jpg)
Combine Schemas: Why?
• You have a schema S• You update it to S'
Can you make a schema for the trees that are now invalid?
This requires taking the difference of schemas...which requires taking the difference of regular expressions
![Page 156: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/156.jpg)
Combine Schemas: Why?
• You have a schema S• You update it to S'
Can you make a schema for the trees that are now invalid?
This requires taking the difference of schemas...which requires taking the difference of regular expressions
Actually, many fundamental questions regarding regular expressions regained interest through schema-related problems
![Page 157: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/157.jpg)
Some Regular Expression QuestionsUnionGiven a regular expressions r1 and r2,what is the worst-case blow up for their union?
![Page 158: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/158.jpg)
Some Regular Expression QuestionsUnionGiven a regular expressions r1 and r2,what is the worst-case blow up for their union?
Theorem [AMW School 2015]
![Page 159: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/159.jpg)
Some Regular Expression QuestionsUnionGiven a regular expressions r1 and r2,what is the worst-case blow up for their union?
Theorem [AMW School 2015]Dude, this is trivial: linear. You just write (r1 + r2)
![Page 160: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/160.jpg)
Some Regular Expression QuestionsIntersectionGiven regular expressions r1 and r2, how large, in general, is the smallest expression for their intersection?
![Page 161: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/161.jpg)
Some Regular Expression QuestionsIntersectionGiven regular expressions r1 and r2, how large, in general, is the smallest expression for their intersection?
Quick Thoughts:
![Page 162: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/162.jpg)
Some Regular Expression QuestionsIntersectionGiven regular expressions r1 and r2, how large, in general, is the smallest expression for their intersection?
Quick Thoughts:
• We can convert them to NFAs N1 and N2 (linear)• We construct NFA for their intersection (O(|r1| x |r2|))
• Convert this NFA back to an expression (exponential)
![Page 163: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/163.jpg)
Some Regular Expression QuestionsIntersectionGiven regular expressions r1 and r2, how large, in general, is the smallest expression for their intersection?
Quick Thoughts:
• We can convert them to NFAs N1 and N2 (linear)• We construct NFA for their intersection (O(|r1| x |r2|))
• Convert this NFA back to an expression (exponential)
What? Can't we do better?
![Page 164: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/164.jpg)
Some Regular Expression QuestionsIntersectionGiven regular expressions r1 and r2, how large, in general, is the smallest expression for their intersection?
Theorem [Gelade, Neven, 2012 (STACS 2008)]
![Page 165: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/165.jpg)
Some Regular Expression QuestionsIntersectionGiven regular expressions r1 and r2, how large, in general, is the smallest expression for their intersection?
Theorem [Gelade, Neven, 2012 (STACS 2008)]For every n ∈𝗡, the are deterministic regular expressions r1 and r2 such that• r1 and r2 have size O(n) and• every regular expression for L(r1) ⋂ L(r2) has size at least 2n
![Page 166: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/166.jpg)
Some Regular Expression QuestionsComplementationGiven a regular expression r, how large, in general, is the smallest expression for its complement, that is, for Σ* - L(r)?
![Page 167: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/167.jpg)
Some Regular Expression QuestionsComplementationGiven a regular expression r, how large, in general, is the smallest expression for its complement, that is, for Σ* - L(r)?
Quick Thoughts:
![Page 168: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/168.jpg)
Some Regular Expression QuestionsComplementationGiven a regular expression r, how large, in general, is the smallest expression for its complement, that is, for Σ* - L(r)?
Quick Thoughts:
• We can convert r to NFA Nr (linear)• Determinize Nr and obtain DFA Dr (exponential)• Complement Dr and obtain DFA D¬r (linear)• Convert D¬r to regular expression ¬r (exponential)
![Page 169: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/169.jpg)
Some Regular Expression QuestionsComplementationGiven a regular expression r, how large, in general, is the smallest expression for its complement, that is, for Σ* - L(r)?
Quick Thoughts:
• We can convert r to NFA Nr (linear)• Determinize Nr and obtain DFA Dr (exponential)• Complement Dr and obtain DFA D¬r (linear)• Convert D¬r to regular expression ¬r (exponential)
That's double exponential!
![Page 170: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/170.jpg)
Some Regular Expression QuestionsComplementationGiven a regular expression r, how large, in general, is the smallest expression for its complement, that is, for Σ* - L(r)?
Quick Thoughts:
• We can convert r to NFA Nr (linear)• Determinize Nr and obtain DFA Dr (exponential)• Complement Dr and obtain DFA D¬r (linear)• Convert D¬r to regular expression ¬r (exponential)
That's double exponential!
Can we do better?
![Page 171: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/171.jpg)
Some Regular Expression QuestionsComplementationGiven a regular expression r, what is the smallest expression for its complement, that is, for Σ* - L(r)?
![Page 172: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/172.jpg)
Some Regular Expression QuestionsComplementationGiven a regular expression r, what is the smallest expression for its complement, that is, for Σ* - L(r)?
Theorem [Gelade, Neven, 2012 (STACS 2008)]
![Page 173: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/173.jpg)
Some Regular Expression QuestionsComplementationGiven a regular expression r, what is the smallest expression for its complement, that is, for Σ* - L(r)?
Theorem [Gelade, Neven, 2012 (STACS 2008)]There exist regular expressions (rn)n∈𝗡 such that
• each rn has size O(n) and• every regular expression for Σ* - L(r) has size at least 2(2^n)
![Page 174: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/174.jpg)
Open ProblemComplementationGiven a regular expression r, what is the smallest expression for its complement, that is, for Σ* - L(r)?
[Losemann et al., 2012]
![Page 175: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/175.jpg)
Open Problem
• We know that the smallest expression for the complement can be double exponential
ComplementationGiven a regular expression r, what is the smallest expression for its complement, that is, for Σ* - L(r)?
[Losemann et al., 2012]
![Page 176: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/176.jpg)
Open Problem
• We know that the smallest expression for the complement can be double exponential
• But what if r comes from a schema? (i.e. r is deterministic)
ComplementationGiven a regular expression r, what is the smallest expression for its complement, that is, for Σ* - L(r)?
[Losemann et al., 2012]
![Page 177: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/177.jpg)
Open Problem
• We know that the smallest expression for the complement can be double exponential
• But what if r comes from a schema? (i.e. r is deterministic)• What if we want to find a deterministic expression for the
complement?
ComplementationGiven a regular expression r, what is the smallest expression for its complement, that is, for Σ* - L(r)?
[Losemann et al., 2012]
![Page 178: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/178.jpg)
Open Problem
• We know that the smallest expression for the complement can be double exponential
• But what if r comes from a schema? (i.e. r is deterministic)• What if we want to find a deterministic expression for the
complement?
ComplementationGiven a regular expression r, what is the smallest expression for its complement, that is, for Σ* - L(r)?
Nobody knows!
[Losemann et al., 2012]
![Page 179: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/179.jpg)
What do people do with schemas?
Back to Schemas
• Validate trees ✓• Construct new schemas by combining others ✓• Redesign or optimize them
![Page 180: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/180.jpg)
What do people do with schemas?
Back to Schemas
• Validate trees ✓• Construct new schemas by combining others ✓• Redesign or optimize them
The problems:Say that we redesign, rewrite, or optimize a schemaDoes it still accept all the data it accepted before?
![Page 181: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/181.jpg)
What do people do with schemas?
Back to Schemas
• Validate trees ✓• Construct new schemas by combining others ✓• Redesign or optimize them
The problems:Say that we redesign, rewrite, or optimize a schemaDoes it still accept all the data it accepted before?
Say that we want to keep the language the sameCan we find a good algorithm for optimizing the schema?
![Page 182: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/182.jpg)
What do people do with schemas?
Back to Schemas
• Validate trees ✓• Construct new schemas by combining others ✓• Redesign or optimize them
The problems:Say that we redesign, rewrite, or optimize a schemaDoes it still accept all the data it accepted before?
Say that we want to keep the language the sameCan we find a good algorithm for optimizing the schema?
This is called containment: is L(S1) ⊆ L(S2)?
![Page 183: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/183.jpg)
What do people do with schemas?
Back to Schemas
• Validate trees ✓• Construct new schemas by combining others ✓• Redesign or optimize them
The problems:Say that we redesign, rewrite, or optimize a schemaDoes it still accept all the data it accepted before?
Say that we want to keep the language the sameCan we find a good algorithm for optimizing the schema?
This is called containment: is L(S1) ⊆ L(S2)?
This is called minimization: find the smallest equivalent schema
![Page 184: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/184.jpg)
What do people do with schemas?
Back to Schemas
• Validate trees ✓• Construct new schemas by combining others ✓• Redesign or optimize them
![Page 185: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/185.jpg)
What do people do with schemas?
Back to Schemas
• Validate trees ✓• Construct new schemas by combining others ✓• Redesign or optimize them
FactBoth the containment and minimization problems boil down to the same problems for regular expressions
![Page 186: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/186.jpg)
What do people do with schemas?
Back to Schemas
• Validate trees ✓• Construct new schemas by combining others ✓• Redesign or optimize them
FactBoth the containment and minimization problems boil down to the same problems for regular expressions
For minimization, this is easy to see:If a schema has some non-minimal regular expression, it is obviously not minimal
![Page 187: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/187.jpg)
What do people do with schemas?
Back to Schemas
• Validate trees ✓• Construct new schemas by combining others ✓• Redesign or optimize them
FactBoth the containment and minimization problems boil down to the same problems for regular expressions
For containment, it is easy to see that L(D1) ⊆ L(D2) if and only if, for every a ∈ Σ, we have L(r1) ⊆ L(r2), where a → ri is the rule for a in Di (i = 1,2)
![Page 188: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/188.jpg)
Containment and MinimizationWe will show: Containment of regular expressions is PSPACE-complete
![Page 189: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/189.jpg)
Containment and MinimizationWe will show: Containment of regular expressions is PSPACE-complete
Containment:Given regular expressions r1 and r2, is L(r1) ⊆ L(r2)?
![Page 190: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/190.jpg)
Containment and MinimizationWe will show: Containment of regular expressions is PSPACE-complete
Containment:Given regular expressions r1 and r2, is L(r1) ⊆ L(r2)?
Step 1: Containment is in PSPACEThis is not so difficult: Construct the NFAs N1 and N2
Guess a word w symbol by symbol Test if w is in L(N1) but not in L(N2)
![Page 191: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/191.jpg)
Containment and MinimizationWe will show: Containment of regular expressions is PSPACE-complete
Step 1: Containment is PSPACE-hard
Containment:Given regular expressions r1 and r2, is L(r1) ⊆ L(r2)?
This is the interesting direction
We will reduce from corridor tiling
![Page 192: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/192.jpg)
Corridor TilingCorridor Tiling:
A tiling system S consists of:
• finite set T of tile types s: { , , , }
• the top row of tiles:
• the bottom row of tiles:
![Page 193: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/193.jpg)
Corridor TilingCorridor Tiling:
A tiling system S consists of:
• finite set T of tile types s: { , , , }
• the top row of tiles:
• the bottom row of tiles:
Question: Can we make a correct corridor tiling?
![Page 194: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/194.jpg)
Corridor Tiling
Question: Can we make a correct corridor tiling?
bottom row
top row
![Page 195: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/195.jpg)
Corridor TilingCorrect Tilings:A corridor tiling is correct iff:
• the bottom row is correct• the top row is correct• in between are only tile types from T• adjacent sides on tiles have the same color
![Page 196: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/196.jpg)
Corridor TilingCorrect Tilings:A corridor tiling is correct iff:
• the bottom row is correct• the top row is correct• in between are only tile types from T• adjacent sides on tiles have the same color
TheoremGiven a tiling system, deciding if there is a correct corridor tiling is PSPACE-complete
![Page 197: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/197.jpg)
Corridor TilingTheoremGiven a tiling system, deciding if there is a correct corridor tiling is PSPACE-complete
Given a tiling system S, we construct regular expressions r1, r2
such that L(r1) ⊆ L(r2) if and only if there is no correct corridor tiling for S
![Page 198: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/198.jpg)
Encoding TilingsTo this end, we must encode tilings as words:
is encoded as the word
# # # # # #over alphabet T ∪ {#}
![Page 199: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/199.jpg)
Corridor Tiling: NotationT: set of tilesMatching Relations H and V
• H ⊆ T x T: horizontal matching relationH := {(x1,x2) | right of x1 has same color as left of x2}
• V ⊆ T x T: vertical matching relation V := {(x1,x2) | top of x1 has same color as bottom of x2}
![Page 200: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/200.jpg)
Corridor Tiling: NotationT: set of tilesMatching Relations H and V
• H ⊆ T x T: horizontal matching relationH := {(x1,x2) | right of x1 has same color as left of x2}
• V ⊆ T x T: vertical matching relation V := {(x1,x2) | top of x1 has same color as bottom of x2}
Reminder: Goal
![Page 201: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/201.jpg)
Corridor Tiling: NotationT: set of tilesMatching Relations H and V
• H ⊆ T x T: horizontal matching relationH := {(x1,x2) | right of x1 has same color as left of x2}
• V ⊆ T x T: vertical matching relation V := {(x1,x2) | top of x1 has same color as bottom of x2}
Reminder: GoalExpressions r1 and r2 such that
![Page 202: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/202.jpg)
Corridor Tiling: NotationT: set of tilesMatching Relations H and V
• H ⊆ T x T: horizontal matching relationH := {(x1,x2) | right of x1 has same color as left of x2}
• V ⊆ T x T: vertical matching relation V := {(x1,x2) | top of x1 has same color as bottom of x2}
Reminder: GoalExpressions r1 and r2 such that L(r1) ⊆ L(r2)
![Page 203: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/203.jpg)
Corridor Tiling: NotationT: set of tilesMatching Relations H and V
• H ⊆ T x T: horizontal matching relationH := {(x1,x2) | right of x1 has same color as left of x2}
• V ⊆ T x T: vertical matching relation V := {(x1,x2) | top of x1 has same color as bottom of x2}
Reminder: GoalExpressions r1 and r2 such that L(r1) ⊆ L(r2) iff S has no correct tiling
![Page 204: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/204.jpg)
Corridor Tiling: NotationT: set of tilesMatching Relations H and V
• H ⊆ T x T: horizontal matching relationH := {(x1,x2) | right of x1 has same color as left of x2}
• V ⊆ T x T: vertical matching relation V := {(x1,x2) | top of x1 has same color as bottom of x2}
Reminder: GoalExpressions r1 and r2 such that L(r1) ⊆ L(r2) iff S has no correct tiling iff all tilings violate H or V
![Page 205: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/205.jpg)
Corridor Tiling: NotationT: set of tilesMatching Relations H and V
• H ⊆ T x T: horizontal matching relationH := {(x1,x2) | right of x1 has same color as left of x2}
• V ⊆ T x T: vertical matching relation V := {(x1,x2) | top of x1 has same color as bottom of x2}
Reminder: GoalExpressions r1 and r2 such that L(r1) ⊆ L(r2) iff S has no correct tiling iff all tilings violate H or V iff all encodings of tilings ⊆ words that violate H or V
![Page 206: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/206.jpg)
Corridor Tiling: NotationT: set of tilesMatching Relations H and V
• H ⊆ T x T: horizontal matching relationH := {(x1,x2) | right of x1 has same color as left of x2}
• V ⊆ T x T: vertical matching relation V := {(x1,x2) | top of x1 has same color as bottom of x2}
Reminder: GoalExpressions r1 and r2 such that L(r1) ⊆ L(r2) iff S has no correct tiling iff all tilings violate H or V iff all encodings of tilings ⊆ words that violate H or V r1 r2
![Page 207: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/207.jpg)
Construction
# # # # # #Encoding:
![Page 208: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/208.jpg)
Construction
# # # # # #Encoding:
NotationT abbreviates x1 + ... + xk, where T = {x1,...,xk}Tn abbreviates T ... T (n times)
![Page 209: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/209.jpg)
Construction
# # # # # #Encoding:
r1
NotationT abbreviates x1 + ... + xk, where T = {x1,...,xk}Tn abbreviates T ... T (n times)
![Page 210: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/210.jpg)
Construction
# # # # # #Encoding:
r1 r2
NotationT abbreviates x1 + ... + xk, where T = {x1,...,xk}Tn abbreviates T ... T (n times)
![Page 211: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/211.jpg)
Construction
# # # # # #Encoding:
r1 r2
• Tiling: (# Tn)* #
NotationT abbreviates x1 + ... + xk, where T = {x1,...,xk}Tn abbreviates T ... T (n times)
![Page 212: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/212.jpg)
Construction
# # # # # #Encoding:
r1 r2
• Tiling: (# Tn)* #• Bot. row: # b1 ... bn # (T+#)*
NotationT abbreviates x1 + ... + xk, where T = {x1,...,xk}Tn abbreviates T ... T (n times)
![Page 213: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/213.jpg)
Construction
# # # # # #Encoding:
r1 r2
• Tiling: (# Tn)* #• Bot. row: # b1 ... bn # (T+#)*• Top row: (T+#)* # t1 ... tn #
NotationT abbreviates x1 + ... + xk, where T = {x1,...,xk}Tn abbreviates T ... T (n times)
![Page 214: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/214.jpg)
Construction
# # # # # #Encoding:
r1 r2
• Tiling: (# Tn)* #• Bot. row: # b1 ... bn # (T+#)*• Top row: (T+#)* # t1 ... tn #
• For all (x1,x2) not in H: (T+#)* x1 x2 (T+#)*
NotationT abbreviates x1 + ... + xk, where T = {x1,...,xk}Tn abbreviates T ... T (n times)
![Page 215: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/215.jpg)
Construction
# # # # # #Encoding:
r1 r2
• Tiling: (# Tn)* #• Bot. row: # b1 ... bn # (T+#)*• Top row: (T+#)* # t1 ... tn #
• For all (x1,x2) not in H: (T+#)* x1 x2 (T+#)*
• For all (x1,x2) not in V: (T+#)* x1 (T+#)n x2 (T+#)*
NotationT abbreviates x1 + ... + xk, where T = {x1,...,xk}Tn abbreviates T ... T (n times)
![Page 216: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/216.jpg)
Constructionr1 r2
• Tiling: (# Tn)* #• Bot. row: # b1 ... bn # (T+#)*• Top row: (T+#)* # t1 ... tn #
• For all (x1,x2) not in H: (T+#)* x1 x2 (T+#)*
• For all (x1,x2) not in V: (T+#)* x1 (T+#)n x2 (T+#)*
![Page 217: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/217.jpg)
Constructionr1 r2
• Tiling: (# Tn)* #• Bot. row: # b1 ... bn # (T+#)*• Top row: (T+#)* # t1 ... tn #
• For all (x1,x2) not in H: (T+#)* x1 x2 (T+#)*
• For all (x1,x2) not in V: (T+#)* x1 (T+#)n x2 (T+#)*
r1 = # b1 ... bn (# Tn)* # t1 ... tn #
r2 = ⨁(x1,x2) ∉ H [(T+#)* x1 x2 (T+#)*] + ⨁(x1,x2) ∉ V [(T+#)* x1 (T+#)n x2 (T+#)*]
![Page 218: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/218.jpg)
Constructionr1 r2
• Tiling: (# Tn)* #• Bot. row: # b1 ... bn # (T+#)*• Top row: (T+#)* # t1 ... tn #
• For all (x1,x2) not in H: (T+#)* x1 x2 (T+#)*
• For all (x1,x2) not in V: (T+#)* x1 (T+#)n x2 (T+#)*
r1 = # b1 ... bn (# Tn)* # t1 ... tn #
r2 = ⨁(x1,x2) ∉ H [(T+#)* x1 x2 (T+#)*] + ⨁(x1,x2) ∉ V [(T+#)* x1 (T+#)n x2 (T+#)*]
So, the tiling system has no correct tiling iff L(r1) ⊆ L(r2)
![Page 219: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/219.jpg)
Constructionr1 r2
• Tiling: (# Tn)* #• Bot. row: # b1 ... bn # (T+#)*• Top row: (T+#)* # t1 ... tn #
• For all (x1,x2) not in H: (T+#)* x1 x2 (T+#)*
• For all (x1,x2) not in V: (T+#)* x1 (T+#)n x2 (T+#)*
r1 = # b1 ... bn (# Tn)* # t1 ... tn #
r2 = ⨁(x1,x2) ∉ H [(T+#)* x1 x2 (T+#)*] + ⨁(x1,x2) ∉ V [(T+#)* x1 (T+#)n x2 (T+#)*]
So, the tiling system has no correct tiling iff L(r1) ⊆ L(r2)
...which means that regular expression containmentis PSPACE-complete
![Page 220: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/220.jpg)
Actually,
![Page 221: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/221.jpg)
Actually,this whole construction can be changed such that r1 = Σ*
![Page 222: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/222.jpg)
Actually,this whole construction can be changed such that r1 = Σ*
So, it is also hard to decide if a given expression defines Σ*
![Page 223: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/223.jpg)
Actually,this whole construction can be changed such that r1 = Σ*
So, it is also hard to decide if a given expression defines Σ*
This, in turn, can be used to show that also regular expression minimization is PSPACE-complete!
![Page 224: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/224.jpg)
Now you've seen somebasics of regular expressions
![Page 225: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/225.jpg)
Now you've seen somebasics of regular expressions
...and how new applications lead to the discovery of fundamental results about them
![Page 226: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/226.jpg)
References• [Brüggemann-Klein, Wood, 1998]
Anne Brüggemann-Klein, Derick Wood One-Unambiguous Regular Languages. Inf. Comput. 140(2): 229-253 (1998)
• [Czerwinski et al., 2013] Wojciech Czerwinski, Claire David, Katja Losemann, Wim Martens Deciding Definability by Deterministic Regular Expressions. FoSSaCS 2013: 289-304
• [Gelade and Neven, 2012]Wouter Gelade, Frank Neven Succinctness of the Complement and Intersection of Regular Expressions. ACM Trans. Comput. Log. 13(1): 4 (2012)
• [Losemann et al., 2012] Katja Losemann, Wim Martens, Matthias NiewerthDescriptional Complexity of Deterministic Regular Expressions. MFCS 2012: 643-654
• [Patnaik and Immerman, 1997]Sushant Patnaik, Neil ImmermanDyn-FO: A Parallel, Dynamic Complexity Class. J. Comput. Syst. Sci. 55(2): 199-209 (1997)
![Page 227: Regular Expressions for Processing Data on the Web · • You may have heard of regular expressions • Processing Data on the Web is a hot topic So, let's • put both of them in](https://reader036.vdocuments.us/reader036/viewer/2022071019/5fd33339a09a73534310f0cb/html5/thumbnails/227.jpg)
End of Part 1