introduction to regex(2)
TRANSCRIPT
-
8/4/2019 Introduction to Regex(2)
1/15
REGULAREXPRESSIONS & LEX
Arwa Basabrain
-
8/4/2019 Introduction to Regex(2)
2/15
What Are Regular Expressions?
Regular expressions are patterns ofcharacters that match, or fail to match,sequences of characters in text. To allow
developers to create regular expressionpatterns, certain characters andcombinations of characters have special
meanings and uses
-
8/4/2019 Introduction to Regex(2)
3/15
What Can Regular Expressions BeUsed For?
Finding Doubled Words
Checking Input from Web Forms
Changing Date Formats Finding Incorrect Case
Search and Replace in Word Processors
Directory Listings
Online Searching
-
8/4/2019 Introduction to Regex(2)
4/15
Regular Expression Basics
. : matches any single character except \n
* : matches 0 or more instances of the preceding regularexpression
+ : matches 1 or more instances of the preceding regular expression
? : matches 0 or 1 of the preceding regular expression
| : matches the preceding or following regular expression
[ ] : defines a character class
() : groups enclosed regular expression into a new regular expression: matches everything within the literally
-
8/4/2019 Introduction to Regex(2)
5/15
Regular Expression Basics
. Any character (may or may not match line terminators)\d A digit: [0-9]\D A non-digit: [^0-9]\s A whitespace character: [ \t\n\x0B\f\r]\S A non-whitespace character: [^\s]\w A word character: [a-zA-Z_0-9]\W A non-word character: [^\w]
-
8/4/2019 Introduction to Regex(2)
6/15
Meta-characters
meta-characters (do not match themselves, because they areused in the preceding reg exps):
( ) [ ] { } < > + / , ^ * | . \ " $ ? - %
to match a meta-character, prefix with "\"
to match a backslash, tab or newline, use \\, \t, or \n
-
8/4/2019 Introduction to Regex(2)
7/15
Lex Regular Expressions
Lex uses an extended form of regular expression:
(c: character, x,y: regular expressions, s: string, m,nintegers and i: identifier).
c any character except meta-characters (see below)[...] the list of enclosed chars (may be a range)
[...] the list of chars not enclosed
. any ASCII char except newline
xy concatenation of x and yx* same as x*
x+ same as x+ (i.e. x* but not )
x? an optional x (same as x+ )
-
8/4/2019 Introduction to Regex(2)
8/15
Lex Reg Exp (cont)
x|y x or y
{i} definition of i
x/y x, only if followed by y (y not removed from input)
x{m,n} m to n occurrences of x
x x, but only at beginning of line
x$ x, but only at end of line
"s" exactly what is in the quotes (except for "\" and
following character)
A regular expression finishes with a space, tab or newline
-
8/4/2019 Introduction to Regex(2)
9/15
Regular Expression Examples
Matching Floating Point Numbers
[-+]?[0-9]*\.?[0-9]+
Match numbers with exponents
[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?if you want to validate if a particular string holds a floating point number,rather than finding a floating point number within longer text
^[-+]?[0-9]*\.?[0-9]+$
-
8/4/2019 Introduction to Regex(2)
10/15
Regular Expression ExamplesCon
Matching a Valid Date
(19|20)\d\d[- /.](0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])
Match Email Address
^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$Matches a complete line of text that contains any of the words "one","two" or "three".
^.*\b(one|two|three)\b.*$
-
8/4/2019 Introduction to Regex(2)
11/15
Regular Expressions DesignerProgram
-
8/4/2019 Introduction to Regex(2)
12/15
LanguageElementsection
-
8/4/2019 Introduction to Regex(2)
13/15
Input, Regular Expression& Resultsections
-
8/4/2019 Introduction to Regex(2)
14/15
Lex program Example
-
8/4/2019 Introduction to Regex(2)
15/15
definitions %% rules %% subroutines