regular expressions: the proper care and feeding zain naboulsi msdn developer evangelist microsoft

36
Regular Expressions: Regular Expressions: The Proper Care and The Proper Care and Feeding Feeding Zain Naboulsi Zain Naboulsi MSDN Developer Evangelist MSDN Developer Evangelist Microsoft Microsoft

Upload: leonard-lucas

Post on 03-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

Regular Expressions: Regular Expressions: The Proper Care and FeedingThe Proper Care and Feeding

Zain NaboulsiZain NaboulsiMSDN Developer EvangelistMSDN Developer EvangelistMicrosoftMicrosoft

Page 2: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

Introduction to Regular ExpressionsIntroduction to Regular Expressions

What Are Regular Expressions?What Are Regular Expressions?

Why Would I Want To Use Them?Why Would I Want To Use Them?

Common MisconceptionsCommon Misconceptions

Anatomy of An Regular ExpressionAnatomy of An Regular Expression

Page 3: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

DisclaimerDisclaimer

All opinions in this session are All opinions in this session are provided "AS IS" with no warranties, provided "AS IS" with no warranties, and confer no rights.and confer no rights.

All opinions are my mine and don't All opinions are my mine and don't necessarily reflect the opinion of necessarily reflect the opinion of Microsoft.Microsoft.

Page 4: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

What Are What Are Regular Expressions?Regular Expressions?

Page 5: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

Regular ExpressionsRegular Expressions““Regular expressions provide a powerful, Regular expressions provide a powerful, flexible, and efficient method for processing flexible, and efficient method for processing text. text.

[They allow] you to quickly parse large [They allow] you to quickly parse large amounts of text to find specific character amounts of text to find specific character patterns; to extract, edit, replace, or delete patterns; to extract, edit, replace, or delete text substrings; or to add the extracted strings text substrings; or to add the extracted strings to a collection in order to generate a report.”to a collection in order to generate a report.”

http://msdn2.microsoft.com/en-us/library/hs600312.aspx

Page 6: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

Do What?Do What?

Simply put, regular expressions will help you Simply put, regular expressions will help you find text patterns and do pretty much find text patterns and do pretty much whatever you want to it.whatever you want to it.

It sounds simple but regular expressions are It sounds simple but regular expressions are one of the most difficult and least understood one of the most difficult and least understood constructs in programming.constructs in programming.

Page 7: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

WarningWarning

Regular expressions are part art and part Regular expressions are part art and part science. There is a steep learning curve but science. There is a steep learning curve but the rewards are significant.the rewards are significant.

Page 8: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

The PossibilitiesThe Possibilities

Page 9: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

Okay, So What Is A Pattern?Okay, So What Is A Pattern?

““a regular or repetitive form, order, or a regular or repetitive form, order, or arrangement”arrangement”

http://encarta.msn.com/dictionary_1861724272/pattern.html

Page 10: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

PATTERNS ARE PATTERNS ARE EVERYWHEREEVERYWHERE

Page 11: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

Checker BoardChecker Board

Page 12: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

Fibonacci SequenceFibonacci Sequence

Page 13: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

TextText

The IP Address for the server is 192.169.1.3 The IP Address for the server is 192.169.1.3 but it should be 192.168.1.5, and I am not but it should be 192.168.1.5, and I am not sure how we managed to get into the sure how we managed to get into the 192.169.1 subnet but we need to remove 192.169.1 subnet but we need to remove ourselves from it immediately unless we are ourselves from it immediately unless we are moving to it then I want the new IP to be moving to it then I want the new IP to be 192.169.1.3 I suppose.192.169.1.3 I suppose.

Page 14: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

YOU HAVE USED YOU HAVE USED PATTERNS BEFOREPATTERNS BEFORE

Page 15: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

Wildcard Searches For FilesWildcard Searches For Files

Wildcards = VERY simple pattern matching Wildcards = VERY simple pattern matching constructs and are NOT regular expressionsconstructs and are NOT regular expressions

Examples:Examples:*.txt*.txt

b*b*b*b*

?un.txt?un.txt

Page 16: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

Why Use Why Use Regular Expressions?Regular Expressions?

Page 17: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

Major Uses of Major Uses of Regular ExpressionsRegular Expressions

Matching = find any text anywhere Matching = find any text anywhere regardless of complexityregardless of complexity

Substitution = once found, you can replace Substitution = once found, you can replace texttext

Page 18: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

FeaturesFeatures

Can literally turn 10 lines of code into 1 Can literally turn 10 lines of code into 1

Extremely efficient pattern matching Extremely efficient pattern matching mechanismmechanism

Once learned, becomes one of the most Once learned, becomes one of the most indispensible techniques you can haveindispensible techniques you can have

Page 19: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

Languages That SupportLanguages That SupportRegular ExpressionsRegular Expressions

All .NET languagesAll .NET languages

JScriptJScript

XML: XPath & XQueryXML: XPath & XQuery

T-SQLT-SQL

PERLPERL

JavaJava

[insert language here][insert language here]

Page 20: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

ASP.NET ControlASP.NET Control

Page 21: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

Common Common MisconceptionsMisconceptions

Page 22: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

MisconceptionsMisconceptions

Regular Expressions can do complex Regular Expressions can do complex programming logicprogramming logic

Regular Expressions can do mathRegular Expressions can do math

Regular Expressions will give me winning Regular Expressions will give me winning lottery numberslottery numbers

Page 23: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

Anatomy of an Anatomy of an Regular ExpressionRegular Expression

Page 24: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

A Sample ExpressionA Sample Expression

^\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,3}$^\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,3}$

Page 25: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

AnatomyAnatomy

CharactersCharacters

MetacharactersMetacharacters

SubexpressionsSubexpressions

Page 26: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

CharactersCharacters

A literal character represents any valid value A literal character represents any valid value represented by the current encoding method.represented by the current encoding method.

For example the “@” literal character is For example the “@” literal character is represented as the decimal value 65 in the represented as the decimal value 65 in the ASCII encoding system.ASCII encoding system.

^\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,3}$^\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,3}$

Page 27: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

MetacharactersMetacharacters

Unlike literal characters, metacharacters are Unlike literal characters, metacharacters are used as “place holders” for characters.used as “place holders” for characters.

For example, the metacharacter “\t” in regular For example, the metacharacter “\t” in regular expressions represents the tab character, expressions represents the tab character, whereas the “\d” matches any digit 0 through whereas the “\d” matches any digit 0 through 9.9.

^\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,3}$^\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,3}$

Page 28: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

SubexpressionsSubexpressions

These are simply smaller expressions nested These are simply smaller expressions nested inside larger ones.inside larger ones.

For example, the following expression has a For example, the following expression has a subexpression inside it:subexpression inside it:

(john|jane)doe(john|jane)doe

Page 29: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

Must Have ResourcesMust Have Resources

Page 30: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

ToolsTools

http://www.RegExLib.com

http://www.ultrapico.com/Expresso.htm

Page 31: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

BookBook

Page 32: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

ToolsTools

Page 33: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

SummarySummary

Page 34: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

SummarySummary

Regular expressions can be used to Regular expressions can be used to manipulate and change textmanipulate and change text

While there is a steep learning curve, regular While there is a steep learning curve, regular expressions are invaluable as a programming expressions are invaluable as a programming tooltool

Regular expressions are supported by Regular expressions are supported by virtually all major programming languagesvirtually all major programming languages

Page 35: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft

Next StepsNext StepsCheck out some of the patterns on the Check out some of the patterns on the RegExLib siteRegExLib site

Do a live search on regular expressions and Do a live search on regular expressions and see what others have to say about themsee what others have to say about them

Prepare your self mentally for a rewarding Prepare your self mentally for a rewarding journey into the world of regular expressionsjourney into the world of regular expressions

Have Fun!!!Have Fun!!!

Page 36: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft