regular expressions
TRANSCRIPT
Regular Expressions
Agenda
What are regular expressions Need for regular expressions Basic rules Practical examples Regular expression groups search() match() replace()
What are Regular expressions
A regular expression is an object that describes a pattern of
text characters.
There are two ways of defining a regular expression :
var regex=new RegExp(pattern,modifiers); or
var regex=/pattern/modifiers;
Example of pattern : ^[a-zA-Z0-9]+$
Modifiers :
g - global, i - ignore case, m - multiline
Without regular expressions ...
The following javascript function tests if a string contains only
alpha-numeric characters :
function bTestOnlyAlphaNum(strToTest) {
if (strToTest.length == 0) return false;
for (var i=0; i < strToTest.length; i++) {
var testChar = strToTest.charCodeAt(i);
if ((testChar < 48 || testChar > 57) &&
(testChar < 65 || testChar > 90) &&
(testChar < 97 || testChar > 122)) return false;
} return true;
}
Magic of regular expressions
By using regular expressions, the same functionality as shown
in the previous slide can be achieved as :
function bTestOnlyAlphaNum(strToTest) {
return (strToTest.match(/^[a-zA-Z0-9]+$/) != null);
}
We will get into details of this regular expression later. Let us
first walk through the basic rules of regular expressions.
Basic Rules
. Matches any one character, except for line breaks. * Matches 0 or more of the preceding character. + Matches 1 or more of the preceding character. ? Preceding character is optional. Matches 0 or
1 occurrence. \d Matches any single digit (opposite: \D) \w Matches any alphanumeric character &
underscore) (opposite: \W). \s Matches a whitespace character(opposite: \S)
Basic Rules
[XYZ] Matches any single character from the character
class.
[XYZ]+ Matches one or more of any of the characters in the set. $ Matches the end of the string. ^ Matches the beginning of a string. [^a-z] When inside of a character class, the ^ means NOT;
in this case, it will match anything that is NOT a
lowercase letter.
Practical Examples
1. In several cases, we want user to enter only alphanumeric
characters. We can achieve that functionality by using the
following function.
function bTestOnlyAlphaNum(strToTest) {
return (strToTest.match(/^[a-zA-Z0-9]+$/) != null);
}
Here we are using match() function of javascript on strToTest
which is a string.
Practical Examples
match() function returns non-null value if the string matches
the regular expression pattern, otherwise it returns null. If it
returns non-null value, our function returns true, meaning that
the input string contained only alphanumeric characters.
/^[a-zA-Z0-9]+$/
is the regular expression pattern.
^ specifies – from the beginning of the input string.
[a-zA-Z0-9] specifies – any one character which may be
any lowercase letter, uppercase letter or digit.
Practical Examples
+ specifies – one or more occurance of the previous character.
$ specifies –the end of the input string.
The entire pattern collectively specifies a string that from
beginning till the end contains one or more characters which
should be lowercase letter, uppercase letter or digit.
If user enters any such string that satisfy this regular expression
pattern, match function returns non-null. Thus our function
returns true.
Practical Examples
2. The following function matches a postal code which
contains only digits and may be in format xxxxx or
xxxxx-xxxx.
function bTestPostalCode(strToTest) {
if (strToTest == null || strToTest.length == 0) return false;
return (strToTest.match(/^\d{5}(-\d{4})?$/) != null);
}
Matches :
12345
12345-6789
Practical Examples
\d{5} specifies five digits
(-\d{4})? Means ”-” followed by four digits. Parenthesis are
used for grouping. ”?” at the end means that this entire group
is optional.
Thus the entire regular expression specifies, from the beginning
of the string and till the end, there must be 5 digits followed by
an optional group of - character with another 4 digits.
Groups
To match a pattern like 1234-567, we can write the regex as : /\d{4}-\d{3}/
In order to extract the individual portions, we can group them as follows :
/(\d{4})-(\d{3})/
Now we can access the first four digits by \1 and last three digits by \2
Groups
Example : let's say we want to mach ”howdy123” RegEx : /[”'][^”']*[”']/ But this regex does not require the opening quote to be
same as closing quote. It will also match patterns like ”howdy123', which is not permitted.
To prevent this we can write : /([”'])[^”']*\1/ Now it will match only if opening and closing quotes are
same.
search()
Finds position and occurance of pattern in the string. Does not support global search. Return character position of matched pattern or -1 if no match
is found. Example : ”abc 123 def 345 ghi”.search(/\d{3}/) Output : 4
match()
String.match(RegExp) can perform global search and returns an array of results.
For global search, the returned array contains all the matching parts of the source string.
For non-global search, the returned array contains the full match along with any parenthesized sub-patterns.
”abc 123 def 345 ghi”.match(/\d{3}/g) It will return an array ["123", "345"]
replace()
String.replace(RegExp,replacement) RegExp is the expression which defines the pattern to be
searched for. Replacement is the text to replace the match found or is a
function that generates the replacement text. If we are using global modifer with replace(), we can call a
function for every match found. For every match found, the matched value will be passed to the function as first argument. If there is a group inside the matched pattern, then it will be passed as the next argument. Each matched value will be replaced with the return value from its corresponding function.
replace()
function fnReplaceWithFunction(){
var srcText=”Number one is 011-33233334, number two is 032-83993333 and finally number three is 033-37443343. Site is http://www.abc.com/index.html”
var result=srcText.replace(/(\d{3})-(\d{8})/g, function(found,a,b){return b;});
alert(result);
}
Here for every string matching the pattern, function receives
three arguments. For example for first matched string
011-33233334, found= 011-33233334, a=011 and b=33233334.
replace()
Function is returning b, thus the matched pattern 011-33233334
is replaced with 33233334. Same happens for all the matched
Values. The final output is :
Thank You