regular expressions regular expression (or pattern) in perl – is a template that either matches or...

29
Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){ } while( <STDIN> ){ if( /hello/ ) { } } Regular Expressions in Perl: @words = split /\s+/, $str;

Post on 22-Dec-2015

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Regular Expressions

Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string.

if( $str =~ /hello/){

}

while( <STDIN> ){

if( /hello/ ){

}

}

Regular Expressions in Perl:

@words = split /\s+/, $str;

Page 2: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Regular Expressions (2)

Regular Expressions in Unix:

grep “include .*h” *.h

regular expression globes

Page 3: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Regular Expressions (3)

/to.*ols/ matches ‘to’, followed by any string, followed by ‘ols’.

/to?ols/ the character before ‘?’ is optional. Thus, there are only two matching strings – ‘tools’ and ‘tols’.

/hello.you/ matches any string that has ‘hello’, followed by any one (exactly one) character, followed by ‘you’./to*ols/ last character before ‘*’ may be repeated zero or more times. Matches ‘tools’,’tooooools’,’tols’ (but not ‘toxols’ !!!)

/to+ols/ ------//------- one or more -----//------.

“.” matchs any char except a newline \n

Page 4: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Regular Expressions (4)

Grouping – parentheses ‘( )’ are used for grouping one or more characters.

/(tools)+/ matches “toolstoolstoolstools”.

Alternatives:

/hello (world|Perl)/ - matches “hello world”, “hello Perl”.

Page 5: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Regular Expressions (5)

Character Class

/Hello [abcde]/ matches “Hello a” or “Hello b” …

/Hello [a-e]/ the same as above

Negating:

[^abc] any char except a,b,c

Page 6: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Regular Expressions (6)

Shortcuts

• \d digit

• \w word character [A-Za-z0-9_]

• \s white space

Negative ^ – [^\d] matches non digit

Page 7: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Regular Expressions (7)

Quantifiers:

/a{3,6}/ - matches “a” repeated 3,4,5,6 times

/(abc){3,}/ - matches three or more repetitions of “abc”.

/a{3}/ - matches exactly three repetitions of “a”.

* = {0,}

+ = {1,}

? = {0,1}

Page 8: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Regular Expressions (8)

Anchors

^ - marks the beginning of the string

$ - marks the end of the string

/^Hello Perl/ - matches “Hello Perl, good by Perl”, but not “Perl Hello Perl”

/^\s*$/ - matches all blank lines

/^abc/ - “^” beginning of a string

/a\^bc/ - matches “\^”

/[^abc]/ - negating

Page 9: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Regular Expressions (9)

\b - matches at either end of a word (matches the start or the end of a group of \w characters)

/\bPerl\b/ - matches “Hello Perl”, “Perl”

but not “Perl++”

\B - negative of \b

Page 10: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Regular Expressions (10)

Backreferences:

/(World|Perl) \1/ - matches “World World”, “Perl Perl”.

/((hello|hi) (world|Perl))/

•\1 refers to (hello|hi) (world|Perl)

•\2 refers to (hello|hi)

•\3 refers to (world|Perl)

Page 11: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Examples

1. What is it?

/^0x[0-9a-fA-F]+$/

2. Date format: Month-Day-Year -> Year:Day:Month

$date = “12-31-1901”;

$date =~ s/(\d+)-(\d+)-(\d+)/$3:$2:$1/;

Page 12: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Examples

3. Make a pattern that matches any line of input that has the same word repeated two or more times in a row. Whitespace between words may differ.

Page 13: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Example 3

1. /\w+/ #matches a word

2. /(\w+)/ #to remember later

3. /(\w+)\1+/ #two or more times

4. /(\w+)(\s+\1)+/ #whitespace between words

5. “This is a test” -> /\b(\w+)(\s+\1)+/

6. “This is the theory” -> /\b(\w+)(\s+\1)+\b/

Page 14: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Regular Expressions (11)

$& - what really was matched

$` - what was before

$’ - the rest of the string after the matched pattern

$` . $& . $’ - original string

Page 15: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Regular Expressions (12)

Substitutions:

s/T/U/; #substitutes T with U (only once)

s/T/U/g; #global substitution

s/\s+/ /g; #collapses whitespaces

s/(\w+) (\w+)/$2 $1/g;

s/T/U/; #applied on $_ variable

$str =~ s/T/U/;

Page 16: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Split and Join

$str=“aaa bbb ccc dddd”;

@words = split /\s+/, $str;

$str = join ‘:‘, @words; #result is “aaa:bbb:ccc:dddd”

@words = split /\s+/, $_; “ aaa b” -> “”, “aaa”, “b”

@words = split; “ aaa b” -> “aaa”, “b”

@words = split ‘ ‘, $_; “ aaa b” -> “aaa”, “b”

Page 17: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Grep

grep EXPR, LIST;

@results = grep /^>/, @array;@results = grep /^>/, <FILE>;

Page 18: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

CGI - Common Gateway Interface

CGI – a standard that defines the protocol between a web server and a application (script).

Web Browser

Web Server

Application

DB

CGI

http/ ssl …

search example

Page 19: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Sending information to CGI

<form action="/cgi-bin/scilib.pl" method=POST><input type=text name=searchj value=""><input type=submit value="search"></form>

http://www.tau.ac.il/cgi-bin/scilib.pl?searchj=protein

Two ways to submit information:•HTML form

•With URL

Page 20: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

CGI - Simple script

#!/usr/bin/perl

use CGI qw(:standard);

print header;

$param= param('formtext');

print "<hr><p align=left>Hello CGI: $param";

print end_html;

Page 21: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

HomeWork

Write a CGI Perl script that prints IP address of submitted server name. Input is received from HTML text box. (you need to create two pages - (1) html page with the text box (2) cgi script that receives and prints the IP address.)

See: http://www.cs.tau.ac.il/faq/home.html

Page 22: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

HomeWork (2)

Input/Output Examples:[maxshats@nova ~]$ ping -c 1 -w 3 tau.ac.ilping: unknown host tau.ac.il

[maxshats@nova ~]$ ping -c 1 -w 3 www.cnn.comPING cnn.com (207.25.71.25) from 132.67.128.249 : 56(84) bytes of data.

--- cnn.com ping statistics ---4 packets transmitted, 0 packets received, 100% packet loss

Use regular expression

Page 23: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

HomeWork (3)

Run Unix commands:$str=`ping -c 1 -w 3 www.cnn.com`;print $str;

Page 24: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Debugger

On Unix: “perldoc perldebug”

Invoke Perl with the -d switch:perl –d your_code.pl arg1 arg2 …

Page 25: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Debugger (2)

•always displays the line it's about to execute

•Any command not recognized by the debugger is directly executed (eval'd) as Perl code (for example you can print out some variables).

p expr (as “print expr”)

x expr - Nested data structures are printed out recursively, unlike the real print function in Perl

Page 26: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Debugger (3)

s [expr] Single step. Executes until the beginning of another statement, descending into subroutine calls. If an expression is supplied that includes function calls, it too will be single-stepped.

n [expr] Next. Executes over subroutine calls, until the beginning of the next statement. If an expression is supplied that includes function calls, those functions will be executed with stops before each statement.

<CR> Repeat last n or s command.

Page 27: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Debugger (4)

r Continue until the return from the current subroutine.

c [line|sub] Continue, optionally inserting a one-time-only breakpoint at the specified line or subroutine.

w [line] List window (a few lines) around the current/[line] line

Page 28: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Debugger (5)

b subname [condition]b [line] [condition] Set a breakpoint before the given line. If line is omitted, set a breakpoint on the line about to be executed. If a condition is specified, it's evaluated each time the statement is reached: a breakpoint is taken only if the condition is true. Breakpoints may only be set on lines that begin an executable statement.

b 237 $x > 30 b 237 ++$count237 < 11 b 33 /pattern/i

Page 29: Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){

Debugger (6)

W expr Add a global watch-expression.