practical extraction & report language perl joseph beltran
TRANSCRIPT
Practical Extraction & Report LanguagePractical Extraction & Report Language
PERLPERL
Joseph BeltranJoseph Beltran
What is PERL?What is PERL?
An interpreted language that is optimized for I/O, An interpreted language that is optimized for I/O,
system tasks and string manipulationsystem tasks and string manipulation
Larry Wall originally created PERL because he Larry Wall originally created PERL because he
sought the need for sought the need for a language that combines the a language that combines the
best featuresbest features of other scripting languages of other scripting languages
Uses of PERLUses of PERL
Text ProcessingText Processing can manipulate textual data, email, news articles, log can manipulate textual data, email, news articles, log
files, or just about any kind of text, with great easefiles, or just about any kind of text, with great ease System AdministrationSystem Administration
particularly useful for tying together lots of smaller particularly useful for tying together lots of smaller scripts, working with file systems, networking, & so onscripts, working with file systems, networking, & so on
CGI and Web ProgrammingCGI and Web Programming can be used to process and generate HTMLcan be used to process and generate HTML
Other UsesOther Uses DNA sequencing for The Human Genome Project DNA sequencing for The Human Genome Project NASA’s Satellite Systems ControlNASA’s Satellite Systems Control Perl Data Language for number-crunchingPerl Data Language for number-crunching Perl Object Environment for event-driven machinesPerl Object Environment for event-driven machines
VariablesVariables
PERL provides three kinds of variables: PERL provides three kinds of variables:
ScalarsScalars, , ArraysArrays, and , and Associative ArraysAssociative Arrays
The initial character of the name identifies the The initial character of the name identifies the
particular type of variable and, hence, its particular type of variable and, hence, its
functionality.functionality.
VariablesVariables
Scalar VariablesScalar Variables
$name$name
Strings and numbers whether integers or decimals are Strings and numbers whether integers or decimals are treated in the same waytreated in the same way
$aVar = 4;$aVar = 4; $bVar = 4.5; $bVar = 4.5; # a decimal number# a decimal number $cVar = 3.14e10; $cVar = 3.14e10; # a floating point number# a floating point number $dVar = "a string of words“;$dVar = "a string of words“; $eVar = $aVar . $bVar; $eVar = $aVar . $bVar; # note use of . to concatenate strings# note use of . to concatenate strings
VariablesVariables
Arrays Arrays
@name()@name()
Single dimension list of scalarsSingle dimension list of scalars @aList = (2, 4, 6, 8); @aList = (2, 4, 6, 8); # explicit values# explicit values @aList = (1..4);@aList = (1..4); # range of values\ # range of values\ @aList = (1, "two", 3, "four");@aList = (1, "two", 3, "four"); # mixed values # mixed values @aList = ();@aList = (); # empty list # empty list $#aList;$#aList; # index of last item # index of last item $aList[0];$aList[0]; # first item in @aList # first item in @aList
VariablesVariables
Associative ArraysAssociative Arrays
%name{}%name{}
A two-dimensional array, for use with attribute/value pairs. A two-dimensional array, for use with attribute/value pairs. The first element in each row is a key and the second The first element in each row is a key and the second
element is a value associated with that key.element is a value associated with that key. $aAA{"A"} = 1; $aAA{"A"} = 1; # creates first row of associative array# creates first row of associative array $aAA{"B"} = 2; $aAA{"B"} = 2; # creates second row of associative array# creates second row of associative array %aAA = ("A", 1, "B", 2); %aAA = ("A", 1, "B", 2); # same as first two statements# same as first two statements
OperatorsOperators
If If variables are the nounsvariables are the nouns, PERL provides , PERL provides
operators, which are the operators, which are the verbsverbs. .
Operators access and change the values of Operators access and change the values of
variables. variables.
Some assignments apply to all three kinds of Some assignments apply to all three kinds of
variables. However, most are variables. However, most are specialized with specialized with
respect to their typesrespect to their types..
OperatorsOperators
Numeric OperatorsNumeric Operators + + plusplus - - minusminus * * multiplymultiply / / dividedivide ** ** exponentiationexponentiation % % modulusmodulus == == equalequal != != not equal not equal < < less thanless than > > greater thangreater than
<= <= less than or equal toless than or equal to >= >= greater than or equal togreater than or equal to += += binary assignment binary assignment -= -= same, subtraction same, subtraction *= *= same, multiplication same, multiplication ++ ++ auto incrementauto increment -- -- auto decrementauto decrement
Literal OperatorsLiteral Operators . . concatenateconcatenate x n x n repetitionrepetition # e.g., "A" x 3 => "AAA"# e.g., "A" x 3 => "AAA" eq eq equalequal ne ne not equalnot equal lt lt less thanless than gt gt greater thangreater than le le less thanless than or equal toor equal to ge ge greater than or equal togreater than or equal to
OperatorsOperators
Control StructuresControl Structures
PERL is an PERL is an iterative languageiterative language in which control in which control
flows from the first statement in the program to flows from the first statement in the program to
the last statement unless something interrupts. the last statement unless something interrupts.
Some of the things that can interrupt this linear Some of the things that can interrupt this linear
flow are conditional branches and loop flow are conditional branches and loop
structures.structures.
Control StructuresControl Structures
If Conditional StatementIf Conditional Statement
if (expression_A) if (expression_A) { {
A_true_stmt_1; A_true_stmt_1; A_true_stmt_2; A_true_stmt_2;
} } elseif (expression_B)elseif (expression_B){ {
B_true_stmt_1; B_true_stmt_1; B_true_stmt_2; B_true_stmt_2;
}}else else false_stmt_1; false_stmt_1;
Control StructuresControl Structures
While Loop StatementWhile Loop Statement
LABEL: while (expression) LABEL: while (expression) {{
stmt_1;stmt_1;stmt_2;stmt_2;
}}
Until Loop StatementUntil Loop StatementLABEL: until (expression) LABEL: until (expression)
{{stmt_1;stmt_1;stmt_2;stmt_2;
}}
Control StructuresControl Structures
For Loop StatementFor Loop Statement
LABEL: for (initial exp; test exp; increment exp)LABEL: for (initial exp; test exp; increment exp) {{
stmt_1;stmt_1;stmt_2;stmt_2;
}}
For Each Loop StatementFor Each Loop StatementLABEL: foreach $i (@aList)LABEL: foreach $i (@aList)
{{stmt_1;stmt_1;stmt_2;stmt_2;
}}
Input / OutputInput / Output
PERL uses PERL uses filehandlesfilehandles to control input & output to control input & output
These are These are STDINSTDIN for accessing input, for accessing input, STDOUTSTDOUT for for
printing output, and printing output, and STDERRSTDERR for writing error for writing error
messagesmessages
Additional filehandles are created by the open Additional filehandles are created by the open
commandcommand
Input / OutputInput / Output
Opening FilesOpening Files SyntaxSyntax: open (FILEHANDLE, "filename");: open (FILEHANDLE, "filename");ExamplesExamples: :
open (INPUT, "index.html"); open (INPUT, "index.html"); # for reading # for reading open (OUTPUT, "> index.html"); open (OUTPUT, "> index.html"); # for writing # for writing open (OUTPUT, ">> index.html"); open (OUTPUT, ">> index.html"); # for appending# for appending
Closing FilesClosing Files SyntaxSyntax: close (FILEHANDLE);: close (FILEHANDLE);ExampleExample: :
close close (INPUT);(INPUT);
Regular ExpressionsRegular Expressions
Regular expressions give us extreme power to do Regular expressions give us extreme power to do pattern matching on text documents.pattern matching on text documents.
PatternsPatternsLiteral String Pattern Literal String Pattern
if (/cat/) { print "cat found in $a"; } if (/cat/) { print "cat found in $a"; } Single-Character PatternSingle-Character Pattern
/.at/ # matches "cat,“ and "bat“/.at/ # matches "cat,“ and "bat“
/[0-9]/ # matches 0 to 9 /[0-9]/ # matches 0 to 9
/[0123456789]/ # matches 0 to 9 /[0123456789]/ # matches 0 to 9
Regular ExpressionsRegular Expressions
OperatorsOperators::Substitution Substitution
s/cat/dog/ # replaces "cat" with "dog“s/cat/dog/ # replaces "cat" with "dog“
s/cat/dog/gi # same but ignores cases/cat/dog/gi # same but ignores case SplittingSplitting
@a = split(/cat/, $a); # removes “cat” from $a @a = split(/cat/, $a); # removes “cat” from $a JoiningJoining
$a = join (“cat", "dog", "bird"); # returns "catdogbird"$a = join (“cat", "dog", "bird"); # returns "catdogbird"
ExamplesExamples
Example 1Example 1
STDIN and STDOUT, Looping and ConditionsSTDIN and STDOUT, Looping and Conditions
Example 2Example 2
SEARCH and REPLACE stringsSEARCH and REPLACE strings
Example 3Example 3
FILE READING and WRITINGFILE READING and WRITING
Sample scripts are run using Active Perl 5.6 Sample scripts are run using Active Perl 5.6 from www.activestate.comfrom www.activestate.com