practical extraction & report language perl joseph beltran

19
Practical Extraction & Report Practical Extraction & Report Language Language PERL PERL Joseph Joseph Beltran Beltran

Upload: daisy-obrien

Post on 11-Jan-2016

225 views

Category:

Documents


1 download

TRANSCRIPT

Practical Extraction & Report LanguagePractical Extraction & Report Language

PERLPERL

Joseph BeltranJoseph Beltran

What is PERL?What is PERL?

An interpreted language that is optimized for I/O, An interpreted language that is optimized for I/O,

system tasks and string manipulationsystem tasks and string manipulation

Larry Wall originally created PERL because he Larry Wall originally created PERL because he

sought the need for sought the need for a language that combines the a language that combines the

best featuresbest features of other scripting languages of other scripting languages

Uses of PERLUses of PERL

Text ProcessingText Processing can manipulate textual data, email, news articles, log can manipulate textual data, email, news articles, log

files, or just about any kind of text, with great easefiles, or just about any kind of text, with great ease System AdministrationSystem Administration

particularly useful for tying together lots of smaller particularly useful for tying together lots of smaller scripts, working with file systems, networking, & so onscripts, working with file systems, networking, & so on

CGI and Web ProgrammingCGI and Web Programming can be used to process and generate HTMLcan be used to process and generate HTML

Other UsesOther Uses DNA sequencing for The Human Genome Project DNA sequencing for The Human Genome Project NASA’s Satellite Systems ControlNASA’s Satellite Systems Control Perl Data Language for number-crunchingPerl Data Language for number-crunching Perl Object Environment for event-driven machinesPerl Object Environment for event-driven machines

VariablesVariables

PERL provides three kinds of variables: PERL provides three kinds of variables:

ScalarsScalars, , ArraysArrays, and , and Associative ArraysAssociative Arrays

The initial character of the name identifies the The initial character of the name identifies the

particular type of variable and, hence, its particular type of variable and, hence, its

functionality.functionality.

VariablesVariables

Scalar VariablesScalar Variables

$name$name

Strings and numbers whether integers or decimals are Strings and numbers whether integers or decimals are treated in the same waytreated in the same way

$aVar = 4;$aVar = 4; $bVar = 4.5; $bVar = 4.5; # a decimal number# a decimal number $cVar = 3.14e10; $cVar = 3.14e10; # a floating point number# a floating point number $dVar = "a string of words“;$dVar = "a string of words“; $eVar = $aVar . $bVar; $eVar = $aVar . $bVar; # note use of . to concatenate strings# note use of . to concatenate strings

VariablesVariables

Arrays Arrays

@name()@name()

Single dimension list of scalarsSingle dimension list of scalars @aList = (2, 4, 6, 8); @aList = (2, 4, 6, 8); # explicit values# explicit values @aList = (1..4);@aList = (1..4); # range of values\ # range of values\ @aList = (1, "two", 3, "four");@aList = (1, "two", 3, "four"); # mixed values # mixed values @aList = ();@aList = (); # empty list # empty list $#aList;$#aList; # index of last item # index of last item $aList[0];$aList[0]; # first item in @aList # first item in @aList

VariablesVariables

Associative ArraysAssociative Arrays

%name{}%name{}

A two-dimensional array, for use with attribute/value pairs. A two-dimensional array, for use with attribute/value pairs. The first element in each row is a key and the second The first element in each row is a key and the second

element is a value associated with that key.element is a value associated with that key. $aAA{"A"} = 1; $aAA{"A"} = 1; # creates first row of associative array# creates first row of associative array $aAA{"B"} = 2; $aAA{"B"} = 2; # creates second row of associative array# creates second row of associative array %aAA = ("A", 1, "B", 2); %aAA = ("A", 1, "B", 2); # same as first two statements# same as first two statements

OperatorsOperators

If If variables are the nounsvariables are the nouns, PERL provides , PERL provides

operators, which are the operators, which are the verbsverbs. .

Operators access and change the values of Operators access and change the values of

variables. variables.

Some assignments apply to all three kinds of Some assignments apply to all three kinds of

variables. However, most are variables. However, most are specialized with specialized with

respect to their typesrespect to their types..

OperatorsOperators

Numeric OperatorsNumeric Operators + + plusplus - - minusminus * * multiplymultiply / / dividedivide ** ** exponentiationexponentiation % % modulusmodulus == == equalequal != != not equal not equal < < less thanless than > > greater thangreater than

<= <= less than or equal toless than or equal to >= >= greater than or equal togreater than or equal to += += binary assignment binary assignment -= -= same, subtraction same, subtraction *= *= same, multiplication same, multiplication ++ ++ auto incrementauto increment -- -- auto decrementauto decrement

Literal OperatorsLiteral Operators . . concatenateconcatenate x n x n repetitionrepetition # e.g., "A" x 3 => "AAA"# e.g., "A" x 3 => "AAA" eq eq equalequal ne ne not equalnot equal lt lt less thanless than gt gt greater thangreater than le le less thanless than or equal toor equal to ge ge greater than or equal togreater than or equal to

OperatorsOperators

Control StructuresControl Structures

PERL is an PERL is an iterative languageiterative language in which control in which control

flows from the first statement in the program to flows from the first statement in the program to

the last statement unless something interrupts. the last statement unless something interrupts.

Some of the things that can interrupt this linear Some of the things that can interrupt this linear

flow are conditional branches and loop flow are conditional branches and loop

structures.structures.

Control StructuresControl Structures

If Conditional StatementIf Conditional Statement

if (expression_A) if (expression_A) { {

A_true_stmt_1; A_true_stmt_1; A_true_stmt_2; A_true_stmt_2;

} } elseif (expression_B)elseif (expression_B){ {

B_true_stmt_1; B_true_stmt_1; B_true_stmt_2; B_true_stmt_2;

}}else else false_stmt_1; false_stmt_1;

Control StructuresControl Structures

While Loop StatementWhile Loop Statement

LABEL: while (expression) LABEL: while (expression) {{

stmt_1;stmt_1;stmt_2;stmt_2;

}}

Until Loop StatementUntil Loop StatementLABEL: until (expression) LABEL: until (expression)

{{stmt_1;stmt_1;stmt_2;stmt_2;

}}

Control StructuresControl Structures

For Loop StatementFor Loop Statement

LABEL: for (initial exp; test exp; increment exp)LABEL: for (initial exp; test exp; increment exp) {{

stmt_1;stmt_1;stmt_2;stmt_2;

}}

For Each Loop StatementFor Each Loop StatementLABEL: foreach $i (@aList)LABEL: foreach $i (@aList)

{{stmt_1;stmt_1;stmt_2;stmt_2;

}}

Input / OutputInput / Output

PERL uses PERL uses filehandlesfilehandles to control input & output to control input & output

These are These are STDINSTDIN for accessing input, for accessing input, STDOUTSTDOUT for for

printing output, and printing output, and STDERRSTDERR for writing error for writing error

messagesmessages

Additional filehandles are created by the open Additional filehandles are created by the open

commandcommand

Input / OutputInput / Output

Opening FilesOpening Files SyntaxSyntax: open (FILEHANDLE, "filename");: open (FILEHANDLE, "filename");ExamplesExamples: :

open (INPUT, "index.html"); open (INPUT, "index.html"); # for reading # for reading open (OUTPUT, "> index.html"); open (OUTPUT, "> index.html"); # for writing # for writing open (OUTPUT, ">> index.html"); open (OUTPUT, ">> index.html"); # for appending# for appending

Closing FilesClosing Files SyntaxSyntax: close (FILEHANDLE);: close (FILEHANDLE);ExampleExample: :

close close (INPUT);(INPUT);

Regular ExpressionsRegular Expressions

Regular expressions give us extreme power to do Regular expressions give us extreme power to do pattern matching on text documents.pattern matching on text documents.

PatternsPatternsLiteral String Pattern Literal String Pattern

if (/cat/) { print "cat found in $a"; } if (/cat/) { print "cat found in $a"; } Single-Character PatternSingle-Character Pattern

/.at/ # matches "cat,“ and "bat“/.at/ # matches "cat,“ and "bat“

/[0-9]/ # matches 0 to 9 /[0-9]/ # matches 0 to 9

/[0123456789]/ # matches 0 to 9 /[0123456789]/ # matches 0 to 9

Regular ExpressionsRegular Expressions

OperatorsOperators::Substitution Substitution

s/cat/dog/ # replaces "cat" with "dog“s/cat/dog/ # replaces "cat" with "dog“

s/cat/dog/gi # same but ignores cases/cat/dog/gi # same but ignores case SplittingSplitting

@a = split(/cat/, $a); # removes “cat” from $a @a = split(/cat/, $a); # removes “cat” from $a JoiningJoining

$a = join (“cat", "dog", "bird"); # returns "catdogbird"$a = join (“cat", "dog", "bird"); # returns "catdogbird"

ExamplesExamples

Example 1Example 1

STDIN and STDOUT, Looping and ConditionsSTDIN and STDOUT, Looping and Conditions

Example 2Example 2

SEARCH and REPLACE stringsSEARCH and REPLACE strings

Example 3Example 3

FILE READING and WRITINGFILE READING and WRITING

Sample scripts are run using Active Perl 5.6 Sample scripts are run using Active Perl 5.6 from www.activestate.comfrom www.activestate.com