Download - Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others
![Page 1: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/1.jpg)
Managing Managing complexitycomplexity
(Advanced Perl)(Advanced Perl)Using perl for specific tasks Using perl for specific tasks with help from Bioperl and with help from Bioperl and
othersothers
![Page 2: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/2.jpg)
LoginLogin
Username: bioinfouserUsername: bioinfouser Password: loginbioinfoPassword: loginbioinfo
![Page 3: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/3.jpg)
Funny?Funny?
![Page 4: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/4.jpg)
GoalsGoals
I already assume you know perl basics I already assume you know perl basics -- some more advanced features-- some more advanced features
Learn how to write OO codeLearn how to write OO code More flexible modulesMore flexible modules Understand other modulesUnderstand other modules
Some API’s that you may need.Some API’s that you may need. BioperlBioperl PerlDBIPerlDBI
![Page 5: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/5.jpg)
What I assume you What I assume you already knowalready know
ScalarsScalars ArraysArrays HashesHashes Control structures (if-then, for, Control structures (if-then, for,
foreach, while, etc.)foreach, while, etc.) File IOFile IO
![Page 6: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/6.jpg)
Managing complexity Managing complexity By managing complexityBy managing complexity
Make hard tasks easy(er)Make hard tasks easy(er) Perl itself does thisPerl itself does this
Regular expressions, text manipulationsRegular expressions, text manipulations Extensions (modules) do thisExtensions (modules) do this
May come at the expense of execution speedMay come at the expense of execution speed You may not careYou may not care Consider the big pictureConsider the big picture
Development timeDevelopment time ErrorsErrors
Extremely custom softwareExtremely custom software Some things need speedSome things need speed
![Page 7: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/7.jpg)
How complex is it now?How complex is it now?
Perl is a very compact language in terms Perl is a very compact language in terms of human languagesof human languages
Perl is large compared with other Perl is large compared with other languageslanguages TMTOWTDITMTOWTDI Perl has approximately 233 reserved wordsPerl has approximately 233 reserved words Java has approximately 47 reserved wordsJava has approximately 47 reserved words
Both are easy to learn harder to use Both are easy to learn harder to use effectivelyeffectively
![Page 8: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/8.jpg)
General practicesGeneral practices
Always use #!/usr/bin/perl –w or use Always use #!/usr/bin/perl –w or use warnings;warnings;
Consider use strict; for scripts Consider use strict; for scripts longer than 10 lineslonger than 10 lines
You can’t have too many commentsYou can’t have too many comments ## =head=head =cut=cut perldocperldoc
![Page 9: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/9.jpg)
Getting values into the Getting values into the program or subroutine.program or subroutine.
Perl is pass by valuePerl is pass by value A scalar can have as a value a “pointer” A scalar can have as a value a “pointer”
to an array, hash, function etc.to an array, hash, function etc. The args to a program or function The args to a program or function
arrive in a special variable called @_arrive in a special variable called @_ my $first_value = shift @_;my $first_value = shift @_; my $first_value = $_[1];my $first_value = $_[1]; my $first_value = shift;my $first_value = shift;
![Page 10: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/10.jpg)
ReferencesReferences
my @array = (“one”, “two”, “three”, “four”);
function_call(@array);
function_call(\@array);
function_call([“one”,”two”,”three”]);
sub function_call{
my $passed = shift @_;
print $passed;
}
Output
oneARRAY(0x80601a0)ARRAY(0x804c9a0)
![Page 11: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/11.jpg)
Debugging complex data Debugging complex data structures.structures.
Print the referencePrint the reference It will tell you a little bit of informationIt will tell you a little bit of information
Use the Dumper module.Use the Dumper module. This will give you a snapshot of the This will give you a snapshot of the
whole data structurewhole data structure
![Page 12: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/12.jpg)
Some more advanced Some more advanced featuresfeatures
![Page 13: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/13.jpg)
Regular expressionsRegular expressions
Not Perl specific Not Perl specific Very usefulVery useful What they do:What they do:
String comparisonsString comparisons String substitutionsString substitutions Substring selectionSubstring selection
![Page 14: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/14.jpg)
RegexRegex$string =~ /find/ $string =~ /find$/
$string =~ /^find/ $string =~ /^find$/
. Match any character\w Match "word" character (alphanumeric plus "_")\W Match non-word character\s Match whitespace character\S Match non-whitespace character\d Match digit character\D Match non-digit character\t Match tab\n Match newline\r Match return
Could put ‘m’
![Page 15: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/15.jpg)
RepetitionRepetition
$string =~ /(ti){2}/
$string =~ /A*T+G?C{3}A{3,}T{4,6}/
Character ClassesCharacter Classes$string =~ /[ATGCN]/$string =~ /[^ATGCNatgcn]/i
![Page 16: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/16.jpg)
Selection/ReplacementSelection/Replacement
$string =~ /(A{3,8})/;print $1;
$string =~ s/a/A/
$string =~ tr/[atgc]/[ATGC]/
![Page 17: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/17.jpg)
Additional syntaxAdditional syntax
$string =~ /AT*?AT/
$string =~ m#/var/log/messages#
$_ = “ATATATAGTGTGCGTGATATGGG”;
($one,$two,$three) =~ /AT..AT/g;
![Page 18: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/18.jpg)
What is a moduleWhat is a module
Two typesTwo types Object-oriented typeObject-oriented type
Provides something similar to a class Provides something similar to a class definitiondefinition
Remote function call Remote function call Provides a method to import subroutines or Provides a method to import subroutines or
variables for the main program to usevariables for the main program to use
![Page 19: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/19.jpg)
Howto: Howto: MakingMaking a module a module
Create a file called workSaver.pm###########package workSaver;
sub doStuff {print “Stuff done\n”;
}
1; #statement that evaluates to true###########Now you can use with “use workSaver;”*
*Some restrictions apply
![Page 20: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/20.jpg)
Howto:Making a module Howto:Making a module cont.cont.
This method would work very well for This method would work very well for subroutines that are used in several subroutines that are used in several programs.programs.
Reduces the “clutter” in your Reduces the “clutter” in your programprogram
Provides one maintenance point Provides one maintenance point instead of unknown number.instead of unknown number. Eases bug fixesEases bug fixes Careful of boundariesCareful of boundaries
![Page 21: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/21.jpg)
More Complete method:More Complete method:
Allows you to “pollute” the Allows you to “pollute” the namespace of the original program namespace of the original program selectively.selectively.
Makes the use of functions and Makes the use of functions and variables easiervariables easier
Still used about the same way as the Still used about the same way as the simple method but things are clearersimple method but things are clearer
![Page 22: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/22.jpg)
More CompleteMore Complete
package functional;use strict;use Exporter;our @ISA = ("Exporter");our @EXPORT = qw ();our @EXPORT_OK = qw ($variable1 $variable2 printout);our $VERSION = 2.0;
our $variable1 = "var1";our $variable2 = "var2";my $variable3 = "var3";
sub printout { my $passed_variable = shift; print "Your variable is $passed_variable mine are $variable1 , $variable2, $variable3 \n";}
1;
![Page 23: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/23.jpg)
CPANCPAN
Wouldn’t it be nice to have a place Wouldn’t it be nice to have a place where:where: You could find a bunch of perl modulesYou could find a bunch of perl modules It would be brows ableIt would be brows able SearchableSearchable Big pipe for people to download stuffBig pipe for people to download stuff Other people would be encouraged to Other people would be encouraged to
submit fixes and updatessubmit fixes and updates And it was all freeAnd it was all free
![Page 24: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/24.jpg)
Sources of Sources of modules/Informationmodules/Information
www.CPAN.orgwww.CPAN.org www.bioperl.orgwww.bioperl.org www.perl.comwww.perl.com www.cetus-links.org/oo_infos.htmlwww.cetus-links.org/oo_infos.html
![Page 25: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/25.jpg)
BioperlBioperl
Set of modules that are extremely Set of modules that are extremely useful for working with biological useful for working with biological data. Actively maintained.data. Actively maintained.
www.bioperl.orgwww.bioperl.org is a very good is a very good place to get the basics of bioperlplace to get the basics of bioperl
We will go through an example to We will go through an example to see a typical usesee a typical use
![Page 26: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/26.jpg)
Bioperl has several basic types of Bioperl has several basic types of objects:objects: Seq: a sequence the most common type Seq: a sequence the most common type
Bio::SeqBio::Seq Location objects: where it is how long it Location objects: where it is how long it
is etc.is etc. Interface objects: Bio::xyzI No Interface objects: Bio::xyzI No
implementation mostly a documentationimplementation mostly a documentation
![Page 27: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/27.jpg)
Bioperl documentationBioperl documentation
Several different ways to find out Several different ways to find out about a moduleabout a module perldoc Bio::Seqperldoc Bio::Seq bioperl.org/usr/lib/perl5/site_perl/bioperl.org/usr/lib/perl5/site_perl/
5.8.0/bptutorial.pl 100 Bio::Seq5.8.0/bptutorial.pl 100 Bio::Seq Data::Dumper to print the data Data::Dumper to print the data
structurestructure Print the variablePrint the variable
![Page 28: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/28.jpg)
Bio perl demoBio perl demo
![Page 29: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/29.jpg)
Why use a databaseWhy use a database
Transaction control - only one user Transaction control - only one user can modify the data at any one time.can modify the data at any one time.
Access control - some people can Access control - some people can modify data, some can read data, modify data, some can read data, others can create data-structures.others can create data-structures.
Fast handling of lots of dataFast handling of lots of data Precise definition of data (mostly).Precise definition of data (mostly). Easy to share data resources with Easy to share data resources with
othersothers
![Page 30: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/30.jpg)
Many choicesMany choices
There are many types: MS Access, There are many types: MS Access, Excel(sortof), sybase, oracle, Excel(sortof), sybase, oracle, postgres, msql, mysql …postgres, msql, mysql …
They each have their niche and They each have their niche and function best in certain cases, there function best in certain cases, there is also considerable overlap.is also considerable overlap.
SQL – structured query language is SQL – structured query language is a common threada common thread
![Page 31: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/31.jpg)
MySQL is better than MySQL is better than YourSQLYourSQL
Free on UnixFree on Unix Good developer supportGood developer support Constant bug fixes and feature additionConstant bug fixes and feature addition Good scalability to medium size and load, Good scalability to medium size and load,
OK performance.OK performance. Easy to install.Easy to install. Used at Ensemble and UCSC genome Used at Ensemble and UCSC genome
browsers, so a lot of information is readily browsers, so a lot of information is readily available in that format.available in that format.
![Page 32: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/32.jpg)
Table Structure - SchemaTable Structure - Schema
Gene tableGene_IDName
Alias tableAlias_IDGene_IDAlias
Reference tableReference_IDGene_IDReferenceDataSource
Gene: ATP7BAliases:
Wilson disease-associated proteinCopper-transporting ATPase 2
References: Enzyme Commission: 3.6.3.4UniGene: Hs.84999AffyProbeU133: 204624_atAffyProbeU95: 37930_atRefSeq: NM_000053GenBank: AF034838GenBank: U11700LocusLink: 540
![Page 33: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/33.jpg)
SQL (MySQL dialect)SQL (MySQL dialect)
SELECT col_name FROM table SELECT col_name FROM table WHERE col_name = value;WHERE col_name = value;
SELECT COUNT(*) FROM table SELECT COUNT(*) FROM table WHERE col_name is like ‘%value%’;WHERE col_name is like ‘%value%’;
SELECT count(distinct(col_name)) SELECT count(distinct(col_name)) FROM table where col_name is not FROM table where col_name is not null;null;
CREATE, UPDATE, DELETE, INSERT CREATE, UPDATE, DELETE, INSERT have similar formshave similar forms
![Page 34: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/34.jpg)
SQL cont.SQL cont.
USE database_nameUSE database_name Also can be specified on the command line –DAlso can be specified on the command line –D
SHOW TABLES – lists all the tables in SHOW TABLES – lists all the tables in that database (also SHOW DATABASES).that database (also SHOW DATABASES).
DESCRIBE table_name – lists the columns DESCRIBE table_name – lists the columns and datatypes for each columnand datatypes for each column
or SHOW COLUMNS FROM table_nameor SHOW COLUMNS FROM table_name
![Page 35: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/35.jpg)
More advanced SELECTSMore advanced SELECTS
SELECT (column_list) FROM SELECT (column_list) FROM (table_list) WHERE (constraints) (table_list) WHERE (constraints) GROUP_BY (grouping columns) GROUP_BY (grouping columns) ORDER_BY (sorting columns) LIMIT ORDER_BY (sorting columns) LIMIT (limit number);(limit number);
SELECT col_name from (table1, SELECT col_name from (table1, table2) where table1_val = table2) where table1_val = table2_val and table1_val2 > value;table2_val and table1_val2 > value; Example of a equi-joinExample of a equi-join
![Page 36: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/36.jpg)
Getting the names rightGetting the names right
If you only have one table you only If you only have one table you only need to use the column nameneed to use the column name
When you are using joins this may When you are using joins this may not be adequate.not be adequate. If two tables have the column primary If two tables have the column primary
you would need to call the column you would need to call the column table1.primary or table2.primarytable1.primary or table2.primary
![Page 37: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/37.jpg)
Data TypesData Types INTINT
Tinyint –128 to 127Tinyint –128 to 127 Smallint –32768 to 32767Smallint –32768 to 32767 Mediumint –8388608 to 8388607Mediumint –8388608 to 8388607 Int –2147683648 to 2147483647Int –2147683648 to 2147483647 Bigint –9223372036854775808 to Bigint –9223372036854775808 to
9223372036854775807 9223372036854775807 FLOATFLOAT
Float 4 bytesFloat 4 bytes Double 8 bytesDouble 8 bytes
![Page 38: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/38.jpg)
CHARCHAR Char(n) character string of n n bytesChar(n) character string of n n bytes Varchar(n) character string up to n long Varchar(n) character string up to n long
L+1 bytesL+1 bytes Text upto 2^16 bytesText upto 2^16 bytes
BLOBs Binary Large OBjects BLOBs Binary Large OBjects
![Page 39: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/39.jpg)
Perl DBIPerl DBI
Method for perl to connect to a Method for perl to connect to a database (virtually any database) database (virtually any database) and read or modify data. and read or modify data.
The statements are constructed very The statements are constructed very similar to SQL statements that similar to SQL statements that would be entered on the command would be entered on the command line so learning SQL is still line so learning SQL is still necessarynecessary
![Page 40: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/40.jpg)
Statements in DBIStatements in DBI
ConnectConnect Used to establish initial connectionUsed to establish initial connection
PreparePrepare Prepare a statement to executePrepare a statement to execute
ExecuteExecute Execute the statementExecute the statement
DoDo prepare a statement that does not return prepare a statement that does not return
results and execute it results and execute it
![Page 41: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/41.jpg)
FetchFetch Several types used to get returned dataSeveral types used to get returned data
DisconnectDisconnect Disconnect from the serverDisconnect from the server
![Page 42: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/42.jpg)
Types of fetchTypes of fetch
““fetchrow_array”fetchrow_array” Used to fetch an array of scalars each Used to fetch an array of scalars each
timetime Can also use “fetchrow_arrayref”Can also use “fetchrow_arrayref”
““fetchrow_hash”fetchrow_hash” Used to fetch a hash indexed by column Used to fetch a hash indexed by column
name.name. Slower but cleaner code.Slower but cleaner code. Can also use “fetchrow_hashref”.Can also use “fetchrow_hashref”.
![Page 43: Managing complexity (Advanced Perl) Using perl for specific tasks with help from Bioperl and others](https://reader030.vdocuments.us/reader030/viewer/2022032806/56649f005503460f94c1584d/html5/thumbnails/43.jpg)
More advanced More advanced statementsstatements
QuoteQuote Used to properly quote data for use with a Used to properly quote data for use with a
prepare statementprepare statement ““$value = $dbh->quote($blast_result);”$value = $dbh->quote($blast_result);”
PlaceholdersPlaceholders Speeds up execution, optionalSpeeds up execution, optional
my $prep = $dbh->prepare (“select x from y where z my $prep = $dbh->prepare (“select x from y where z = ?”);= ?”);
loop_startloop_start $prep->bind_param(1,$z);$prep->bind_param(1,$z); $prep->execute();$prep->execute(); loop_endloop_end