lecture 6 some useful bits & pieces that every perl programmer should know

92
1 Lecture 6 Some useful bits&pieces that every Perl programmer should know

Upload: cathleen-guerrero

Post on 02-Jan-2016

11 views

Category:

Documents


0 download

DESCRIPTION

Lecture 6 Some useful bits & pieces that every Perl programmer should know. Strictness. Use strict; - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lecture  6 Some useful bits & pieces that every Perl programmer should know

1

Lecture 6

Some useful bits&pieces that every Perl programmer should

know

Page 2: Lecture  6 Some useful bits & pieces that every Perl programmer should know

2

Strictness

Use strict; This is a directive that, among other things, tells perl to insist on all veriables being declared before they are used, as well as requiring that all subrutines be declared (or defined) before they are invoked.

# bestrict - demonstrating the effect of strictness.use strict;$message = "This is the message.\n";print $message;

# bestrict - demonstrating the effect of strictness.use strict;my $message = "This is the message.\n";print $message;

Page 3: Lecture  6 Some useful bits & pieces that every Perl programmer should know

3

Results from bestrict

Global symbol "$message" requires explicit package name at bestrict line 7.

Global symbol "$message" requires explicit package name at bestrict line 9.

Execution of bestrict aborted due to compilation errors.

Use “my” To Fix bestrict :

# bestrict - demonstrating the effect of strictness.

use strict;my $message = "This is the message.\n";print $message;

Page 4: Lecture  6 Some useful bits & pieces that every Perl programmer should know

4

use subs

• Perl provides the use subs directive that can be used in combination with use strict to declare a list of subroutines at the top of the program

use strict;

use sub qw( drawline biod2mysql);

Page 5: Lecture  6 Some useful bits & pieces that every Perl programmer should know

5

Unless you have a really good reason not to, always switch on strictness at the top of

your program

Page 6: Lecture  6 Some useful bits & pieces that every Perl programmer should know

6

Perl One-Liners

Perl usually starts with the following line:#! /usr/bin/perl –w w: warning

-e switch checks whether a module installed correctly or not:

$ perl -e 'use ExampleModule‘ e: execute

$ perl -e 'print "Hello from a Perl one-liner.\n";'

$ perl -e 'printf "%0.2f\n", 30000 * .12;'

The ability to use the –e switch on the command-line in this way creates what is known in the perl world as a one-liner.

Page 7: Lecture  6 Some useful bits & pieces that every Perl programmer should know

7

Perl One-Liners: Equivalents

Another useful switch is –n, which, when used with –e, treats the one-liner as if it is enclosed with a loop.

$ perl -ne 'print if /ctgaatagcc/;' embl.data

Which is equivalent to the following program statement:

while ( <> ){ print if /ctgaatagcc/;}

Same function can also be imlemented using grep:

$ grep 'ctgaatagcc' embl.data

Page 8: Lecture  6 Some useful bits & pieces that every Perl programmer should know

8

Perl One-Liners: More Options

When –n is combined with –p, loop has a print statement added to the end. Here is a one-liner that prints only those lines from the embl.data disk-file that do not end in four digits:

$ perl -npe 'last if /\d{4}$/;' embl.data

equivalent perl program:

while ( <> ) { last if /\d{4}$/;}continue { print $_;}

$ grep -v '[0123456789][0123456789][0123456789][0123456789]$' embl.data

Page 9: Lecture  6 Some useful bits & pieces that every Perl programmer should know

9

Running Other Programs From perl

#! /usr/bin/perl -w

# pinvoke - demonstrating the invocation of other programs# from Perl.

use strict;

my $result = system( “dir *.*" );

print "The result of the system call was as follows:\n$result\n";

$result = `dir *.*`;

print "The result of the backticks call was as follows:\n$result\n";

$result = qx/dir *.*/;

Page 10: Lecture  6 Some useful bits & pieces that every Perl programmer should know

10

Results from pinvoke ...

Page 11: Lecture  6 Some useful bits & pieces that every Perl programmer should know

11

Recovering From Errors

Consider the following code:my $first_filename = "itdoesnotexist.txt";

open FIRSTFILE, "$first_filename" or die "Could not open $first_filename. Aborting.\n";

Exeption handling by using eval:

eval { my $first_filename = "itdoesnotexist.txt";

open FIRSTFILE, "$first_filename" or die "Could not open $first_filename. Aborting.\n";};if ( $@ ){ print "Calling eval produced this message: $@";}

Page 12: Lecture  6 Some useful bits & pieces that every Perl programmer should know

12

Use eval to protect potentially erroneous code

Page 13: Lecture  6 Some useful bits & pieces that every Perl programmer should know

13

Sorting#! /usr/bin/perl -w

# sortexamples - how Perl's in-built sort subroutine works.

use strict;

my @sequences = qw( gctacataat attgttttta aattatattc cgatgcttgg );

print "Before sorting:\n\t-> @sequences\n";

my @sorted = sort @sequences;my @reversed = sort { $b cmp $a } @sequences;my @also_reversed = reverse sort @sequences;

print "Sorted order (default):\n\t-> @sorted\n";print "Reversed order (using sort { \$b cmp \$a }):\n\t->

@reversed\n";print "Reversed order (using reverse sort):\n\t-> @also_reversed\

n";

Page 14: Lecture  6 Some useful bits & pieces that every Perl programmer should know

14

Results from sortexamples ...

Before sorting: -> gctacataat attgttttta aattatattc cgatgcttgg

Sorted order (default): -> aattatattc attgttttta cgatgcttgg gctacataat

Reversed order (using sort { $b cmp $a }): -> gctacataat cgatgcttgg attgttttta aattatattc

Reversed order (using reverse sort): -> gctacataat cgatgcttgg attgttttta aattatattc

Page 15: Lecture  6 Some useful bits & pieces that every Perl programmer should know

15

Another Sorting Example

my @chromosomes = qw( 17 5 13 21 1 2 22 15 );

print "Before sorting:\n\t-> @chromosomes\n";

@sorted = sort { $a <=> $b } @chromosomes;@reversed = sort { $b <=> $a } @chromosomes;

print "Sorted order (using sort { \$a <=> \$b }):\n\t-> @sorted\n";

print "Reversed order (using sort { \$b <=> \$a }):\n\t-> @reversed\n";

Page 16: Lecture  6 Some useful bits & pieces that every Perl programmer should know

16

And its results ...

Before sorting: -> 17 5 13 21 1 2 22 15Sorted order (using sort { $a <=> $b }):

-> 1 2 5 13 15 17 21 22Reversed order (using sort { $b <=> $a }):

-> 22 21 17 15 13 5 2 1

Page 17: Lecture  6 Some useful bits & pieces that every Perl programmer should know

17

The sortfile Program

#! /usr/bin/perl -w

# sortfile - sort the lines in any file.

use strict;

my @the_file;

while ( <> ){ chomp; push @the_file, $_;}

my @sorted_file = sort @the_file;

foreach my $line ( @sorted_file ){ print "$line\n";}

Page 18: Lecture  6 Some useful bits & pieces that every Perl programmer should know

18

Results from sortfile ...

Zap! Zoom! Bang! Bam!Batman, look out!Robin, behind you!Aaaaah, it's the Riddler!

$ perl sortfile sort.data

Aaaaah, it's the Riddler!Batman, look out!Robin, behind you!Zap! Zoom! Bang! Bam!

$ sort sort.data

Page 19: Lecture  6 Some useful bits & pieces that every Perl programmer should know

19

Learning More About Sorting

$ perldoc -f sort

$ man sort

Page 20: Lecture  6 Some useful bits & pieces that every Perl programmer should know

20

Take the time to become familiar with the utilities included in the operating system

Page 21: Lecture  6 Some useful bits & pieces that every Perl programmer should know

21

HERE Documents

Shotgun Sequencing

This is a relatively simple method of readinga genome sequence. It is ''simple'' becauseit does away with the need to locate individual DNA fragments on a map beforethey are sequenced.

The Shotgun Sequencing method relies on powerful computers to assemble the finishedsequence.

Page 22: Lecture  6 Some useful bits & pieces that every Perl programmer should know

22

Without HERE Documents

print "Shotgun Sequencing\n\n";print "This is a relatively simple method of reading\

n";print "a genome sequence. It is ''simple'' because\n";print "it does away with the need to locate\n";print "individual DNA fragments on a map before\n";print "they are sequenced.\n\n";print "The Shotgun Sequencing method relies on\n"; print "powerful computers to assemble the finished\n";print "sequence.\n";

Page 23: Lecture  6 Some useful bits & pieces that every Perl programmer should know

23

With HERE Documents

my $shotgun_message = <<ENDSHOTMSG;Shotgun Sequencing

This is a relatively simple method of readinga genome sequence. It is ''simple'' becauseit does away with the need to locate individual DNA fragments on a map beforethey are sequenced.

The Shotgun Sequencing method relies on powerful computers to assemble the finishedsequence. ENDSHOTMSG

print $shotgun_message;

Page 24: Lecture  6 Some useful bits & pieces that every Perl programmer should know

24

Even Better HERE Documents

print <<ENDSHOTMSG;Shotgun Sequencing

This is a relatively simple method of readinga genome sequence. It is ''simple'' becauseit does away with the need to locate individual DNA fragments on a map beforethey are sequenced.

The Shotgun Sequencing method relies on powerful computers to assemble the finishedsequence. ENDSHOTMSG

Page 25: Lecture  6 Some useful bits & pieces that every Perl programmer should know

25

Where To From Here

Page 26: Lecture  6 Some useful bits & pieces that every Perl programmer should know

26

Databases

Learning to talk database

Page 27: Lecture  6 Some useful bits & pieces that every Perl programmer should know

27

Introducing Databases

• A database is a collection of one or more related tables

• A table is a collection of one or more rows of data

• A row is a collection of one or more data items, arranged in columns

Page 28: Lecture  6 Some useful bits & pieces that every Perl programmer should know

28

1960-12-21 P. Barry1954-6-14 M. Moorhouse

-------------- ---------Discovery_Date Scientist-------------- ---------1960-12-21 P. Barry1954-6-14 M. Moorhouse1970-3-4 J. Blow2001-12-27 J. Doe

----------- ----------------Column name Type restriction----------- ----------------Discovery_Date a valid DateScientist a String no longer than 64 characters

Structured Data

Page 29: Lecture  6 Some useful bits & pieces that every Perl programmer should know

29

-------------- --------- ---------Discovery_Date Scientist Discovery-------------- --------- ---------1960-12-21 P. Barry Flying car1954-6-14 M. Moorhouse Telepathic sunglasses1970-3-4 J. Blow Self cleaning

child2001-12-27 J. Doe Time travel

----------- ----------------Column name Type restriction----------- ----------------Discovery_Date a valid DateScientist a String no longer than 64 charactersDiscovery a String no longer than 128 characters

Relating tables

Page 30: Lecture  6 Some useful bits & pieces that every Perl programmer should know

30

----------- ----------------Column name Type restriction----------- ----------------Discovery_Date a valid DateScientist a String no longer than 64 charactersDiscovery a String no longer than 128 charactersDate_of_birth a valid DateTelephone_number a String no longer than 16 characters

-------------- --------- --------- ------------- ----------------Discovery_Date Scientist Discovery Date_of_birth Telephone_number-------------- --------- --------- ------------- ----------------1960-12-21 P. Barry Flying car 1966-11-18 353-503-555-919101954-6-14 M. Moorhouse Telepathic sunglasses 1970-3-24 00-44-81-555-32321970-3-4 J. Blow Self cleaning child 1955-8-17 555-28372001-12-27 J. Doe Time travel 1962-12-1 - 1974-3-17 M. Moorhouse Memory swapping toupee 1970-3-24 00-44-81-555-32321999-12-31 M. Moorhouse Twenty six hour clock 1958-7-12 416-555-2000

Relating tables, cont.

Page 31: Lecture  6 Some useful bits & pieces that every Perl programmer should know

31

----------- ----------------Column name Type restriction----------- ----------------Discovery_Date a valid DateScientist_ID a String no longer than 8 charactersDiscovery a String no longer than 128 characters

----------- ----------------Column name Type restriction----------- ----------------Scientist_ID a String no longer than 8 charactersScientist a String no longer than 64 charactersDate_of_birth a valid DateAddress a String no longer than 256 charactersTelephone_number a String no longer than 16 characters

Solving the one table problem

Page 32: Lecture  6 Some useful bits & pieces that every Perl programmer should know

32

-------------- ------------ ---------Discovery_Date Scientist_ID Discovery-------------- ------------ ---------1954-6-14 MM Telepathic sunglasses1960-12-21 PB Flying car1969-8-1 PB A cure for bad jokes1970-3-4 JB Self cleaning child1974-3-17 MM Memory swapping toupee1999-12-31 MM2 Twenty six hour clock2001-12-27 JD Time travel

------------ --------- ------------- ------- ----------------Scientist_ID Scientist Date_of_birth Address Telephone_number------------ --------- ------------- ------- ----------------JB J. Blow 1955-8-17 Belfast, NI 555-2837JD J. Doe 1962-12-1 Syndey, AUS - MM M. Moorhouse 1970-3-24 England, UK 00-44-81-555-3232MM2 M. Moorhouse 1958-7-12 Toronto, CA 416-555-2000PB P. Barry 1966-11-18 Carlow, IRL 353-503-555-91910

Solving the one table problem, cont.

Page 33: Lecture  6 Some useful bits & pieces that every Perl programmer should know

33

A little database design goes a long way.

Page 34: Lecture  6 Some useful bits & pieces that every Perl programmer should know

34

Database system: a definition

• A database system is a computer program (or group of programs) that provides a mechanism to define and manipulate one or more databases

Page 35: Lecture  6 Some useful bits & pieces that every Perl programmer should know

35

Available Database Systems

• Personal database systems: Designed to run on PCsAccess, Paradox, FileMaker, dBase

• Enterprise database systems: Designed to support efficient storage and retrieval of vast amount of dataInterbase, Ingres, SQL Server, Informix, DB2, Oracle

• Open source database systems: Free!!! (Linux of course!!!)PostgreSQL, MySQL

Page 36: Lecture  6 Some useful bits & pieces that every Perl programmer should know

36

SQL: The Language of Databases

• Defining data with SQL (structured query language)

• SQL provides two facilities:1. A database definition Language (DDL)

provides a mechanism whereby databases can be created

2. A Data Manipulation Language (DML) provides a mechanism to work with data in tables

Page 37: Lecture  6 Some useful bits & pieces that every Perl programmer should know

37

The Swiss Institute of Bioinformatics maintains SWISS-PROT, an annotated protein sequence database. SWISS-PROT data format is described in detail in the manual, which is awailable at:

http://www.expasy.org/sprot/userman.html

EMBL Nucleotide Sequence Database, maintained by the EMBL Outstation at the EUROPİAN Bioinformatics Institute:

http://www.ebi.ac.uk/embl/Documentation/User_manual/home.html

A Database Case Study: MER

Page 38: Lecture  6 Some useful bits & pieces that every Perl programmer should know

38

ID MERT_ACICA STANDARD; PRT; 116 AA.AC Q52106;DT 01-NOV-1997 (Rel. 35, Created)DT 01-NOV-1997 (Rel. 35, Last sequence update)DT 15-JUN-2002 (Rel. 41, Last annotation update)DE Mercuric transport protein (Mercury ion transport protein).GN MERT.OS Acinetobacter calcoaceticus.OG Plasmid pKLH2.OC Bacteria; Proteobacteria; Gammaproteobacteria; Pseudomonadales;OC Moraxellaceae; Acinetobacter.

. . .

Extracted Sample Data

Page 39: Lecture  6 Some useful bits & pieces that every Perl programmer should know

39

http://www.mysql.com

$ chkconfig --add mysqld$ chkconfig mysqld on

$ mysqladmin -u root password 'passwordhere'

$ mysql -u root -p

Installing a database system

Page 40: Lecture  6 Some useful bits & pieces that every Perl programmer should know

40

mysql> create database MER;Query OK, 1 row affected (0.36 sec)

mysql> show databases;+------------+| Databases |+------------+| MER || test || mysql |+------------+3 rows in set (0.00 sec)

mysql> use mysql;Database changed

mysql> grant all on MER.* to bbp identified by 'passwordhere';Query OK. 0 rows affected (0.00 sec)

mysql> quitBye

Creating the MER database

Page 41: Lecture  6 Some useful bits & pieces that every Perl programmer should know

41

create table proteins(

accession_number varchar (6) not null,code varchar (4) not null,species varchar (5) not null,last_date date not null,description text not null, sequence_header varchar (75) not null,sequence_length int not null,sequence_data text not null

)

$ mysql -u bbp -p MER < create_proteins.sql

Adding tables to the MER database

Page 42: Lecture  6 Some useful bits & pieces that every Perl programmer should know

42

Understand the data before designing the tables

Page 43: Lecture  6 Some useful bits & pieces that every Perl programmer should know

43

acica_ADPT.swp.txt

serma_abdprt.swp.txt

shilf_seq_ACDP.swp.txt

Example SWISS-PROT data-files

Page 44: Lecture  6 Some useful bits & pieces that every Perl programmer should know

44

$ ./get_proteins *swp* > proteins.input

Preparing SWISS-PROT data for importation

Page 45: Lecture  6 Some useful bits & pieces that every Perl programmer should know

45

$ mysql -u bbp -p MER

mysql> load data local infile "proteins.input" into table proteins;

Query OK, 14 rows affected (0.07sec)Records: 14 Deleted: 0, Skipped: 0, Warnings: 0

Importing tab-delimited data into proteins

Page 46: Lecture  6 Some useful bits & pieces that every Perl programmer should know

46

mysql> select accession_number, sequence_length -> from proteins;

mysql> select accession_number, sequence_length -> from proteins -> order by accession_number;

mysql> select accession_number, sequence_length-> from proteins-> where sequence_length > 200-> order by sequence_length;

Working with the data in proteins

Page 47: Lecture  6 Some useful bits & pieces that every Perl programmer should know

47

create table dnas(

accession_number varchar (8) not null,entry_name varchar (9) not null,sequence_version varchar (16) not null,last_date date not null,description text not null,sequence_header varchar (75) not null,sequence_length int not null,sequence_data text not null

)

$ mysql -u bbp -p MER < create_dnas.sql

Adding another table to the MER database

Page 48: Lecture  6 Some useful bits & pieces that every Perl programmer should know

48

Preparing EMBL data for importation

Page 49: Lecture  6 Some useful bits & pieces that every Perl programmer should know

49

AF213017.EMBL.txt

J01730.embl.txt

M15049.embl.txt

M24940.embl.txt

$ ./get_dnas *EMBL* *embl* > dnas.input

Example EMBL data-files

Page 50: Lecture  6 Some useful bits & pieces that every Perl programmer should know

50

mysql> load data local infile "dnas.input" into table dnas;

Query OK, 4 rows affected (0.01sec)Records: 4 Deleted: 0, Skipped: 0, Warnings: 0

Importing tab-delimited data into dnas

Page 51: Lecture  6 Some useful bits & pieces that every Perl programmer should know

51

mysql> select accession_number, sequence_length-> from dnas-> where sequence_length > 4000-> order by sequence_length;

mysql> select accession_number, sequence_length-> from dnas-> where sequence_length > 4000-> order by sequence_length desc;

mysql> select accession_number, entry_name, sequence_length-> from dnas-> order by sequence_length desc-> limit 1;

Working with the data in dnas

Page 52: Lecture  6 Some useful bits & pieces that every Perl programmer should know

52

create table crossrefs (ac_protein varchar (6) not null,ac_dna varchar (8) not null

)

$ mysql -u bbp -p MER < create_crossrefs.sql

Adding the crossrefs table to the MER database

Page 53: Lecture  6 Some useful bits & pieces that every Perl programmer should know

53

Preparing cross-references for importation

$ ./get_protein_crossrefs *swp* > protein_crossrefs

$ ./get_dna_crossrefs *embl* *EMBL* > dna_crossrefs

$ ./unique_crossrefs protein_crossrefs dna_crossrefs > unique.input

Page 54: Lecture  6 Some useful bits & pieces that every Perl programmer should know

54

mysql> load data local infile "unique.input" into table crossrefs;

Query OK, 22 rows affected (0.04 sec)Records: 22 Deleted: 0 Skipped: 0 Warnings: 0

Importing tab-delimited data into crossrefs

Page 55: Lecture  6 Some useful bits & pieces that every Perl programmer should know

55

mysql> select * from crossrefs;

mysql> select proteins.sequence_header, dnas.sequence_header-> from proteins, dnas, crossrefs-> where proteins.accession_number = crossrefs.ac_protein-> and dnas.accession_number = crossrefs.ac_dna-> order by proteins.sequence_header;

mysql> select proteins.code, proteins.species, dnas.entry_name-> from proteins, dnas, crossrefs-> where proteins.accession_number = crossrefs.ac_protein-> and dnas.accession_number = crossrefs.ac_dna;

Working with the data in crossrefs

Page 56: Lecture  6 Some useful bits & pieces that every Perl programmer should know

56

mysql> select-> proteins.code as 'Protein Code',-> proteins.species as 'Protein Species',-> dnas.entry_name as 'DNA Entry Name'-> from proteins, dnas, crossrefs-> where proteins.accession_number = crossrefs.ac_protein-> and dnas.accession_number = crossrefs.ac_dna-> order by proteins.code;

Working with the data in crossrefs, cont.

Page 57: Lecture  6 Some useful bits & pieces that every Perl programmer should know

57

create table citations (accession_number varchar (8) not null,number int not null,author text not null,title text not null,location text not null,annotation text

)

$ mysql -u bbp -p MER < create_citations.sql

Adding the citations table to the MER database

Page 58: Lecture  6 Some useful bits & pieces that every Perl programmer should know

58

Preparing citation information for importation

$ ./get_citations * > citations.input

Page 59: Lecture  6 Some useful bits & pieces that every Perl programmer should know

59

mysql> load data local infile "citations.input" into table citations;

Query OK, 34 rows affected (0.08 sec)Records: 34 Deleted: 0 Skipped: 0 Warnings: 0

Importing tab-delimited data into citations

Page 60: Lecture  6 Some useful bits & pieces that every Perl programmer should know

60

mysql> select-> proteins.code as 'Protein Code',-> proteins.species as 'Protein Species',-> dnas.entry_name as 'DNA Entry Name',-> citations.location as 'Citation Location'-> from proteins, dnas, crossrefs, citations-> where proteins.accession_number =

crossrefs.ac_protein-> and dnas.accession_number = crossrefs.ac_dna-> and dnas.accession_number =

citations.accession_number-> order by proteins.code;

Working with the data in citations

Page 61: Lecture  6 Some useful bits & pieces that every Perl programmer should know

61

The SELECT query can do no harm.

Page 62: Lecture  6 Some useful bits & pieces that every Perl programmer should know

62

Where To From Here

Page 63: Lecture  6 Some useful bits & pieces that every Perl programmer should know

63

Databases and Perl

Using Perl to talk to databases

Page 64: Lecture  6 Some useful bits & pieces that every Perl programmer should know

64

Why Program Databases?

• Customised output handling

• Customized input handling

• Extending SQL

• Integrating MySQL into custom applications

Page 65: Lecture  6 Some useful bits & pieces that every Perl programmer should know

65

If at all possible, avoid the use of database driver ``enhancements''

Page 66: Lecture  6 Some useful bits & pieces that every Perl programmer should know

66

DBI and DBD::mysql modules need to be installed$ man DBI$ man DBD::mysql

$ find `perl -Te 'print "@INC"' ` -name '*.pm' -print | grep 'DBI.pm'$ find `perl -Te 'print "@INC"' ` -name '*.pm' -print | grep

'mysql.pm'

$ locate DBI.pm$ locate mysql.pm

Preparing Perl

Page 67: Lecture  6 Some useful bits & pieces that every Perl programmer should know

67

#! /usr/bin/perl -w

# check_drivers - check which drivers are installed with DBI.

use strict;

use DBI;

my @drivers = DBI->available_drivers;

foreach my $driver ( @drivers ){ print "Driver: $driver installed.\n";}

Checking the DBI installation

Page 68: Lecture  6 Some useful bits & pieces that every Perl programmer should know

68

#! /usr/bin/perl -w

# show_tables - list the tables within the MER database.# Uses "DBI::dump_results" to display results.

use strict;use DBI qw( :utils );

use constant DATABASE => "DBI:mysql:MER";use constant DB_USER => "bbp";use constant DB_PASS => "passwordhere";

my $dbh = DBI->connect( DATABASE, DB_USER, DB_PASS ) or die "Connect failed: ", $DBI::errstr, ".\n";

my $sql = "show tables";my $sth = $dbh->prepare( $sql );$sth->execute;print dump_results( $sth ), "\n";$sth->finish;$dbh->disconnect;

Programming Databases With DBI

Page 69: Lecture  6 Some useful bits & pieces that every Perl programmer should know

69

• DATABASE – Identifies the data source to use

• DB_USER – Identifies the username to use when connecting to the data source

• DB_PASS – Identifies the password to use when authenticating to the data source

Page 70: Lecture  6 Some useful bits & pieces that every Perl programmer should know

70

Be sure to adhere to any established naming conventions within a programming

community

Page 71: Lecture  6 Some useful bits & pieces that every Perl programmer should know

71

$ chmod +x show_tables

$ ./show_tables

'citations''crossrefs''dnas''proteins'4 rows4

Results from show_tables ...

Page 72: Lecture  6 Some useful bits & pieces that every Perl programmer should know

72

Avoid littering programs with username/password combinations

Page 73: Lecture  6 Some useful bits & pieces that every Perl programmer should know

73

package DbUtilsMER;

# DbUtilsMER.pm - the database utilities module from "Bioinformatics,# Biocomputing and Perl".## Version 0.01: module created to hold MERconnectDB.

require Exporter;

our @ISA = qw( Exporter );

our @EXPORT = qw( MERconnectDB );our @EXPORT_OK = qw();our %EXPORT_TAGS = ();

our $VERSION = 0.01;

Developing A Database Utility Module

Page 74: Lecture  6 Some useful bits & pieces that every Perl programmer should know

74

use constant DATABASE => "DBI:mysql:MER";use constant DB_USER => "bbp";use constant DB_PASS => "passwordhere";

sub MERconnectDB { # # Given: nothing. # Return: a "connected" database handle to the MER database or # if no connection is possible, return "undef". #

return DBI->connect( DATABASE, DB_USER, DB_PASS );}

1;

Developing A Database Utility Module, cont.

Page 75: Lecture  6 Some useful bits & pieces that every Perl programmer should know

75

use lib "$ENV{'HOME'}/bbp/";use DbUtilsMER;

my $dbh = MERconnectDB or die "Connect failed: ", $DBI::errstr, ".\n";

use constant DATABASE => "DBI:mysql:MER";use constant DB_USER => "bbp";use constant DB_PASS => "passwordhere";

my $dbh = DBI->connect( DATABASE, DB_USER, DB_PASS ) or die "Connect failed: ", $DBI::errstr, ".\n";

Using the Database Utility Module

Page 76: Lecture  6 Some useful bits & pieces that every Perl programmer should know

76

#! /usr/bin/perl -w

# show_tables2 - list the tables within the MER database.# Uses "fetchrow_array" to display results.

use strict;

use DBI;

use lib "$ENV{'HOME'}/bbp/";use DbUtilsMER;

my $dbh = MERconnectDB or die "Connect failed: ", $DBI::errstr, ".\n";

my $sql = "show tables";

my $sth = $dbh->prepare( $sql );

Improving upon dump_results

Page 77: Lecture  6 Some useful bits & pieces that every Perl programmer should know

77

$sth->execute;

print "The MER database contains the following tables:\n\n";

while ( my @row = $sth->fetchrow_array ){ foreach my $column_value ( @row ) { print "\t$column_value\n"; }}

$sth->finish;

$dbh->disconnect;

Improving upon dump_results, cont.

Page 78: Lecture  6 Some useful bits & pieces that every Perl programmer should know

78

The MER database contains the following tables:

citationscrossrefsdnasproteins

Results from show_tables2 ...

Page 79: Lecture  6 Some useful bits & pieces that every Perl programmer should know

79

There are 22 cross references in the database.

The protein P04336 is cross referenced with J01730.The protein P08332 is cross referenced with J01730.The protein P08654 is cross referenced with M15049.The protein P20102 is cross referenced with X03405.The protein Q52107 is cross referenced with AF213017. . .The protein P03830 is cross referenced with J01730.The protein Q52109 is cross referenced with AF213017.The protein P20102 is cross referenced with J01730.The protein Q52106 is cross referenced with AF213017.The protein P04337 is cross referenced with K03089.The protein P08664 is cross referenced with M15049.

Customizing Output

Page 80: Lecture  6 Some useful bits & pieces that every Perl programmer should know

80

#! /usr/bin/perl -w

# which_crossrefs - nicely displayed list of protein->dna# cross references.

use strict;

use DBI;

use lib "$ENV{'HOME'}/bbp/";use DbUtilsMER;

my $dbh = MERconnectDB or die "Connect failed: ", $DBI::errstr, ".\n";

my $sql = "select * from crossrefs";

my $sth = $dbh->prepare( $sql );

Customizing Output, cont.

Page 81: Lecture  6 Some useful bits & pieces that every Perl programmer should know

81

$sth->execute;

print "There are ", $sth->rows, " cross references in the database.\n\n";

while ( my @row = $sth->fetchrow_array ){ print "The protein $row[0] is cross referenced with $row[1].\n";}

$sth->finish;

$dbh->disconnect;

Customizing Output, cont.

Page 82: Lecture  6 Some useful bits & pieces that every Perl programmer should know

82

while ( my ( $protein, $dna ) = $sth->fetchrow_array ){ print "The protein $protein is cross referenced with $dna.\n";}

while ( my $row = $sth->fetchrow_hashref ){ print "The protein $row->{ ac_protein } is cross referenced "; print "with $row->{ ac_dna }.\n";}

Alternatives to fetchrow_array

Page 83: Lecture  6 Some useful bits & pieces that every Perl programmer should know

83

Use fetchrow_hashref to guard against changes to the structure of a database

table

Page 84: Lecture  6 Some useful bits & pieces that every Perl programmer should know

84

Provide a protein accession number to cross reference ('quit' to end): p03377Not found: there is no cross reference for that protein in the database.

Provide a protein accession number to cross reference ('quit' to end): p04337Found: P04337 is cross referenced with J01730.

Provide a protein accession number to cross reference ('quit' to end): q52109Found: Q52109 is cross referenced with AF213017.

Provide a protein accession number to cross reference ('quit' to end): x6587Not found: there is no cross reference for that protein in the database.

Provide a protein accession number to cross reference ('quit' to end): quit

Customizing Input

Page 85: Lecture  6 Some useful bits & pieces that every Perl programmer should know

85

#! /usr/bin/perl -w

# specific_crossref - allow for the "interactive" checking# of crossrefs from the command-line.# Keep going until the user enters "quit".

use strict;use DBI;

use lib "$ENV{'HOME'}/bbp/";use DbUtilsMER;

use constant TRUE => 1;

my $dbh = MERconnectDB or die "Connect failed: ", $DBI::errstr, ".\n";

my $sql = qq/ select ac_dna from crossrefs where ac_protein = ? /;

my $sth = $dbh->prepare( $sql );

Customizing Input

Page 86: Lecture  6 Some useful bits & pieces that every Perl programmer should know

86

while ( TRUE ){ print "\nProvide a protein accession number to cross "; print "reference ('quit' to end): ";

my $protein2find = <>;

chomp $protein2find;

$protein2find = uc $protein2find;

if ( $protein2find eq 'QUIT' ) { last; }

Customizing Input, cont.

Page 87: Lecture  6 Some useful bits & pieces that every Perl programmer should know

87

$sth->execute( $protein2find );

my $dna = $sth->fetchrow_array;

$sth->finish;

if ( !$dna ) { print "Not found: there is no cross reference for that protein "; print "in the database.\n"; } else { print "Found: $protein2find is cross referenced with $dna.\n"; } }

$dbh->disconnect;

Customizing Input, cont.

Page 88: Lecture  6 Some useful bits & pieces that every Perl programmer should know

88

#! /usr/bin/perl -w# The 'db_match_embl' program - check a sequence against each EMBL# database entry stored in the dnas# table within the MER database.

use strict;use DBI;use lib "$ENV{'HOME'}/bbp/";use DbUtilsMER;use constant TRUE => 1;use constant FALSE => 0;

my $dbh = MERconnectDB or die "Connect failed: ", $DBI::errstr, ".\n";

my $sql = qq/ select accession_number, sequence_data, sequence_length from dnas /;

my $sth = $dbh->prepare( $sql );

Extending SQL

Page 89: Lecture  6 Some useful bits & pieces that every Perl programmer should know

89

while ( TRUE ){ my $sequence_found = FALSE; print "Please enter a sequence to check ('quit' to end): ";

my $to_check = <>;

chomp( $to_check ); $to_check = lc $to_check;

if ( $to_check =~ /^quit$/ ) { last; } $sth->execute; while ( my ( $ac, $sequence, $sequence_length ) = $sth->fetchrow_array ) { $sequence =~ s/\s*//g;

Extending SQL, cont.

Page 90: Lecture  6 Some useful bits & pieces that every Perl programmer should know

90

if ( $sequence =~ /$to_check/ ) { $sequence_found = TRUE;

print "The EMBL entry in the database: ", $ac, " contains: $to_check.\n"; print "[Lengths: ", length $sequence, "/$sequence_length]\n\n"; } }

if ( !$sequence_found ) { print "No match found in database for: $to_check.\n\n"; }

$sth->finish;}$dbh->disconnect;

Extending SQL, cont.

Page 91: Lecture  6 Some useful bits & pieces that every Perl programmer should know

91

Please enter a sequence to check ('quit' to end): aattgcThe EMBL entry in the database: AF213017 contains: aattgc.[Lengths: 6838/6838]

Please enter a sequence to check ('quit' to end): aatttcThe EMBL entry in the database: AF213017 contains: aatttc.[Lengths: 6838/6838]

The EMBL entry in the database: J01730 contains: aatttc.[Lengths: 5747/5747]

Please enter a sequence to check ('quit' to end): accttaaatttgtacgtg

No match found in database for: accttaaatttgtacgtg.

Please enter a sequence to check ('quit' to end): quit

Results from db_match_embl ...

Page 92: Lecture  6 Some useful bits & pieces that every Perl programmer should know

92

Where To From Here