computer programming for biologists class 10 dec 5 th, 2014 karsten hokamp

18
Computer Programming for Biologists Class 10 Dec 5 th , 2014 Karsten Hokamp tp://bioinf.gen.tcd.ie/GE30M25/programm

Upload: ronald-clark

Post on 31-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

Class 10

Dec 5th, 2014

Karsten Hokamp

http://bioinf.gen.tcd.ie/GE30M25/programming

Page 2: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

• system calls

• one-liners

Overview

Page 3: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

- integrate other programs

- Perl acts as wrapper

- three possible ways:

system "command";

exec "command";

$output = `command`;

- only backticks option allows to capture output

- exec option quits the Perl script

System calls

backticks

Page 4: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

# shut down the computer:

exec "shutdown -h now";

# run an alignment program

system "clustalw multi.fa";

# check the load of the computer

$load = `uptime`;

System calls - examples

Page 5: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

http://bioinf.gen.tcd.ie/GE3M25/programming/class10

Practial

Page 6: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

- quick way of programming

- no need for editor

- handy for testing pieces of code

- bioinformatics examples online

One-liners

Page 7: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

A very simple example:

One-liners

$ perl –w -e 'print "Hello world!\n";'

use warnings

Perl codeexecute code

Page 8: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

Squash short Perl scripts into one line:

One-liners

$ perl –w -e 'while (<>) { print ++$count." $_";}' input.txt

#!/usr/bin/perluse warnings; # add numbers to each input linewhile (<>) {

print ++$count." $_";}

Page 9: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

One-liners

$ perl -e 'while (<>) {print ++$i . " $_";}' X62493_embl.txt1 ID X62493; SV 1; linear; genomic RNA; STD; VRL; 4789 BP.2 XX3 AC X62493;4 XX5 DT 29-MAY-1992 (Rel. 32, Created)6 DT 18-APR-2005 (Rel. 83, Last updated, Version 7)…

argument to scriptPerl codePerl flag

$ perl -e 'while (<>) {print ++$i . " $_";}' X62493_embl.txt > out.txt

Alternatively, redirect output into a new file:

Page 10: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

Useful switches:One-liners

-a turns on autosplit mode (@F = split / /, $_;), used with -n, -p

-d starts perl in debugging mode

-e specify perl code to be executed

-i[extension] edit file(s) in-place (create backup copy with extension)

-l strips newlines on input, and adds them on output

-n loops through each line of input file

-p loops through each line of input file and prints it

-F[pattern] specifies the pattern to split on if -a is also in effect

-M[module] use module with program

-v prints version of Perl executable

-w prints warnings about dubious constructs

more detailed info through man perlrun

Page 11: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

One-liners

-l strips newlines on input, and adds them on output

No more 'chomp' and "\n" !

-l strips newlines on input, and adds them on output

No more 'chomp' and "\n" !

-n loops through each line of input file

Does the following internally:

while (<>) {

… # the code after –e flag goes here

}

-n loops through each line of input file

Does the following internally:

while (<>) {

… # the code after –e flag goes here

}

Page 12: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

Useful switch combinations:

One-liners

-p -i -e 's/pattern1/pattern2/g' file(s)loops through each line and prints it, replaces some text, changes files in-place

perl -p -i.bak -e 's/^/sprintf("%-5s", ++$i)/e if (/\S/)' seq.fa

-p -i -e 's/pattern1/pattern2/g' file(s)loops through each line and prints it, replaces some text, changes files in-place

perl -p -i.bak -e 's/^/sprintf("%-5s", ++$i)/e if (/\S/)' seq.fa

-lane 'do something with @F' file(s)process line ending, splits line on spaces into @F, cycles through each line and lets you do something with @F

-F"\t" -lane 'do something with @F' file(s)same as above but on tab-delimited files

perl -F"\t" -lane 'print(join "\t", @F[0,2,1,3..$#F])' in > out

-lane 'do something with @F' file(s)process line ending, splits line on spaces into @F, cycles through each line and lets you do something with @F

-F"\t" -lane 'do something with @F' file(s)same as above but on tab-delimited files

perl -F"\t" -lane 'print(join "\t", @F[0,2,1,3..$#F])' in > out

Page 13: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

Useful switch combinations:

One-liners

-p -i -e 's/pattern1/pattern2/g' file(s)loops through each line and prints it, replaces some text, changes files in-place

perl -p -i.bak -e 's/^/sprintf("%-5s", ++$i)/e if (/\S/)' seq.fa

-p -i -e 's/pattern1/pattern2/g' file(s)loops through each line and prints it, replaces some text, changes files in-place

perl -p -i.bak -e 's/^/sprintf("%-5s", ++$i)/e if (/\S/)' seq.fa

-lane 'do something with @F' file(s)process line ending, splits line on spaces into @F, cycles through each line and lets you do something with @F

-F"\t" -lane 'do something with @F' file(s)same as above but on tab-delimited files

perl -F"\t" -lane 'print(join "\t", @F[0,2,1,3..$#F])' in > out

-lane 'do something with @F' file(s)process line ending, splits line on spaces into @F, cycles through each line and lets you do something with @F

-F"\t" -lane 'do something with @F' file(s)same as above but on tab-delimited files

perl -F"\t" -lane 'print(join "\t", @F[0,2,1,3..$#F])' in > out

replace start of line

increment counter variableand pad it with whitespace

treat replacement string as an expression

only work on lines that are not empty

create backupwith .bak ending

modify file in-place

Page 14: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

Useful switch combinations:

One-liners

-p -i -e 's/pattern1/pattern2/g' file(s)loops through each line and prints it, replaces some text, changes files in-place

perl -p -i.bak -e 's/^/sprintf("%-5s", ++$i)/e if (/\S/)' seq.fa

-p -i -e 's/pattern1/pattern2/g' file(s)loops through each line and prints it, replaces some text, changes files in-place

perl -p -i.bak -e 's/^/sprintf("%-5s", ++$i)/e if (/\S/)' seq.fa

-lane 'do something with @F' file(s)process line ending, splits line on spaces into @F, cycles through each line and lets you do something with @F

-F"\t" -lane 'do something with @F' file(s)same as above but on tab-delimited files

perl -F"\t" -lane 'print(join "\t", @F[0,2,1,3..$#F])' in > out

-lane 'do something with @F' file(s)process line ending, splits line on spaces into @F, cycles through each line and lets you do something with @F

-F"\t" -lane 'do something with @F' file(s)same as above but on tab-delimited files

perl -F"\t" -lane 'print(join "\t", @F[0,2,1,3..$#F])' in > out

split at tabs

process each line,elements in @F

print in tab-delimited format

change order of elements

Page 15: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

One-liners for bioinformatics:

One-liners

The Scriptome - Protocols for Manipulating Biological Data

http://sysbio.harvard.edu/csb/resources/computational/scriptome

In-house collection (includes Perl primer):

http://bioinf.gen.tcd.ie/pol

Page 16: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

http://bioinf.gen.tcd.ie/GE3M25/programming/class10

Practial

Page 17: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

loads of great recipes:

Where to go from here?

loads of useful extensions:

BioPerl

http://www.bioperl.org

Page 18: Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp

Computer Programming for Biologists

Exam

Programming Exam:Thu, Dec 11th, 11 -1 pm

Bioinformatics Exam:Tue, Dec 9th, 11 -1 pm