productivity tips - introduction to linux for bioinformatics
DESCRIPTION
Part 6 of the training "Introduction to linux for bioinformatics". Some useful tips to get your bioinformatics scripts better.TRANSCRIPT
Productivity
Joachim Jacob8 and 15 November 2013
Multiple commands
In bash, commands put on one line when be separated by “;”
$ wget http://homepage.tudelft.nl/19j49/t-SNE_files/tSNE_linux.tar.gz ; tar xvfz tSNE_linux.tar.gz
Multiple commands
Commands on a oneliner can also be separated by && or ||
&& Only execute the command if the preceding one finished correctly.
$ curl corz.org/ip && echo '\n'
|| (not a pipe!) - Inverse of the above. Only execute the command if the preceding one did not succesfully ends.
Piping a list of files with xargs
A pipe reads the output of a command.
Some commands requires the file name to be passed, instead of the content of the file. E.g. this doesn't work:
$ ls | less
$ ls | fileUsage: file [-bchikLlNnprsvz0] [--apple] [--mime-encoding] [--mime-type] [-e testname] [-F separator] [-f namefile] [-m magicfiles] file ... file -C [-m magicfiles] file [--help]
Piping a list of files with xargs
Some commands requires the file name to be passed, instead of the content of the file.
xargs passes the output of a command as a list of arguments to another program.
$ ls | xargs filebin: directorybuddy.sh: Bourne-Again shell script, ASCII text executableCompression_exercise: directoryDesktop: directoryDocuments: directoryDownloads: directoryFastQValidator.0.1.1.tgz: gzip compressed data, from Unix, last modified: Fri Oct 19 16:44:23 2012
.bashrc
~/.bashrc is a hidden configuration file for bash in your home.
It configures the prompt in your terminal.It contains aliases to commands.
alias example
When you enter a first word on the command line that bash does not recognize as a command, it will search in the aliases for the word.
You can specify aliases in .bashrc. An example:
Alias example
Some interesting aliases
alias ll='ls -lh'alias dirsize="du -sh */"alias uncom='grep -v -E "^\#|^$"'alias hosts="cat /etc/hosts"alias dedup="awk '! x[$0]++' "
Aliases are perfectly suited for storing one-liners: find some athttps://wikis.utexas.edu/display/bioiteam/Scott%27s+list+of+linux+one-liners
Alias exercise
→ exercise link
Finding stuff: locate
Extremely quick and convenient:locate
However, it won't find the newest files you created. First you need to update the database by running:updatedb
It accepts wildcards. Example:$ locate *.sam
Bonus: How to filter on a certain location?
Finding stuff: find
More elaborate tool to find stuff:$ find -name alignment.sam
Find won't find without specifying options:-name : to search on the name of the file-type : to search for the type: (f)ile, (d)irectory, (l)ink-perm : to search for the permissions (111 or rwx)…
This is the power tool to find stuff.
Finding stuff: find
The most powerful option of find:-exec Execute a command on the found entities.
Finding stuff: find
The most powerful option of find:-exec Execute a command on the found entities.
$ find -name \*.gz ./DRR000542_2.fastq.subset.gz./DRR000542_1.fastq.subset.gz./DRR000545_2.fastq.subset.gz./DRR000545_1.fastq.subset.gz$ find -name \*.gz -exec gunzip {} \;$ lsDRR000542_1.fastq.subset DRR000545_1.fastq.subsetDRR000542_2.fastq.subset DRR000545_2.fastq.subset
Command substitution in bash
In bash, the output of commands can be directly stored in a variable. Put the command between back-ticks.
$ test=`ls -l`$ echo $testtotal 7929624 -rw-rw-r-- 1 joachim joachim 15326 May 10 2013 0538c2b.jpg -rw-rw-r-- 1 joachim joachim 4914797 Nov 8 16:15 18d7alY
Command substitution in bash
A variable can also contain a list. A list contains several entities (e.g. files).
Extracting first 100k lines from compressed text file:
for filename in `ls DRR00054*tar.gz`; \ do zcat $filename | head -n 1000000 \
>${file%.gz}.subset; done
The output of ls is being put in a list. 'for' assigns one after the other the name of the file to the variable file. This variable is used in the
oneliner zcat | head.
Keywords.bashrc
;
alias
prompt
locate
find
Command substitution
Write in your own words what the terms mean
Break