Download - MCB3895-004 Lecture # 12 Oct 2/14
MCB3895-004 Lecture #12Oct 2/14
The many meanings of assembly quality; De novo genome assembly options
Discussion
• What types of "good" assemblies are there?
Assembly options
• Different sequencing technologies & libraries
• Trimmomatic: adapter removal, trimming low quality bases or reads
• Read error correction: SGA: $ sga
• De novo assemblers:• ABySS2: $ ABYSS• Celera Assembler $ runCA• MaSuRCA: $ masurca • MIRA $ mira• SPAdes $ spades.py• SOAPdenovo $ SOAPdenovo-127mer• Velvet $ velveth/velvetg
• Software parameters, e.g., kmer size
A useful script
• http://korflab.ucdavis.edu/datasets/Assemblathon/Assemblathon2/Basic_metrics/assemblathon_stats.pl
• Requires Falite.pm be in the same directory: http://korflab.ucdavis.edu/Unix_and_Perl/FAlite.pm
• May need to change this:Old: foreach my $base qw (A C G T N){New: my @bases_array;
foreach my $base (@bases_array){Also present later: foreach my $size qw ( … ){
A simpler bash script for qsub
#!/bin/bash
#$ -S /bin/bash
# change to a preexisting working directory
# .sh script does not need to be in this directory
cd $HOME/temp
# execute the job from the current working directory
#$ -cwd
# assumes that "something.pl" is in the
# directory "~/temp/"
perl something.pl > out
Today - assignment #4 continued1. Design ON PAPER a strategy that you think
will create an effective de novo assembly
2. Discuss this strategy with me BEFORE touching the computer
3. Execute your strategy
4. Based on your results, revise your strategy and try again