adapter and quality trimming mick watson director of ark-genomics the roslin institute

12
Adapter and quality trimming Mick Watson Director of ARK-Genomics The Roslin Institute

Upload: elfrieda-kelly-park

Post on 19-Jan-2018

218 views

Category:

Documents


0 download

DESCRIPTION

Illumina technology Watch a video?

TRANSCRIPT

Page 1: Adapter and quality trimming Mick Watson Director of ARK-Genomics The Roslin Institute

Adapter and quality trimming

Mick WatsonDirector of ARK-Genomics

The Roslin Institute

Page 2: Adapter and quality trimming Mick Watson Director of ARK-Genomics The Roslin Institute

ADAPTER TRIMMING

Page 3: Adapter and quality trimming Mick Watson Director of ARK-Genomics The Roslin Institute

Illumina technology• Watch a video?

http://www.youtube.com/embed/45vNetkGspo

Page 4: Adapter and quality trimming Mick Watson Director of ARK-Genomics The Roslin Institute

Illumina technology

Page 5: Adapter and quality trimming Mick Watson Director of ARK-Genomics The Roslin Institute

Bridge Amplification

Page 6: Adapter and quality trimming Mick Watson Director of ARK-Genomics The Roslin Institute

Key point:• Sequence from Illumina may contain adapters

Page 7: Adapter and quality trimming Mick Watson Director of ARK-Genomics The Roslin Institute

QUALITY TRIMMING

Page 8: Adapter and quality trimming Mick Watson Director of ARK-Genomics The Roslin Institute

Quality trimming• Take every read• Remove bases at 5’ end (usually) or 3’ end

(sometimes) that are below threshold• Either remove after first bad base• Or remove after average within sliding

window falls below threshold

Page 9: Adapter and quality trimming Mick Watson Director of ARK-Genomics The Roslin Institute

Paired-end and mate-pair

700bp

3000bp

2 x 100bp reads approx. 500bp apart

2 x 50bp reads approx. 3000bp apart

A Paired-end

B Mate-pair

Page 10: Adapter and quality trimming Mick Watson Director of ARK-Genomics The Roslin Institute

Paired reads• Paired reads represented by TWO fastq files• Often named the same with _1.fastq, _2.fastq• Or R1.fastq, R2.fastq

• Order of reads matters• Read 1 in file 1 paired with read 1 in file 2• Etc

Page 11: Adapter and quality trimming Mick Watson Director of ARK-Genomics The Roslin Institute

• What happens if your quality trimmer removes read from one file but not the other?

Page 12: Adapter and quality trimming Mick Watson Director of ARK-Genomics The Roslin Institute

Paired-end aware software?• We will use sickle to trim on quality– It is paired-end aware

• We will use cutadapt to remove adapters– It is not paired-end aware