sequence analysis. dna and protein sequences are biological information that are well suited for...
TRANSCRIPT
Sequence Analysis
Sequence Analysis
• DNA and Protein sequences are biological information that are well suited for computer analysis
• Fundamental Axiom: homologous sequences share an evolutionary ancestor and are almost surely performing the same or a similar function
Sequence Analysis topics for today
• Restriction enzyme sites for diagnostics and cloning
• Open reading frame analysis
• Conceptual translation
• Oligo primer design
• Sequence alignments
Sequence Analysis
• Alignments document homologous relationships
• DNA sequence alignments - best for showing identity
• Protein sequence alignments best for showing similarity
Types of Alignments
In Class Tutorial
• Introduction to File Formats– Examples of file formats– Utilities to change formats
• Restriction Analysis– Web tools for restriction analysis– Local programs
In Class Tutorial
• Open reading frame analysis• Reverse complement• Capturing output to an MS Word doc• Oligo Primer Design for PCR and
sequencing• Alignments
– global and local
Sequence File Formats
• FASTA – Simplest format– Easy to create by hand on a word processor
FASTA• First line must start with > followed by seq name• Second line to end = sequence• No numbers or spaces• Seq can be UPPER or lower case
File Formats
• Some sequence analysis program take input sequences in FASTA format ONLY
• ReadSeq is a web based utility that converts many file formats to FASTA
• More and more programs will accept multiple file formats as input
Mono-Space Fonts
• Every character uses the same space = mono space
• ATG and C use the same space on a line
• W and . use the same space on a line
• Critical for sequence alignments to stay aligned
Mono-Space Fonts
NOT a Monospace font
Primer Design
• Primers are chemically synthesized oligonucleotides
• Used for sequencing and PCR
• Bad primer design can result in reaction failures
Primer Design Matters
Primer Design• TM 55-60°: PCR primer pairs need to have similar TM’s • GC content 40-60% (Biased to 5’ end) • Length = 17-25nt• Low self complementarity (Palindromes)• < 3/5 3’ bases G/C (no GC clamp at 3’ end)• Low complementarity between primers (avoid primer dimer)• Blast search primers – avoid repetitive DNA• Small amplicon size increases PCR efficiency• Avoid runs of one base
Primer Design: GC Clamps cause false priming