version 21a_1012 1 the ucsc genome browser introduction osvaldo graña cnio bioinformatics unit...

34
Version 21a_1012 1 The UCSC Genome Browser Introduction Osvaldo Graña CNIO Bioinformatics Unit Materials prepared by Mary Mangan, Ph.D. www.openhelix.com

Upload: paula-kelly

Post on 14-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Version 21a_1012 1

The UCSC Genome BrowserIntroduction

Osvaldo GrañaCNIO Bioinformatics Unit

Materials prepared by

Mary Mangan, Ph.D.

www.openhelix.com

2

UCSC Genome Browser Agenda

UCSC Genome Browser: http://genome.ucsc.edu

Introduction Basic Searches Understanding Displays Get Details or Sequences Sequence Searches (BLAT) Exercises

3

Organization of Genomic Data

Reference genome: base position numbersequence

Annotation

Tracks

chromosome band

predicted genes

phenotype and disease

evolutionary conservation

SNPs and structural variation

gap locations

known genes

repeated regions

microarray/expression data

more…

enhancer/promoter data

Links out to more data

4

A Sample of the UCSC Genome Browser

An

notatio

n T

racks

referencesequence

comparisons

gene details

SNPs

5

UCSC Genome Browser Agenda

UCSC Genome Browser: http://genome.ucsc.edu

Introduction and Credits Basic Searches Understanding Displays Get Details or Sequences Sequence Searches (BLAT) Exercises

6

The UCSC Homepage: http://genome.ucsc.edu

navigate

navigate

General information

Specific information—new features, current status, etc.

Gateway: Start Page for a Basic Search

7

text/ID searches

Helpful search examples

format provided

Use this Gateway to search: Gene names, symbols, IDs Chromosome number: chr7, or region: chr11:1038475-1075482 Keywords: kinase, receptor

See lower part of page for help with format

UCSC Genome Browser Gateway

8

Make your Gateway choices:

1. Select clade + genome = species: search 1 species at a time

2. Assembly: the official reference DNA sequence

3. Position: location in the genome to examine, or text search

4. Track search to find data types of interest (annotation tracks)

5. Configure: make fonts bigger + other display choices

1

assembly

3

4

2

5

9

Sample Search for Human TP53

Sample search: human, February 2009 assembly, tp53

select

Select from results list; or goes to a viewer page, if unique

uc002gij.2

10

UCSC Genome Browser Agenda

UCSC Genome Browser: http://genome.ucsc.edu

Introduction and Credits Basic Searches Understanding Displays Get Details or Sequences Sequence Searches (BLAT) Summary Exercises

11

Overview of the WholeGenome Browser Page

(2009 Human Assembly)} Genome viewer

mRNA and EST Tracks

Expression (such as microarray)

Comparative Genomics•As a group•Individual species

Variation and Repeats(including SNPs, copy number variation)

Groups of data (Tracks)Mapping and Sequencing Tracks

Genes and Gene Prediction Tracks(including sno/miRNA data)

Phenotype and Disease Tracks

Regulation (including TFBS)

12

Different Assemblies, Species, Tracks

Assemblies, Species may have different data tracks Layout, software, functions the same

13

Sample Genome Viewer Image, TP53 Region

base position

UCSC genes

RefSeq

mRNAs & ESTs

many species compared

SNPs

single species compared

ENCODE

repeats

scale

14

Visual Cues on the Genome Browser

Track colors may have meaning—for example, UCSC Gene track:•If there is a corresponding PDB entry = black•If there is a corresponding reviewed/validated seq = dark blue•If there is a non-RefSeq seq = lightest blue

height of a blue bar is increased likelihood of conservation, red indicates a likelihood of faster-evolving regions

Intron and direction of transcription <<< or >>>

<exon exon exon< < < < < < <ex 5' UTR3' UTR

Alignment indications (Conservation pairs: “chain” or “net” style)•Alignments = boxes, Gaps = lines

Mammal cons.

Tick marks; a single location (STS, SNP)

15

Options for Changing Images: Upper Section

Change your view or location with controls at the top Use “base” to get right down to the nucleotides Drag tracks up and down the viewer to re-arrange Various select and focus options by clicking/dragging mouse

zoomwalk

Tweak position or do new search

Right-clickitems

Hold/drag mouseto view section

Hold/drag mouseto view sectionDrag (like Google Maps)

16

Annotation Track Display Options

Some data is ON or OFF by default Menu links to info about the tracks: content, methods You change the view with pulldown menus After making changes, REFRESH to enforce the change

Change track view

Links to infoand/or filtersand color key

Enforcemenu

changes

17

Basic Annotation Track Menus Defined

Hide: removes a track from view

Dense: all items collapsed into a single line

Squish: each item = separate line, but 50% height + packed

Pack: each item separate, but efficiently stacked (full height)

Full: each item on separate line (may need to zoom to fit)

Tracks with Additional Options: Filters, more….

Some tracks have filters (ESTs shown; SNPs other good example) Some tracks may have undisplayed data (Yale TFBS; 2006) Super-tracks may have multiple components, various settings

18

off

on

Super-track

19

Mid-page Options to Change Settings

Search for data types Reset to defaults Configure options page You control the views with numerous features

Resets, back to defaults

Start from scratch

Flip display to Genomic 5’3’

Search for data types

Fit to browserwindow size

20

Cookies and Sessions

To clear your “cart” or parameters, click default tracks or reset

Save your setup as “Session” and store/share them

OR

Your browser remembers where you were (cookies)

Requires loginLifespan: 4 months

21

UCSC Genome Browser Agenda

UCSC Genome Browser: http://genome.ucsc.edu

Introduction and Credits Basic Searches Understanding Displays Get Details or Sequences Sequence Searches (BLAT) Exercises

22

Click Any Viewer Object for More Details

Example: click your mouse anywhere on the TP53 line

Click the item

Many details and links to more data about TP53

New description web page opens

23

Click Annotation Track Item for Description Pages

Not all genes have this much detail.

Different annotation tracks

carry different data.

informativedescriptionother resource links

microarray data

mRNA secondary structure

links to sequences

protein domains/structure

orthologs in other species

Gene Ontology™ descriptions

mRNA descriptions

pathways

genetic associationstudiescomparative toxicology

gene modelsynonyms

24

Get DNA, with Extended Case/Color Options

Use the View DNA link at the top Plain or Extended options Change colors, fonts, underline, etc.

255

255

Get DNA

25

Get Sequence from Description Pages

Click the item

Click an item, go to Sequence section of description page

sequence section on detail page

Copy whole mRNA for next segment

26

UCSC Genome Browser Agenda

UCSC Genome Browser: http://genome.ucsc.edu

Introduction and Credits Basic Searches Understanding Displays Get Details or Sequences Sequence Searches (BLAT) Exercises

27

Accessing the BLAT Tool

Rapid searches by INDEXING the entire genome Works best with high similarity matches See documentation and publication for details

Kent, WJ. Genome Res. 2002. 12:656 and “Help”

BLAT = BLAST-like Alignment Tool

28

BLAT Tool Interface

Make choices

DNA limit 25000 basesProtein limit 10000 aa25 total sequences

Paste one or more sequences

FASTA for more than one

Or upload

submit

29

BLAT Results with Hyperlinks

Results with demo sequences, settings default; sort = Query, Score Score is a count of matches—higher number, better match

Click browser to go to Genome Browser image location (next slide) Click details to see the alignment to genomic sequence (2nd slide)

sorting

go to

bro

wser/vie

we

r

go to

alig

nm

en

t deta

il

30

BLAT Results: Browser Link

query

From browser click in BLAT results A new track line with Your Sequence from BLAT Search appears Also a new menu to adjust

31

BLAT Results, Alignment Details

Your query

Genomic match, with color cues

Side by Side Alignment

yoursgenomic

32

UCSC Genome Browser Agenda

UCSC Genome Browser: http://genome.ucsc.edu

Introduction and Credits Basic Searches Understanding Displays Get Details or Sequences Sequence Searches (BLAT) Exercises

33

UCSC Genome Browser Agenda

UCSC Genome Browser: http://genome.ucsc.edu

Introduction and Credits Basic Searches Understanding Displays Get Details or Sequences Sequence Searches (BLAT) Exercises

34

Notice:

The materials and slides offered are for non-commercial use only. Reproduction, distribution and/or use for commercial purposes is strictly prohibited.

Copyright 2012, OpenHelix, LLC

http://www.openhelix.com/ucsc