bits training - ucsc genome browser - part 2

Post on 11-May-2015

1.284 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

These is the second part of the lecture slides of the BITS bioinformatics training session on the UCSC Genome Browser. See http://www.bits.vib.be/index.php?option=com_content&view=article&id=17203990:orange-genome-browsers-ucsc-training&catid=81:training-pages&Itemid=190

TRANSCRIPT

Paco Hulpiau

UCSCgenome browsing

http://www.bits.vib.be

TABLE BROWSER

GET DNA

CLICK LINE

CURRENT BROWSER GRAPHIC IN PDF

TO GET OTHER DATA

CLICK LINE

TO GET OTHER DATA2

Databases & accession numbers

GenBank exchanges data daily with its two partners in the International

Nucleotide Sequence Database Collaboration (INSDC):

European Bioinformatics Institute (EBI, part of EMBL)

DNA Data Bank of Japan (DDBJ) Characteristics of GenBank and RefSeq @ NCBI :

The Ensembl automatic gene annotation system (Curwen et al, 2004) :

The gene-building system enables fast automated annotation of

eukaryotic genomes. It annotates genes based on evidence derived from

known protein, cDNA, and EST sequences

incl. GenBank sequences shared by INSDC, UniProtKB and NCBI

RefSeq

Databases & accession numbers

Databases & accession numbers

CLICK LINE

TO GET OTHER DATA2

zoom in on exon 1 + upstream

Exercises (II)

1) Are there any diseases related to your gene of interest?

(OMIM)

Which interactions partners are known? (Entrez Gene)

Any important SNPs changing the amino acid sequence?

Get the multiple sequence alignment (MSA, multiz46way)

showing the nucleotide sequences of human, mouse, chicken, Xenopus

and zebrafish genes (CDS fasta alignment, exons not separate).

Save your results (e.g. exercises2_1.doc).

TO GET OTHER DATA

GET DNA 3

http://www.visibone.com/colorlab/

Exercises (II)

1) Get the DNA sequence for your gene of interest

including 2000 base pairs upstream and

use the following extended case/color options:

» RefSeq and Ensembl genes in bold

» SNPs (132) underlined

» Regulatory information e.g. from Oreganno and miRNA sites

in different colors

» Save your results (e.g. exercises2_2a.doc).

Exercises (II)

1) Try to get the DNA sequence for your gene of interest

in chicken or zebrafish and

use the following extended case/color options:

» UCSC, RefSeq and Ensembl genes in bold

» Other RefSeq genes underlined

» Human proteins in a specific color

» Save your results (e.g. exercises2_2b.doc).

TABLE BROWSER4

TO GET OTHER DATA

COPY (Ctrl+C)

= Accession Number (RefSeq) e.g. NM_001229

= Gene Name (Entrez) e.g. CASP1

Exercises (II)

1) Get a list of the RefSeq and Ensembl transcripts using the table

browser with the following selected fields:

» name, chromosome, exon count, name2

» Save the results (exercises2_3a.xls)

Also get the sequences and save as genename_transcripts.fasta

Search the mouse genome using the filter in the table browser

to get all family members of a protein family (research interest)

and save the results in a list (exercises2_3b.xls) containing name,

chromosome, cds start and end, exon count and name2

TO GET OTHER DATA

TO GET OTHER DATA

BLAT = Blast-Like Alignment Tool search for high similarity matches by indexing entire

genome DNA limit = 25000 bases, for multiple seqs 50000 bases protein limit = 10000 aa, for multiple seqs 25000 aa total sequences = 25

PASTE (Ctrl+V)

TTTAGCCAACGAACAGTCGCT TTCTCTTTGCATCTGTCCCAG

The Utilities page contains links to some tools

created by the UCSC Genome Bioinformatics

Group.

DNA Duster & Protein Duster remove non-sequence

related characters from an input sequence.

Exercises (II)

1) Use BLAT to find orthologs of your gene in chicken, zebrafish

and fruit fly. What is the genomic location?

Are the flanking genes the same?

Perform an in silico PCR to see what happens when more than 1

PCR product may arise and determine product size and Tm:

species: human

forward primer: TTC AAG GAG GCC TTC TCC CT

reverse primer: CTG GGG GAG AAG CTG A (+click flip reverse)

top related