searching for tfbss with transfac - hot topics in bioinformatics

21
Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Upload: randolf-austin-bryant

Post on 24-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Page 2: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for transcription factor binding sites

with TRANSFAC

George Bell, Ph.D.Bioinformatics and Research Computing

Hot Topics – October 2009

Page 3: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Outline

• What is known about your favorite TFs?

• In what regulatory DNA should we search?

• How can we search for an inexact sequence motif like a TFBS?

• What related resources are available?

Page 4: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Transcription control is complex

Lodish et al. Molecular Cell Biology. Model for cooperative assembly of an activated transcription-initiation complex at the TTR promoter in hepatocytes

Kettenberger et al., 2004. (1y1w)Complete RNA Polymerase II elongation complex(12 subunits)

Page 5: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

TRANSFAC at BiobaseConnect from Whitehead network

Page 6: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

TRANSFAC introduction

• created in 1988• contains information about transcription factors

that have been experimentally determined to bind DNA

• includes eukaryotic cis-acting regulatory DNA elements and trans-acting factors, in organisms ranging from yeast to humans.

• The majority of information has been manually curated from the primary literature.

Page 7: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Browsing transcription factors

Select species Detailed

info

Page 8: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Types of TRANSFAC data

• Gene – curated info• Promoter – TSS coordinates from

Ensembl, FANTOM, etc.• Functional Region – describes publushed

regulatory regions• Composite Element (with two or more

nearby binding sites)• Site – describes published TFBSs• ChIP-chip – shows data by target• Matrix – contains published aligned

binding sites and positional probabilities

Page 9: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Transcription factor matrix

A C G T Consensus

1 2 2 0 S

2 1 2 0 R

3 0 1 1 A

0 5 0 0 C

5 0 0 0 A

0 0 4 1 G

0 1 4 0 G

0 0 0 5 T

0 0 5 0 G

0 1 2 2 K

0 2 0 3 Y

1 0 3 1 G

Example: V$MYOD_01 vertebrate MyoD matrix 1

Page 10: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Matrix identifiers

• Examples: V$MYOD_01, V$AP1_Q4_01 V$ = vertebrate

I$ = insects; P$ = plants; F$ = fungi;

N$ = nematodes; B$ = bacteria

MYOD = factor or family name

01 = matrix number 1 for MYOD

Q* = matrix reliability/quality (1 – 6)

1 Functionally confirmed transcription factor binding site

2 Binding of pure protein (purified or recombinant)

3 Immunologically characterized binding activity of a cellular extract

4 Binding activity characterized via a known binding sequence

5 Binding of uncharacterized extract protein to a bona fide element

6 No quality assigned

Page 11: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Matrices are redundant

V$MYOD_01

V$MYOD_Q6

V$MYOD_Q6_01

Page 12: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Extracting regulatory regions

• One, many or all genes?

• Promoters or all potential regions (introns, intergenic)?

• Sources of genomic sequence:– UCSC genome browser (click on “DNA”)– Ensembl BioMart (“Sequences” for output)– Published datasets

Page 13: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Starting MATCH

Page 14: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

MATCH profiles (sets of matrices)

Taxon:•all•bacteria•fungi•insects•invertebrates•nematodes•plants•vertebrate_non_redundant•vertebrate_non_redundant_minFN•vertebrate_non_redundant_minFP•vertebrate_non_redundant_minSUM•vertebrates

Tissue:•adipocyte_specific•immune_cell_specific•liver_specific•lung_specific•muscle_specific•nerve_system_specific•pancreatic_beta_cell_specific•pituitary_specific•redox_specific

Biological process:•cell_cycle_specific

User defined:•Muscle_george

Page 15: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

MATCH output

Core == first 5 most conserved positions

Page 16: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Creating a custom matrix: input

Page 17: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Creating a custom matrix: output

Page 18: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

MATCH Profiler - input

Page 19: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

MATCH Profiler - output

Page 20: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

MATCH with our custom profile

Page 21: Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics

Related resources

• UCSC Genome Browser (hg18): – “TFBS Conserved” track (human/mouse/rat)

• JASPAR (public database of transcription factor binding profiles):– http://jaspar.genereg.net/

• Create a sequence logo: http://weblogo.berkeley.edu

• Command-line tools:– TRANSFAC; tffind; HMMER1; MAST (MEME Suite)

• Search for “patterns” ( ex: CAxxTGx[TC] )– EMBOSS: fuzznuc; dreg