identification of a novel cis-regulatory element involved in the heat shock response in...

12
Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational Methods Debraj Guha Thakurta, Lisanne Palomar, Bary D. Stormo, Debraj Guha Thakurta, Lisanne Palomar, Bary D. Stormo, Pat Tedesco, Thomas E. Johnson, Davis W. Walker, Gordon Pat Tedesco, Thomas E. Johnson, Davis W. Walker, Gordon Lithgow, Stuart Kim, and Christopher D. Link Lithgow, Stuart Kim, and Christopher D. Link Presented by Presented by Abel G. Gezahegne Abel G. Gezahegne ECS 289A ECS 289A February 24, 2003 February 24, 2003

Post on 19-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational

Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational Methods

Debraj Guha Thakurta, Lisanne Palomar, Bary D. Stormo, Pat Tedesco, Thomas E. Debraj Guha Thakurta, Lisanne Palomar, Bary D. Stormo, Pat Tedesco, Thomas E. Johnson, Davis W. Walker, Gordon Lithgow, Stuart Kim, and Christopher D. Link Johnson, Davis W. Walker, Gordon Lithgow, Stuart Kim, and Christopher D. Link

Presented byPresented by

Abel G. GezahegneAbel G. Gezahegne

ECS 289AECS 289A

February 24, 2003February 24, 2003

Page 2: Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational

Overview Monitor ~12,000 genes from C. elegans to determine genes

up-regulated on heat shock (HS).

Analyze the upstream regions of these genes using computational DNA pattern recognition methods to identify any cis-regulatory motifs.

Determine the significance of these motifs using statistical methods.

Perform comparative sequence analysis to determine if any cross-species conservations exist.

Page 3: Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational

Microarray Experiment Determine Gene expression patterns before and after HS using

DNA Microarray for 11,917 known and predicted C. elegans genes.

Animals were harvested as young adults and then split in two halves: HS population and control population.

5 independent HS experiments at 35OC: In two experiment animals were harvested after 1 hr of HS. In three experiments animals were heat shocked for 2 hrs and allowed to recover at 20OC for 2 hrs then harvested.

Page 4: Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational

Software Tools

Consensus – a greedy algorithm that searches for a matrix with a

low probability of occurring by chance.

ANN-Spec – an algorithm based on Artificial Neural Network and Gibbs sampling method to discover un-gapped patterns in DNA sequences

GLASS – Graphical Language for Assembly of Secondary Structures: a sequence alignment algorithm.

Patser – given weight matrix identifies high scoring subsequences

and calculates p values.

Page 5: Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational

Gene Identification Identified 28 genes induced in at least four of the

five experiments and over-expressed by a factor of two or more.

Because of noise in DNA Microarray considered only genes up-regulated by an average factor of four or more.

Page 6: Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational

Gene Identification (cont.)

Used 500 bp upstream from transcription start site to select candidates for promoter elements.

Two DNA motifs identified by Consensus and ANN-Spec. HSE - TTCTAGAA, a well known DNA binding site for HS Transcription

Factors (HSF). HSAS - GGGTGTC, un unknown motif that does not correspond to any

known TF binding site.

Page 7: Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational

Mathematical Model

Probability of a protein binding to a site with a score s:

P(bound|s) es

 When multiple binding sites exist, probability of binding:

Pmseq = sites e

s

 Geometric Mean of the pp-values:

 < Pmseq > = [ Sseq sites e

s ] 1/N

Difference of the log geometric means of the pp-values:

DLGM = log < Pmseq >HS - < Pm

seq >Rand

 

Page 8: Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational

Statistical Significance

Use the DLGM to determine the cutoff scores using the 13 up-regulated genes and 3000 random genes from the C. elegans genome.

DLGM = log < Pmseq >HS - log < Pm

seq >Rand

At a low cutoff value there are substantial amount of low scoring sequences thus DLGM is low.

At a high cutoff even the high scoring sequences are not being used thus DLGM drops.

The cutoff score that maximizes DLGM is chosen as the appropriate cutoff value.

Page 9: Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational

Cross-Species Conservation

To study conservation of regulatory sites across related species two orthologous gene pairs were examined between C. elegans and C. briggsae.

The pattern of HSE and HSAS sites on the promoters indicate conservation across closely related species.

Output from VISTA (VISualization Tools for Alignment.

Page 10: Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational

Cross-Species C. (cont)

The gene structure and distances between the genes are similar in both organisms.

The two genes share 450 nt in the upstream DNA sequence.

Output from GLASS alignment algorithm.

Page 11: Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational

Mutant Promoter Construct A single mutation of HSE or HSAS still results in a

significant expression level of GFP (green fluorescence protein).

Mutation of all three or two sites of HSE’s or one HSE’s and the HSAS results in dramatic reduction is expression level.

Page 12: Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational

Remarks and Conclusion Since Microarray data was conducted

for ~2/3 of the C. elegans genes, there may exist other HS induced genes.

Through experiments and statistical methods the novel cis-regulatory element discovered has been shown to play a significant role in heat shock response.

This has also shown computational methods can be a valuable tool in discovery of novel regulatory elements.