blank slide/colon data. the data: expresion level of gene i in sample j the data sample 136 1 358...

26
CLUSTER ANALYSIS OF DNA AND CLUSTER ANALYSIS OF DNA AND ANTIGEN CHIP DATA ANTIGEN CHIP DATA EYTAN DOMANY EYTAN DOMANY DEAD SEA, OCT 2002 DEAD SEA, OCT 2002

Post on 22-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Blank slide/colon data

CLUSTER ANALYSIS OF DNA ANDCLUSTER ANALYSIS OF DNA AND ANTIGEN CHIP DATA ANTIGEN CHIP DATA

EYTAN DOMANYEYTAN DOMANY

DEAD SEA, OCT 2002 DEAD SEA, OCT 2002

THE DATA: EXPRESION LEVEL OF GENE i IN SAMPLE j

The data

sample

1 361

358

gene

THE METHOD 1

T (RESOLUTION)

YOUNG OLD

THE METHOD: CLUSTER ANALYSIS.

N OBJECTS: BREAK THEM INTO GROUPS ON THE BASIS OF SIMILARITY:

DENDROGRAM

THE METHOD 2

THE METHOD: CLUSTER ANALYSIS.

N OBJECTS: BREAK THEM INTO GROUPS ON THE BASIS OF SIMILARITY

1. OBJECTS = GENES:

GENES WITH SIMILAR EXPRESSION PROFILES MAY BE CO-REGULATED

PROVIDE GUESS FOR ROLE OF PROTEINS

2. OBJECTS = SAMPLES:

CLASSIFY TUMORS, DIAGNOSIS,

PROGNOSIS, THERAPY

THE PROBLEM RAISED BY ESHEL BEN-JACOB:

CLASSIFYING THE PATIENTS ON THE BASIS OF EXPRESSION OF THOUSANDS OF GENES DOES NOT WORK, SINCE MOST GENES ARE NOT RELEVANT TO THE QUESTION OF INTEREST AND INTRODUCE ONLY NOISE.

football

THE SOLUTION: WORK WITH SMALL SUBSETS OF GENES AND SAMPLES.

COUPLED TWO-WAY CLUSTERING

Getz et al PNAS (2000)

Califano et al , Proc. Int. Conf. Intell. Syst. Mol. Biol. (2000).Y. Cheng and G. M. Church, Proc. Int. Conf. Intell. Syst. Mol. Biol. (2000)

IDENTIFY (CORRELATED) GROUPS OF GENES AND USE THEIR EXPRESSION LEVELS TO STUDY (CLUSTER) THE SAMPLES.

Astrocytoma(II)Secondary GBM

Primary GlioBlastoMaCell Lines

GE

NE

S

S2S3

T

S1(G1)

G12

G5

Coupled Two-Way Clustering (CTWC)

of 358 Genes and 36 Samples

GLIOBLASTOMA: M. Hegi et al CHUV, G. Getz

glioblastoma

CLONTECH ARRAYS

G1(S1)

AB004904 STAT-induced STAT inhibitor 3

M32977 VEGF

M35410 IGFBP2

X51602 VEGFR1

M96322 gravin

AB004903 STAT-induced STAT inhibitor 2

X52946 PTN

J04111 c-jun

X79067 TIS11B

S11S12

S14

S10

S13S1(G5)

Super-Paramagnetic Clustering of All Samples

Using Stable Gene Cluster G5

Fig. 2B

S1(G5)

G5Ver

LGAAIIIScGBMPrGBMRecPrGBMnew sample

validation

AB004904 STAT-induced STAT inhibitor 3 M32977 VEGF

M35410 IGFBP2

X51602 VEGFR1

M96322 gravin

AB004903 STAT-induced STAT inhibitor 2

X52946 PTN

J04111 c-jun

X79067 TIS11B

THE GENES OF G5:

VEGF AND ITS RECEPTORS – INSTRUMENTAL INANGIOGENESIS; INDUCED GROWTH OF BLOODVESSELS, ESSENTIAL FOR GROWTH BEYOND ACRITICAL SIZE. THE COEXPRESSION OF IGFBP2WAS INDEPENDENTLY VERIFIED; 1ST EVIDENCEFOR POSSIBLE ROLE IN ANGIOGENESIS.

THE GENES OF G5

G8

G1(S1)

G3

COLON CANCER: 18 PAIRED CARCINOMA/NORMAL

4 PAIRED ADENOMA/NORMAL

Notterman et al Cancer Res. (2001); Getz et al, Bioinformatics (in print)

colon paired G1(S1)

COLON CANCER: 18 PAIRED CARCINOMA/NORMAL 4 PAIRED ADENOMA/NORMALNotterman et al Cancer Res. (2001)

S1(G8): tumor/normal S1(G3): protocol A /protocol B

COLON CANCER: 18 PAIRED CARCINOMA/NORMAL 4 PAIRED ADENOMA/NORMALNotterman et al Cancer Res. (2001)

S1(G8): tumor/normal distance matrix

BREAST CANCER DATA (BOTSTEIN/BROWN LAB PEROU ET AL, NATURE 2000) I.Kela, G. Getz

20 patients before/after chemotherapy. 10 of the “before” samples are in cluster b; all 3 successful treatments’ samples in this group.

Intermediate expression level of the G46 genes may serve as a marker for a relatively high success rate of the doxorubicin treatment

Predicting response to doxorubicin treatment;successful for 3/20 patients

S1(G46)

survival S1(G33) Sorlie

BREAST CANCER DATA (BOTSTEIN/BROWN LAB),

Sorlie et al, PNAS (2001); Getz et al, Bioinformatics (in print)

Cluster (a): high expression levels of the genes of G33,low survival, mutant p53.

predictor of survival.

S1(G33) survivalp53 status

nointerpret

BREAST CANCER DATA (BOTSTEIN/BROWN LAB),

Sorlie et al, PNAS (2001)

Gene cluster G36 inducesclear partition to two classes of no known clinical interpretation

skin cancer, UV

NORMAL HUMAN EPIDERMAL KERATINOCYTES (NHEK)SQUAMOUS CARCINOMA CELLS (SCC)

IRRADIATE (2m) BY UVB; MEASURE EXPRESSION VS TIMENHEK: UV t =0.5, 3, 6, 12, 24 hours NO UV t = 0, 0.5, 12, 24 SCC: UV t = 0, 6, 12

UV INDUCES DNA DAMAGE, WHICH ELICITS APOPTOTICRESPONSE. NHEK RESIST APOPTOSIS BY SECRETION OFSURVIVAL FACTORS. THIS RESISTANCE TO APOPTOSISMAY PROMOTE EMERGENCE OF MALIGNANCY.

Givol, Rechavi, Dazard,... Hilah Gal

Squamous Carcinoma Cells + UVB (SU)Squamous Carcinoma Cells control (SC)

Normal Keratinocytes control (KC)Normal Keratinocytes + UVB (KU)

Cluster S1(G28)(22 genes)

Reordered Samples

Re

ord

ere

d G

en

es

SC

0h

KC

12h

KC

0h K

C0.

5h

KU

V0.

5h

KC

24h

SU

V6h

SU

V12

hK

UV

3h

KU

V6h

KU

V12

h

KU

V24

h

UV/NON UV SEPARATION INDUCED BY G(28):DNA REPAIR (GADD45A,B); ANTIOXYDANT (MT1G)GROWTH FACTORS, INFLAMMATORY MEDIATORS

S1(G28)

S1(G18) ; TUMOR/NORMAL(5 genes)

Re

ord

ere

d G

en

es

Reordered Samples

KC

12h

KC

0h

SC

0hKC

0.5h

KU

V0

.5h KC

24h

SU

V6h

SU

V1

2hKU

V3h

KU

V6h

KU

V12

h

KU

V24

h

S1(G24) TUMOR/NORMAL(31 genes)

KC

12h

KC

0h

SC

0hK

C0.

5h

KU

V0.

5h

KC

24h

SU

V6h

KU

V3h

K

UV

6h

KU

V1

2h KU

V24

h

Reordered Samples

Re

ord

ere

d G

en

es

SU

V12

h

Squamous Carcinoma Cells + UVB (SU)Squamous Carcinoma Cells control (SC)

Normal Keratinocytes control (KC)Normal Keratinocytes + UVB (KU)

PRO-APOPTOTIC GENES (PARP, CAS)

HIGH IN NHEK, ELEVATED WITH UV

S1(G24), S1(G18)

IRUN COHENFRANCISCO QUINTANAGuy HeadGaddy GetzHila ShtarkDafna TsafrirGadi Elitzur

DIABETES, ARTHRITIS:

Antigen chips

ANTIGEN CHIPS

SAMPLES: N = 40 SERA FROM20 HEALTHY SUBJECTS + 20 DIABETES TYPE 1

78 TESTED ANTIGENS (1 BLANK) EACH WITH TWO “MARKERS”, IgG + IgM AND IgM

MEASURE Aij - REACTIVITY OF SERUM j TO ANTIGEN i

M = 176 MEASUREMENTS PER SAMPLE

THE DATA FORM A 176 X 40 ARRAY

EACH ONE OF THE 40 SUBJECTS IS REPRESENTED BY

176 NUMBERS – THE REACTIVITY OF HIS SERUM WITH

THE 176 ANTIGENS. THESE 176 NUMBERS A1,j , A2,j ,...A176,j

CONSTITUTE THE “REACTIVITY PROFILE” OF SUBJECT j

QUESTION: CAN ONE IDENTIFY PATTERNS OF SIMILARITY BETWEEN THE REACTIVITY PROFILES OF SUBJECTS WITH DIABETES? DO THEY FORM A DISTINCT GROUP FROM HEALTHY SUBJECTS?

ANSWER: NO SEPARATION INTO DIABETES VS HEALTHY IS FOUND WHEN WE USE ALL ANTIGENS TO CHARACTERIZE THE SUBJECTS.

CLUSTERING 176 ANTIGENS, USING THEIR REACTIVITIES WITH 40 SERA:

ANTIGEN CLUSTERS:

RANDOM DATA

THE ANTIGENS FORMDISTINCT GROUPS. THEREIS STRUCTURE IN THEIR REACTIVITY PATTERNS.

USE THE STABLE (SIGNIFICANT) ANTIGEN CLUSTERS,ONE AT A TIME, TO CLUSTER THE SUBJECTS.

USING ANTIGEN CLUSTER 1(Insulin G+M, Collagen1 – both)WE GET A GOOD CLASSIFIER:A “DIABETES CLUSTER”CATCHES 17/20 OF THE DIABETES 4/21 MISTAKES

USING A “MAJORITY VOTE” OF 5 CLASSIFIERS GET 90%

Projects, Collaborators, Students/Postdocs Cancer: Colon* D. Notterman G. Getz*, M.Mashiah*, H. Gal* Breast* D. Botstein I. Kela*, G.Getz* Glioblastoma* M.Hegi, S. Goddard G. Getz* Skin* D. Givol*, G.Rechavi, J. Dazard*H. Gal* Leukemia E. Canaani* O. Ravid*, G.Getz*, H.Agrawal*P53 primary targets* D. Givol*, K.Kannan,G.Rechavi G. Getz*, I.Kela*P73 primary targets* D. Givol*, G. Rechavi I. Kela*MutP53 as oncogen V. Rotter* O. Ravid*Bone development D. Gazit O. Ravid*Antigen Chips: Diabetes* I. Cohen*, F. Quintana* G. Getz*, G. Hed, D. Tsafrir*, Arthritis I. Cohen, F. Quintana* I. Tsafrir*,H.Shtern*,G.Elitzur*Neurotransmitters M. Levite* D. Tsafrir*,I. Tsafrir*Tissue dependence D. Lancet* H. Shtern*Apoptosis, IL6 L.Sachs*,Y.Lotem*,D.Givol* H. Gal*Meiosis in yeast M. Primig M. Katzenelenbogen*Yeast cell cycle* M. Zhang G. Getz*, E. LevineProtein Struct. Classif.* M. Vendruscolo,G.Getz*Low-T phase, SpinGlass* D. Stauffer, P. Young, G. Hed A. Hartmann, M.Palassini SPC* M. Blatt, S. Wiseman H. Agrawal, N. Shental*CTWC* G. Getz*, E. Levine,O.Barad* *re/preprint available *Weizmann Inst. *Currently postdoc/student

collaborators

THE COUPLED TWO WAY CLUSTERING METHOD

SUCCESSFULLY IDENTIFIED RELEVANT STRUCTURE AND

MEANING IN CANCER RELATED GENE MICROARRAY DATA.

CTWC SERVER: http//ctwc.weizmann.ac.il

www.weizmann.ac.il/home/fedomany/

www.weizmann.ac.il/physics/complex/compphys

SUMMARY

Summary

DNA CHIPS PROVIDE A POWERFUL TOOL TO STUDY GENEEXPRESSION; ADVANCED METHODS OF ANALYSIS MAYHAVE FAR REACHING CLINICAL & SCIENTIFIC IMPLICATIONS

FUNDING: ISF, GIF, Ridgefield Found., Levine Found, NIH, EC