1 an array of fda efforts in pharmacogenomics weida tong director, center for toxicoinformatics,...
Post on 22-Dec-2015
218 Views
Preview:
TRANSCRIPT
1
An Array of FDA Efforts in Pharmacogenomics
Weida TongDirector, Center for Toxicoinformatics, NCTR/FDA
Weida.tong@fda.hhs.gov
CAMDA 08, Boku University, Vienna, Austria, Dec 4-6, 2008
2
Pipeline Problem: Spending More, Getting Less
While research spending (Pharma and NIH) has increased, fewer NME’s and BLA’s have been submitted to FDA
Research spending NDAs and BLAs received by FDA
R&D spending
NIH budget NMEs
BLAs
3
The FDA Critical Path to New Medical Products
• Pharmacogenomics and
toxicogenomics have been
identified as crucial in
advancing – Medical product development
– Personalized medicine
4
Guidance for Industry: Pharmacogenomic Data Submissions
www.fda.gov/cder/genomicswww.fda.gov/cder/genomicswww.fda.gov/cder/genomics/regulatory.htmwww.fda.gov/cder/genomics/regulatory.htm
5
A Novel Data Submission Path - Voluntary Genomics Data Submission (VGDS)
• Defined in Guidance for Industry on Pharmacogenomics
(PGx) Data Submission (draft document released in 2003;
final publication, 2005)
– To encourage the sponsor interacting with FDA through
submission of PGx data at the voluntary basis
– To provide a forum for scientific discussions with the FDA outside
of the application review process.
– To establish regulatory environment (both the tools and expertise)
within the FDA for receiving, analyzing and interpreting PGx data
6
VGDS Status
• Total of >40 submissions have been received
• The submissions contain PGx data from– DNA Microarrays
– Proteomics
– Metabolomics
– Genotyping including Genome wide association study (GWAS)
– Others
• Bioinformatics has played an essential role to accomplish:– Objective 1: Data repository– Objective 2: Reproduce the sponsor’s results– Objective 3: Conduct alternative analysis
7
FDA Genomic Tool: ArrayTrack – Support FDA regulatory research and review
• Developed by NCTR/FDA– Develop 1: An integrated solution for microarray data
management, analysis and interpretation
– Develop 2: Support meta data analysis across various omics
platforms and study data
– Develop 3: SNPTrack, a sister product in collaboration with
Rosetta
• FDA agency wide application– Review tool for the FDA VGDS data submission
– >100 FDA reviewers and scientists have participated the training
– Integrating with Janus for e-Submission
8
Microarray data
Proteomics data
Metabolomics data
Chemical data
Clinical and non-clinical
data
Public data
ArrayTrackArrayTrack
ArrayTrack: An Integrated Solution for omics research
9
ProteinGeneMetabolite
10
Specific Functionality Related to VGDS• Phenotypic anchoring
• Systems Approach
Clin
ical path
ology d
ata
CL
inC
hem
nam
e is hid
den
Gene name is hidden
Gene
11
ArrayTrack-Freely Available to Public#
of u
niq
ue
user
s ca
lcul
ated
qua
rter
ly Web-access Local installation
0
20
40
60
80
100
120
140
160
180
0
20
40
60
80
100
120
140
• To be consistent with the common practice in the research community
• Over 10 training courses have been offered, including two in Europe
• Education: Part of bioinformatics course in UCLA, UMDNJ and UALR
• Eli Lilly choose ArrayTrack to support it’s clinical gene-expression studies after rigorously assessing the architectural structure, functionality, security assessments and custom support
12
ArrayTrack Websitehttp://www.fda.gov/nctr/science/centers/toxicoinformatics/ArrayTrack/
13
• QC issue – How good is good enough?– Assessing the best achievable technical performance of
microarray platforms (QC metrics and thresholds)
• Analysis issue – Can we reach a consensus on analysis methods?– Assessing the advantages and disadvantages of various data
analysis methods
• Cross-platform issue – Do different platforms generate different results? – Assessing cross-platform consistency
MicroArray Quality Control (MAQC)
- An FDA-Led Community Wide Effort to Address the Challenges and Issues Identified in VGDS
14
MAQC Way of WorkingParticipants: Everyone was welcome; however, cutoff dates had to be imposed.
Cost-sharing:Every participant contributed, e.g., arrays, RNA samples, reagents, time and resources in generating and analyzing the MAQC data
Decision-making: Face-to-face meetings (1st, 2nd, 3rd, and 4th) Biweekly, regular MAQC teleconferences (>20 times)Smaller-scale teleconferences on specific issues (many)
Outcome: Peer-reviewed publication:Followed the normal journal-defined publication process9 papers submitted to Nature Biotechnology6 accepted and 3 rejected
TransparencyMAQC Data is freely available at GEO, ArrayExpress, and ArrayTrackRNA samples are available from commercial vendors
15
MicroArray Quality Control (MAQC) project – Phase I
• MAQC-I: Technical Performance– Reliability of microarray technology– Cross-platform consistency– Reproducibility of microarray results
• MAQC-II: Practical Application– Molecular signatures (or classifiers) for risk
assessment and clinical application– Reliability, cross-platform consistency and
reproducibility– Develop guidance and recommendations
Feb 2005
Sept 2006
Dec 2008
MA
QC
-IM
AQ
C-I
I
137
sci
en
tis
ts
fro
m 5
1 O
RG
>4
00 s
cie
nti
sts
fr
om
>1
50
OR
G
16
Results from the MAQC-I Study Published in Nature Biotechnology on Sept/Oct 2006
Nat. Biotechnol. 24(9) and 24(10s), 2006
Six research papers:
• MAQC Main Paper
• Validation of Microarray Results
• RNA Sample Titrations
• One-color vs. Two-color Microarrays
• External RNA Controls
• Rat Toxicogenomics ValidationPlus:
Editorial Nature BiotechnologyForeword Casciano DA and Woodcock JStanford Commentary Ji H and Davis RWFDA Commentary Frueh FWEPA Commentary Dix DJ et al.
17
Key Findings from the MAQC-I Study
When standard operating procedures (SOPs) are followed and
the data is analyzed properly, the following is demonstrated:
• High within-lab and cross-lab reproducibility
• High cross-platform comparability, including one- vs two-
color platforms
• High correlation between quantitative gene expression (e.g.
TaqMan) and microarray platforms
– The few discordant measurements were found, mainly, due to probe
sequence and thus target location
18
How to determine DEGs - Do we really know what we know• A circular path for DEGs
– Fold Change – biologist initiated (frugal approach)• Magnitude difference• Biological significance
– P-value – statistician joined in (expensive approach)• Specificity and sensitivity• Statistical significance
– FC (p) – A MAQC findings (statistics got to know its limitation)
• The FC ranking with a nonstringent P-value cutoff, FC (P), should be considered for class comparison study
• Reproducibility
19
Nature
Science
Nature Method
Cell
Analytical Chemistry
20
Post-MAQC-I Study on Reproducibility of DEGs - A Statistical Simulation Study
P vs FC
Lab 1
Lab 2
0
0.25
0.5
0.75
1
0
0.25
0.5
0.75
1
0
0.25
0.5
0.75
1
1
2
3
4
5
sensitivity
0 .25 .5 .75 1
one_minus_specificity
0 .25 .5 .75 1
pog
0 .25 .5 .75 1
log10size
1 2 3 4 5
Sen
siti
vity
1-specificity
POG
FC Sorting
POG
Reproducibility
0
10
20
30
40
50
60
70
80
90
100
1 10 100 1000 10000
Number of selected DEGs
PO
G (%
)
P
FC
FC(P<0.01)
FC(P<0.05)
P(FC>2)
P(FC>1.4)
76.2%
25.0%
Biological Replicate (30% noise)
5000
10
20
30
40
50
60
70
80
90
100
1 10 100 1000 10000
Number of selected DEGs
PO
G (%
)
P
FC
FC(P<0.01)
FC(P<0.05)
P(FC>2)
P(FC>1.4)
76.2%
25.0%
Biological Replicate (30% noise)
0
10
20
30
40
50
60
70
80
90
100
1 10 100 1000 10000
Number of selected DEGs
PO
G (%
)
P
FC
FC(P<0.01)
FC(P<0.05)
P(FC>2)
P(FC>1.4)
76.2%
25.0%
Biological Replicate (30% noise)
500
21
How to determine DEGs- Do we really know what we don’t know
• A struggle between reproducibility and specificity/sensitivity– A monotonic relationship between specificity
and sensitivity– A “???” relationship between reproducibility
and specificity/sensitivity
22
More on Reproducibility• General impressions (conclusions):
– Reproducibility is a complicated phenomena
– No straightforward way to assess the reproducibility of DEGs
• Reproducibility and statistical power – More samples higher reproducibility
• Reproducibility and statistical significance– Inverse relationship but not a simple trade-off
• Reproducibility and the gene length– A complex relationship with the DEG length
• Irreproducible not equal to biological irrelevant– If two DEGs from two replicated studies are not reproducible,
both could be true discovery
23
MicroArray Quality Control (MAQC) project – Phase II
• MAQC-I: Technical Performance– Reliability of microarray technology– Cross-platform consistency– Reproducibility of microarray results
• MAQC-II: Practical Application– Molecular signatures (or classifiers) for risk
assessment and clinical application– Reliability, cross-platform consistency and
reproducibility– Develop guidance and recommendations
Feb 2005
Sept 2006
Dec 2008
MA
QC
-IM
AQ
C-I
I
137
sci
en
tis
ts
fro
m 5
1 O
RG
>4
00 s
cie
nti
sts
fr
om
>1
50
OR
G
24
Application of Predictive Signature
Diagnosis
Short term exposure
Long term effect
Clin
ica
l ap
plic
atio
n(P
har
ma
co
ge
no
mic
s)
Sa
fety
As
ses
sm
en
t (T
oxi
co
ge
no
mic
s)
Long term effect
Treatment
Treatment outcome
Prognosis
Phenotypic anchoring
Prediction
25
Data Set
Validation
Classifier
Preprocessing
QC
Feature Selection
Batch effect
Which QC methods
How to generate an initial gene pool for modeling
P, FC, p(FC), FC(p) …
How to assess the success- Chemical based prediction- Animal based prediction
Normalization e.g.: Raw data, MAS5, RMA, dChip, Plier
Which methods: KNN, NC, SVM, DT, PLS …
Challenge 1
26
Challenge 2: Assessing the Performance of a Classifier
Prediction Accuracy: Sensitivity, Specificity
Mechanistic Relevance:Biological understanding
Robustness:Reproducibility of
signatures
1
23
27
Dataset Set
Validation
Classifier
Preprocessing
QC
Feature Selection
Normalization
Freedom of choice (35 analysis teams)
A consensus approach (12 teams)
Validation, validation and Validation!
28
What We Are Looking For
• Which factors (or parameters) critical to the performance of a classifier
• A standard procedure to determine these factors
• The procedure should be the dataset independent
• A best practice - Could be used as a guidance to develop microarray based classifiers
Dataset Set
Validation
Classifier
Preprocessing
QC
Feature Selection
Normalization
29
Three-Step Approach
Step1Training set
1. Classifiers
2. Sig. genes
3. DAPs
Frozen
Step 2Blind test set
Prediction
Assessment
Best Practice
Step 3Future sets
Validate the Best Practice
New exp for
selected
endpoints
30
MAQC-II Data Sets
Providers Datasets
Size
Step 1
- Training
Step 2
- Test
MDACC Breast cancer 130 100
UAMS Multiple myeloma 350 209
Univ. of Cologne
Neuroblastoma 251 300
Hamner The lung tumor70
(18 cmpds)
40
(5 cmpds)
IconixNon-genotoxic hepatocarcinogenicity
216 201
NIEHS Liver injury (Necrosis) 214 204
Clin
ica
l da
taT
oxi
co
ge
no
mic
s d
ata
31
Where We Are
Step1Training set
1. Classifiers
2. Sig. genes
3. DAPs
Frozen
Step 2Blind test set
Prediction
Assessment
Best Practice
Step 3Future sets
Validate the Best Practice
New exp for
selected
endpoints
32
18 Proposed Manuscripts• Main manuscript - Study design and main
findings
• Assessing Modeling Factors (4 proposals)
• Prediction Confidence (5 proposals)
• Robustness (3 proposals)
• Mechanistic Relevance (2 proposals)
• Consensus Document (3 proposals)
Dataset Set
Validation
Classifier
Preprocessing
QC
Feature Selection
Normalization
Prediction Accuracy
Mechanistic Relevance
Robustness
33
Consensus Document (3 proposals)
1. Principles of classifier development: Standard Operating Procedures (SOPs)
2. Good Clinical Practice (GCP) in using microarray gene expression data
3. MAQC, VXDS and FDA guidance on genomics
Modeling
Assessing
Consensus
Guidance
34
Best Practice Document• One of the VGDS and MAQC objectives is to
communicate with the private industry/research community to reach consensus on – How to exchange genomic data (data submission)
– How to analyze genomic data
– How to interpret genomic data
• Lessons Learned from VGDS and MAQC have led to development of Best Practice Document (Led by Federico Goodsaid)– Companion to Guidance for Industry on
Pharmacogenomic Data Submission (Docket No. 2007D-0310). (http://www.fda.gov/cder/genomics/conceptpaper_20061107.pdf)
– Over 10 pharmas have provided comments
35
An Array of FDA Endeavors- Integrated Nature of VGDS, ArrayTrack, MAQC
and Best Practice Document
ArrayTrack
MAQCVGDS
36
Member Of Center for Toxicoinformatics
top related