hot topics in chemoinformatics in the pharmaceutical industry david j. wild, ph.d. scientific...

50
Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical Engineering at the University of Michigan [email protected] www.WildIdeasConsulting.com

Post on 19-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Hot Topics in Chemoinformaticsin the Pharmaceutical Industry

David J. Wild, Ph.D.

Scientific Computing Consultant, andAdjunct Professor of Pharmaceutical Engineering at the

University of Michigan

[email protected]

Page 2: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

B.Sc Computer SciencePh.D. Chemoinformatics (Willett Lab)

Worked for 5 years in Scientific Computing leadership at Pfizer, responsible for the development of computational tools for scientists

Now run a consulting firm based in Ann Arbor, Mich., and am also an Adjunct Professor at the University of Michigan.

About me

doing some research

Wild Ideas Consultingwww.WildIdeasConsulting.com

University of Michiganwww-personal.engin.umich.edu/~wildd

Page 3: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

What we’ll cover today

• Overview of early-stage drug discovery and the big industry concerns

• Using information and technology together to improve the chances of finding a new drug

• Example – High Throughput Screening• Some other examples of “hot” areas

– Genomics & Proteomics Information Handling– Virtual Screening– Combinatorial Chemistry– Design of scientific software

Page 4: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Characteristics of the pharmaceutical industry

• Very segmented market – largest company (Pfizer) only has an 11% market share

• High risk, long term – takes 10-20 years to develop a drug, and most drugs fail to get to market

• Highly regulated (by FDA)• High profit margins for drugs which do make it• Investors traditionally expect high return on

investment• Four main phases: discovery, development,

clinical trials and marketing

Page 5: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

R&D spending up, new drugs down

Taken from http://www.newscientistjobs.com/biotech/ernstyoung/blues.jsp

Page 6: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Drug Discovery & Development

Identify disease

Isolate proteininvolved in disease (2-5 years)

Find a drug effectiveagainst disease protein(2-5 years)

Preclinical testing(1-3 years)

Formulation &Scale-up

Human clinical trials(2-10 years)

FDA approval(2-3 years)

File

IN

D

File

NDA

Page 7: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Impact of new technology on drug discovery

• The last few years have seen a number of “revolutionary” new technologies:– Gene chips, genomics and HGP– Bioinformatics & Molecular biology– More protein structures– High-throughput screening & assays– Virtual screening and library design– Docking– Combinatorial chemistry– In-vitro ADME testing– Other computational methods

• How do we make it all work for us?

Page 8: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Identify disease

Isolate protein

Find drug

Preclinical testing

GENOMICS, PROTEOMICS & BIOPHARM.

HIGH THROUGHPUT SCREENING

MOLECULAR MODELING

VIRTUAL SCREENING

COMBINATORIAL CHEMISTRY

IN VITRO & IN SILICO ADME MODELS

Potentially producing many more targetsand “personalized” targets

Screening up to 100,000 compounds aday for activity against a target protein

Using a computer topredict activity

Rapidly producing vast numbersof compounds

Computer graphics & models help improve activity

Tissue and computer models begin to replace animal testing

Page 9: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

There is little “hard data” on using the new technologies

• In a sense, the drug design process is becoming a big experiment

• Do we continue as before, and carefully introduce new technologies as we deem appropriate, or do we radically change the way things are done?

• Lots of pressure for the new technologies to yield results quickly

• How do we measure the results?

Page 10: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Some questions being asked

• Is our increasing spending on R&D and new technologies really going to pay off? Or was it a red herring?

• Is the paucity of drugs in the pipeline because we’re not doing things right, or are we just hitting limits on the number of major diseases with potential treatments still to be found? (“all the low-hanging fruit has gone”)

• Should we be looking in new areas (e.g. “life enhancment” rather than “life saving” or “quality of life”)

Page 11: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

What’s being done

• Trying to get the right Attrition (=drugs dropping out of the pipeline). Aim to increase early-stage attrition and reduce late-stage attrition

• Risk analysis – look ideally for low-risk, high-payoff drugs

• Using metrics to monitor successes and failures

Page 12: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Analyzing risk

High riskLow payoff

High riskHigh payoff

Low riskLow payoff

Low riskHigh payoff

Page 13: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Using metrics to monitor improvement

• Split the discovery process into discrete units, with key decisions at the end of each unit.

• Come up with measurable properties that can be used to gauge success

• Look for good and bad decisions and why they were made

Stage Decision Point

Target exploration

Go with this target?

HTS Was the screen successful?

HTS Analysis Follow up these 5-10 series

Series Followup

Produce 2-3 lead compounds

ADME study Are compounds safe?

Page 14: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Summary

• The pharmaceutical industry is a high-risk industry with very long development times and short product lifespans

• There has been a lot of investment in new technologies for early stage drug discovery, but so far these are not resulting in more drug candidates (or profits)

• Companies are looking at ways to address this problem including managing attrition, risk analysis and metrics.

Page 15: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

How Chemoinformatics can help out

• Producing and manage information for metrics• In-silico analysis to reduce risk, e.g.

– Virtual screening– Library design,– Docking– Cost/benefit analyses

• Making information available at the right time and the right place

• Needs to be integrated into processes

Page 16: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

An example: High-Throughput Screening

Screening perhaps millions of compounds in a corporate collection to see if any show activity against a certain disease protein

Page 17: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

High-Throughput Screening

• Drug companies now have millions of samples of chemical compounds

• High-throughput screening can test 100,000 compounds a day for activity against a protein target

• Maybe tens of thousands of these compounds will show some activity for the protein

• The chemist needs to intelligently select the 2 - 3 classes of compounds that show the most promise for being drugs to follow-up

Page 18: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Informatics Implications

• Need to be able to store chemical structure and biological data for millions of datapoints– Computational representation of 2D structure

• Need to be able to organize thousands of active compounds into meaningful groups– Use cluster analysis or machine learning methods to group

similar structures together and relate to activity

• Need to learn as much information as possible from the data (data mining)– Apply statistical methods to the structures and related

information

Page 19: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

HTS Tools – Tripos SAR Navigator

SAR Navigator is © Tripos, inc., www.tripos.com

Page 20: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

BioReason ClassPharmer

• Clusters actives into groups representing series• Attempts to find a scaffold using MCS algorithm• Recovers inactives back into series• Presents series as rows in a “spreadsheet” view• Gives other statistics on series, such as activity

distribution• http://www.bioreason.com

Page 21: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

BioReason Classpharmer

www.bioreason.com

Page 22: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

BioReason Classpharmer

www.bioreason.com

Page 23: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Strategy for “HTS Triage”

• Run HTS• Decided which compounds are “active” and

which are “inactive”• Cluster the actives to put them into series• Visualize clusters of actives (showing 2D

structures) and pick series of interest• Identify “scaffold” for each series• Use similarity or substructure search on

inactives to find inactives related to these series• Use SAR techniques to discover differences

between actives and inactives in a series

Page 24: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Information generated at different points in the Drug Design process

File

IN

D

File

NDA

Gene chip experiments

Project selection decisionsAssay protocols

HTS resultsSeries selection decisionsSAR studies

Protein structures

Combinatorial Expts.PharmacophoresADME studies

Toxicology studiesScaleup reactions

Lead cmpd decisions

Clinical Trials data

Doctor/patient studies

Marketing, surveys, etc

Page 25: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Information generated at different sites

Page 26: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Distributed goals model

Page 27: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Shared goals model

Page 28: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Information storage breakdowns

• Large amounts of information generated:– Some is not kept at all– Some is kept but loses its meaning

• Often data is kept, but not semantics or decisions– e.g. keep “the HVX2 assay result for this compound was

3.2”, but not what the assay protocol was, whether the compound was considered ‘active’, nor whether it was followed up on.

• “Bigger picture” or derivative information is usually not stored– E.g. “all the compounds with a tri-methyl group seemed

to have much lower activity for this project”

Page 29: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Information access breakdowns

• Some information is only available in one physical location

• Some information is only available within one part of the discovery process

• Often information is not “contextualized” for use outside a particular domain

• When someone is clear about a piece of information they need; that piece of information exists, but they don’t know how to access it.– E.g. What system to use, what Oracle table it’s in, or

even the knowledge of whether that piece of information does exist!

Page 30: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

“Missed opportunities”

• Not a specific breakdown, but if the right piece of information had been available at the right time, better decisions could have been made

• E.g.– A group of compounds is being followed up as potential

drugs, but a rival company just applied for a patent on the compounds

– A large amount of money is being spent developing an HTS assay for a target, but marketing research shows any drug is unlikely to be a success

– A group of compounds is selected from an HTS as good candidates for follow up, but 20 years ago they were followed up for a similar project and had severe solubility problems

Page 31: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Information use breakdowns

• The meaning of data is incorrectly interpreted• A single piece of information is used, whilst using

a wider range of information would lead to different conclusions

• Lessons learned from one project are incorrectly applied to another

• “Fuzzy” information is taken as concrete information

Page 32: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

What do we do?

• No large company has really solved the problem• But ongoing attempts include:

– Defining information produced and needed at each stage of the discovery process

– Improving processes to be more consistent, especially across different sites

– Improving information flow between departments and sites– Harmonizing terminology across disciplines and sites– Defining needed “management information” as well as raw

data– Looking for “quick win” opportunities

• This will presumably impact what is stored in databases and what software is used– Oracle Chemistry Cartridges help

Page 33: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Some Other Examples

Genomics & Proteomics Information Handling

Virtual ScreeningCombinatorial Chemistry

Design of scientific software

Page 34: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Genomics & Proteomics Information Handling

Understanding the link between diseases, genetic makeup and expression of proteins

Page 35: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Genomics

• Genomics is fast-forwarding our understanding of how DNA, genes, proteins and protein function are related, in both normal and disease conditions

• Human genome project has mapped the genes in human DNA• Hope is that this understanding will provide many more potential

protein targets• Allows potential “personalization” of therapies

ATACGGATTATGCCTA functions

Page 36: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Gene Chips

• “Gene chips” allow us to look for changes in protein expression for different people with a variety of conditions, and to see if the presence of drugs changes that expression

• Makes possible the design of drugs to target different phenotypes

compounds administered

people / conditions

e.g. obese, cancer, caucasian

expression profile

(screen for 35,000 genes)

Page 37: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

“Chemogenomics” from Vertex

Video: http://www.vrtx.com/Chemogenonone.html

Page 38: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Virtual Screening

• Build a computational model of activity for a particular target

• Use model to score compounds from “virtual” or real libraries

• Use scores to decide which to make, or pass through a real screen

Page 39: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Computational Models of Activity

• Machine Learning Methods– E.g. Neural nets, Bayesian nets, SVMs, Kahonen nets– Train with compounds of known activity– Predict activity of “unknown” compounds

• Scoring methods– Profile compounds based on properties related to target

• Fast Docking– Rapidly “dock” 3D representations of molecules into 3D

representations of proteins, and score according to how well they bind

Page 40: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Present molecules to model

• We may want to virtual screen– All of a company’s in-house compounds, to see which to

screen first– A compound collection that could be purchased– A potential combinatorial chemistry library, to see if it is

worth making, and if so which to make

• Model will come out with with either prediction of how well each molecule will bind, or a score for each molecule

Page 41: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Combinatorial Chemistry

• By combining molecular “building blocks”, we can create very large numbers of different molecules very quickly.

• Usually involves a “scaffold” molecule, and sets of compounds which can be reacted with the scaffold to place different structures on “attachment points”.

Page 42: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Example Combinatorial Library

NH

R1

R2R3

Scaffold “R”-groups

R1 = OH OCH3

NH2

Cl COOH

R2 = phenyl OH NH2

Br F CN

R3 = CF3

NO2

OCH3

OH phenoxy

Examples

NH

OH

CN

OH

NH

OH

OCH3

NH

C

OH

OHO

CF3

NH

C

OH

OHO

O

For this small library, the numberof possible compounds is

5 x 6 x 5 = 150

Page 43: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Combinatorial Chemistry Issues

• Which R-groups to choose

• Which libraries to make– “Fill out” existing compound collection?– Targeted to a particular protein?– As many compounds as possible?

• Computational profiling of libraries can help– “Virtual libraries” can be assessed on computer

Page 44: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Design of Scientific Software

Problems with scientific software tend to occur because of deficiencies in one of three areas:

Software RelevanceSoftware Usability

Software Management

Page 45: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Software Relevance

• To be able to make software relevant requires the software designer to understand:

– the science, i.e. the domain– the scientific computing techniques that are used in the

domain– the possibilites and limitations of software development.

• Even with this, it’s hard to match the things we can do with the things that people want or need to do

• Techniques like personas and contextual inquiry simply help us understand the people who use the software, their goals, and tasks they want to do

Page 46: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Software relevance:Bridge between computation & science

clusteringsim. searchingactivity modelsscaffold detectiondockinglogp calculation

tasks:

“doing a cluster analysis”

“identifying activity-related fragments”

tools

chemoinformatics science

tasks:

work out a chemical synthesis

choose good reagents

try and document some reactions

goals:

e.g. produce compounds that have high biological activity

?

Page 47: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Software Usability

• Tend to focus on the method and the science, but not how easy it is for people to get their job done using the software

• Programmers tend to make software intuitive for them, but not necessarily the people it is designed for

• A usability lab and other techniques can make a HUGE difference to the satisfaction of users and programmers alike!

Page 48: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Software Management

• Disparate set of tools & platforms • Disparate programming styles, languages• A variety of people tend to be writing software

– Trained software developers– Enthusiastic scientists– Scientific computing specialists

• Focus on the science tends to mean software management is neglected

• Everyone hates traditional software management “rules”• But there are ways of making everything work better and

having more fun doing it!• Have a recommended basic setup that should help a lot

Page 49: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Foundation reading

• “The Inmates Are Running the Asylum” by Alan Cooper

• “Contextual Design” by Hugh Beyer and Karen Holtzblatt

• “Usability Engineering” by Jakob Nielsen• “The Visual Display of Quantitative Information”

by Edward Tufte• “Don’t Make Me Think!” by Steve Krug

• See also, www.WildIdeasConsulting.com

Page 50: Hot Topics in Chemoinformatics in the Pharmaceutical Industry David J. Wild, Ph.D. Scientific Computing Consultant, and Adjunct Professor of Pharmaceutical

Summary

• R&D in the pharmaceutical industry is undergoing a lot of technological changes, and there is pressure to make the investment pay off

• There is a big need to sensibly use the large amounts of chemical and biological-related information produced in the process

• Thoughtful use of chemoinformatics methods and software is becoming crucial to the success of drug discovery