4.matt hall -gsk

The current state of bioinformatics in a big pharma company

Eagle Genomics Symposium, April 5th 2011

Matt Hall, Team Leader, Computational Biology, GSK

Outline of talk

Challenges in large pharma

Brief history of Bioinformatics at GSK

The organisation today

Examples of current work

Major challenges and opportunities

Staff recruitment, training and motivation

Software licensing

Outsourcing

New models for engagement

Quality vs. quantity

Pressures for changes in Pharma

Patent cliff

Industrialisation of R&D

Novelty vs. Me-too drugs

Greater emphasis on safety

Reimbursement for real added benefit

RSC Chemistry World Jan 2009

Pressures for change – re-personalisation of R&D

Pipeline failing to replenish lost revenues

Re-personalise the discovery process through the creation of small focussed teams usually based around a specific disease or therapeutic area (DPUs)

External partnerships

De-risking R&D – remove blockbuster dependency

Rare and Neglected Diseases

Pool resources (ViiV)For more info see:

Andrew Witty featured in The World in 2011, The Economist (November 2010)

http://www.gsk.com/media/downloads/economist-world-in-2011-gsk-witty.pdf

Or “An Audience With: Patrick Vallance”

Nature Reviews Drug Discovery 9, 834 (November 2010)

GSK Computational Biology – at the heart of MDD

http://www.gsk.com/investors/presentations/2011/david-redfern-economist-pharma-summit-10Feb2011a.pdf

Small isolated bioinformatics groups in Oncology, Biopharm, Biologicals and China

Increasing externalisation in early R&D

http://www.gsk.com/investors/presentations/2011/JPMorgan-Jan11.pdf

How things have changed for Bioinformatics

10-15 years ago the “Genome Gold Rush”

Incyte, HGS, rush for patenting, mining of “drug targets”

Large central bioinformatics group (SB)

Built large in-house bioinformatics infrastructure

Smaller, more agile group in GW

Both had major gene discovery and annotation efforts

Large genetics and genomics groups (data producers)

Large central bioinformatics group built up around managing and analysing platform data

All bioinformatics analysis done in-house (core support group)

Integration of data types around genes

Many databases, tools and websites produced

Group developing cutting edge methods and tools for literature, pathway and network analysis

Restructuring and downsizing

Even in it’s heyday of ~200 staff GSK Bioinformatics did not cover all areas of bioinformatics

Other groups in other parts of GSK R&D involved in mathematical modelling (Systems Biology) and Knowledge Management

Software engineering and infrastructure parts were moved to IT

Over time Platform ‘Omics groups were closed down and outsourced

Analysis functions suffered a series of reorganisations and headcount reductions.

Left a small but highly skilled workforce with years of combined experience covering many of the areas listed above.

Legacy systems and databases retired and data archived (by IT)

TA area aligned sub teams to later become fully matrixed team

Computational Biology Remit (GSK)

Our mission: Support the Drug Discovery and

Development objectives by integrative analysis of internal and external biomedical data

How? By combining our computational

expertise with our biological insight to perform analyses of impact aligned with projects in all parts of the pipeline

GSK Computational Biology Skills

Biology, disease and drug discovery knowledge Computational skills:

Perl, C, Python, R (Bioconductor), SQL, Unix/Linux shell, Java etc… Analysis skills:

Quantitative skills Platform analysis:

Transcriptomics, etc (esp. interpreting data in a biological context) Sequence analysis (with some protein structure):

Promoters, druggability, SNP, splice variants, protein domain analysis Phylogenetic analyses: orthology, paralogy, selection

Model organism functional prediction Pathways/diseases networks analysis:

New targets, MOA hypotheses Molecular Genetic analysis:

Large scale integrative analyses to support Genetics Literature and textual analysis:

KOLs, Biotech portfolios, externalisation (BD), input into TA rebalancing Integration of diverse analyses/data sources

Biomarkers

Animal model

selection/Comparative

Genomics

Environmental

risk assessment

Safety

Vaccine

Development

Computational Biology & the Drug Discovery Pipeline

LaunchTarget ClinicalPreclinical

CB is not only supporting early drug discovery: target identification, target validation etc…..

Target id,

validation

and project

initiation

Repositioning

Disease

indications

Due

diligences

Mechanism

of action

Diagnostics

Off

target

effects

Biomarkers

Assay design

Screening

Dosing

Drug Repositioning

Drug repositioning (also known as Drug repurposing, Drug re-profiling, Therapeutic Switching and Drug re-tasking) is the application of known drugs and compounds to new indications (i.e., new diseases).

Wikipedia

Gram positive:i.e. Staphylococcus aureus

TB:Mycobacterium tuberculosis

Infectious Disease at GSK

Gram negative:i.e. Klebsiella pneumoniae

Malaria:Plasmodium falciparum

AntibacterialCollegeville, PA, USA

Diseases of the Developing WorldTres Cantos, Spain

AntiviralResearch Triangle Park, USA

HCV

HIV

ChIP-Seq to study genome wide histone methylation

Development of computational analysis capability

Support drug discovery efforts in Epigenetics

Disease network using pathways

Disease Networks -subset

Major challenges and opportunities (some)

Struggling with data volumes (NGS) getting right level of support from IT (we are very niche)

Many internal groups require bioinformatics skills

Knowledge management – what is the ROI?

Solid data standards and ontologies

Dealing with clinical data – EHRs and observational data

The Cloud – will it be a viable solution?

“In vogue” areas and their application to drug discovery e.g. Systems Biology. When is the right time to invest?

Move towards greater externalisation and partnerships academic and biotechs(PPPs). Where do we fit into this model? Deals are struck by our business partners assuming that we can support them.

Lack of internal resource for multiple evaluations of new tools/approaches unless clear need. E.g. ChIP-Seq analysis – clear strategic need from business partner so aggressive (GSK HPC grid,

EMBL training etc.)

Staff – recruitment, training, motivation

Been push to raise profile of the group through publications (in line with GSK wide strategy)

Staff turnover is low. Important to keeping staff motivated, trained and up to date with the latest methodologies. Seek opportunities to share knowledge and skills across the group – staff development planning very important part.

Strategies – training workshops (e.g. EBI, Cold Spring Harbor, MSc student projects, training in-house, attendance at scientific conferences, part-time distance learning courses (e.g. MSc modules, OU modules), mentoring etc.

Reward and recognition and personal development – traditional old style career progression path unlikely in current climate.

Mix of computational expertise in the group with common critical skills culture of collaboration and sharing of expertise

importance of communication and interpersonal skills – need to be scientifically multilingual

good problem solvers and spot where can apply current technologies and tools to answer questions

Need to be agile – no time to spend long time coding up new solutions

Internal training

With cuts in bioinformatics head count there are pressures for a greater level of bioinformatics skills from R&D scientists.

“De-skilling” for many molecular biologists within the company Core service

Internalise and integrate all public data within the company firewall – unsustainable

Issue of training at the appropriate level

Unix skills are an issue for most R&D scientists

Limited Unix support

Ideally windows based or web based tools

Significant shift required to empower scientists to do more for themselves

Many public resources are aimed at the bioinformatics specialist, but this is changing EBI user experience, EBI Search etc.

Security is a critical issue for some areas – Chemistry and Biopharms

Software licensing

Mix of commercial and open-source software Expression analysis (commercial and public)

NGS analysis (commercial and public)

Pathway/Network analysis (commercial, in-house and public)

Text mining (commercial, in-house, public)

Analysis pipeline tools (commercial)

Nirvana – a work bench of tools we can plug and play but only pay for on a “per use basis”

Focus on reuse and reanalysis of data. Academic data is aimed at producing the “paper”

Vendors – we need great flexibility – prefer a “PAYG” model for tools and applications. Budgets are too tight and the types of work we do is too varied. Concurrent licences are a “must have” otherwise get tied to a small handful of staff

We are not responsible for provision of tools for R&D staff users pay themselves (or IT)

Data content – ideally seek to buy data outright

Opportunities for outsourcing

Key – we need to get good value for money from any such activity and require the minimum of internal resource to manage the activity. Need to factor into this the cost of outsourcing vs cost of hiring a contractor to work

Tools development

Software, algorithms, pipelines etc.

Target curation

Gather every piece of biological information available for Target X and check the quality. E.g.: splice variants, classification, function, sequences and isoforms, motif analysis, SNPs, interactions, therapeutic portfolio (internal and external), expression profile, disease association, pathways etc.

Was done by central group, now up to each project leader

Who would pay for this service?

In silico candidate Biomarker discovery

Using pathway, expression data, literature information etc

Platform data analysis

any relevant Omics dataset (customer driven)

Data analysis pipelines – for internal data producers

New models for engagement

Data production has become a commodity and tends to be commissioned by project teams (to CROs and/or academic partners). A level of primary analysis (QC and analysis summaries)

We’d like to do more pre-competitive collaborations with academics but system tends to favour 1-3 year post-doc/PhD type of relationships

Innovative Medicines Initiative projects

Shared risk – innovation deals are being struck by GSK with academics and companies where there is some shared risk, pay off for all if successful. For a vendor this may be the opportunity to commercialise the “product” at the end

DPAc – Discovery Partnerships with Academia

Stevenage Bioscience Park

http://imi.europa.eu/

Collaborations

BGI Mix of fee-for-service and collaborations

Brunel University

Drexel University, Philadelphia

European Bioinformatics Institute

Imperial College

Newcastle University

OrphaNet

Quality vs. quantity

Bespoke re-analysis of data (much of it public)

Leading initiatives rather than service mode

Licensing / paying for access to quantity – focus on ROI

GSK Computational Biology – lean, agile and flexible

4.matt hall -gsk

Documents

bioinformatics analysis

analysis skills

early drug discovery

textual analysis

staff gsk bioinformatics

drug repurposing

drug discovery pipeline

areasof bioinformatics