4.matt hall -gsk
DESCRIPTION
The current state of bioinformatics in a big pharma company.TRANSCRIPT
The current state of bioinformatics in a big pharma company
Eagle Genomics Symposium, April 5th 2011
Matt Hall, Team Leader, Computational Biology, GSK
Outline of talk
Challenges in large pharma
Brief history of Bioinformatics at GSK
The organisation today
Examples of current work
Major challenges and opportunities
Staff recruitment, training and motivation
Software licensing
Outsourcing
New models for engagement
Quality vs. quantity
Pressures for changes in Pharma
Patent cliff
Industrialisation of R&D
Novelty vs. Me-too drugs
Greater emphasis on safety
Reimbursement for real added benefit
RSC Chemistry World Jan 2009
Pressures for change – re-personalisation of R&D
Pipeline failing to replenish lost revenues
Re-personalise the discovery process through the creation of small focussed teams usually based around a specific disease or therapeutic area (DPUs)
External partnerships
De-risking R&D – remove blockbuster dependency
Rare and Neglected Diseases
Pool resources (ViiV)For more info see:
Andrew Witty featured in The World in 2011, The Economist (November 2010)
http://www.gsk.com/media/downloads/economist-world-in-2011-gsk-witty.pdf
Or “An Audience With: Patrick Vallance”
Nature Reviews Drug Discovery 9, 834 (November 2010)
GSK Computational Biology – at the heart of MDD
http://www.gsk.com/investors/presentations/2011/david-redfern-economist-pharma-summit-10Feb2011a.pdf
Small isolated bioinformatics groups in Oncology, Biopharm, Biologicals and China
Increasing externalisation in early R&D
http://www.gsk.com/investors/presentations/2011/JPMorgan-Jan11.pdf
How things have changed for Bioinformatics
10-15 years ago the “Genome Gold Rush”
Incyte, HGS, rush for patenting, mining of “drug targets”
Large central bioinformatics group (SB)
Built large in-house bioinformatics infrastructure
Smaller, more agile group in GW
Both had major gene discovery and annotation efforts
Large genetics and genomics groups (data producers)
Large central bioinformatics group built up around managing and analysing platform data
All bioinformatics analysis done in-house (core support group)
Integration of data types around genes
Many databases, tools and websites produced
Group developing cutting edge methods and tools for literature, pathway and network analysis
Restructuring and downsizing
Even in it’s heyday of ~200 staff GSK Bioinformatics did not cover all areas of bioinformatics
Other groups in other parts of GSK R&D involved in mathematical modelling (Systems Biology) and Knowledge Management
Software engineering and infrastructure parts were moved to IT
Over time Platform ‘Omics groups were closed down and outsourced
Analysis functions suffered a series of reorganisations and headcount reductions.
Left a small but highly skilled workforce with years of combined experience covering many of the areas listed above.
Legacy systems and databases retired and data archived (by IT)
TA area aligned sub teams to later become fully matrixed team
Computational Biology Remit (GSK)
Our mission: Support the Drug Discovery and
Development objectives by integrative analysis of internal and external biomedical data
How? By combining our computational
expertise with our biological insight to perform analyses of impact aligned with projects in all parts of the pipeline
GSK Computational Biology Skills
Biology, disease and drug discovery knowledge Computational skills:
Perl, C, Python, R (Bioconductor), SQL, Unix/Linux shell, Java etc… Analysis skills:
Quantitative skills Platform analysis:
Transcriptomics, etc (esp. interpreting data in a biological context) Sequence analysis (with some protein structure):
Promoters, druggability, SNP, splice variants, protein domain analysis Phylogenetic analyses: orthology, paralogy, selection
Model organism functional prediction Pathways/diseases networks analysis:
New targets, MOA hypotheses Molecular Genetic analysis:
Large scale integrative analyses to support Genetics Literature and textual analysis:
KOLs, Biotech portfolios, externalisation (BD), input into TA rebalancing Integration of diverse analyses/data sources
Biomarkers
Animal model
selection/Comparative
Genomics
Environmental
risk assessment
Safety
Vaccine
Development
Computational Biology & the Drug Discovery Pipeline
LaunchTarget ClinicalPreclinical
CB is not only supporting early drug discovery: target identification, target validation etc…..
Target id,
validation
and project
initiation
Repositioning
Disease
indications
Due
diligences
Mechanism
of action
Diagnostics
Off
target
effects
Biomarkers
Assay design
Screening
Dosing
Drug Repositioning
Drug repositioning (also known as Drug repurposing, Drug re-profiling, Therapeutic Switching and Drug re-tasking) is the application of known drugs and compounds to new indications (i.e., new diseases).
Wikipedia
Gram positive:i.e. Staphylococcus aureus
TB:Mycobacterium tuberculosis
Infectious Disease at GSK
Gram negative:i.e. Klebsiella pneumoniae
Malaria:Plasmodium falciparum
AntibacterialCollegeville, PA, USA
Diseases of the Developing WorldTres Cantos, Spain
AntiviralResearch Triangle Park, USA
HCV
HIV
ChIP-Seq to study genome wide histone methylation
Development of computational analysis capability
Support drug discovery efforts in Epigenetics
Disease network using pathways
Disease Networks -subset
Major challenges and opportunities (some)
Struggling with data volumes (NGS) getting right level of support from IT (we are very niche)
Many internal groups require bioinformatics skills
Knowledge management – what is the ROI?
Solid data standards and ontologies
Dealing with clinical data – EHRs and observational data
The Cloud – will it be a viable solution?
“In vogue” areas and their application to drug discovery e.g. Systems Biology. When is the right time to invest?
Move towards greater externalisation and partnerships academic and biotechs(PPPs). Where do we fit into this model? Deals are struck by our business partners assuming that we can support them.
Lack of internal resource for multiple evaluations of new tools/approaches unless clear need. E.g. ChIP-Seq analysis – clear strategic need from business partner so aggressive (GSK HPC grid,
EMBL training etc.)
Staff – recruitment, training, motivation
Been push to raise profile of the group through publications (in line with GSK wide strategy)
Staff turnover is low. Important to keeping staff motivated, trained and up to date with the latest methodologies. Seek opportunities to share knowledge and skills across the group – staff development planning very important part.
Strategies – training workshops (e.g. EBI, Cold Spring Harbor, MSc student projects, training in-house, attendance at scientific conferences, part-time distance learning courses (e.g. MSc modules, OU modules), mentoring etc.
Reward and recognition and personal development – traditional old style career progression path unlikely in current climate.
Mix of computational expertise in the group with common critical skills culture of collaboration and sharing of expertise
importance of communication and interpersonal skills – need to be scientifically multilingual
good problem solvers and spot where can apply current technologies and tools to answer questions
Need to be agile – no time to spend long time coding up new solutions
Internal training
With cuts in bioinformatics head count there are pressures for a greater level of bioinformatics skills from R&D scientists.
“De-skilling” for many molecular biologists within the company Core service
Internalise and integrate all public data within the company firewall – unsustainable
Issue of training at the appropriate level
Unix skills are an issue for most R&D scientists
Limited Unix support
Ideally windows based or web based tools
Significant shift required to empower scientists to do more for themselves
Many public resources are aimed at the bioinformatics specialist, but this is changing EBI user experience, EBI Search etc.
Security is a critical issue for some areas – Chemistry and Biopharms
Software licensing
Mix of commercial and open-source software Expression analysis (commercial and public)
NGS analysis (commercial and public)
Pathway/Network analysis (commercial, in-house and public)
Text mining (commercial, in-house, public)
Analysis pipeline tools (commercial)
Nirvana – a work bench of tools we can plug and play but only pay for on a “per use basis”
Focus on reuse and reanalysis of data. Academic data is aimed at producing the “paper”
Vendors – we need great flexibility – prefer a “PAYG” model for tools and applications. Budgets are too tight and the types of work we do is too varied. Concurrent licences are a “must have” otherwise get tied to a small handful of staff
We are not responsible for provision of tools for R&D staff users pay themselves (or IT)
Data content – ideally seek to buy data outright
Opportunities for outsourcing
Key – we need to get good value for money from any such activity and require the minimum of internal resource to manage the activity. Need to factor into this the cost of outsourcing vs cost of hiring a contractor to work
Tools development
Software, algorithms, pipelines etc.
Target curation
Gather every piece of biological information available for Target X and check the quality. E.g.: splice variants, classification, function, sequences and isoforms, motif analysis, SNPs, interactions, therapeutic portfolio (internal and external), expression profile, disease association, pathways etc.
Was done by central group, now up to each project leader
Who would pay for this service?
In silico candidate Biomarker discovery
Using pathway, expression data, literature information etc
Platform data analysis
any relevant Omics dataset (customer driven)
Data analysis pipelines – for internal data producers
New models for engagement
Data production has become a commodity and tends to be commissioned by project teams (to CROs and/or academic partners). A level of primary analysis (QC and analysis summaries)
We’d like to do more pre-competitive collaborations with academics but system tends to favour 1-3 year post-doc/PhD type of relationships
Innovative Medicines Initiative projects
Shared risk – innovation deals are being struck by GSK with academics and companies where there is some shared risk, pay off for all if successful. For a vendor this may be the opportunity to commercialise the “product” at the end
DPAc – Discovery Partnerships with Academia
Stevenage Bioscience Park
Collaborations
BGI Mix of fee-for-service and collaborations
Brunel University
Drexel University, Philadelphia
European Bioinformatics Institute
Imperial College
Newcastle University
OrphaNet
Quality vs. quantity
Bespoke re-analysis of data (much of it public)
Leading initiatives rather than service mode
Licensing / paying for access to quantity – focus on ROI
GSK Computational Biology – lean, agile and flexible