open science monitor case study open targets · 2018-10-30 · 7. next-generation sequencing 8....

25
Laia Pujol Priego, Jonathan Wareham October – 2018 EN Open Targets Open Science Monitor Case Study

Upload: others

Post on 27-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

Laia Pujol Priego, Jonathan Wareham October – 2018

EN

Open Targets

Open Science Monitor Case Study

Page 2: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

Open Targets - Open Science Monitor Case Study

European Commission Directorate-General for Research and Innovation Directorate A — Policy Development and Coordination Unit A.2 — Open Data Policy and Science Cloud E-mail [email protected] [email protected] European Commission B-1049 Brussels

Manuscript completed in July 2018.

This document has been prepared for the European Commission however it reflects the views only of the authors, and the Commission cannot be held responsible for any use which may be made of the information contained therein.

More information on the European Union is available on the internet (http://europa.eu).

Luxembourg: Publications Office of the European Union, 2018

PDF ISBN 978-92-79-96834-1 doi: 10.2777/2832 KI-05-18-020-EN-N

© European Union, 2018. Reuse is authorised provided the source is acknowledged. The reuse policy of European Commission documents is regulated by Decision 2011/833/EU (OJ L 330, 14.12.2011, p. 39).

For any use or reproduction of photos or other material that is not under the EU copyright, permission must be sought directly from the copyright holders.

Page 3: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

EUROPEAN COMMISSION

OPEN TARGETS Open Science Monitor Case Study

2018 Directorate-General for Research and Innovation EN

Page 4: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

2

Table of contents Acknowledgements ............................................................................................. 3

1 Introduction ................................................................................................ 4

2 Background ................................................................................................. 4

3 Drivers ....................................................................................................... 9

4 Barriers .................................................................................................... 12

5 Impact ...................................................................................................... 15

5.1 For Science ......................................................................................... 15

5.2 For Industry ........................................................................................ 16

5.3 For Society ......................................................................................... 17

6 Lessons Learnt ........................................................................................... 18

7 Policy conclusions....................................................................................... 18

References ....................................................................................................... 20

Page 5: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

3

Acknowledgements

Disclaimer: The information and views set out in this study report are those of the author(s) and do not necessarily reflect the official opinion of the Commission. The Commission does not guarantee the accuracy of the data included in this case study. Neither the Commission nor any person acting on the Commission’s behalf may be held responsible for the use which may be made of the information contained therein.

The case study is part of Open Science Monitor led by the Lisbon Council together with CWTS, ESADE and Elsevier.

Authors

Laia Pujol Priego – Ramon Llull University, ESADE

Jonathan Wareham – Ramon Llull University, ESADE

Page 6: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

4

1 Introduction

Open Targets is an innovative, large-scale, public-private collaboration on pre-competitive research that provides comprehensive and up to date data for drug target identification and prioritization1. Open Targets integrates publicly available information and data relevant to targets and diseases in the Open Targets Platform2; and performs high throughput experimental projects that generate target-centred data in physiologically relevant systems to understand causal relationships between targets and diseases in three therapeutic areas: Oncology, Immunology, and Neurodegeneration.

Open Targets is located on the Wellcome Genome Campus in the United Kingdom, which hosts some of the most advanced institutes worldwide working at the interface of genomics and computational biology. It was founded in 2014 by three global leading organizations in bioinformatics, genomics, and pharmaceuticals: EMBL-EBI, a global leader in the management, integration, and analysis of public domain life science data; the Wellcome Sanger Institute, a world-leading genomics institution with expertise in human genetics, cancer, and infectious disease; and GSK, a leading, global pharmaceutical company. Since its creation, the partnership has been integrating new stakeholders progressively: Biogen, a biotechnology company, in 2016; Takeda, a large pharmaceutical company in Asia in 2017; and Celgene a global biopharmaceutical company in 2018.

A cornerstone of this public-private collaboration from the beginning is an agreement among the organizations that all data and resources generated within Open Targets should be made available rapidly in the public domain to the entire scientific community. The release of all precompetitive information (i.e informatics tools, experimental methods, and all sequence data generated) seeks to maximize the public benefit to be gained from research. Furthermore, as Open Targets states in their guiding principles, they are committed to non-exclusive partnerships that foster the free exchange of ideas and expertise. With this aim, the partnership has been incorporating progressively new partners since its creation.

The present case provides a unique opportunity to explore how EMBL-EBI, Europe’s flagship laboratory for life sciences, Wellcome Trust, and four big pharmaceutical and biotech companies, have managed to generate a framework of collaboration that allows to openly share in a unique user-driven platform all data and tools developed with the goal to accelerate both scientific research and drug discovery.

2 Background

Today, the pharmaceutical industry is facing a major challenge, insofar that the costs of bringing a new drug to market have never been higher and they continue to rise. According to the Tufts Center for the Study of Drug Development, the cost of developing a medicine, including the cost of failures, are at 2,6 billion- more than double the estimate of just a decade ago. Additionally, the increasing scientific complexity of diseases areas, such as rare cancers or neurological conditions, require investing resources in new approaches to investigate such diseases at a molecular and genetic level. Data deluge in biomedical research with the advent of genomics and associated biological sciences such as proteomics and structural biology has led to an increasing granularity of information that requires new tools, techniques, and approaches to accelerate drug discovery. Simultaneously, there is considerable duplication of efforts in research devoted to providing the basic biological knowledge required for successful drug design (Altshuler et al., 2010).

1 OpenTargets website: https://www.opentargets.org/ 2 Open Targets Platform: https://www.targetvalidation.org/

Page 7: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

5

To address these challenges, Open Targets was born to bring together the skills, expertise, technologies and diverse data types that scientists need to establish the genetic links between targets and disease development. Target validation is an exercise that defines the role that a biological process plays in a disease and is the first step in drug discovery. “Currently, an estimated 90 percent of compounds entering clinical trials fail to demonstrate the necessary efficacy and safety requirements, never reaching patients as medicines. This is often because the biological target chosen is not well understood”, explains Biogen team (EMBL-EBI, 2014).

Open Targets uses advances in cutting-edge genetic methods to support researchers in the first step of exploring new drugs; concretely, helping them to identify “where to start”. Open Targets generates data through the partnership collaboration projects in Oncology, Immunology, Neurodegeneration, and Cross-disease; and integrates data through the Open Targets platform.

Figure 1. Open Targets platform (data integration) and Open Targets consortium (data generation) (Source: Open Targets)

In order to generate valuable data and insights, Open Targets combines the expertise of its members in emerging and established technologies, which include:

1. Gene editing (CRISPR)

2. Induced pluripotent stem cells

3. Single cell genomics

4. Organoid and tissue culture

5. Large-scale genomics and epigenomics (e.g. ENCODE)

6. Genome-wide association studies

7. Next-generation sequencing

8. Bioinformatics

9. High-performance computing

The methods used by Open Targets include a combination of large-scale genomic experiments with objective statistical and computational techniques to identify and validate causality between targets, pathways, and diseases (Open Targets outreach, 2016).

Page 8: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

6

Open Targets Platform

The Open Targets Platform enables investigation of the evidence associating targets and diseases in a user-friendly, intuitive and accessible way while providing tools that prioritize target-disease hypotheses for further research. Figure 2 below shows a schematic representation of the Open Targets evidence object associating a Target (T) with a disease (D).

Figure 2. Schematic representation of the Open Targets evidence object associating a Target (T) with a disease (D) (Koscielny, G., et al., 2016)

The first public version of the platform (version 1.0) was launched in December 2015. Open Targets integrates comprehensive datasets from a myriad of public databases to calculate, rank and score gene-disease associations. Concretely, the platform integrates the following public data sources (Koscielny, G., et al., 2016):

• Genetic associations: GWAS Catalog, UniProt, European Variation Archive, Gene2Phenotype;

• Somatic mutations: Cancer Gene Census, European Variation Archive somatic, IntOGen;

• RNA expression: Expression Atlas;

• Drugs: ChEMBL

• Affected pathways: Reactome;

• Text mining: Europe PMC;

• Animal models: PhenoDigm.

New data is dynamically integrated into the Open Targets Platform and the updates are disseminated to the community through diverse release blog posts and social media communications. To access to the data, scientists can use Open Targets Platform GUI3, the REST-API4 (with or without Python, R or their download page). Open Targets has several tutorials and webinars on their YouTube channel5.

3 Open Targets Platform access: https://www.targetvalidation.org 4 API Open Targets: http://api.opentargets.io/v3/platform/docs 5 Open Targets YouTube channel: https://www.youtube.com/channel/UCLMrondxbT0DIGx5nGOSYOQ/playlists

Page 9: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

7

The Open Targets Platform is an open access “Google”-type search engine that extensively searches assess and integrates the huge quantity of genetic and biological data available. The platform supports two main workflows: the first is target-centric; the second is a disease-centric workflow. The user of the platform can search for a target and will be presented with visualizations of the evidence available for associations with specific diseases clustered by broad therapeutic areas and additionally allows in-depth analysis of the evidence and user-defined prioritization of the lists of associations. In the second workflow, the user enters the name of a disease and asks which targets can be associated with this disease and obtains a visualized summary of the targets associated with that disease and the underlying evidence available.

These two workflows were designed and developed by Open Targets to support practicing biological scientists in the pharmaceutical industry and research labs to select and prioritize the targets that are most likely to succeed based on data-driven associations with diseases. Users of the platform do not require an in-depth understanding of bioinformatics or the integrated data to use the platform.

These two workflows are the result of a range of user experience methods applied to develop the Open Targets Platform. When the project started, the Open Targets team decided to carry out different interviews with scientists and managers working in R&D in pharmaceutical and biotech companies, as well as academic researchers interested in drug discovery. This process helped the Open Targets team to identify the fundamental questions that those researchers ask in order to identify and prioritize targets. In addition, the exercise helped the team to understand the ecosystem of data that drug discovery practitioners use to build confidence in a target. The main conclusions of the Open Targets team were: i) starting from a particular target (e.g. PDE4D), which diseases are associated with the target? ii) Starting from a particular disease (e.g. asthma), which targets are associated with this disease? (Koscielny, G., et al., 2016). The architecture of the platform is structured around these two workflows and follows these types of answers leading the user to those different pathways.

Page 10: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

8

Figure 3. Workflows of the application (Koscielny et al., 2017). According to the last update available (June 2018), Open Targets contains more than 21.060 targets, 2.407.593 associations, 10.086 diseases and 18 data sources. Open Targets chose to develop a federated approach. The platform has developed summary information of the data, which takes the form of evidence objects supplied by the source database or by Open Targets team through an analytical pipeline or by parsing other databases. The aim of Open Targets was not to store all the data contributing to the evidence, first for efficiency but also because the databases are already uniquely tuned to deal with many of the specialized data sources and Open Targets team already anticipated that these data sources were going to evolve fast with future techniques.

Regarding the visualization of data, Open Targets team has integrated third-party visualizations, which include a visualization for biological pathways developed by Reactome; a graphical display of RNA baseline expression developed by Expression Atlas; a visualization of the different protein features developed by UniProt or a three-dimensional protein structure display for targets6. In addition, the Open Targets Platform has been designed to integrate other third-party widgets to visualize target or disease evidence in any local or user-deployed instance.

6 WebGL-based viewer for proteins and other macromolecular structures: http://dx.doi.org/10.5281/zenodo.20980

Page 11: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

9

Open Targets collaborative experimental projects

The collaboration in Open Targets consortium between its members is implemented through a portfolio of experimental projects that are designed to provide new evidence in key therapy areas to enable target identification and prioritization: Oncology, Immunology, and Neurodegeneration. The focus on these therapeutic areas is to identify substantial unmet patient needs as well as complementary expertise from the partners in pre-competitive collaboration.

Figure 4. The role of Open Targets experimental projects in Open Targets overall activity (source: Open Targets webinar series)

Regarding oncology, Open Targets projects are connected to resources of Sanger Institute’s cancer program, which has performed research regarding the genetic basis of cancer. A shared interest is the use of accessible cancer resources to analyse clinical genomic datasets to identify driver genes across different sub-types of cancer. A key resource for that purpose is the collection of >1000 human cancer cell lines and their drug sensitivities of Sanger. To identify target opportunities, the consortium uses genomic information including RNA-seq and synthetic lethality from genome editing model systems that show the biology of tumours.

Immunology is the second focus area of pre-competitive collaboration in the framework of Open Targets. The Consortium, following the shared interest amongst partners, has developed a state-of-the-art meta-analysis for the existing inflammatory bowel disease (IBD) cohorts and seeks to qualify potential targets for validation. As the collaboration in Open Targets progresses, additional projects have emerged probing the role of targets in well-defined immune cells or in diseases such as asthma using single-cell genomics.

The third focus area of the collaboration of Open Targets consortium is neurodegeneration, where members with the collaboration of Gurdon Institute have used their collective expertise to implement projects around Alzheimer’s and Parkinson’s disease to identify and test potential targets.

3 Drivers

Open science model to fight lengthy, costly, low success rate, high attrition rates and complexity in drug discovery

“We believe that harnessing the potential of “big data” and genome sequencing through this collaboration could help us dramatically improve our success rate for discovering new medicines”, declares GSK team (GSK, 2017).

Page 12: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

10

The traditional drug development process is long and convoluted, based on massive R&D investments protected by IP. The high costs of drug development and its high frequency of failure can be offset by a patent system that allows monopolistic or oligopolistic pricing for a period of time. After patents expire, generic versions of the drug emerge to take over the market, quickly decreasing profits. As a result, pharmaceutical firms have had little incentive to develop treatments for diseases or conditions where anticipated profits were not high. This approach encourages pharmaceutical companies to focus on “blockbuster” drugs and provides negligible incentives to target small potential markets such as “orphan” diseases. However, this “blockbuster” model is subject to decreasing productivity and increasing cost (Lee, 2015). Drug developers are pursuing more complex disease areas while trying to comply with a growing proliferation of regulatory requirements.

In response, biopharmaceutical companies are increasingly opening-up their firm boundaries and partnering with their competitors and academia in consortia that seek to create greater efficiencies in R&D and accelerate the discovery production. As Sally John, vice president of Computational Biology & Genomics at Biogen explains, "We are committed to advancing evidence-based target discovery and opening up the field for researchers to create innovative methods and tools to accelerate the development of new medicines. Being part of Open Targets helps us realize this vision and provides a practical, harmonized way to share data with the scientific community." (Biogen, 2016)

New models for research collaboration, which go beyond mergers, takeovers, and in-licensing, are emerging that feature the sharing of information, resources and capabilities across traditional organizational boundaries. As an example of the growing importance of more open, collaborative approaches to R&D innovation the biopharmaceutical sector, 334 new R&D consortia have been created from 2005 to 2014, 9 times the number formed during the prior decade (Deloitte, 2017). If we look at partnerships forming in earlier stages of the R&D process (i.e. prior to a potential new therapy entering clinical trials) with the average number of new early stage (discovery, basic research, and pre-clinical) partnerships have more than doubled between 2005 (256) and 2014 (578).

Pre-competitive openness and collaboration amongst academia and competitors are growing, and it will be the source of future competition between industry players because all of them need well-validated targets. The basic idea behind this trend in biopharmaceuticals is well-known also in other sectors: while in the early hypothesis generation stages, it makes sense for different players to join efforts to generate new open research tools that can be used by everybody (Edwards et al., 2009.) This is in contrast to subsequent, de-risked stages, where different stakeholders start developing their own proprietary products (Weigelt, 2009). In competitive stages, biopharmaceutical companies compete in other ways: over how good their chemists are, how fast they can generate new effective drugs, or how efficiently they can bring them to market (Deloitte, 2017). As one member of Open Targets, GSK’s pharma R&D Head Patrick Vallance expresses to Reuters: “If you can double the base knowledge then you’ve de-risked things enormously, though you’ve still got to make your judgment in your invention. It is not going to give you all the answers but it is going to increase the chance of getting it right” (Gastfriend & Lee, 2015.)

What is worth noting is that industry’s perceptions of what is the domain of precompetitive research have been expanding in the last decade, not without tensions and frequent differences in boundary definitions among companies and academic researchers (Institute of Medicine, 2011.) Challenges for collaboration in consortia increase when we move towards the heart of what makes those companies competitively different. Years ago, it would have been hard to anticipate that industry would be willing to share resources, tools, and information about drug targets as openly as it does today.

Open Targets is the result of a common awareness amongst companies and scientists that they are essentially working on the same targets, and that all of them need to achieve a better understanding of what underlies them.

As Dr. Maya Ghoussaini, Team Leader of the Genetics Core Team at Open Targets explains: “We have to keep reminding ourselves that for every 20 or 30 new drugs that get approval

Page 13: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

11

for use each year, tens of thousands of candidates fail and these failures come at a huge expense both financially and time-wise. It also takes 12–15 years to get a drug from bench to market, with an associated cost of GB£1.5 billion. When a drug fails, which tends to happen at the latest stages of drug development, hundreds of millions of pounds are wasted. What is even more challenging is that drugs that are licensed for use have to account not only for their cost of manufacture but also for the development costs associated with many failed drugs. By reducing the likelihood of drug failure through robust drug target validation at earlier stages, more drugs are likely to succeed and drug development will become more efficient” (Medchemnet, 2018). Figure 5 is a graphic representation of how Open Targets seek to fight the attrition rates of industry pipelines.

Figure 5. The foreseen virtuous cycle of Open Targets (source: Carvalho-Silva, 20177)

The increasing value of aggregating data in biomedical research: Drug repositioning

The explosion of data availability in biomedical research has increasingly rendered it a data-driven science. As a result, drug discovery efforts accumulate today far more information than for previous drugs. There are extensive volumes of data associated with drug discovery that is being accumulated in pharmaceutical firms.

Drug repositioning is one strategy in drug development that seeks to expand the indication space for a successful drug or find a new indication for a drug that was not successful in the clinical trials. Open Targets facilitates identification of potential repositioning opportunities. Through the Open Target Platform, scientists are able to systematically identify and assess all the evidence available that associates a drug with diverse diseases. Basically, the aggregation and integration of all the heterogeneous and complementary evidence available in the platform help prioritize and analyse potential drug repositioning. As reported by Khaldakar et al. (2017), Open Targets has been able to uncover 2,540 potential new indications for 791 existing drug targets. Among these 2,540 new indications, 1,366 are for Orphanet rare diseases where the target is associated with more common diseases.

7 Presentation titled: Open Targets: Mining gene and disease associations for improved drug target identification by Denise Carvalho-Silva, PhD. – October, 18th 2017- Part of EMBL-EBI webinar series

Page 14: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

12

Leverage the capabilities of many organizations

Open Targets gives access, to the different stakeholders involved, to specialized expertise, capabilities, and capacity from other members. It combines the expertise of different experts working together to explore and interpret large volumes of data from genomics, proteomics, chemistry, and disease biology. The collaboration draws on the diverse and specialized knowledge base from scientific institutes and pharmaceutical companies. The motivation of the consortium is to accelerate basic research and technology and drug discovery by pulling together the specialized scientific expertise required.

As Dr. Maya Ghoussaini, Team Leader of the Genetics Core Team at Open Targets highlights, it is critical for drug discovery to both leverage diverse capabilities amongst the diverse stakeholders in biomedical research and to foster an open science strategy that facilitates scientific data sharing: “The future of drug discovery lies in building strong collaborations between industry and academia to bring together multi-disciplinary expertise, from geneticists, molecular biologists, medicinal chemists, chemical biologists, and clinicians, to provide both depth and breadth into diseases and the underlying biology. It also involves embracing data sharing and open sources strategies, such as publishing data in open access journals and making data publicly available through data repositories, as well as sharing novel biologically active chemical probes and data from clinical trials.” (Medchemnet, 2018)

4 Barriers

Technical barriers

The usability of bioinformatics resources for scientists and non-scientists working on drug discovery has always been a barrier to exploiting the already existing data and resources for their work (Karamanis et al., 2018). As Philip Ma, Vice President, Digital Health Technology & Data Sciences at Biogen expressed: "The importance of accessing and managing searchable, structured data is critical to sharing knowledge on target validation."(Biogen, 2016)

Being aware of this common barrier, the Open Targets Platform had the challenge of supporting bench scientists working on early drug discovery in both academia and industry - without requiring in-depth knowledge of data integration methods or bioinformatics - to identify and prioritize drug targets faster and with more confidence.

As a result, the team behind the Open Targets Platform applied lean user experience (UX) design methods to integrate and appropriately address its target users’ needs in the architecture of the platform (Gothelf & Seiden, 2013.) In the early stages of the platform design, before any development, the team engaged in an intensive work with users, collaborating with them in a series of iterative processes to develop the platform. Error! Reference source not found. shows how the Open Target team sketched together with users to indicate genetics data supporting a particular target-disease association.

Page 15: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

13

Figure 6. A graphic representation showing potential visualizations for Open Targets platform resulting from joint collaboration with users (Karamanis et al., 2018)

27 interviews with scientists and managers working on drug discovery, observation of scientists working in early stages of target identification, workshops with users contributing to the design activities, user testing processes and regular feedback sessions within an iterative development process, resulted in the launch of the first version of Open Target Platform in December 2015. Figure 7 shows a diagram that visualizes an activity observed in a real setting, created during an interpretation session with the rest of the development team (Karamanis et al., 2018.)

Page 16: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

14

Figure 7. Diagram visualizing an activity observed in a real setting. It was created during an interpretation session with the rest of the development team (Karamanis et al., 2018)

Institutional barriers

There are inherent challenges in aligning incentives and sharing both risks and rewards across multiple stakeholders to pre-competitive scientific discovery (Deloitte, 2017.) The different organizations are motivated to cooperate together by their shared goals, while there is an inherent competition that needs to be managed in the ecosystem activity.

Data and knowledge sharing in Open Targets are guided by the following guiding principles that were agreed amongst the different stakeholders to openly share the data and results of their collaboration:

- “We are focused on pre-competitive research that will enable the systematic identification and prioritization of targets.

- We are committed to rapid publication and making data, methods, and results publicly available as soon as possible.

- We believe in non-exclusive partnerships that foster the free exchange of ideas and expertise.”8

As the consortium publicly states, they place in the public domain all their new informatics tools, experimental methods, platforms and the data generated by their projects as soon as is practical. Although the consortium does not seek patent protection for IP arising from Open Targets, they recognize that instances may arise where IP protection is appropriate to support their mission. For those cases, Open Targets has created a Joint Patent Committee with scientific and legal experts from all six organizations part of the

8 OpenTargets website: www.opentargets.org

Page 17: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

15

partnership, which reviews all potential publications (but not raw data) prior to submission. In practical terms, this policy translates into the following actions:

First, for the development and implementation of all Open Targets projects, each member in the partnership has agreed to license its background IP to other members. Second, all organizations in the Open Targets consortium have agreed to license Open Targets arising IP to each other for use in Open Targets projects and for the members’ research and development activities. Third, any IP that is directly related- and solely to- an industry partner compound will belong to the industry partner. Finally, any other IP arising from a research project in Open Targets will belong to the members that invented it.

As Jeffrey Barrett, Open Targets ex-director since March 2018, elaborates: "The precompetitive nature of CTTV (Open Targets) is critical: the collaboration of our members allows us to make the most of commercial R&D practice while making the data and information available to everyone. It is truly exciting to apply so many different areas of expertise, from cell biology to large-scale genome analysis, to the challenge of creating better medicines.” (Biogen, 2016)

5 Impact

5.1 For Science

Regarding the use of the Open Targets Platform, according to the latest impact study in 20169, 900 unique IP addresses access the Open Targets Platform every week. Some metrics available about Open Target platform usage, from April 2016- March 2017, reveal that the platform is used substantially (Figure 8).

Figure 8. Key metrics about Open Target Platform usage (April 2016- March 2017). (Source: Kafkas et al., 2017)

The metrics are also aligned also with the qualitative feedback reported by Open Targets team from platform users. For instance, as one drug discover scientist said: “Powerful resource, clear links and easy to use without training, especially for a non-bioinformatician!” (Kafkas et al., 2017)

Accelerating science

The scientific impact of the evidence available in the platform is substantial. Associations between drug target and disease are the main focus for both new drug development and drug repositioning. Scientists seek evidence supporting target-disease associations that can be stored in structured databases and integrated to obtain a comprehensive assessment in target validation studies (Kafkas et al., 2017).

9 The Value and Impact of the European Bioinformatics Institute: https://www.ebi.ac.uk/about/news/press-

releases/value-and-impact-of-the-european-bioinformatics-institute

Page 18: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

16

Figure 9. Evidence type, source and target-disease association objects (Source: Koscielny et al., 2016)

The results of computational approaches that have mined and assessed the target-disease associations uncovered in Open Targets are detailed in section 5.2 below.

5.2 For Industry

Catalysing drug discovery

Several publications have reported computational approaches that have mined the entirety of all target-disease associations in Open Targets uncovering many potential opportunities for drug discovery. Khaldakar et al. (2017) show a computational workflow implemented that systematically uses data from Open Targets to identify potential repositioning opportunities. Concretely, it uncovers 2540 potential new indications for 791 existing drug targets, and it further categorizes them based on diverse and multiple types of evidence. From these new opportunities uncovered, 1366 are for Orphanet rare diseases where the target is a known drug target for a common disease.

Rare diseases are considered diseases affecting less than 1-in-2000 people in the EU or less than 200.000 patients in the USA (Tambuyzer, 2010.) There are severe unmet needs for people suffering from rare disease, so the potential of repositioning an existing drug or clinical compound for a rare disease is an attractive approach being implemented for orphan drug development.

It is worth noting that the 34 rare diseases that are considered promising opportunities for drug development, Open Targets have been able to suggest potential drug-repositioning opportunities for 14 of them. In addition, taking into account the recent importance of genetics support for having higher success rates on drugs for targets, Khaldakar et al. (2017) highlighted that 628 (24%) of the 2540 new indications are supported by genetic evidence. These opportunities have been able to be classified when repositioning potential exists within the same therapy or across therapy areas.

Page 19: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

17

Consulting with the commercial source Pharmaprojects10, which gathers worldwide drug development pipeline data, 6% of all new target-disease pairs uncovered in Open Targets are in drug development, which is a conservative estimate because only indications that exactly match were considered. However, ongoing clinical trials for closely related indications could exist outside of this estimate. The computational workflows implemented show substantive potential opportunities that will need to be evaluated in detail, combining the evidence from the platform with endogenous evidence about the disease. The next steps would be to develop hypotheses and evaluate a potential validation of the molecule(s) for a repositioning indication and then move to the clinical development phase. Many of the opportunities already identified by Open Targets could eventually lead to effective drugs for patients.

Harnessing strengths of all stakeholders

A list of already available scientific publications shares the results of the Open Targets collaboration11. Additionally, other outcomes from Open Targets collaboration include three open source research tools and resources, which can foster and make more productive the work of all stakeholders:

- CELLector (Genomics Guided Selection of Cancer in vitro Models), which is a computational tool implemented in an open source R Shiny application and R package. By combining methods from graph theory and market basket analysis, the tool enables scientists to select the most relevant cancer cell lines in a genomic-guided fashion and leverages tumour genomics data to assess optimal cell line models in a user-friendly way.

- Link (Literature concept knowledgebase), which is a tool that allows the exploration of half a billion relations between genes, diseases, drugs and key concepts extracted from PubMed abstracts using Natural Language Processing. This open source tool was developed when Medline relaxed their license for accessing publication data and with the goal to support scientists to generate new hypothesis for the development of new drugs by exploiting the biomedical knowledge in the literature. It offers a pipeline that quickly runs a large-scale NLP analysis; an API that serves the resulting data; and a user interface to assess the data.

- Dorothea (Discriminant Regulon Expression Analysis) is a research resource that has been developed to search candidate TF-drug interactions in cancer.

5.3 For Society

Accelerating scientific knowledge in life sciences, reducing duplication efforts by putting together stakeholders that shared common interests in R&D, catalysing drug discovery and uncovering new opportunities for drug development for rare diseases are just some of the benefits that the partnership and its platform offer to society. The initiative started in 2015 and it is still early to assess the magnitude of its potential impact in the long run. Considering the lengthy and challenging process for drug development, it is still early to quantify the collaboration’s impact in terms of reducing the likelihood of drug failures. While we wait for such evidence to accumulate, the value of the educational outreach and scientific support to the biomedical community through data, technology and resource sharing are evident.

10 Pharmaprojects website : http://www.pharmaprojects.com 11 Open Targets related publications available at: https://www.opentargets.org/resources/

Page 20: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

18

6 Lessons Learnt

David-Hulcoop, Operations Director at Open Targets, highlights: “I’ve also been impressed with the strength of the relationships across Open Targets. Both GSK/Biogen (industry) and Wellcome Trust Sanger Institute/EMBL-EBI scientists (academia) are involved in the research projects as well as in the development of the Platform. Some of our GSK and Biogen employees actually sit in the same office as EMBL-EBI staff. We also have regular attendees on-site and online for weekly meetings as well as strong participation in the tri-yearly integration days, all with the aim to create a collaborative environment that focuses on science with minimal bureaucracy” (Open Targets, 2017).

Open Targets is an example of how the biotech and pharmaceutical industries have managed to work together in a pre-competitive model for drug discovery with scientists from public research infrastructures to fight against high attrition rates and reduce losses of time and effort from redundant research agendas. This model clashes with the traditional undisclosed research by different corporations engendering secrecy (Arshad et al., 2016), which often translate into duplicate efforts, wasting resources, and increasing the likelihood of failure.

Within the Open Targets Consortium, scientists both from industry and academia work together in a pre-competitive research phases being able to share the data, methods, and tools of their projects in a user-driven platform, while company scientists are to conceal the market intentions of such developments.

7 Policy conclusions

The fundamental conclusion from the Open Targets case study is that the inherent tension between the goals of scientific openness and commercial exploitation does not necessarily imply incompatibility, but a need to identify sophisticated solutions that adequately balance the divergent interests at different phases of scientific processes. In other words, companies are reluctant to share knowledge and data when we move towards the heart of what makes those companies truly competitively, but they can accommodate such openness if we are in a pre-competitive research phase and there is an agreement on a collaborative framework that defines boundary conditions.

Promote smart openness to accelerate innovation

It appears important to move away from the ideological debate that confronts the pro-Open with the pro-IP advocates and find the appropriate frameworks to foster rapid disclosure of scientific data towards scientific progress and social value, while not threatening innovation and industrial collaboration from research infrastructures and commercial entities.

Data publication is the beginning, not the end of public-private collaboration.

Sharing scientific data in the public domain is a necessary but insufficient requirement for scientific re-use, drug development, or other forms of commercialization. Close interaction from scientists who generate data and those that seek its re-use it is still needed to understand what data tells, how it was generated, and how to interpret it. The Open Targets case highlights the importance of working with the users of the data to understand how to design the appropriate workflows, queries, and visualizations that effectively support their work and increase scientific productivity.

The value of a federated approach with metadata and complementary services

In the case of Open Targets, a federated approach that facilitates access to the increasingly available repositories of scientific data has proven effective to facilitate data re-use. The aim of Open Targets was not to store all the data contributing to evidence but opted for a meta-data approach where the platform provides summary information about more

Page 21: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

19

granular sources. Because extant databases are uniquely tuned to deal with many of the specialized data sources, developing summary information of the available data has been an effective solution to data heterogeneity.

Finally, the very user-friendly search engines, APIs, analytical and visualization tools, and integration with important statistical applications further supports user utility.

Increased monitoring is necessary

Empirical studies that assess the impact of initiatives like Open Targets would not only increase awareness about the potential of such approaches for related fields of science and innovation but also help understand the effectiveness and generalizability of its governance mechanisms towards unrelated scientific and commercial domains.

Page 22: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

20

References

Altshuler, J.S., Balogh, E., Barker, A.D., Eck, S.L., Friend, S.H., Ginsburg, G.S., Herbst, R.S., Nass, S.J., Streeter, C.M. and Wagner, J.A., 2010. Opening up to precompetitive collaboration. Science Translational Medicine, 2(52), pp.52cm26-52cm26.

Arshad, Z., Smith, J., Roberts, M., Lee, W.H., Davies, B., Bure, K., Hollander, G.A., Dopson, S., Bountra, C. and Brindley, D. (2016) Open access could transform drug discovery: a case study of JQ1. Expert opinion on drug discovery, 11(3), pp.321-332.

Baker, M (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533(7604): 452–454. DOI: https://doi. org/10.1038/533452a

Biogen Press Release- 8th Feb 2016. Retreived 10th July from: https://www.ebi.ac.uk/about/news/press-releases/biogen-joins-pioneering-target-validation-collaboration

Chokshi, D. A., Parker, M. and Kwiatkowski, D. P. (2006). Data sharing and intellectual property in a genomic epidemiology network: policies for large-scale research collaboration. Bulletin of the World Health Organization, 84(5), 382-387.

Edwards AM, Bountra C, Kerr DJ, Willson TM. Open access chemical and clinical probes to support drug discovery. Nat Chem Biol. 2009 Jul;5(7):436–40. pmid:19536100

EMBL-EBI Press release 25th March 2014. Retrieved July 18, 2018, from: http://www.ebii.ac.uk/about/news/press-releases/CTTV-launch

Gothelf, J. and Seiden, J., eds (2013) Lean UX: Applying Lean Principles to Improve User Experience, O’Reilly Media

GSK, 6th December 2017: How genomics is driving a new era of drug discovery. Retreived 15th July 2018 from: https://www.gsk.com/en-gb/behind-the-science/innovation/how-genomics-is-driving-a-new-era-of-drug-discovery/

Hey, A. J. and Trefethen, A. E. (2003). The data deluge: An e-science perspective

Holtzman (2006). Data Withholding in Genetics and the Other Life Sciences: Prevalences and Predictors. Academic Medicine, 81(2), 137-145.

Kafkas, Ş., Dunham, I., & McEntyre, J. (2017). Literature evidence in open targets-a target validation platform. Journal of biomedical semantics, 8(1), 20.

Karamanis, N., Pignatelli, M., Carvalho-Silva, D., Rowland, F., Cham, J. A., & Dunham, I. (2018). Designing an intuitive web application for drug discovery scientists. Drug discovery today, 23(6), 1169-1174.

Khaladkar, M., Koscielny, G., Hasan, S., Agarwal, P., Dunham, I., Rajpal, D. and Sanseau, P., 2017. Uncovering novel repositioning opportunities using the open targets platform. Drug discovery today.

Koscielny, G., An, P., Carvalho-Silva, D., Cham, J.A., Fumis, L., Gasparyan, R., Hasan, S., Karamanis, N., Maguire, M., Papa, E. and Pierleoni, A., 2016. Open Targets: a platform for therapeutic target identification and validation. Nucleic acids research, 45(D1), pp.D985-D994.

Kowalczyk, S. and Shankar, K. (2011) Data sharing in the sciences. Annual review of information science and technology, 45(1), 247-294.

Lee WH (2015) Open Access Target Validation Is a More Efficient Way to Accelerate Drug Discovery. PLoS Biol 13(6): e1002164. https://doi.org/10.1371/journal.pbio.1002164

Page 23: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

21

Medchemnet, Feb 07, 2018. The ultimate validation of a target is a successful clinical trial’: an interview with Maya Ghoussaini. Retrieved 30th June 2018 from: https://www.medchemnet.com/users/6223-medchemnet/posts/29635-the-ultimate-validation-of-a-target-is-a-successful-clinical-trial-an-interview-with-maya-ghoussaini-

Open Targets n.d Retreived 15 July 2018 from https://www.opentargets.org/science/

Open Targets Outreach Blog post, retreived July 15 2018 from https://www.biostars.org/p/196813/

Open Targets blog post 27th February 2017. Retreived 20th June 2018 from : http://blog.opentargets.org/author/david-hulcoop/

Tambuyzer, E. (2010) Rare diseases, orphan drugs and their regulation: questions and misconceptions. Nat. Rev. Drug Discov. 9, 921–929

Tufts Center for the Study of Drug Development “Personalized Medicine Gains Traction but Still Faces Multiple Challenges” Impact Report, May/ June 2015

Weigelt J. The case for open-access chemical biology. A strategy for pre-competitive medicinal chemistry to promote drug discovery. EMBO Rep. 2009 Sep;10(9):941–5. pmid:19721463

Page 24: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

Getting in touch with the EU IN PERSON All over the European Union there are hundreds of Europe Direct Information Centres. You can find the address of the centre nearest you at: http://europa.eu/contact ON THE PHONE OR BY E-MAIL Europe Direct is a service that answers your questions about the European Union. You can contact this service – by freephone: 00 800 6 7 8 9 10 11 (certain operators may charge for these calls), – at the following standard number: +32 22999696 or – by electronic mail via: http://europa.eu/contact Finding information about the EU ONLINE Information about the European Union in all the official languages of the EU is available on the Europa website at: http://europa.eu EU PUBLICATIONS You can download or order free and priced EU publications from EU Bookshop at: http://bookshop.europa.eu. Multiple copies of free publications may be obtained by contacting Europe Direct or your local information centre (see http://europa.eu/contact) EU LAW AND RELATED DOCUMENTS For access to legal information from the EU, including all EU law since 1951 in all the official language versions, go to EUR-Lex at: http://eur-lex.europa.eu OPEN DATA FROM THE EU The EU Open Data Portal (http://data.europa.eu/euodp/en/data) provides access to datasets from the EU. Data can be downloaded and reused for free, both for commercial and non-commercial purposes.

Page 25: Open Science Monitor Case Study Open Targets · 2018-10-30 · 7. Next-generation sequencing 8. Bioinformatics 9. High-performance computing The methods used by Open Targets include

Open Targets is an innovative, large-scale, public-private collaboration on pre-competitive research that provides comprehensive and up to date data for drug target identification and prioritisation. Open Targets integrates publicly available information and data relevant to targets and diseases in the Open Targets Platform; and performs high throughput experimental projects that generate target-centred data in physiologically relevant systems to understand causal relationships between targets and diseases. A cornerstone of this public-private collaboration is an agreement among the organisations that all data and resources generated within Open Targets should be made available rapidly in the public domain to the entire scientific community, including informatics tools, experimental methods, and all sequence data generated. The present case provides a unique opportunity to explore how EMBL-EBI, Europe’s flagship laboratory for life sciences and the pharmaceutical and biotech companies involved in the partnership managed to generate a framework of collaboration that allows to openly share in a unique user-driven platform all data and tools developed with the goal to accelerate both scientific research and drug discovery.

Studies and reports

[Catalogue num

ber]