indo us 2012

Post on 10-May-2015

602 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

OSDD Presentation at Washington DC

TRANSCRIPT

Abhik Seal Phd Student Indiana University)

(Researcher OSDD CSIR)

Anshu Bhardwaj Scientist, OSDD Unit

Council of Scientific & Industrial Research Delhi, India

23rd March 2012, Washington DC http://www.osdd.net

Open Source Drug Discovery CSIR-led Team India Consortium with Global Partnership

Affordable Healthcare for All

Cheminformatics and Open Source Drug Discovery: a case study in academic collaboration between the

U.S. and India

First Disease Target : Tuberculosis Tuberculosis (TB) is one of leading causes of fatality, ranking second only to HIV as the killer infectious disease of adults worldwide.

Source: http://www.globalhealthfacts.org/data/topic/map.aspx?ind=12

OSDD Focus : Tropical Neglected Diseases

At least one person in the world is newly infected with TB bacilli every second

Over 1000 deaths a day or 3 deaths every 2 mins

New TB cases 2010

Countries that had reported at least one XDR-TB case by end March 2011

Argentina Bhutan France Japan Namibia Republic of Korea ThailandArmenia Cambodia Georgia Kazakhstan Nepal Republic of Moldova TogoAustralia Canada Germany Kenya Netherlands Romania TunisiaAustria Chile Greece Kyrgyzstan Norway Russian Federation UkraineAzerbaijan China India Latvia Pakistan Slovenia United Arab EmiratesBangladesh Colombia Indonesia Lesotho Peru South Africa United KingdomBelgium Czech Republic Iran (Islamic Rep. of) Lithuania Philippines Spain United States of AmericaBotswana Ecuador Ireland Mexico Poland Swaziland UzbekistanBrazil Egypt Israel Mozambique Portugal Sweden Viet NamBurkina Faso Estonia Italy Myanmar Qatar Tajikistan

TB Drug Discovery

It commemorates the discovery of TB bacillus (Mycobacterium tuberculosis) through sputum microscopy which is still the diagnostics used to detect TB! No progress whatsoever, and we are discussing 'network communications'

World TB Day is 24th March 2012

Challenges with Drug Discovery of Neglected Diseases

• Lack of market incentives • TB is a complex disease – latency, relapse, resistance • Clinical trials take a long time & study of relapse

needs long follow up (up to 18months) • Patient access is not direct, is through government

agencies

Conventional vs Open Innovation Approach to Drug Discovery

Corporate HQ R&D

Cancer R&D

Neurological Disorder

Packaging

Sales

Clinical Trial

… R&D

Diabetics

Production

Pre-Clinical Trial Formulation

Conventional vs Open Innovation Approach to Drug Discovery

Research groups Industry collaboration Individual participation Open Data Sharing

OSDD Process Flow

Clinical trials

Public Funding of Clinical Trials

Government of India commitment - $46 million

Drug Target Identification

Virtual Screening

Chemical Synthesis/library

Screening/ Hit

identification

Hit to

Lead

Clinical Trials

Candidate

45

19

9

6

2

Status: OSDD Projects

Other projects aim to develop tools, databases and repositories for the OSDD community

1

September 2008…………………………………………………………………March 2012

OSDD Platform

System Architecture

Collaborative tools to accelerate neglected diseases research” in the book “Collaborative Computational Technologies for Biomedical Research”. Wiley and Sons. 2011

Gene/operon predictions

Gene Expression

Regulatory Elements

Variation and repeats

Orthologs

Drug targets

Pathway/ Networks

More than a Million Data Points are now “Linked”

Deeksha Bhartiya Nitin Kumar

Mtb Data

* This is representative set of post-genomics data available on TB Collaborator: Dr. Vinod Scaria

Post-genomics data on Mtb is ‘Linked’ from disparate resources

s.no. Source Tracks

1 UCSC Genome Browser on Mycobacterium tuberculosis H37Rv 06/20/1998 Assembly 6

2 WebTb Operon Map

3 Argo Genome Browser not web based

4 PGBrowser: Pathogen Genome Browser 3

5 BioHealthBase 16

6 Ensembl ~15

7 Tbrowse 100

Comparison of Browsers

DeekshaBhartiya

OpenLabNoteBook on SysBorgTB http://sysborgtb.osdd.net/bin/view/OpenLabNotebook/TBMapDataset

Deeksha Bhartiya Nitin Kumar

From a mathematical point of view, to create an accurate model of a single mammalian cell may require generating and then solving somewhere between 100,000 to one million equations

Biology is complex !!

http://news.vanderbilt.edu/2011/10/robot-biologist/

The human brain can only process seven pieces of data at a time!!! Need automation & new

technology to address the complexity

Literature

Annotation Tools

Genomic Databases

Curated Annotations

Raw Annotations

OSDD C2D Community

800+ Student Researchers

Collaborative Curation

Pathway/Interactome | Gene Ontology | Protein Structure/Fold | Glycomics| Immunome

The “Connect to Decode” Programme

Community Curation!!

Wrong (mark in red)

Right (mark in green)

Online discussion

Working on the cloud..

Many eye balls, make the ‘bug’ shallow!!!

Mtb Metabolome Map on Payao

Sub-map of the metabolic network on Payao

SBI developed customized plug ins for OSDD for generating the metabolic map

C2D April 2010 – Onsite Activity

iOSDD890 From Social Network to Biological Network

OSDD Community Effort to Understand Mtb Biology

Within weeks, 830 volunteered to re-annotate the entire M. tuberculosis genome. The work started in December 2009 and was completed by April 2010, packing nearly 300 man-years into 4 months!

Source: Munos B. Can Open-Source Drug R&D Repower Pharmaceutical Innovation? Clin Pharmacol Ther 2010;87:534–536

Source: Hiroaki Kitano Nature Chemical Biology 7, 323–326 (2011)

Social engineering for virtual 'big science' in systems biology

Connect to Decode Phase II - Themes

Large student community from colleges and university are Cloning, Expressing and Purifying selected Mtb genes

To clone and express select genes of Mycobacterium tuberculosis Open Access Repository of Mtb clones

More than 120 sequence confirmed clones are ready for distribution

http://sysborg2.osdd.net/group/sysborgtb/project-details/-/projects/show/3212

OSDDChem: Open Chemistry Initiative

A Large number of molecules are being

submitted for screening

Bhardwaj et al. Tuberculosis (Edinb). 2009 Sep;89(5):386-7

http://tbrowse.osdd.net

Computational Resources developed with Community participation

Bhardwaj et al. 2011 John Wiley & Sons, Inc.

Mtb essential genes database

TrapTB Mtb drug targets database

Chembio Toolkit Workflow engine with federated resources

AmPhyDB Antimycobacterial Phytomolecule Database

http://sysborg2.osdd.net

A Comprehensive database of Mtb transporters Mtb-Human Interaction Database

Q. Find novel genes and mutations & map known drug resistance mutations on genome of an MDR-TB strain

Enabling Complex Computational Analysis For Experimental Biologists/Chemists

Galaxy provides - Simplified GUI design Ease of integrating modules Fewer components for creating workflows Sharable workflows for better collaboration

Get data customized for extracting files from open lab note book

Custom APIs for importing input files from OSDD’s open lab note books

Workflows and the result of the workflows are stored as separate lab note books Lab note book has details of the experiments performed Results of one experiment may be invoked for analysis in another experiment All versions of the workflow and the results are stored Flexibility to execute nested workflows

Custom APIs for exporting results to OSDD’s Open lab note book

Our Approach : Data & Tool integration

In addition to access heterogeneous sources of data like BioMart Central/UCSC Table Browser (http://genome.ucsc.edu/), Open lab note

book of http://sysborg2.osdd.net is interfaced with Galaxy

Standalone databases and tools Tools as web services:

• Web services can be added as tools in Galaxy • Extends the potential of galaxy workflows

The process

Identify the module

Search for the WSDL

Code for client

Write XML for Galaxy

Configure & Integrate to

Galaxy

ChemBio toolkit : >300 Modules integrated by OSDD Community

S. No Resources Clients 1 KEGG: Kyoto Encyclopedia of Genes and Genomes 60 2 GetEntry: DDBJ sequence search by accessionID 43 3 GPSR : tools 33 4 PDB : Protein Data Bank 30 5 BioModel:mathematical models of biological DB 25 6 Gtps : Gene Trek in Prokaryote Space 8

7 WSDbfetch: retrieve entries from biological dbs using entry identifiers or accession no. 7

8 Gibv: Genome Information Broker for Viruses 7 9 DDBJ :DNA Data bank of Japan 7 10 Mafft: a multiple sequence alignment program 4 11 Fasta:- DDBJ database 4 12 Ensembl : maintains automatic annotation 4 13 VecScreen vector contamination 4 14 OMIM:Online Mendelian Inheritance in man 4 15 Gtop: Gene-product Informatics 3 16 GO: Gene Ontology 3 17 SPS : Splicing Profile based Score 2 18 GIBIS: Genome Information Broker for Insertion Sequence 1 19 RefSeq: database of sequence 1 20 GIB: Genome Information Broker 1 21 GIBEnv- DDBJ database 1 22 TxSearch: Database indexing & searching 1

OSDD Community suggests tools for integration in Galaxy

Pubchem Bioassay data

(approx. 100,000

molecules/ dataset

6000 descriptors/molecule

Successful Models

Screen PubChem

(30 million)

Data amplification: Cheminformatics

Potential Hits

o Down sizing and random validation require multiple calculation for validation of results o Cross validation up to 50+ time for each experiment

C-DAC’s Garuda Grid – Indian Grid Computing Initiative

C-DAC is R&D organization under Ministry of Communication & Information

Technology, India

C-DAC’s Garuda Grid is targeted at providing a facility for the scientific community,

which would enable them to seamlessly access the distributed resources.

Compute Power of GARUDA: ~ 70TFs (6000

CPUs)

Currently there are 55 Garuda Partners

Has NKN (National Knowledge Network) connectivity at 10Gbps

Features:

Customized Galaxy on GARUDA • Integrated with Grid Authentication mechanism - Indian Grid Certificate

Authority (IGCA)

• Integrated with Gridway Metascheduler - Job scheduling and management

• Integrated OSDD tools - Weka (for data mining) and Autodock (Virtual screening).

• Provided support to upload multiple input files as tar file

• Data libraries of OSDD community are uploaded and are shared by all users

• Integrated with PostgreSQL

Garuda- Galaxy Job Submission - Flow

Garuda-OSDD Server

Galaxy GUI

1. User selects tool and Input parameters

Galaxy Job Manager

Gridway Job runner

3. Gridway job runner uses user’s Garuda proxy file for job submission

2. Based on Tool, it sends the job to the correct runner.

Internet

Weka in Galaxy

Garuda Usage by OSDD: Job Accounting

High Performance Grid Computing for OSDD members

Anshu Bhardwaj Council of Scientific & Industrial Research (CSIR),

India

Chintalapati Janaki, Center for Development of Advanced Computing (C-DAC),

India

www.osdd.net 25-26 May 2011

Customized Galaxy with applications as Web Services and on the Grid for Open Source Drug Discovery (OSDD)

A CSIR led team India consortium with global partnership for affordable healthcare

“In the long history of human mankind those who have learned to collaborate and improvise most effectively have prevailed.” -- Charles Darwin

Cheminformatics: a strong case for community collaborative science

There is now an incredibly rich resource of public information relating compounds, targets, genes, pathways, and diseases. Just for starters there is in the public domain information on:

~30 million compounds and ~500,000 bioassays (PubChem, ChemSpider) ~60 million compound bioactivities (PubChem Bioassay) ~5,000 drugs (DrugBank) ~9 million protein sequences (SwissProt) and ~60,000 3D structures (PDB) ~14 million human nucleotide sequences (EMBL) ~20 million life science publications (PubMED) Multitude of other sets (drugs, toxicogenomics, chemogenomics, metagenomics …)

I have thus chosen ‘Cheminformatics’ to study the vast pool of chemical compounds much more in details and analyze so as to narrow down to potential drug candidate. With the unique combination of IT and Chemistry, I am confident that one can actually derive much more

meaningful information of a chemical entity on this earth. Rajdeep (BioIT) I am organic chemist. I prepared several organic molecules.We go for biological activity,

maximum times it gives negative result. But with help of informatics in chemistry we can predict molecular properties. We can replace many ligands or substituents or functional group easily. And we can design our desirable molecule. ---Chirupulo

I am doing my M.Pharm in pharmaceutical chemistry,and i like cheminformatics that i need

accurate results but soon....and i am really interested in molecular modelling...so I am here. --- Haffy manaf

Cheminformatics deals with information about chems. It combines tools and techniques of IT

for information about chemical entities at the finger tip on click of a mouse. Databases are available for properties of descriptors. Softwares help to calculate molecular properties. Cheminformatics thus come handy tool for learning chemistry.------ Dr Keshav Mohan

Community Speaks: What excites them about Cheminformatics

• Access to Journals for Chemical Structures • Lack of proper communication systems other than skype • Lack of software tools for accelerated drug discovery • Need of high speed internet • Need more experts to teach/train community members • Proper time schedule of IU cheminformatics classes

Challenges in implementation of Cheminformatics projects

Indiana University Initiatives (Prof David J Wild)

Cheminformatics Awareness

http://icep.wikispaces.com

Association Search – visualize literature supported associations between any two entities (compound, drug, gene, pathway, disease, side effect). PLoS One, in press.

Semantic Link Association Prediction (SLAP) – find most highly associated entities (compound, drug, gene, pathway, disease, side effect) to any other entity, based on probabilistic weightings of graph edges based on public experimental datasets. Paper in preparation

BioLDA – find most highly associated entities to any other entity based on a complex topic model analysis of the literature (PubMed). PLoS One, 2011, 6 (3), e17243

See also: WENDI (J. Cheminf., 2010,2,6); Chemogenomic Explorer (BMC Bio. 2011,12,256), ChemLDA, ChemBioGrid (J. Chem. Inf. Model., 2007; 47(4) pp 1303-1307)

Tools Developed for Large Scale Bio-Chemical Data Minning

OSDD virtual resources

Cheminformatics

Curated molecule datasets

Cheminformatics Models

Data Mining and Analysis

HT Virtual screening

PubChem

ChEMBL

DrugBank

Experimental Assays

Community of About 400

Other Active Communities: •OSDD Women Scientists Forum •OSDD Junior Scientists Forum

Ideal Case US-India Cheminformatics Collaboration

IU CCRG

Research

Education Industry partnerships OSDD

Wet lab research

Open cheminfo.

group

Many interested students

Funding for research in U.S.

$1.3m NIH

$360,000 Eli Lilly $120,000 Pfizer

Funding for research in

osdd

$46m Govt

$0

But in order to sustain…?

Most of the biologists and chemists do not use computational workflows for their analysis

Awareness about the advantages of using such workflow engines

The Community needs to be trained for using the workflows

The Community needs to be trained for integrating applications

Web services vs standalone applications – each have their own set of advantages and limitations

Developers of algorithms should be encouraged to report results in globally accepted standard formats with standard ontologies

What should be our approach to reach out and integrate?

Assembly line for drug discovery

I Biological Repository

i. Open access clinical strains repository ii. Open access clone repository iii. Open access protein repository

II Chemical Repository i. Open access small molecule repository

III Open Screening Facility

I. Submit your compounds for anti-tuberculosis screening

OSDD Open Access Resources

Inhibition of FAAL and FACL enzymes by acyl-sulfamoyl

analogues

O O

NNO

CF3

s12

s14 s15

Preclinical development of thiophene containing

trisubstituted methanes

• Five synthetic ‘thiophene containing trisubstituted methanes’, which showed a MIC of <1.56 µg/ml, no cytotoxicity in mammalian cells being synthesised in PPP Mode

Public Private Partnerships as Open Collaborative Endeavors to solve Scientific Challenges

Collaboration with TB Alliance on Human Clinical Trials

PA-824 in combination with other drugs

Affordable Healthcare for All

Systems Biology

Target based

approach Human Clinical Trials

Hit to Lead Ligand based

approach

An Innovative Approach to Drug Discovery: A New Paradigm

Valu

e

Biology/ Genomics

Target Identification

Target Validation

Hit(s)

Validated/ Quality Lead

Optimised Candidate Drug

Clinical Trials

Registered Drug

Risk

High Risk, Innovation Driven Sphere Strategy-> Open Innovation with best minds from academia/ industry

Process Oriented – Strategy-> Industry CRO’s Participation

Strategy-> OSDD to support clinical trials in collaboration with pharma

Innovation Funnel

Drugs to be available without IP encumbrances

Major International Collaborations

Cheminformatics and e-learning

Structural Interactome to predict Off-Site Interactions of Drug Candidates

Metabolic Map Network Generation

Author, Angela Saini

Geek Nation: How Indian Science Is Taking Over The World

http://www.sunday-guardian.com/bookbeat/tour-of-indian-science-that-fails-to-see-full-picture

Science 24 February 2012: Vol. 335 no. 6071 p. 909

NEWS FOCUS

OSDD Portfolio

March 2012

OSDD Community &

The Team Leaders

Not all are shown

Target Validation

PPI Validation

Cloning of potential drug targets

Galaxy Integration with Grid

Some of the OSDD PIs

Mtb Systems Biology

Mtb Genome Analysis

OSDDChem

Email: anshub@osdd.net Skype: anshu.bhardwaj

Cheminformatics Community + E-learning

OSDD : A Global Community - More than 5500 members from over 130 countries

Statistics as of March 2012

Open Source Drug Discovery (OSDD) Model “Team India Consortium with International Participation”

Council of Scientific and Industrial Research (CSIR), India

Current Partners

Mycobacterium tuberculosis

Wiki Portal

Exchange of Ideas/Results Community Participation

Lead Molecules Drug

Contract Research Organisations

Academia & Hospitals

Open Synthesis and Exchange

of Knowledge

PRECLINICAL & CLINICAL TRIAL

Candidate Targets

in silico SCREENING

in vivo VALIDATION

Lead Organization

Together we can … .. and we should !

Matt Smadley | Flickr.com

http://www.osdd.net http://c2d.osdd.net

http://sysborg2.osdd.net

Email: info@osdd.net anshub@osdd.net abhik1368@gmail.com Skype: anshu.bhardwaj

http://scienceopenscience.blogspot.com/2011/12/osdd-song.html

top related