unm division of biocomputing public web applications

1
Mesa OpenEye OpenBabel CDK UNM Division of Biocomputing public web applications: Computational tools for cheminformatics and molecular discovery Jeremy Yang, Division of Biocomputing, University of New Mexico, Albuqueruqe, New Mexico, USA ChemAxon US User Group Meeting, Boston, September 13-15, 2010 Public web applications To benefit and engage the scientific community Cheminformatics and biomolecular discovery http://pasilla.health.unm.edu Employs diverse set of commercial, open- source and community-based components Limited bandwidth; etiquette essential WWW reigns! Now superpowered. Over the roughly 15 year history of the world wide web (WWW) the prevalence and usefulness of web applications has increased continuously. The “Web OS” paradigm is increasingly a reality, given tools such as GoogleDocs and Microsoft Office Web Apps, web services and cloud computing. Although web apps are not new per se, greatly enhanced capabilities are available via web apps now due to continual and dramatic improvements in (1) network bandwidth, (2) processor power, (3) web software development methods and (4) online data resources. In short, each year we can in practice do things with web apps we could not the year prior. What has not changed is the primary motivation for adoption of web apps: Deploying functionality via web apps is efficient and reliable for users, developers and managers. In addition and importantly (and this has changed somewhat as web UIs have become more complex), by virtue of their common features, web apps are generally easy to use. Web apps in cheminformatics In cheminformatics, the emergence of high quality programming toolkits, both commercial and community-based, has facilitated web app development with highly diverse aims and methods. Major software providers such as Daylight, Accelrys, ChemAxon and Chemical Computing Group have embraced web technologies, understanding their advantages and broad appeal. Web apps can vary from large scale enterprise tools (e.g. database access) to special purpose rapid prototypes for researching new algorithms. Diverse areas of research have been addressed with web apps, from toxicology prediction to 3D macromolecular visualization to quantum chemistry. Science enabled via WWW Web apps can also amount to a sort of scientific publishing. Whereas a journal article on computational methodology can leave many questions, and just as a picture can be “worth 1000 words”, arguably a functioning program might be worth 1000 pictures. In addition, agencies funding public research (such as NIH), where outcomes include computational methodology, quite reasonably and wisely require that such methodology and software be disseminated in effective, extensible, sustainable ways. Web apps developed with modern software engineering standards can well achieve these goals. Recently our app smartsfilter was used for a tuberculosis study 1 . ChemTattoo: colorized commonalities 4 UNM Division of Biocomputing The UNM Division of Biocomputing is a multi-disciplinary research group within the Dept. of Biochemistry and Molecular Biology, in the UNM School of Medicine. Research areas include drug discovery, drug informatics, cheminformatics, bioinformatics, machine learning, QSAR, lead and probe identification. A major effort of the group is providing biomolecular screening informatics support for the UNM Center for Molecular Discovery (UNMCMD), an NIH Roadmap Molecular Libraries Programming screening center. As with UNMCMD, a long-term collaboration with Givaudan Flavors S&T has involved web apps and contributed to their development. These activities, and various other projects with collaborators who are geographically diverse, have motivated extensive use of web apps. Tudor I. Oprea (chief) Cristian Bologa Stephen Mathias Jerome Abear Oleg Ursu Gergely Zahoransky-Kohalmi Jeremy Yang Division of Biocomputing Personnel: Rapid prototyping enables research For scientific research web apps provide powerful and enabling rapid prototyping capabilities. Effective research often depends on testing and interatively modifying hypotheses and algorithms, and communication among collaborators. When developers can rapidly deploy a web app implementing a new algorithm, this can allow a perhaps geographically dispersed team to experiment, collaborate, and accelerate progress. Bio- Activity Datamining Associative Promiscuity Pattern Learning Engine iPHACE: Integrated Pharmacology Space Exploration 5 Starting lineup of cool tools The new public web app server offers a diverse set of tools. Some are quite simple but useful routine functions, such as depicting molecules or converting among file formats. clustermols.cgi provides an enabling front-end for a successful command-line product (Mesa's Grouping Module). Others implement complex and experimental methodology, such as iPHACE (integrative navigation of pharmacological space)i. and “model- free” drug-likeness 2 . Several scientific software packages have been employed including ChemAxon, Mesa, OpenEye, CDK, OpenBabel, SciTouch and others. Generic development tools include Java, Python, R, and Gnuplot. We anticipate that the menu of web apps will expand and evolve with new research and projects. WWW a great machine. WWW are We. Our current capability to deliver great computational power via the WWW owes debts of thanks to the communities of developers and users who have pushed all the component technologies forward. In addition to the outstanding scientific software vendors and projects, the generic components which have enabled progress are many and spectacularly significant. There are many, but these are a few of particular importance to the web: Mozilla/Firefox, Apache HTTP server, Apache Tomcat, Java, Perl, Python, Ruby, PHP, JavaScript, CSS and many contributed packages and libraries written for these major projects. Web standards have progressed with the now typical semi-regulated, semi-democratic, two steps forward one step back fashion. Viewed in toto and in retrospect, it is startling to consider the resulting global organization and interoperability of these many and disparate efforts. Despite its flaws and ongoing evolution, the WWW can be viewed as one of the most complex, most powerful, most inherently interactive machines ever built. As we pursue goals in computational and data- intensive (“Fourth Paradigm”) research, engaging the WWW effectively will be essential. References: (1) "Analysis and hit filtering of a very large library of compounds screened against Mycobacterium tuberculosis", S. Ekins, Molecular BioSystems, 2010 (in press). (2) Ursu O, Oprea TI., “Model-Free Drug-Likeness from Fragments”, J. Chem. Inf. Model. Publication Date (Web): July 22, 2010. (3) "Clustering in Bioinformatics and Drug Discovery", by John D. MacCuish and Norah E. MacCuish, Chapman & Hall, 2010. (4) NE Shemetulskis, D Weininger, CJ Blankley, JJ Yang, C Humblet, "Stigmata: An Algorithm to Determine Structural Commonalities in Diverse Datasets", JCICS, 1996, Vol 36, No 4, pp 862-871. (5) Garcia-Serna R, Ursu O, Oprea TI, Mestres J., “iPHACE: integrative navigation in pharmacological space”, Bioinformatics, 2010, 26: 985-986. “WWW dead” definitional delusion Recent pronouncements that “The Web is dead” (Wired, Sept. 2010), supplanted by managed apps via closed environments (e.g. iPad), seem dubious and dependent on a narrow definition of web (HTTP). Tim Berners- Lee's functional concept of the Semantic Web seems more enlightening, a connected global network of information resources well suited to humans and machines, using standardized communication protocols and semantics. Insofar as the Internet platform enables this vision, the Internet is the Web. Powered by: SciTouch smarts filters clustering 3 drug-likeness 2 property profiling depiction ROC curves format conversion similarity

Upload: jeremy-yang

Post on 04-Jun-2015

629 views

Category:

Documents


1 download

DESCRIPTION

Computational tools for cheminformatics and molecular discovery. Poster presented at the ChemAxon 2010 US UGM.

TRANSCRIPT

Page 1: UNM Division of Biocomputing public web applications

Mesa OpenEye OpenBabel CDK

UNM Division of Biocomputing public web applications: Computational tools for cheminformatics and molecular discovery

Jeremy Yang, Division of Biocomputing, University of New Mexico, Albuqueruqe, New Mexico, USAChemAxon US User Group Meeting, Boston, September 13-15, 2010

● Public web applications● To benefit and engage the scientific community● Cheminformatics and biomolecular discovery● http://pasilla.health.unm.edu● Employs diverse set of commercial, open-source and community-based components● Limited bandwidth; etiquette essential

WWW reigns! Now superpowered.Over the roughly 15 year history of the world wide web (WWW) the prevalence and usefulness of web applications has increased continuously. The “Web OS” paradigm is increasingly a reality, given tools such as GoogleDocs and Microsoft Office Web Apps, web services and cloud computing. Although web apps are not new per se, greatly enhanced capabilities are available via web apps now due to continual and dramatic improvements in (1) network bandwidth, (2) processor power, (3) web software development methods and (4) online data resources. In short, each year we can in practice do things with web apps we could not the year prior. What has not changed is the primary motivation for adoption of web apps: Deploying functionality via web apps is efficient and reliable for users, developers and managers. In addition and importantly (and this has changed somewhat as web UIs have become more complex), by virtue of their common features, web apps are generally easy to use.

Web apps in cheminformaticsIn cheminformatics, the emergence of high quality programming toolkits, both commercial and community-based, has facilitated web app development with highly diverse aims and methods. Major software providers such as Daylight, Accelrys, ChemAxon and Chemical Computing Group have embraced web technologies, understanding their advantages and broad appeal. Web apps can vary from large scale enterprise tools (e.g. database access) to special purpose rapid prototypes for researching new algorithms. Diverse areas of research have been addressed with web apps, from toxicology prediction to 3D macromolecular visualization to quantum chemistry.

Science enabled via WWWWeb apps can also amount to a sort of scientific publishing. Whereas a journal article on computational methodology can leave many questions, and just as a picture can be “worth 1000 words”, arguably a functioning program might be worth 1000 pictures. In addition, agencies funding public research (such as NIH), where outcomes include computational methodology, quite reasonably and wisely require that such methodology and software be disseminated in effective, extensible, sustainable ways. Web apps developed with modern software engineering standards can well achieve these goals. Recently our app smartsfilter was used for a tuberculosis study1.

ChemTattoo: colorized commonalities4

UNM Division of BiocomputingThe UNM Division of Biocomputing is a multi-disciplinary research group within the Dept. of Biochemistry and Molecular Biology, in the UNM School of Medicine. Research areas include drug discovery, drug informatics, cheminformatics, bioinformatics, machine learning, QSAR, lead and probe identification. A major effort of the group is providing biomolecular screening informatics support for the UNM Center for Molecular Discovery (UNMCMD), an NIH Roadmap Molecular Libraries Programming screening center. As with UNMCMD, a long-term collaboration with Givaudan Flavors S&T has involved web apps and contributed to their development. These activities, and various other projects with collaborators who are geographically diverse, have motivated extensive use of web apps.

●Tudor I. Oprea (chief)●Cristian Bologa●Stephen Mathias●Jerome Abear

●Oleg Ursu●Gergely Zahoransky-Kohalmi●Jeremy Yang

Division of Biocomputing Personnel:

Rapid prototyping enables researchFor scientific research web apps provide powerful and enabling rapid prototyping capabilities. Effective research often depends on testing and interatively modifying hypotheses and algorithms, and communication among collaborators. When developers can rapidly deploy a web app implementing a new algorithm, this can allow a perhaps geographically dispersed team to experiment, collaborate, and accelerate progress.

Bio-ActivityDataminingAssociativePromiscuityPatternLearningEngine

iPHACE:Integrated PharmacologySpaceExploration5

Starting lineup of cool toolsThe new public web app server offers a diverse set of tools. Some are quite simple but useful routine functions, such as depicting molecules or converting among file formats. clustermols.cgi provides an enabling front-end for a successful command-line product (Mesa's Grouping Module). Others implement complex and experimental methodology, such as iPHACE (integrative navigation of pharmacological space)i. and “model-free” drug-likeness2. Several scientific software packages have been employed including ChemAxon, Mesa, OpenEye, CDK, OpenBabel, SciTouch and others. Generic development tools include Java, Python, R, and Gnuplot. We anticipate that the menu of web apps will expand and evolve with new research and projects.

WWW a great machine. WWW are We.Our current capability to deliver great computational power via the WWW owes debts of thanks to the communities of developers and users who have pushed all the component technologies forward. In addition to the outstanding scientific software vendors and projects, the generic components which have enabled progress are many and spectacularly significant. There are many, but these are a few of particular importance to the web: Mozilla/Firefox, Apache HTTP server, Apache Tomcat, Java, Perl, Python, Ruby, PHP, JavaScript, CSS and many contributed packages and libraries written for these major projects. Web standards have progressed with the now typical semi-regulated, semi-democratic, two steps forward one step back fashion. Viewed in toto and in retrospect, it is startling to consider the resulting global organization and interoperability of these many and disparate efforts. Despite its flaws and ongoing evolution, the WWW can be viewed as one of the most complex, most powerful, most inherently interactive machines ever built. As we pursue goals in computational and data-intensive (“Fourth Paradigm”) research, engaging the WWW effectively will be essential.

References:

(1) "Analysis and hit filtering of a very large library of compounds screened against Mycobacterium tuberculosis", S. Ekins, Molecular BioSystems, 2010 (in press).

(2) Ursu O, Oprea TI., “Model-Free Drug-Likeness from Fragments”, J. Chem. Inf. Model. Publication Date (Web): July 22, 2010.

(3) "Clustering in Bioinformatics and Drug Discovery", by John D. MacCuish and Norah E. MacCuish, Chapman & Hall, 2010.

(4) NE Shemetulskis, D Weininger, CJ Blankley, JJ Yang, C Humblet, "Stigmata: An Algorithm to Determine Structural Commonalities in Diverse Datasets", JCICS, 1996, Vol 36, No 4, pp 862-871.

(5) Garcia-Serna R, Ursu O, Oprea TI, Mestres J., “iPHACE: integrative navigation in pharmacological space”, Bioinformatics, 2010, 26: 985-986.

“WWW dead” definitional delusionRecent pronouncements that “The Web is dead” (Wired, Sept. 2010), supplanted by managed apps via closed environments (e.g. iPad), seem dubious and dependent on a narrow definition of web (HTTP). Tim Berners-Lee's functional concept of the Semantic Web seems more enlightening, a connected global network of information resources well suited to humans and machines, using standardized communication protocols and semantics. Insofar as the Internet platform enables this vision, the Internet is the Web.

Powered by: SciTouch

smartsfilters

clustering3

drug-likeness2

propertyprofiling

depiction

ROC curves

formatconversion similarity