Transcript
Page 1: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

Adding structures to Web pages and data to

structuresAlex Allardyce

ChemAxon

Presented at ACS Spring Meeting, Anaheim, 2011

Page 2: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

Demo – index page

• Lay out input box• Recently chemicalized, recent

queries…• Drag and drop structure images• Help, about

Example: http://www.chemicalize.org/

Presented at ACS Spring Meeting, Anaheim, 2011

Page 3: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

Demo – chemicalizing a Web page• URL paste

• Structure images• TOC and links• Properties link from mouse over image• Download• Links work

Example: http://www.chemicalize.org/?url=http%3A%2F%2Fen.wikipedia.org%2Fwiki%2FPenicillin

Presented at ACS Spring Meeting, Anaheim, 2011

Page 4: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

Demo – Structure based predictions• Properties

• Manage views, move boxes• Open MarvinView from double click on

any structure image• Calculate on demand• Download results

Example: http://www.chemicalize.org/structure/#!mol=Penicillin&source=parser

Presented at ACS Spring Meeting, Anaheim, 2011

Page 5: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

Demo – Structure search

• Chem search pages• Search from Calculate properties• Open Marvin, power query features• Similarity default search, see other types

• Choose a structure• List of URL’s, chemicalized links• Show structures• Combine chem search with URL• Download results

Examples: 1. http://www.chemicalize.org/search/#m=Penicillin/t=t/h=0 2. http://www.chemicalize.org/search/#m=Penicillin/t=t/h=0/c=46260/p=0

Presented at ACS Spring Meeting, Anaheim, 2011

Page 6: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

Demo – Web search

• define chem and non-chem text query

• Structure synonyms in query• structures in results panel ‘like web text

search + structures in the results”

Example: http://www.chemicalize.org/websearch/#m=Serotonin+sexual+preference+site%3Anature.com/p=0

Presented at ACS Spring Meeting, Anaheim, 2011

Page 7: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

Who are we

• 70+ people making cheminformatics toolkits and GUI’s in Budapest, Hungary

• 4 areas of technology : • Cheminformatics platform toolkits• Discovery toolkits• Desktop applications• Markush and IP

• Lots of web ready chemistry functionality to play with

• Emerging as industry leader in platform cheminformatics

Presented at ACS Spring Meeting, Anaheim, 2011

Page 8: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

Why did we do this

History•Free academic package and FreeWeb licensing since 2005•Marvin free for all desktops (since the beginning)•Open support forum developed to allow support for free users (no login to see all threads)

Presented at ACS Spring Meeting, Anaheim, 2011

Page 9: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

So why did we do this

• There is a lot of content on the web• Useful + increase visibility/utility of chemical

structures• Creates user interest in this type of functionality

and so demand for chemistry and content for publishers

• Lets us develop directly with end users:• Functionality/feature development • GUI usability• Crowd sourced bug fixing “Report Error” for naming.

• Pushing state of the art • Browser tech (svg, chunking, reducing calls) • ChemAxon tech (on the web, must be superfast, finalise

features)• We love cheminformatics “cheminfomaniacs”Presented at ACS Spring Meeting, Anaheim, 2011

Page 10: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

chemicalize.org under the hoodweb application (15kloc):•MySQL: DB engine - structure/text storage•ChemAxon bits: see below•Apache Tomcat – servlet container with code logic •jQuery + Plugins – UI interactions with code logic

• A fair bit of home grown (46% of code) here

Presented at ACS Spring Meeting, Anaheim, 2011

Page 11: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

ChemAxon bits

• Marvin: structure editor, viewer, image generation• Name <> structure, Document to Structure:

parsing, dictionaries and lexing IUPAC names• JChem Base, JChem Web Services,

Standardizer, MCES: structure database, duplicate checking, structure search, web services layer, canonicalization, hit highlighting

• Calculator Plugins: structure based predictions like pKa, logP, logD, charge, HBDA, tautomer, stereoisomers, etc. Notable combined predictions yield argument results – like “Lipinski-likeness” etc

Presented at ACS Spring Meeting, Anaheim, 2011

Page 12: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

Use cases:

• Wanted to know the logP of…• What are the structures for known drugs

(http://en.wikipedia.org/wiki/List_of_drugs)

• Seeing structures in relation to the name• All wikipedia pages with a “chembox” have been

indexed by chemicalize.org so can be searched by structure search (sub structure, similar, exact)

• See all similar structures (and names) for any similar structure : sildenafil = viagra, lodenafil, aildenafil, udenafil …

• Draw a structure and see it’s name• Automatically chemicalize my blog (WordPress

plugin)Presented at ACS Spring Meeting, Anaheim, 2011

Page 13: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

Stats: Raw numbers(Apr 1, 2010 – Mar 25, 2011)

• URL’s visited: 232,648• Total number of names: 3,383,947 (14.58

names/page)• Unique names extracted: 220,117• Structures extracted: 175,598• Total number unique visitors: 44,535• Average number of visitors/day (March 2011):

212• Average/longest time on site: 4:03 / 28:41

(min:sec)

Presented at ACS Spring Meeting, Anaheim, 2011

Page 14: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

What are they doing on the site

Presented at ACS Spring Meeting, Anaheim, 2011

Page 15: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

How busy are they?

Presented at ACS Spring Meeting, Anaheim, 2011

Page 16: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

Top domains

Total domains: 13,390 Ave. 17.29 urls per domain

Presented at ACS Spring Meeting, Anaheim, 2011

Page 17: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

Top pages

1. en.wikipedia.org/wiki/List_of_anaesthetic_drugs2. www.reactivereports.com/chemistry-blog/arty-with-a-capital-

f-and-the-myth-of-absinthe.html/comment-page-13. en.wikipedia.org/wiki/Penicillin4. en.wikipedia.org/wiki/Aspirin5. en.wikipedia.org/wiki/Paracetamol6. www.ncbi.nlm.nih.gov/sites/entrez?

db=pccompound&term=aspirin7. en.wikipedia.org/wiki/List_of_organic_compounds8. www.biomedcentral.com/info/ifora/figuretypes/9. www.freepatentsonline.com/y2005/0037033.html10. www.vivo.colostate.edu/hbooks/pathphys/endocrine/

pancreas/insulin_phys.htmlData only available for last 2 weeksPresented at ACS Spring Meeting,

Anaheim, 2011

Page 18: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

Usage statistics – predictions

Presented at ACS Spring Meeting, Anaheim, 2011

Page 19: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

Future plans..?

• Remaining free• Crowdsourcing – new structures/names, bug reporting• Working on sorting and ordering results (biggie)• Personalization (login) = personal search history, profiles

(notifications), dictionaries, calculation/search parameter settings

• Index page as window into internet chemistry use• Browser Plugins = chemicalize better, particularly in

login/https pages (plugins tech approaching unity anyway)• How about working up the chemistry side such as

pharmacophore search, other screening, etc - there is a lot of ChemAxon tech here to play with

• Work on quality of name parsing, black lists etc• What else guys – this is a provisional listPresented at ACS Spring Meeting, Anaheim, 2011

Page 20: Chemicalize org: Adding structures to web pages and data and Web links to structures: ACS Anaheim 2011

Thanks to

• Andras StraczSite implementation

• Daniel BonniotDocument & Name to structure

• Alex Allardyce, Ferenc CsizmadiaFeatures, project management, idiot and advanced testing

• Zsolt KocsmarszkyDesign

• Roland MolnarJChem Web Services


Top Related