ontologies in physical science onto workshop, ed.ac.uk 2013-04-11 an #animalgarden production peter...
TRANSCRIPT
- Slide 1
- Ontologies in Physical Science Onto Workshop, ed.ac.uk 2013-04-11 An #animalgarden production Peter Murray-Rust, University of Cambridge & Open Knowledge Foundation
- Slide 2
- PMR and friends want us to help build a computational chemistry ontology Is it an important problem? $1,000,000,000/yr for compchem They need OWL Problem: How to build ontologies when people are uninterested or antagonistic even though we have the technology
- Slide 3
- Chemists dont use ontologies Perhaps the chemists could use OWL-DL Top-down schemas like AniML havent (yet) taken off
- Slide 4
- Are there any ontologies in physical science that work? Crystallo- graphers build CIF dictionaries The IUCr, right? Tell us about CIF IUCr: International Union of Crystallography
- Slide 5
- CIF Core defines 500 common concepts CIF: http://www.iucr.org/cif Or the volume of the crystal cell Like the wavelength of the radiation used
- Slide 6
- Core dictionary (coreCIF) version 2.4.3 _diffrn_ambient_temperature Definition: The mean temperature in kelvins at which the intensities were measured. Range: 0.0 -> infinity Type: numb An example ? ID For humans For machines: Constraint + type http://www.iucr.org/__data/iucr/cifdic_html/1/ cif_core.dic/Idiffrn_ambient_temperature.html
- Slide 7
- Definition: The mean temperature in kelvins at which the intensities were measured. So everyone converts temperatures to use K? Yes! today I swam at 273K But chemists want to use all sorts of different units We MUST have a units ontology
- Slide 8
- OWL? Is CIF a proper ontology? Its not RDF but weve global URIs, like cif:_diffrn_ambient_temperature Because IUCr controls the namespace prefix: cif= http://www.iucr.org/cif
- Slide 9
- CIF had 20 years of community involvement through IUCr But most top- down chemistry projects dont work So well do this bottom-up.
- Slide 10
- Every compchem program uses basically the same scientific concepts We think each should build its own dictionary so we understand the output Wont that just be a mess? No. Its the first step to interoperability.
- Slide 11
- Chemical Markup Language PMR/Rzepa http://www.xml-cml.org Hyperchem builds ITS dictionary Each annotates their own program output NWChem builds ITS dictionary The programs will use CML* for chemical output
- Slide 12
- in a communal cml:compchem dictionary that everyone uses We agree they are the same so create compchem:alphae Alpha-electrons: Hyperchem uses hchem:e_alpha NWChem has nwchem:_alpha_elec
- Slide 13
- What if the data structure or concepts dont map CML provides conventions so each group can define their data structure Data can then be machine validated against each convention!
- Slide 14
- But there are over 20 program codes. Weve prototyped with many before. Theyll be encouraged I think its going to work. BUT TTT* GULP, DPOLY, CASTEP, SIESTA, MOPAC TTT: Things Take Time (Piet Hein)
- Slide 15
- Will it work? It depends on people National labs CSIRO/AU and PNNL/US are committed And we have companies like Hyperchem and Kitware I wish we had some publishers
- Slide 16
- Well need tools Weve got FoX* for FORTRAN output JUMBOTemplates to parse logfiles RDF for navigating dictionaries FoX*: XML/FORTRAN Toby White, Andrew Walker
- Slide 17
- Benefits of semantic dictionaries: FORTRAN logfile can be made semantic High degree of interoperability in chemistry Semantic publication (HTML5, CML, MathML) Interoperates with mainstream Web Easily scalable to other phys sci. Problems: Closed code/minds is short-term market advantage Non-trivial commitment (updates, code revision) Getting top-down approval (e.g. IUPAC)
- Slide 18
- Benefits of semantic dictionaries: FORTRAN logfile can be made semantic High degree of interoperability in chemistry Semantic publication (HTML5, CML, MathML) Interoperates with mainstream Web Easily scalable to other phys sci. Problems: Closed code/minds is short-term market advantage Non-trivial commitment (updates, code revision) Getting top-down approval (e.g. IUPAC)
- Slide 19
- Benefits of semantic dictionaries: FORTRAN logfile can be made semantic High degree of interoperability in chemistry Semantic publication (HTML5, CML, MathML) Interoperates with mainstream Web Easily scalable to other phys sci. Problems: Closed code/minds is short-term market advantage Non-trivial commitment (updates, code revision) Getting top-down approval (e.g. IUPAC)
- Slide 20
- Benefits of semantic dictionaries: FORTRAN logfile can be made semantic High degree of interoperability in chemistry Semantic publication (HTML5, CML, MathML) Interoperates with mainstream Web Easily scalable to other phys sci. Problems: Closed code/minds is short-term market advantage Non-trivial commitment (updates, code revision) Getting top-down approval (e.g. IUPAC)
- Slide 21
- Benefits of semantic dictionaries: FORTRAN logfile can be made semantic High degree of interoperability in chemistry Semantic publication (HTML5, CML, MathML) Interoperates with mainstream Web Easily scalable to other phys sci. Problems: Closed code/minds is short-term market advantage Non-trivial commitment (updates, code revision) Getting top-down approval (e.g. IUPAC)
- Slide 22
- Top-down schemas like AniML havent (yet) taken off Chemists dont use ANY ontologies Perhaps the chemists could use OWL-DL