ontologies in physical science onto workshop, ed.ac.uk 2013-04-11 an #animalgarden production peter...

Download Ontologies in Physical Science Onto Workshop, ed.ac.uk 2013-04-11 An #animalgarden production Peter Murray-Rust, University of Cambridge & Open Knowledge

If you can't read please download the document

Upload: luke-welch

Post on 22-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

  • Slide 1
  • Ontologies in Physical Science Onto Workshop, ed.ac.uk 2013-04-11 An #animalgarden production Peter Murray-Rust, University of Cambridge & Open Knowledge Foundation
  • Slide 2
  • PMR and friends want us to help build a computational chemistry ontology Is it an important problem? $1,000,000,000/yr for compchem They need OWL Problem: How to build ontologies when people are uninterested or antagonistic even though we have the technology
  • Slide 3
  • Chemists dont use ontologies Perhaps the chemists could use OWL-DL Top-down schemas like AniML havent (yet) taken off
  • Slide 4
  • Are there any ontologies in physical science that work? Crystallo- graphers build CIF dictionaries The IUCr, right? Tell us about CIF IUCr: International Union of Crystallography
  • Slide 5
  • CIF Core defines 500 common concepts CIF: http://www.iucr.org/cif Or the volume of the crystal cell Like the wavelength of the radiation used
  • Slide 6
  • Core dictionary (coreCIF) version 2.4.3 _diffrn_ambient_temperature Definition: The mean temperature in kelvins at which the intensities were measured. Range: 0.0 -> infinity Type: numb An example ? ID For humans For machines: Constraint + type http://www.iucr.org/__data/iucr/cifdic_html/1/ cif_core.dic/Idiffrn_ambient_temperature.html
  • Slide 7
  • Definition: The mean temperature in kelvins at which the intensities were measured. So everyone converts temperatures to use K? Yes! today I swam at 273K But chemists want to use all sorts of different units We MUST have a units ontology
  • Slide 8
  • OWL? Is CIF a proper ontology? Its not RDF but weve global URIs, like cif:_diffrn_ambient_temperature Because IUCr controls the namespace prefix: cif= http://www.iucr.org/cif
  • Slide 9
  • CIF had 20 years of community involvement through IUCr But most top- down chemistry projects dont work So well do this bottom-up.
  • Slide 10
  • Every compchem program uses basically the same scientific concepts We think each should build its own dictionary so we understand the output Wont that just be a mess? No. Its the first step to interoperability.
  • Slide 11
  • Chemical Markup Language PMR/Rzepa http://www.xml-cml.org Hyperchem builds ITS dictionary Each annotates their own program output NWChem builds ITS dictionary The programs will use CML* for chemical output
  • Slide 12
  • in a communal cml:compchem dictionary that everyone uses We agree they are the same so create compchem:alphae Alpha-electrons: Hyperchem uses hchem:e_alpha NWChem has nwchem:_alpha_elec
  • Slide 13
  • What if the data structure or concepts dont map CML provides conventions so each group can define their data structure Data can then be machine validated against each convention!
  • Slide 14
  • But there are over 20 program codes. Weve prototyped with many before. Theyll be encouraged I think its going to work. BUT TTT* GULP, DPOLY, CASTEP, SIESTA, MOPAC TTT: Things Take Time (Piet Hein)
  • Slide 15
  • Will it work? It depends on people National labs CSIRO/AU and PNNL/US are committed And we have companies like Hyperchem and Kitware I wish we had some publishers
  • Slide 16
  • Well need tools Weve got FoX* for FORTRAN output JUMBOTemplates to parse logfiles RDF for navigating dictionaries FoX*: XML/FORTRAN Toby White, Andrew Walker
  • Slide 17
  • Benefits of semantic dictionaries: FORTRAN logfile can be made semantic High degree of interoperability in chemistry Semantic publication (HTML5, CML, MathML) Interoperates with mainstream Web Easily scalable to other phys sci. Problems: Closed code/minds is short-term market advantage Non-trivial commitment (updates, code revision) Getting top-down approval (e.g. IUPAC)
  • Slide 18
  • Benefits of semantic dictionaries: FORTRAN logfile can be made semantic High degree of interoperability in chemistry Semantic publication (HTML5, CML, MathML) Interoperates with mainstream Web Easily scalable to other phys sci. Problems: Closed code/minds is short-term market advantage Non-trivial commitment (updates, code revision) Getting top-down approval (e.g. IUPAC)
  • Slide 19
  • Benefits of semantic dictionaries: FORTRAN logfile can be made semantic High degree of interoperability in chemistry Semantic publication (HTML5, CML, MathML) Interoperates with mainstream Web Easily scalable to other phys sci. Problems: Closed code/minds is short-term market advantage Non-trivial commitment (updates, code revision) Getting top-down approval (e.g. IUPAC)
  • Slide 20
  • Benefits of semantic dictionaries: FORTRAN logfile can be made semantic High degree of interoperability in chemistry Semantic publication (HTML5, CML, MathML) Interoperates with mainstream Web Easily scalable to other phys sci. Problems: Closed code/minds is short-term market advantage Non-trivial commitment (updates, code revision) Getting top-down approval (e.g. IUPAC)
  • Slide 21
  • Benefits of semantic dictionaries: FORTRAN logfile can be made semantic High degree of interoperability in chemistry Semantic publication (HTML5, CML, MathML) Interoperates with mainstream Web Easily scalable to other phys sci. Problems: Closed code/minds is short-term market advantage Non-trivial commitment (updates, code revision) Getting top-down approval (e.g. IUPAC)
  • Slide 22
  • Top-down schemas like AniML havent (yet) taken off Chemists dont use ANY ontologies Perhaps the chemists could use OWL-DL