term project, 91.514 luke p immes, 05/09/2011 translation from xml (and other sufficiently described...
Post on 22-Dec-2015
217 views
TRANSCRIPT
Term Project, 91.514Luke P Immes, 05/09/2011
Translation from XML (and other sufficiently described input) to RDF/OWL, using
Java/Groovy
Focus on two demonstrations
Translation Task
XML file, with elements, and attributes Desired path accesses each value
Or, well defined string which e.g., shows connectivity in a computer network (LAN).
We want to explicitly represent the above as instances of RDF/OWL classes. Optionally we add relationships.
Why is Translation Important
XML, or other input description, leaves relationships, and/or functional representation as implicit.
Implied relationships are not in the elements nor attributes. They may be added by a human being, manually,
or automatically. Make as much knowledge as possible: explicitly
represented.
Inputs
An existing RDF/OWL hierarchy An existing XML file
<recipe name="server1"> <baseOS edition="Enterprise" family="windows"
language="en-US" os="2003" patchlevel="2"/> </recipe>
An input string 'comp1 – comp2, comp23 – comp23...'
Program which parses XML
Input xml file Input .owl template file, which import ontology
Modifiable Parse xml file, scanning desired paths(not
every one is useful). Output new ontology with xmlFilename
Original ontology with new instances, Perhaps subclasses, relationships.
1st: TransXMLtoOwlqWrks.groovy
Input/Users/lpimmes/Documents/lpimmesOnly/UMassLowell/91.514/project/testrange20f.xml
/Users/lpimmes/Documents/lpimmesOnly/UMassLowell/91.514/project/
– AFRL_network/hwb_core_IMPORT_THEN_MODIFY.owl
Processing TransXMLtoOwlqWrks.java translates .xml into .owl, rdf/owl (not .ttl) format.
Update *.owl with new subclasses, instances(concat '-', of tokens).
Write new .xml via groovy's append
RDF.appendNode{...}
Parse .xml path via groovy's slurper to find token to translate
RDF = new XmlSlurper().parse(owlOut)
Output New *.owl is loadable into protege; new entities appear.
/Users/lpimmes/Documents/lpimmesOnly/UMassLowell/91.514/project/AFRL_network/hwb_core_TstRng20f.owl
Parsing...
http://www.dcl.mathcs.emory.edu/hwb/ontologies/hwb_core.owl, numof: 20
nameInstStr: dc.enclave1.net
nameInstStr: web.enclave1.net
nameInstStr: mail.enclave1.net
nameInstStr: client1.enclave1.net
...
...
Program Parsing a String
Input string showing connectivities in a LAN In program itself, subclasses are given showing
node types. Also: connectsTo
Recall humanly defined relationship
Output .owl file which classes, subclasses, relatioships, and instances (of subclasses).
2nd: VisioCnvrt.groovy
InputString: 'Ethernet.46-652 - PC.45-5, Ethernet.46-652 - Server.11-49, Ethernet.46-652 – PC.45-66...'
(External xml program has yielded this string; no need to put back to .xml.)
Fname: viso.owl ;; changeable, and still loads
Semantic hdr: 'www.semanticweb.org/ontologies/2011/4' ;; changeable and still loads
Processing VisioCnvrt.groovy outputs fname in .ttl format, loadable in protege.
• Internal .ttl format much easier to generate code vs. rdf/owl.
Entire file generated: namespaces, subclasses, relationship(s), instances derived from relationships.
• Mary Loves John, Sue loves Mike; People Love People.
Output /Users/lpimmes/Documents/workspace/ulowellGroovy/visio.owl.
File Formats; Parsing Issues
Using the appropriate file format, makes parsing and generation of code: Easier: ttl
Used in 2nd program Protege save_as is buggy
Multiple name spaces; OK for single namespace Much harder: rdf/owl
Used in 1st program
RdfSerializer.java needs to tested for multiple namespaces.