Semantic Spaces
Professor Nigel Shadbolt
Director of AKT
School of Electronics and Computer Science
University of Southampton
Structure of the Talk
• Introduction• Semantic Spaces: The vision and the reality• Ingredients for Semantic Spaces
– Ontologies– Heterogeneous Information Sources– Navigation and visualisation of the space– Knowledge Processing Services– Socio-technical Challenges
• Interleave with two primary examples from AKT – CS ATKive Space
http://triplestore.aktors.org/demo/AKTiveSpace/– MIAKT
www.aktors.org/miakt
Advanced Knowledge Technologies IRC
AKT started Sept 00, 6 years, £8.8 Meg, EPSRC
www.aktors.org
Around 65 investigators and research staff
Dramatis Personae
Departments PIs CIs
Computing Sciences, Aberdeen
Derek Sleeman Peter Gray
Alun Preece
Informatics, Edinburgh Austin Tate
Dave Robertson
KMI, OU Enrico Motta Simon Buckingham-Shum
John Domingue
Computer Science Yorick Wilks Fabio Ciravegnia
Hamish Cunningham
ECS, Southampton Nigel Shadbolt
Wendy Hall
Leslie Carr
Dave De Roure
Hugh Glaser
Kieron O’Hara
Semantic Spaces: The Vision
Structured Spaces
• Linkage of heterogeneous information– web content
– databases
– meta-data repository
– multimedia
• Via ontologies as information mediation structures
• Using Semantic Web languages
Oncogene(MYC): Found_In_Organism(Human). Gene_Has_Function(Transcriptional_Regulation). Gene_Has_Function(Gene_Transcription). In_Chromosomal_Location(8q24). Gene_Associated_With_Disease(Burkitts_Lymphoma).
NCI Cancer Ontology (OWL)
<meta> <classifications> <classification type="MYC” subtype="old_arx_id">bcr-2-1-059</classification> </classifications></meta>
BioMedCentral Metadata (XML)
Web data set (XHTML)
Vocabulary (RDFS)
Services on the Space
Hendler 03- Science
So what have we got?
• A very particular KRL for the web• Very low take up of structured meta-data beyond
XML• What RDF exists is largely FOAF• A variety of demonstrators on the small to
medium scale • Few deployed examples• A lot of Good Old Fashioned Artificial
Intelligence (GOFAI) proposals in the wings• But this could be our big chance….
Semantic Spaces: Ontologies
Perspectives on ontologies
Source, Concepts for Automating Systems Integration, E. Barkmeyer, A. Feeney, P. Denno, D. Flater, D. Libes, M. Steves, E. Wallace. NISTIR 6928, NIST Feb., 2003
• The semantic view: An ontology is the context needed to understand a specification, model, or other communication in the way that was intended.
• The specification / reference view: "An ontology is an explicit specification of a conceptualization." and "Commitment to a common ontology is a guarantee of consistency [in terminology]." Simple taxonomies and thesauri are included in this definition as degenerate cases.
• The modeling view: An ontology is a metamodel. • The automation view: An ontology is, or is captured in,
a knowledge base designed to support automatic reasoning.
Ontologies offer….
• Communication– Normative models– Networks of relationships– Consistent and
unambiguous– Integrate multiple
perspectives
• Inter-operability and Integration: Sharing & Reuse– Inter-lingua– Specifications– Reliability
• Control– Controlled vocabularies – Accurate data collection or
retrieval– Classification– Finding, sharing,
discovering, navigation, indexing
Medicine: The UMLS®
• Extensive Medical Nomenclature Project
• Integrative– SnoMed
• Translation work into OWL
• Being widely adoped• High level
Governmental support
OrganismAttribute
AnatomicalStructure
EmbryonicStructure
AnatomicalAbnormality
CongenitalAbnormality
AcquiredAbnormality
Fully FormedAnatomical
Structure
Finding
Laboratory orTest Result
Sign orSymptom
BodySubstance
Body System
part of
part of
part of part of
part of
Body Part, Organ orOrgan Component
Tissue Cell CellComponent
Gene orGenome
Injury orPoisoning
property of
evaluation of
Body Spaceor Junction
conceptualpart of
Body Locationor Region
conceptualpart of
produces,contains
disrupts
disrupts
process of
conceptualpart of
evaluation of
isa linksnon-isa relations
conceptualpart of
BiologicFunction
PhysiologicFunction
Organ orTissue
Function
CellFunction
MolecularFunction
OrganismFunction
GeneticFunction
MentalProcess
PathologicFunction
Cell orMolecular
Dysfunction
Experimentalmodel
of Disease
Disease orSyndrome
Mental orBehavioral
Dysfunction
NeoplasticProcess
location of
adjacent to
location of
co-occurs with
Organism
Alga
Fungus Virus Rickettsiaor
Chlamydia
Bacterium Animal
Invertebrate Vertebrate
Amphibian Bird Fish
PlantArchaeon
ReptileMammal
Human
OrganismAttribute
AnatomicalStructure
EmbryonicStructure
AnatomicalAbnormality
CongenitalAbnormality
AcquiredAbnormality
Fully FormedAnatomical
Structure
Finding
Laboratory orTest Result
Sign orSymptom
Laboratory orTest Result
Sign orSymptom
BodySubstance
Body System
part of
part of
part of part of
part of
Body Part, Organ orOrgan Component
Tissue Cell CellComponent
Gene orGenome
Injury orPoisoningInjury orPoisoning
property of
evaluation of
Body Spaceor Junction
conceptualpart of
Body Locationor Region
conceptualpart of
produces,contains
disrupts
disrupts
process of
conceptualpart of
evaluation of
isa linksnon-isa relationsisa linksnon-isa relations
conceptualpart of
BiologicFunction
PhysiologicFunction
Organ orTissue
Function
CellFunction
MolecularFunction
OrganismFunction
GeneticFunction
MentalProcess
PathologicFunction
Cell orMolecular
Dysfunction
Experimentalmodel
of Disease
Disease orSyndrome
Mental orBehavioral
Dysfunction
NeoplasticProcess
Mental orBehavioral
Dysfunction
NeoplasticProcess
location of
adjacent to
location of
co-occurs with
Organism
Alga
Fungus Virus Rickettsiaor
Chlamydia
Bacterium Animal
Invertebrate Vertebrate
Amphibian Bird Fish
PlantArchaeon
ReptileMammal
Human
Organism
Alga
Fungus Virus Rickettsiaor
Chlamydia
Bacterium Animal
Invertebrate Vertebrate
Amphibian Bird Fish
PlantArchaeon
ReptileMammal
Human
Mammal
Human
Genetics: Gene Ontology
• One of the earliest examples of the benefits of ontologies
• Integration and interoperability were big wins
• Specific tool support• Considerable resources
invested and continuing in maintenance
• Translation into DLs• Spawned more generic
biological ontology efforts
Manufacturing: Aerospace
• Considerable work on ontologies for products and components
• Used in all stages of the life cycle, from design to in service maintenance
• Need for multiple perspectives e.g– Whole engine– Heat transfer– Cost model– Manufacturing– Assembling/Maintenance
Military: Coalition Operations
• Some of the original motivation behind DAML work
• Lots of activity to build ontologies in a range of contexts
• Particularly important in coalition operations
• Central requirement for the concept of Network Enabled Capability
Computer Science: The AKT Ontology
• Designed as a learning case for AKT
• Adopted for our own Semantic Web experiments including CS AKTive
• Uses a number of Upper Ontology Fragments
• Reusable in many University and Research Contexts
MIAKT: Multi-disciplinary Assessment
• Multiple stakeholders• Multiple viewpoints and
vocabularies– Breast imaging – X-ray,
ultrasound, MRI– Clinical examination– Microscopy – cells and
tissues (also, hormone receptors)
• Local dialects in use• Variation between
countries due to factors such as insurance claims!
Ontologies: Observations
• In any domain– Usually highly implicit– Poorly documented– Likely to be ambiguous, vague, inconsistent
• When modelling– Interaction Problem: tasks influence ontologies– Integration Problem: integrating multiple ontologies– Modularity Problem: how to modularise and what grain size?
• Maintenance– Ongoing maintenance overhead– Ontologies evolve and change– Design rationale is important
• Upside– They do facilitate interoperability– They do enhance reuse– They are becoming part of the infrastructure
The Crucial Role Standards Play
HTML XML + Name Space + XML Schema
Topic Maps
SMIL
RDF(S)XOL
OWL
RDF
Unicode URI
Semantic Spaces: Heterogeneous Information
Sources
What might Heterogeneous Information Sources mean?
• Provenance– Could be legacy– Not necessarily under direct control– Variable validity
• Form– More or less structured– Different syntactic and semantic formats– Multimedia– Distributed in space or time
• Function– Collected for different reasons
MIAKT DEMO
• Clinical examination– Notes
• Imaging– X-ray, – Ultrasound– MRI
• Microscopy– Histopathology
• Treatment– Protocol Records– Re-assessment
• Medical Records– Case sets– Individual patient records
• Published background– Epidemiology– Medical Abstracts
AKTive Spaces• Content harvested and published from
multiple Heterogeneous Sources Higher Education directories
• 2001 RAE submissions• UK EPSRC project database (all
grants awarded by EPSRC in the past decade)
• Detailed data on personnel, projects and publications harvested for:
– all AKT partners– all 5 or 5* CS departments in the UK– Automatic NL mining: Armadillo
• Additional resources– All the world's countries (from
ISO3166-1)– All UK administrative areas (from
ISO3166-2)– All UK settlements listed in the UN
LOCODE service– All the world's airports (from the IATA)– (and they're all integrated via the AKT
reference ontology)
Semantic Spaces: Navigation and Visualisation
Aspects to navigating or visualising a semantic space
• Semantic Interfaces– Ontology as a navigable structure– Semantic encoding visualisations
• Scope– Local to global– Domain specific or generic
• Function– Reader to author– Individual to collaborative
• Navigation and visualisation via graphical characterisation of ontology
• Ontological relations are also the essential relations that are used to navigate the information space
• Natural Language Generation is used to provide a summary of content held in the image
W3photo and AKTive photo
• An AKTive Space for photo annotation
• The annotation is direct from the ontology
• The navigation is also based on the ontology
• Re-ordering columns (classes) exposes different parts of the information space
CS AKTive Space
• An AKTive Space for CS research
• The navigation is also based on the AKT ontology
• Re-ordering columns (classes) exposes different parts of the information space
• Complex RDQL dispatched behind the direct manipulation interface
• Strong geographical overaly
Semantic Spaces: Knowledge Processing Services
What constitutes a semantic service?
• Have semantic characterisation– What is the goal or task achieving effect?– What are its “knowledge level” preconditions
or inputs
• Compositionality– Grain size– Internal and external aspects
• Discoverable or locatable– Accessible– Maintained
MIAKT: Overall Framework and Current Services
CS AKTive Space Services
• Triple Store (3store) and associated browser navigation• RDQL interface to 3store• Harvesting and scraping• M-space visualisation• Community of Practice• Armadillo – publication harvesting
The CS AKTive Space:Semantic Web Challenge Winner
2003• 24/7 update of content• Content continually harvested and acquired against
community agreed ontology• Easy access to information gestalts - who, what, where• Hot spots
– Institutions
– Individuals
– Topics
• Impact of research– citation services etc
– funding levels
– Changes and deltas
• Dynamic Communities of Practice…
CS AKTive Space
DEMO
Semantic Spaces: Socio-Technical Context
Semantic Spaces: A Challenge?
“Is this rocket science? Well, not really …We are not inventing relational models for data, or query systems or rule-based systems. We are just webizing them. We are just allowing them to work together in a decentralized system - without a human having to custom handcraft every connection.”
Tim Berners-Lee, Business Case for the Semantic Web,http://www.w3.org/DesignIssues/Business
Technical Challenges in Semantic Space
• Annotation• Content capture/harvesting• Ontology mapping and alignment• Referential Integrity• Reasoning Services including incorporation of
statistical and probabilistic methods • Semantic service composition• Provenance and Trust• Multimedia content• Semantic HCI
Social Challenges of Semantic Spaces
• Social Issues– How do you get communities to participate?– Mandate and require– The need to share information e.g. e-Science– Become social and viral e.g. early days of
web and FOAF
• Regulatory– Fidelity of content is on the high side but even
so…provenance and quality services– Data Protection and information assurance