ic achies in knowledge analysis and ion infoion sciences...

29
Semantic Hierarchies in Knowledge Analysis and Integration Cliff Joslyn Information Sciences Group DIMACS Workshop on Recent Advances in Mathematics and Information Sciences for Analysis and Understanding of Massive and Diverse Sources of Data May 2007

Upload: others

Post on 03-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

Semantic Hierarchies inKnowledge Analysis and Integration

Cli! Joslyn

InformationSciences Group

DIMACS Workshop on Recent Advances in Mathematics and

Information Sciences for Analysis and Understanding ofMassive and Diverse Sources of Data

May 2007

Page 2: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

OUTLINE

• The challenge of semantic information for knowledge systems

• Large computational ontologies

– Analysis

– Induction

– Interoperability

• Order theoretical approaches

– Ontology anlaysis

– Concept lattices: Formal Concept Analysis

Cli! Joslyn, [email protected], p. 1, 5/14/2007

Page 3: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

APPLICATION CHALLENGES

Decision Support: Military, intelligence, disaster response

Intelligence Analysis: Multi-Int integration: IMINT, HUMINT,SIGINT, MASINT, etc.

Biomedicine: Biothreat response

Defense Applications: Defense transformation, situational aware-ness, global ISR

Bibliometrics: Digital libraries, retrieval and recommendation

Simulation: Interaction with knowledge management/decision

support environments

Nonproliferation: “Ubiquitous sensing”, information fusion

Cli! Joslyn, [email protected], p. 2, 5/14/2007

Page 4: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

KNOWLEDGE SYSTEMS• Challenge for database integration at the knowledge level:

Connectivity: Wiring everything up, everything accessible

Interoperability: Knowing what you have and where it is• Complement quantitative statistical techniques with qualita-

tive methods:

– Knowledge representation, natural language processing

– Search, retrieval, inference

– Focus on the meaning (semantics) of information in databases:use, interpretation

• In conjunction with existing capabilities in data mining, ma-chine learning, sensor technology, simulation, etc.

– Knowledge-based and data-rich sciences: Biology, as-tronomy, earth science

– Knowledge-based technologies for national security:Decision support, intelligence analysis

– Knowledge-based technologies supporting the scien-tific process: Semantic web, digital libraries, publicationprocess, communities of networked scientists

Cli! Joslyn, [email protected], p. 3, 5/14/2007

Page 5: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

MULTI-MODAL DATA FUSION

• Qualitative di!erence:

Sensors:

– Physics sensors: nuclear, radiological, chemical

– Electromagnetic spectrum

– Acoustic, seismic

– Images, video

Information Sources:

– Geospatial

– Structured and semi-structured data

– Relational databases

– Text, documents

– Plans, scenarios

• How to bridge?

– Meta-data

– Feature extraction from signals, images

– Feature ontologies and interoperability protocols

Cli! Joslyn, [email protected], p. 4, 5/14/2007

Page 6: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

LANL KNOWLEDGE ANDINFORMATION SYSTEMS SCIENCE

http://www.c3.lanl.gov/knowledge

Semantic Hierarchies for Knowledge Systems• Representations of semantic and symbolic information• Approach from mathematical systems theory:

– Discrete math, combinatorics, information theory

– Metric geometry approach to order theory (lattices andposets)

• Hybrid methodologies combining statistical, numerical, andquantitative with symbolic, logical, and qualitative

• Ontologies and Conceptual Semantic Systems: Discretemathematical approaches

• Computational Linguistics and Lexical Semantics: Fornatural language processing and text extraction

• Database Analysis: User-guided knowledge discovery incomplex, multi-dimensional data spaces

• Software Architectures: Parallel and high performance al-gorithms

Cli! Joslyn, [email protected], p. 5, 5/14/2007

Page 7: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

PARADIGM: SEMANTIC NETWORKS

• Lattice-

labeled

directed

multi-graphs

• Increasing

size and

prominence

for

databases:

Intelligence

analysis, lawenforcement,computa-tional

biology

Organism

Human BirdViral

PathogenBacterialPathogen

Animal Microbe Pathogen

Move

ContagiousNon-Contagious

Transmission

Mosquito

Richard

TransmitWest Nile

Transmit

George

West NileTransmit

Transmit

FACT BASE =INSTANCE GRAPH

Concept Hierarchy Relation HierarchyONTOLOGY = TYPE GRAPH

Insect

Contact

VectoredDirect

BiteBite

Infect

• Challenges: Typed-link network theory; morphisms of typedgraphs; ontology analysis, induction, and interoperability.

Cli! Joslyn, [email protected], p. 6, 5/14/2007

Page 8: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

REASONING WITHIN ONTOLOGIES FORTHE SEMANTIC WEB

• Proposed ba-sis for Seman-

tic Web• Ontological

database:interacting

hierarchies ofobjects andrelations

Trip(Traveler:By)

Event(When:Time)

Arrive(To:Place) Depart(From:Place)

Action(By:Entity)

Relations Objects

Entity

Animal

Person(Name)

Traveler

President(Country)

President-of-the-USA: President(USA)

Place

Country

Vietnam USA

• Semantic relations valued on objects• Description-logic queriesWho was the last president before Clinton to visit Vietnam?

>>: (Name(By)) ( Trip?x ( To:Vietman, By:President-of-the-USA )

.and. lub(When(x)) ! 1992)

Cli! Joslyn, [email protected], p. 7, 5/14/2007

Page 9: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

BIO-ONTOLOGIES• Domain-specific concepts, together with how they’re related

semantically• Crushing need driven by the genomic revolution• At least:

– Large terminological collections (controlled vocabularies,lexicons)

– Organized in taxonomic, hierarchical relationships• Sometimes in addition: Methods for inference over these struc-

tures• Molecular, anatomy, clinical, epidemiological, etc.:

Gene Ontology: Molecular function, biological process, cel-lular location

Fundamental Model of Anatomy

Unified Medical Language System: National Library of Medicine,meta-thesaurus

Open Biology Ontologies

MEdical Subject Headings (MeSH)

Enzyme Structures Database: EC numbers

Cli! Joslyn, [email protected], p. 8, 5/14/2007

Page 10: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

GENE ONTOLOGY (GO):DNA METABOLISM PORTION

• Taxonomiccontrolledvocabulary

• " 20K nodes

populated bygenes, proteins

• Two orders!isa,!has

• Major communitye!ort: assumingprimary positionin generalbioinformatics

Gene Ontology Consortium (2000): “Gene Ontology: ToolFor the Unification of Biology”, Nature Genetics, 25:25-29

• Tremendous computational resource: large, semantically rich,validated, middle ontology, first (?) in major use

Cli! Joslyn, [email protected], p. 9, 5/14/2007

Page 11: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

CATEGORIZATION IN THEGENE ONTOLOLGYhttp://www.c3.lanl.gov/posoc

• Develop functional hypotheses about hundreds of genes iden-tified through expression experiments

• Given the Gene Ontology (GO) . . .

• And a list of hundreds of genes of interest . . .

• “Splatter” them over the GO . . .

• Where do they end up?– Concentrated?– Dispersed

– Clustered?– High or low?– Overlapping or distinct?

• POSet Ontology Categorize (POSOC)

C Joslyn, S Mniszewski, A Fulmer, and G Heaton: (2004) “The Gene Ontology Categorizer”,Bioinformatics, v. 20:s1, pp. 169-177

Cli! Joslyn, [email protected], p. 10, 5/14/2007

Page 12: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

WHOLE GO CA. 2001

Courtesy of Robert Kue!ner, NCGR, 2001

Cli! Joslyn, [email protected], p. 11, 5/14/2007

Page 13: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

GO PORTION, HIERARCHICAL EYECHART

Cli! Joslyn, [email protected], p. 12, 5/14/2007

Page 14: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

HIERARCHIES ASPARTIALLY ORDERED SETS

ChainAntichain

DirectedGraph

Lattice

Tree

Partial Order =Poset = DAG

• Partial Order: Set P ; relation !#P2: reflexive, anti-symmetric, tran-sitive

• Poset: P = $P,!%• Simplest mathematical structures

which admit to descriptions interms of “levels” and “hierarchies”

• More specific than graphs or net-works: no cycles, equivalent to Di-

rected Acyclic Graphs (DAGs)• More general than trees, lattices:

single nodes, pairs of nodes canhave multiple parents

• Ubiquitous in knowledge systems:constructed, induced, empirical

Cli! Joslyn, [email protected], p. 13, 5/14/2007

Page 15: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

BASIC POSET CONCEPTS

Poset: P = $P,!%Comparable Nodes: a " b := a ! b or b ! a

Up-Set: &a = {b ' a}, Down-Set: (a = {b ! a}Chain: Collection of comparable nodes: a1 ! a2 ! . . . ! an

Height: Size maximal chain H(P)

Noncomparable Nodes: a )" b

Antichain: Collection of noncompara-ble nodes: A # P, a )" b, a, b * A

Width: Size maximal antichain W(P)

Interval: [a, b] := {c * P : a ! c ! b}, abounded sub-poset of P

Join, Meet: a + b, a , b # P

Lattice: Then a + b, a , b * P

Bounded: Min 0 * P , Max 1 * P

B

F G

A

I

H

C

E J

D

1

K

0

. Schroder, BS (2003): Ordered Sets, Birkhauser, Boston

Cli! Joslyn, [email protected], p. 14, 5/14/2007

Page 16: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

SOME GO QUANTITATIVE MEASURES

Nodes Leaves Interior Edges H WMF 7.0K 5.6K 1.3K 8.1K 13 ' 3.5KBP 7.7K 4.1K 3.6K 11.8K 15 ' 2.9KCC 1.3K 0.9K 0.4K 1.7K 13 ' 0.4KGO 16.0K 10.6K 5.4K 21.5K 16 ' 5.9K

1

10

100

1000

10000

0 10 20 30 40 50 60 70 80 90 100

# N

odes

#

Branching (BP Branch)

ChildrenParents

0 2 4 6 8 10 12 14 16 2

4 6

8 10

12 14

16

1

10

100

# Children

Branching by Interval Rank (BP Branch)

Average # ChildrenAverage # Parents

Top Rank

Bottom Rank

# Children

Joslyn, Cli!; Mniszewski, SM; Verspoor, KM; and JD Cohn: (2005) “Improved Order The-oretical Techniques for GO Functional Annotation”, poster at 2005 Conf. on IntelligentSystems for Molecular Biology (ISMB 05)

C Joslyn, S Mniszewski, A Fulmer, and G Heaton: (2004) “The Gene Ontology Categorizer”,Bioinformatics, v. 20:s1, pp. 169-177

Cli! Joslyn, [email protected], p. 15, 5/14/2007

Page 17: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

CHAIN DECOMPOSITION OF INTERVALS

Comparable Nodes: e.g. D ! 1 * P

Chain Decomposition: Set of all chains connecting them:

C(D, 1) = {Cj}= {D - E - I - B - 1, D - E - I - C - 1,

D - E - K - 1, D - J - C - 1,

D - J - K - 1} # 2P

Chain Lengths: hj := |Cj|. 1Vectors of Chain Lengths:

!h(a, b) :=!hj

"M

j=1=

$4,4,3,3,3%Extremes:

h!(a, b) = minhj"!h(a,b)

hj = 3

h!(a, b) = maxhj"!h(a,b)

h=4

B

F G

A

I

H

C

E J

D

1

K

0

Cli! Joslyn, [email protected], p. 16, 5/14/2007

Page 18: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

INTERVAL RANK LAYOUT

• Interval valued vertical position(rank)

• Chain decomposition guideshorizontal: short maximalchains to outside

CA Joslyn, SM Mniszewski, SA Smith, and PMWeber: (2006) “SpindleViz: A Three Dimensional,Order Theoretical Visualization Environment forthe Gene Ontology”, Joint BioLINK and 9thBio-Ontologies Meeting (JBB 06)

Cli! Joslyn, [email protected], p. 17, 5/14/2007

Page 19: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

CATEGORIZATION METHOD

• POSO: POSet OntologyO := $P, X, F % ,P = $P,!%Labels: finite, non-empty set X

Labeling Function: F :X /0 2P

• Given labels (genes) c, e, i . . .• What node(s)

P = {A, B, C, . . . , K} are best to

pay attention to?

B

F G

A

I

H

C

E J

D

1

K

a,b,c

b,d

e

f

g,h,i

j

b

• Scores to rank-order nodes wrt/gene locations, balancing:

– Coverage: Covering as many genes as possible

– Specificity: But at the “lowest level” possible• “Cluster” based on non-comparable high score nodes

C Joslyn, S Mniszewski, A Fulmer, and G Heaton: (2004) “The Gene Ontology Categorizer”,Bioinformatics, v. 20:s1, pp. 169-177

Cli! Joslyn, [email protected], p. 18, 5/14/2007

Page 20: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

AUTOMATED ONTOLOGICAL PROTEINFUNCTION ANNOTATION

Functions = GO

Keywords/Literature

Structures

x

Sequences

GO Branch(BP,MF,CC)

x1 x2

y

AnnotationsF(x1) Annotations

F(x2)

POSOCG(y)

POSOCG(y)

UnknownProtein y

Near BLASTneighbord

POSOCG(x1)

GO:1

GO:2

GO:3

GO:4

BLAST Space

• Mappings among regions of biological spaces . . .• . . . into spaces of biological functions• POSOC annotated BLAST neighborhoods of new proteins• How to measure quality of inferred annotations?

Verspoor, KM; Cohn, JD; Mniszewski, SM; and Joslyn, CA: (2006) “Categorization Approachto Automated Ontological Function Annotation”, Protein Science, v. 15, pp. 1544-1549

Cli! Joslyn, [email protected], p. 19, 5/14/2007

Page 21: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

HIERARCHICAL EVALUATION METRICS

• Hierarchical measures:

Precision:

HP =1

|G(x)|#

b"G(x)

maxa"F(x)

| &a 1 & b|| & b|

Recall:

HR =1

|F(x)|#

a"F(x)

maxb"G(x)

| &a 1 & b|| &a|

F -Score:

HF =2(HP)(HR)

HP + HR

GO Branch(BP,MF,CC)

GO:1

GO:2

GO:3

GO:4

GO:5

GO:6 GO:7

• Example: F(x) = {GO:4}, G(x) = {GO:6}&a = {GO:1,GO:2,GO:4}, & b = {GO:1,GO:2,GO:3,GO:5,GO:6}HP = 2/5, HR = 3/5

S Kiritchenko, S Matwin, and AF Famili: (2005) “Functional Annotation of Genes UsingHierarchical Text Categorization”, Proc. BioLINK SIG on Text Data Mining

Verspoor, KM; Cohn, JD; Mniszewski, SM; and Joslyn, CA: (2006) “Categorization Approachto Automated Ontological Function Annotation”, Protein Science, v. 15, pp. 1544-1549

Cli! Joslyn, [email protected], p. 20, 5/14/2007

Page 22: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

SEMANTIC SIMILARITIESPoset P = $P,!%, probability distributionp:P /0 [0,1],

$a"P p(a) = 1, “cumulative” "(a) :=

$b#a p(a)

Resnik: #(a, b) = maxc"a$b [. log2("(c))]Lin:

#(a, b) =2maxc"a$b[log2("(c))]

log2("(a)) + log2("(b))Jiang and Conrath:

#(a, b) = 2 maxc"a$b

[log2("(c))] . log2("(a)) . log2("(b))

Issues:

• General mathematical grounding inposet metrics

• Not rely on probabilistic weighting

A Budanitsky and G Hirst: (2006) “Evaluating WordNet-based measures of semantic distance.” Computational Lin-guistics, 32(1), 13–47.

Lord, PW; Stevens, Robert; Brass, A; and Goble, C: (2003)“Investigating Semantic Similarity Measures Across the GeneOntology: the Relationship Between Sequence and Annota-tion”, Bioinformatics, v. 10, pp. 1275-1283

B 0.70.0

F 0.00.0

.7

G 0.00.0

A 0.00.0

.7 I 0.70.5

H 0.00.0

0.0

.7

C 0.90.0

E: 0.20.0

J: 0.40.2

D: 0.20.2

.5

0 .2

.2

.5

1 1.00.0

.3 .1

K 0.50.1

.5

.1.3

0 0.00.0

Node: betap

Cli! Joslyn, [email protected], p. 21, 5/14/2007

Page 23: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

POSET METRICSAssume $P,!% finite, connected, bounded

aub := &a 1 & b, alb := ( a 1 ( b

Isotone Map: v:P /0 IR, a ! b 0 v(a) ! v(b)v+(a, b) := min

w"aubv(w)

(aub)v := {w * P : v(w) = v+(a, b)}Upper Valuation: 2z * alb,

v(a) + v(b) ' v+(a, b) + v(z)

Distance: v is an upper valuation i!d(a, b) := 2v+(a, b) . v(a) . v(b)

is a distance (triangle inequality)

a b

aub

albz

(aub)v

Upper Valuation Lower Valuationz " alb z " aub

Isotone v(a) + v(b) % v+(a, b) + v(z) v(a) + v(b) # v!(a, b) + v(z)d(a, b) = 2v+(a, b) & v(a) & v(b) d(a, b) = v(a) + v(b) & 2v!(a, b)

Antitone v(a) + v(b) # v+(a, b) + v(z) v(a) + v(b) % v!(a, b) + v(z)d(a, b) = v(a) + v(b) & 2v+(a, b) d(a, b) = 2v!(a, b) & v(a) & v(b)

Monjardet, B: (1981) “Metrics on Partially Ordered Sets - A Survey”, Discrete Mathematics,v. 35, pp. 173-184

Cli! Joslyn, [email protected], p. 22, 5/14/2007

Page 24: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

SOME LATTICE METRICS

Information Theoretical: Monotoneupper valuation

• Let v(a) = "(a), “cumulative”probability

• Proposition: Jiang and Conrath isa metric, others are not

• d(a, b) = 2"(a + b) . "(a) . "(b)• d(I, J) = 1.53, d(E, J) = 1.64

Purely Structural: Antitone uppervaluation• | & a 1 & b| = | &(a + b)|,

| ( a 1 ( b| = | &(a , b)|• Let v(a) = | & a|• d(a, b) = | & a| + | & b|. 2| & a 1 & b|• d(I, J) = 4, d(E, J) = 6

B 0.70.0

F 0.00.0

.7

G 0.00.0

A 0.00.0

.7 I 0.70.5

H 0.00.0

0.0

.7

C 0.90.0

E: 0.20.0

J: 0.40.2

D: 0.20.2

.5

0 .2

.2

.5

1 1.00.0

.3 .1

K 0.50.1

.5

.1.3

0 0.00.0

Node: betap

Cli! Joslyn, [email protected], p. 23, 5/14/2007

Page 25: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

INTEROPERABILITY AND ALIGNMENT

Matching: Measure similarity between

two regions of a single ontologyComparing: Twist one ontology on a

given term set into another orderingMerging: Given two completely

distinct ontologies:• Identify structurally similar

regions: intersection• Create encompassing

meta-ontologies: product orunion?

C

E J

D

1

K

g,h,i

j

b G F

I

1

A

g,h

j

b

i

GO EC

Joslyn, Cli!: (2004) “Poset Ontologies and Concept Lattices as Semantic Hierarchies”, in:Conceptual Structures at Work, Lecture Notes in Artificial Intelligence, v. 3127, ed. Wol!,Pfei!er and Delugach, pp. 287-302, Springer-Verlag, Berlin

Cli! Joslyn, [email protected], p. 24, 5/14/2007

Page 26: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

ALIGNMENT METHODSUltimate Goal: Construct order morhpismsNeighborhoods: Around given anchorsLexical: MatchesStructural: Nodes playing similar structural rolesCombinatoric: Sets of nodes playing similar structural rolesPoset Metrics: Measure candidate alignment, suggest new an-

chors

Horse

Palamino Arabian

Has-astragalusRideable

Cow Pony

Shetland

Thick-mane Ungulate

Bison

Has-mane

Highland

Given

Animal Mammal

Lexical

Structural

Combinatoric

Cli! Joslyn, [email protected], p. 25, 5/14/2007

Page 27: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

FORMAL CONCEPT ANALYSIS

• Semantichierarchies fromrelational data

• Unbiased,graphical, visualrepresentation

• Hypothesis and

rule generationand evaluation

• For ontologyinduction,interoperability

Ganter, Bernhard and Wille, Rudolf: (1999) Formal Concept Analysis, Springer-Verlag

Cli! Joslyn, [email protected], p. 26, 5/14/2007

Page 28: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

FCA ONTOLOGY MERGER, INDUCTION

• {g1, g2, g3}: annotated into an ontology O:C

A Bg1 g2

g3 (g1 g2)

• {g2, g3, g4}: annotated to keywords K = {k1, k2, k3}• Induce order on K while incoporating order on O• Amenable to metric treatment of attributes, objects

a b c

g1' '

g2' '

g3'

k1 k2 k3

g2' '

g3'

g4' '

a b c k1 k2 k3

g1' '

g2' ' ' '

g3' '

g4' '

Gessler, DDG, CA Joslyn, KM Verspoor: (2007) “Knowledge Integration in Open Worlds:Exploiting the Mathematics of Hierarchical Structure”, in preparation for ICSC 07

Cli! Joslyn, [email protected], p. 27, 5/14/2007

Page 29: ic achies in Knowledge Analysis and ion Infoion Sciences oupc4i.gmu.edu/OIC09/presentations/OIC2009_3talk_Joslyn.pdf · chine learning, sensor technology, simulation, etc. – od

ACKNOWLEDGEMENTS,COLLABORATIONS, AND

OTHER ASSORTED NAME-DROPPING

LANL Info. Sciences:

• Susan Mniszewski• Chris Orum• Karin Verspoor

• Michael WallLANL Elsewhere:

• Judith Cohn• Bill Bruno

• Steve SmithU. West Indies:

• Jonathan Farley

PNNL: Joe OliveiraU. Newcastle: Phillip Lord

NCGR: Damian GesslerTechnische Universitat Dresden:

• Stephan Schmidt• Tim Kaiser• Bjoern Koester

New Mexico State U.:

• Alex PogelP&G: Andy FulmerStanford Medical Informatics:

• Natasha Noy

Cli! Joslyn, [email protected], p. 28, 5/14/2007