contextual graph attention for answering logical queries...

20
I C M E C C G A A L Q I K G K-CAP 2019, N. 2019 Gengchen Mai 1 Krzysztof Janowicz 1 Bo Yan 1 Rui Zhu 1 Ling Cai 1 Ni Lao 2 1 STKO Lab, University of California Santa Barbara 2 SayMosaic Inc., Palo Alto, CA, USA S C G M 1 , K J 1 , B Y 1 , R Z 1 , L C 1 , N L 2

Upload: others

Post on 17-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Contextual Graph Attention for Answering

Logical Queries over Incomplete Knowledge

Graphs

K-CAP 2019, Nov. 2019

Gengchen Mai1 Krzysztof Janowicz1 Bo Yan1Rui Zhu1 Ling Cai1 Ni Lao2

1 STKO Lab, University of California Santa Barbara2SayMosaic Inc., Palo Alto, CA, USA

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 2: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Introduction

Knowledge Graph (KG): a data repository that describes entitiesand their relationships across domains according to some schemaProblem: Incompleteness, Sparsity, and Noise

Figure From https://medium.com/@sderymail/challenges-of-knowledge-graph-part-1-d9ffe9e35214

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 3: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Introduction

Knowledge Graph Embedding for Knowledge Graph CompletionThe major KG Embedding models can be classified as twocategories (Wang et al. 2017):

Translation-based models (e.g. TransE, TransH, and TransR)

Semantic matching models (e.g. RESCAL, DisMult, and HolE)

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 4: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Introduction

Training a KG embedding model over a knowledge graph (KG)G � (V ,R)

Task: link prediction and entity classificationThe model complexity is linear with respect to |V|Dealing with more complex tasks?

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 5: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Conjunctive Graph Query (CGQ)

Using KG Embeddings for Conjunctive Graph Query (CGQ)A query q ∈ Q(G) that can be written as follows:

q � V?.∃V1 ,V2 , ..,Vm : b1 ∧ b2 ∧ ... ∧ bn

where bi � r (ek ,Vl ),Vl ∈ {V? ,V1 ,V2 , ..,Vm}, ek ∈ V , r ∈ Ror bi � r (Vk ,Vl ),Vk ,Vl ∈ {V? ,V1 ,V2 , ..,Vm}, k , l , r ∈ R

Requirements:Require one variable as the answer denotation: Target NodeNo variable in the predicate positionOnly consider the conjunction of graph patternsThe dependence graph of q must be a directed acyclic graph (DAG)

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 6: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Conjunctive Graph Query (CGQ)

SPARQL?

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 7: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Conjunctive Graph Query (CGQ)

KGEmbedding

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 8: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Conjunctive Graph Query (CGQ)

Using KG Embedding to predict the answer to a CGQ:Projection Operation: Translate from the corresponding entitynodes via different relation embeddings through different paths (tripleT1 and T2).Intersection Operation: Integrate predicted embeddings for thesame variable (?Person) from different paths (triple T1 and T2).Recursively use these two operators until we get the embedding forthe target variable q.Nearest neighbor search for answer entities with q by cosinesimilarity.

UCLA

EscapeClause

?Person ?Disease

Guest

AlmaMaster -1 DeathCause

T1

T2

T3

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 9: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Related Work

Wang et al. (2018): Pretrain KG embeddings and use it for CGQlacks flexibility: deterministic weighting approach for path embeddingintegrationNo end-to-end: does not directly optimized on the QA objective

Hamilton et al. (2018): An end-to-end model for logic queryanswering with an elementwise-mean intersection operator whichtreats query path equally

Fail to consider unequal contribution from different pathsDo not consider the original KG structure

Attention?

UCLA

EscapeClause

?Person ?Disease

Guest

AlmaMaster -1 DeathCause

T1

T2

T3

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 10: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Attention Mechanism

Consider unequal contribution from different triple paths:Problem for Attention mechanism: The center nodeembedding/query embedding is a prerequisite for attention scorecomputing which is unknown in this case

UCLA

EscapeClause

?Person

Guest

AlmaMaster -1

T1

T2

450

5

e1:AlmaMaster-1(UCLA,?Person)

e2:Guest(EscapeClause,?Person)

e:?PersonIntersectionOperator

?

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 11: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Entity Embedding

Entity embedding lookup:

ei �Zγxi

‖ Zγxi ‖L2(1)

Zγ ∈ Rd×mγ is the type-specific embedding matrices for all entitieswith type γ � Γ(ei ).

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 12: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Projection Operation

e′i � P(ei , r) � Rrei (2)Rr ∈ Rd×d is a trainable and relation-specific matrix for relation type r.

EscapeClause ?Person

Guest

ei Rr ei'Support and Centrality Gengchen Mai

1, Krzysztof Janowicz

1, Bo Yan

1, Rui Zhu

1, Ling Cai

1, Ni Lao

2

Page 13: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Intersection Operation

e′′ � ICGA({e′1 , e′

2 , ..., e′

i , ..., e′

n}) (3)

UCLA

EscapeClause

?Person

Guest

AlmaMaster -1

T1

T2

450

5

e1:AlmaMaster-1(UCLA,?Person)

e2:Guest(EscapeClause,?Person)

e:?PersonIntersectionOperator

?

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 14: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Intersection Operation

Multi-head attention inspired by Transformer (Vaswani et al. 2017).An initial intersection embedding layer (red) is used so that centervariable embedding is no longer a prerequisite.

NeighborEmbedding e1

Projected NeighborEmbedding e'1

NeighborEmbedding e2

Projected NeighborEmbedding e'2

NeighborEmbedding en

Projected NeighborEmbedding e'n

. . .Initial IntersectionEmbedding e''attn

Multi-HeadAttention

Min/Mean

Add & Norm

Feed ForwardNeural Network

Add & Norm

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 15: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Model Training

Original KG Training Phase:

LKG �

∑ei∈V

∑e−i ∈Neg(ei )

max(0,∆ −Φ(HKG(ei ), ei ) +Φ(HKG(ei ), e−i ))

(4)

Φ: cosine similarity function.HKG(ei ) indicates a new embedding e′′i for entity ei given its 1-degreeneighborhood N(ei ).e−i ∈ Neg(ei ) is a negative sample.

Logical Query-Answer Pair Training Phase:

LQA �

∑(qi ,ai )∈S

∑a−i ∈Neg(qi ,ai )

max(0,∆ −Φ(qi , ai ) +Φ(qi , a−i )) (5)

qi : query embedding.ai , a−i : the embedding for the correct answer entity & negative answers.

The whole loss function:

L � LKG + LQA (6)Support and Centrality Gengchen Mai

1, Krzysztof Janowicz

1, Bo Yan

1, Rui Zhu

1, Ling Cai

1, Ni Lao

2

Page 16: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Datasets

The original Bio dataset (Hamilton et al. 2018)We constructed two datasets from publicly available DBpedia andWikidata: DB18, WikiGeo19Two metrics: ROC AUC score and average percentile rank (APR)

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 17: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Evaluation Results

Adding the original KG training phase in the model training processimproves the model performance.Adding the attention mechanism further improves the modelperformance.CGA has less learnable parameters with better performance.CGA shows strong advantages over baseline models especially onquery types with hard negative sampling.

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 18: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Evaluation Results

CGA outperforms the baseline models in almost all query types.

APR for WikiGeo19

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 19: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Conclusion

We propose an end-to-end attention-based logical query answeringmodel, contextual graph attention model (CGA).The multi-head attention mechanism is utilized in the intersectionoperator to automatically learn different weights for different querypaths.Our models outperform the baseline models on three dataset (Bio,DB18, and WikiGeo19) despite using less parameters.

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2

Page 20: Contextual Graph Attention for Answering Logical Queries ...gengchen_mai/presentations/2019-K-CAP2019… · SPARQL? Support and Centrality Gengchen Mai1, Krzysztof Janowicz1, Bo Yan1,

Introduction Contribution Method Experiment Conclusion

Future Work

Explore ways to use our model in an inductive learning setupConsider disjunction, negation, and filters in query answeringConsider variables in the predicate position

Support and Centrality Gengchen Mai1

, Krzysztof Janowicz1

, Bo Yan1

, Rui Zhu1

, Ling Cai1

, Ni Lao2