knowledge-driven semantic understandinghainanumeeting.net/yssnlp2019/file/1.pdf · • logic rules...

Post on 25-Aug-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Knowledge-driven Semantic Understanding

ZHAO Xin Renmin University of China

BatmanFly@qq.com

Distributional semantics• Target word = “stars”

Distributional semantics• Collect the contextual words for “stars”

Distributional semantics• Distributional word representation • Distributional hypothesis:  words that are used and occur in the same

contexts tend to purport similar meanings

• Implementations with distributed word embedding models • Word2vec • GloVec

Distributional semantics

A huge success of deep contextualized models• EMLo

• Char-level word encoding + 2 BiLSTM layers • Pretrained language models + can be fine-tuned according to specific tasks

• Transformer • Pairwise interaction, called self-attention (with positional embedding) • Multi-head mechanism • Deep architecture (using six layers)

• BERT • Built on top of Transformer • Bidirectional context • Masked LM + Next sentence prediction • Very deep

• BERT-Base: 12-layer, 768-hidden, 12-head • BERT-Large: 24-layer, 1024-hidden, 16-head

Basic motivations

• Deeper architecture and larger context scope • Pretrained language models that can be fine-tuned

Connecting the dots• Deep architecture

• NLPers always want to use very deep neural networks as CVers

• Contextualized models • Distributional semantics

• Word-by-word attention • Reasoning about entailment with neural attention

• Self-match • R-Net: Machine Reading Comprehension with Self-Matching

The context scope can be even larger • Document-level information • Document Context Neural Machine Translation with Memory Network

The context scope can be even larger • Document-level information • Improving the Transformer Translation Model with Document-Level

Context

Structural knowledge • Triplets • Form: (h, r, t) • Examples:

• <YAO Ming, birthPlace, Shanghai> • <YAO Ming, gender, male>

• Embedding methods • E.g., TransE

Structural knowledge • Knowledge base (or KB like) information • Knowledgeable Reader: Enhancing Cloze-Style Reading Comprehension

with External Commonsense Knowledge

Structural knowledge • Linguistic information • Type-Aware Question Answering over Knowledge Base with Attention-

Based Tree-Structured Neural Networks

Structural knowledge • Logic rules • Harnessing Deep Neural Networks with Logic Rules

First-order logic rules on two tasks: sentiment classification and NER

Structural knowledge • Demographic attributes • Mining Product Adopter Information from Online Reviews for Improving

Product Recommendation

Structural knowledge • Demographic attributes • Mining Product Adopter Information from Online Reviews for Improving

Product Recommendation

• Based on the analysis of 13.9 million JD reviews, about 10.8% reviews contain at least an adopter mention • We can even infer the information about the buyer

• Marital status • Age range

Structural knowledge • Demographic attributes • Adversarial Removal of Demographic Attributes from Text Data

Knowledge utilization• Point #1: • Enriching information for the NLP tasks • The widely used procedure

• Knowledge retrieval ! Knowledge contextualization ! Knowledge utilization

Knowledge-powered conversational agents

Knowledge utilization• Point #1: • Enriching information for the NLP tasks • The widely used procedure

• Knowledge retrieval ! Knowledge contextualization ! Knowledge utilization

Commonsense Knowledge Aware Conversation Generation with Graph Attention

Knowledge utilization• Point #1: • Enriching information for the NLP tasks • Challenging problems:

• How to identify and find suitable knowledge resources to use • How to learn knowledge representations that are useful for some specific tasks

Knowledge utilization• Point #2: • Making models more explainable

• e-SNLI: Natural Language Inference with Natural Language Explanations

Knowledge utilization• Point #2: • Making models more explainable

• Improving Sequential Recommendation with Knowledge-Enhanced Memory Networks

Knowledge utilization• Point #2: • Making models more explainable

• Challenging problems • How to define explainability • How to balance explainability and effectiveness

Knowledge utilization• Point #3: • Knowledge can guide the model design

• Sentence Encoding with Tree-constrained Relation Networks

Knowledge utilization• Point #3: • Knowledge can guide the model design

• Taxonomy-Aware Multi-Hop Reasoning Networks for Sequential Recommendation

Knowledge utilization• Point #3: • Knowledge can guide the model design

• Challenging problems: • Given some kind of knowledge, what is the suitable model to integrate it? • Given some existing models, how to adapt it to fully utilize knowledge information?

Conclusion

top related