iman saleh -- cairo university alessandro moschitti...

Explore semantic similarity from distributional and other sources, e.g., Wikipedia, Wiktionary, WordNet, FrameNet, BabelNet, and LSA for different domains. Brown cluster ID First letter of the POS Partial Tree Kernels (PTK) Smoothed PTK or Semantic Kernel (SK) Soft matching of tree fragments via similarity Node similarity derived using LSA Text/Speech “cheap italian restaurants in doha with take out” Semantically segmented text (cheap) (italian) restaurants in (doha) with (take out) <Price> <Cuisine> <City> <Amenity> Semantic representation {“Price” : “low”, “Amenity” : “take out”, “City” : “doha”, “Cuisine” : “italian”} DB query {$and[{“cuisine”:{“$regex”:“italian”}}, {“city”:{“$regex”:“doha”}}, {“price”:{“$regex”:“low”}}, {“amenity”:{“$regex”:“carry out”}}]} How do we convert a spoken request like “cheap italian restaurants in doha with take out” into a database query? Semantic Kernels for Semantic Parsing Iman Saleh -- Cairo University Alessandro Moschitti, Preslav Nakov, Lluís Màrquez, Shafiq Joty -- Qatar Computing Research Institute 1. Motivation 3. Our Approach: Reranking with Semantic Kernels 4. Experiments semantic parser rule normalizer Semi-CRF generates n-best hypotheses (H i ) Pairwise reranking function trained with SVMs with the preference reranking kernel Hypotheses are represented with tree-based structures (tree kernels are applied) Focus on kernels using semantics: LSA-based smoothing lexical similarity and Brown clusters Dataset: human annotations (McGraw et al. 12) Oracle: shows large room for improvement Brown clusters: from Yelp restaurants reviews Lexical similarity for SK: LSA on TripAdvisor reviews Main results The LSA-smoothed semantic kernel (SK) improves significantly over the semi-CRF (“baseline”) and over our previous state-of- the-art reranker, which uses shallow syntactic patterns and PTK (Saleh et al., COLING-2014; equivalent to PTK+all) ~10% relative improvement BCs do not significantly improve any model PTK+all is better than PTK but its accuracy is lower than any SK +all helps SK only with small sizes of the training set Note: the state of the art on the task is very hard to beat 5. Conclusions 6. Future Work N 1 2 5 10 100 Oracle F1 83.03 87.76 92.63 95.23 98.72 SK Structures SK + Brown Clusters (SK+BC) Semantic Structures for CSL Reranking Tree Kernels capture structural similarities between trees Brown cluster IDs in the structures capture semantic dependencies between labels Train Test Train Reranker Test Reranker 6,922 1,521 28,482 7,605 Acknowledgements This research is developed by the Arabic Language Technologies (ALT) group at Qatar Computing Research Institute (QCRI) within the Qatar Foundation in collaboration with MIT. It is part of the Interactive sYstems for Answer Search (Iyas) project. Extra features in a flat vector (+all) From the semi-CRF: probability of label sequences, label sequence n-grams, DB-based, etc. Processing Steps query builder 2. State-of-the-art Semantic Parser Joint sequential segmentation/classification Discriminative probabilistic sequential model Undirected graphical model Semi-Markov CRFs (Sarawagi & Cohen 04) Semi-CRF Sem. parser H 1 H 2 H 3 … … H n <H 1 ,H 2 > <H 1 ,H 3 > … … <H n ,H 1 > … <H n ,H n-1 > Pairwise reranker <H 4 ,H 2 > <H 3 ,H 1 > … … <H 1 ,H n > <H n ,H 2 > Input sentence H 4 LSA-smoothed SKs enable to measure the similarity between tree fragments (based on LSA word similarity), thus powerful semantic patterns are generated

Upload: others

Post on 10-Mar-2020

0 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: Iman Saleh -- Cairo University Alessandro Moschitti ...emnlp2014.org/material/poster-EMNLP2014050.pdfover the semi-CRF (“baseline”) and over our previous state-of-the-art reranker,

Explore semantic similarity from distributional and other sources, e.g., Wikipedia, Wiktionary, WordNet, FrameNet, BabelNet, and LSA for different domains.

Brown cluster ID

First letter of the POS

Partial Tree Kernels (PTK)

Smoothed PTK or Semantic Kernel (SK)

Soft matching of tree fragments via similarity Node similarity derived using LSA

Text/Speech “cheap italian restaurants in doha with take out”

Semantically segmented text (cheap) (italian) restaurants in (doha) with (take out)

Semantic representation {“Price” : “low”,

“Amenity” : “take out”,

“City” : “doha”,

“Cuisine” : “italian”}

DB query {$and[{“cuisine”:{“$regex”:“italian”}}, {“city”:{“$regex”:“doha”}},

{“price”:{“$regex”:“low”}},

{“amenity”:{“$regex”:“carry out”}}]}

How do we convert a spoken request like

“cheap italian restaurants in doha with take out”

into a database query?

Semantic Kernels for Semantic Parsing Iman Saleh -- Cairo University Alessandro Moschitti, Preslav Nakov, Lluís Màrquez, Shafiq Joty -- Qatar Computing Research Institute

1. Motivation 3. Our Approach: Reranking with Semantic Kernels 4. Experiments

semantic parser

rule normalizer

Semi-CRF generates n-best hypotheses (Hi)

Pairwise reranking function trained with SVMs with the preference reranking kernel

Hypotheses are represented with tree-based structures (tree kernels are applied)

Focus on kernels using semantics: LSA-based smoothing lexical similarity and Brown clusters

Dataset: human annotations (McGraw et al. 12)

Oracle: shows large room for improvement

Brown clusters: from Yelp restaurants reviews

Lexical similarity for SK: LSA on TripAdvisor reviews

Main results

The LSA-smoothed semantic kernel (SK) improves significantly over the semi-CRF (“baseline”) and over our previous state-of-the-art reranker, which uses shallow syntactic patterns and PTK (Saleh et al., COLING-2014; equivalent to PTK+all)

~10% relative improvement

BCs do not significantly improve any model

PTK+all is better than PTK but its accuracy is lower than any SK

+all helps SK only with small sizes of the training set

Note: the state of the art on the task is very hard to beat

5. Conclusions

6. Future Work

N 1 2 5 10 100

Oracle F1 83.03 87.76 92.63 95.23 98.72

SK Structures

SK + Brown Clusters (SK+BC)

Semantic Structures for CSL Reranking

Tree Kernels capture structural similarities between trees

Brown cluster IDs in the structures capture semantic dependencies between labels

Train Test Train Reranker Test Reranker

6,922 1,521 28,482 7,605

Acknowledgements This research is developed by the Arabic Language Technologies (ALT) group at Qatar Computing Research Institute (QCRI) within the Qatar Foundation in collaboration with MIT. It is part of the Interactive sYstems for Answer Search (Iyas) project.

Extra features in a flat vector (+all)

From the semi-CRF: probability of label sequences, label sequence n-grams, DB-based, etc.

Processing Steps

query builder

2. State-of-the-art Semantic Parser

Joint sequential segmentation/classification Discriminative probabilistic sequential model

Undirected graphical model

Semi-Markov CRFs (Sarawagi & Cohen 04)

Semi-CRF Sem. parser

… … Hn

<H1,H2> <H1,H3>

… …

<Hn,H1> …

<Hn,Hn-1>

Pairwise reranker

<H4,H2> <H3,H1>

… …

<H1,Hn> <Hn,H2>

Input sentence

LSA-smoothed SKs enable to measure the similarity between tree fragments (based on LSA word similarity), thus powerful semantic patterns are generated

Word Translation Prediction for Morphologically …emnlp2014.org/papers/pdf/EMNLP2014175.pdfProceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP),

Complex Linguistic Features for Text Classification: A Comprehensive Study Alessandro Moschitti and Roberto Basili University of Texas at Dallas, University

Language Processing and Learning Models for Community ...dit.unitn.it/moschitti/since2013/2017_!PMJ_Romeo_Arabic.pdf · 115 deep learning features (cf. Section 4). 116 Extensive work

ANAPHORA RESOLUTION - DISI, University of Trentodisi.unitn.it/moschitti/Teaching-slides/Lecture-NLPIR-Coref.pdf · Anaphora Resolution The interpretation of most expressions depends

Natural Language Processing and Information Retrievaldisi.unitn.it/moschitti/Teaching-slides/NLP-IR/Lecture7-NLP-IR.pdf · Performance Measurements (cont’d) Breakeven Point Find

SemEval-2015 Task 3: Answer Selection in Community ...disi.unitn.it/moschitti/since2013/2015_SEMEVAL_Task3_Nakov_desc… · Proceedings of the 9th International Workshop on Semantic

University of Texas at Dallas 1 Computational Implicatures for Advanced Question Answering Sanda Harabagiu, Alessandro Moschitti, Adrian Atanasiu, Paul

MACHINE LEARNING Vapnik-Chervonenkis (VC) Dimensiondisi.unitn.it/moschitti/Teaching-slides/VC-dim.pdf · Computational Learning Theory The approach used in rectangular hypotheses

Advanced Natural Language Processing and Information Retrievaldisi.unitn.it/moschitti/Teaching-slides/slides-AINLP-2016/Lab2-SVMs-KM... · Alessandro Moschitti Department of Computer

Advanced Structural Representations for Question ...disi.unitn.it/moschitti/articles/ECIR2007-PAS.pdf · Advanced Structural Representations for Question Classiﬁcation and Answer

Translation Modeling with Bidirectional Recurrent Neural ...emnlp2014.org/papers/pdf/EMNLP2014003.pdf · 3 LSTM Recurrent Neural Networks ... In related elds like e.g. language modeling,

Kernel-Based Machines for Abstract and Easy …disi.unitn.it/moschitti/articles/2011/SFM2011.odf.pdfKernel-Based Machines for Abstract and Easy Modeling of Automatic Learning Alessandro