prediction of frame-to-frame relations in the framenet hierarchy … · 2017. 7. 26. · for answer...

11
Proceedings of the 2nd Workshop on Representation Learning for NLP, pages 146–156, Vancouver, Canada, August 3, 2017. c 2017 Association for Computational Linguistics Prediction of Frame-to-Frame Relations in the FrameNet Hierarchy with Frame Embeddings Teresa Botschen §† , Hatem Mousselly-Sergieh , Iryna Gurevych §† §Research Training Group AIPHES Ubiquitous Knowledge Processing (UKP) Lab Department of Computer Science, Technische Universit¨ at Darmstadt www.aiphes.tu-darmstadt.de, www.ukp.tu-darmstadt.de Abstract Automatic completion of frame-to-frame (F2F) relations in the FrameNet (FN) hierarchy has received little attention, al- though they incorporate meta-level com- monsense knowledge and are used in downstream approaches. We address the problem of sparsely annotated F2F rela- tions. First, we examine whether the man- ually defined F2F relations emerge from text by learning text-based frame embed- dings. Our analysis reveals insights about the difficulty of reconstructing F2F rela- tions purely from text. Second, we present different systems for predicting F2F rela- tions; our best-performing one uses the FN hierarchy to train on and to ground em- beddings in. A comparison of systems and embeddings exposes the crucial influence of knowledge-based embeddings to a sys- tem’s performance in predicting F2F rela- tions. 1 Introduction FrameNet (FN) (Baker et al., 1998) is a lexical- semantic resource manually built by FN experts. It embodies the theory of frame semantics (Fill- more, 1976): the frames capture units of mean- ing corresponding to prototypical situations. Be- sides FN’s definitions of frame-specific roles and frame-evoking elements that are used for the task of Semantic Role Labeling, it also contains man- ual annotations for relations that connect pairs of frames. There are thirteen frame-to-frame (F2F) relations of which five are antonym relations (e.g. Precedes, Is Preceded by). To give an example, the frame “Waking up” is in relation Precedes to the frame “Being awake”. Fig- ure 1 further elaborates this example by demon- strating relationships to additional frames. Table 1 lists all F2F relation names with the number of frame pairs for each relation according to the FN hierarchy, and also restricted counts including only frame pairs that have lexical units (LU) in the FN hierarchy (e.g., the frame “Waking up” can be evoked by the LU “awake.v” of the verb “awake”). The FN hierarchy, a report version of FN, does not provide lexical units for 125 frames (e.g., the frame “Sleep wake cycle” has no LU). In fact, such frames are used as meta-frames for abstrac- tion purposes, thus, they exist only to participate in F2F relations with other frames (Ruppenhofer et al., 2006). In general, each frame pair is con- nected via only one F2F relation with occasional exceptions and the F2F relations situate the frames in semantic space (Ruppenhofer et al., 2006). F2F relations are used in the context of other tasks, such as text understanding (Fillmore and Baker, 2001), paraphrase rule generation for the system LexPar (Coyne and Rambow, 2009) and recogni- tion of textual entailment (Aharon et al., 2010). Furthermore, F2F can be used as a form of com- monsense knowledge (Rastogi and Van Durme, 2014). The incompleteness of the FN hierarchy is a known issue not only at the frame level (Ras- togi and Van Durme, 2014; Pavlick et al., 2015; Hartmann and Gurevych, 2013) but also at the Figure 1: F2F relations example. Gray: FN hier- archy. White arrows: missing in FN hierarchy. 146

Upload: others

Post on 02-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Proceedings of the 2nd Workshop on Representation Learning for NLP, pages 146–156,Vancouver, Canada, August 3, 2017. c©2017 Association for Computational Linguistics

    Prediction of Frame-to-Frame Relations in the FrameNet Hierarchy withFrame Embeddings

    Teresa Botschen§†, Hatem Mousselly-Sergieh†, Iryna Gurevych§†§Research Training Group AIPHES

    †Ubiquitous Knowledge Processing (UKP) LabDepartment of Computer Science, Technische Universität Darmstadt

    www.aiphes.tu-darmstadt.de, www.ukp.tu-darmstadt.de

    Abstract

    Automatic completion of frame-to-frame(F2F) relations in the FrameNet (FN)hierarchy has received little attention, al-though they incorporate meta-level com-monsense knowledge and are used indownstream approaches. We address theproblem of sparsely annotated F2F rela-tions. First, we examine whether the man-ually defined F2F relations emerge fromtext by learning text-based frame embed-dings. Our analysis reveals insights aboutthe difficulty of reconstructing F2F rela-tions purely from text. Second, we presentdifferent systems for predicting F2F rela-tions; our best-performing one uses the FNhierarchy to train on and to ground em-beddings in. A comparison of systems andembeddings exposes the crucial influenceof knowledge-based embeddings to a sys-tem’s performance in predicting F2F rela-tions.

    1 Introduction

    FrameNet (FN) (Baker et al., 1998) is a lexical-semantic resource manually built by FN experts.It embodies the theory of frame semantics (Fill-more, 1976): the frames capture units of mean-ing corresponding to prototypical situations. Be-sides FN’s definitions of frame-specific roles andframe-evoking elements that are used for the taskof Semantic Role Labeling, it also contains man-ual annotations for relations that connect pairs offrames. There are thirteen frame-to-frame (F2F)relations of which five are antonym relations (e.g.Precedes, Is Preceded by). To give anexample, the frame “Waking up” is in relationPrecedes to the frame “Being awake”. Fig-ure 1 further elaborates this example by demon-

    strating relationships to additional frames. Table1 lists all F2F relation names with the numberof frame pairs for each relation according to theFN hierarchy, and also restricted counts includingonly frame pairs that have lexical units (LU) in theFN hierarchy (e.g., the frame “Waking up” can beevoked by the LU “awake.v” of the verb “awake”).The FN hierarchy, a report version of FN, doesnot provide lexical units for 125 frames (e.g., theframe “Sleep wake cycle” has no LU). In fact,such frames are used as meta-frames for abstrac-tion purposes, thus, they exist only to participatein F2F relations with other frames (Ruppenhoferet al., 2006). In general, each frame pair is con-nected via only one F2F relation with occasionalexceptions and the F2F relations situate the framesin semantic space (Ruppenhofer et al., 2006). F2Frelations are used in the context of other tasks,such as text understanding (Fillmore and Baker,2001), paraphrase rule generation for the systemLexPar (Coyne and Rambow, 2009) and recogni-tion of textual entailment (Aharon et al., 2010).Furthermore, F2F can be used as a form of com-monsense knowledge (Rastogi and Van Durme,2014).

    The incompleteness of the FN hierarchy is aknown issue not only at the frame level (Ras-togi and Van Durme, 2014; Pavlick et al., 2015;Hartmann and Gurevych, 2013) but also at the

    Figure 1: F2F relations example. Gray: FN hier-archy. White arrows: missing in FN hierarchy.

    146

  • F2F Relation Name Total RestrictedInherits from 617 383Is inherited by 617 383Uses 491 430Is Used by 490 430Subframe of 119 29Has Subframes 117 29Perspective on 99 15Is Perspectivized in 99 15Precedes 79 48Is Preceded by 79 48Is Causative of 48 47See also 41 40Is Inchoative of 16 16Sum 2, 912 1, 913

    Table 1: F2F relation pair counts and restrictedpair counts of frames with lexical units.

    F2F relation level. Figure 1 exemplifies a miss-ing precedence relation: “Fall asleep” is pre-ceded by “Being awake” but inbetween yet an-other frame could be added, e.g. “Biologi-cal urge” (evoked by the predicate “tired”). Ras-togi (2014) note a lack of research on automat-ically enriching the F2F relations, which wouldbe beneficial given the large number of possibleframe pairs for a relation and their use in othertasks. The automatic annotation of F2F relationsinvolves three difficulties accounted for by the na-ture of FN. First, F2F relation annotations occursparsely and for the majority of pairs in each re-lation there are few instances (see Table 1). Sec-ond, the relations themselves have no direct lexicalcorrespondences in text and hence inferring themfrom text is not trivial. Third, if a relation involvesa frame that does not have any lexical unit (seerestricted counts in Table 1), this frame does notoccur in text and hence inferring this relation fromtext is even more difficult.

    To the best of our knowledge there is no work(a) addressing the emergence of F2F relationsfrom text data or (b) enriching F2F relations auto-matically. We aim to address these problems withthe following contributions:Contributions of the paper1) We learn text-based frame embeddings and ex-plore their limitations with respect to F2F relationsto check whether the manually annotated F2F re-lations can naturally emerge from text. We findthat, concerning the methods we explored, the em-beddings have difficulties in showing structures di-rectly corresponding to F2F relations.2) We transfer the relation prediction task fromresearch on Knowledge Graph Completion to the

    case of FN and present our best-performing sys-tem for predicting F2F relations. The system in-volves training on the FN hierarchy and uses em-beddings also trained on the FN hierarchy.3) We generically demonstrate the predictions ofour best-performing system for unannotated framepairs and suggest its application for automatic FNcompletion on the relation level.Structure of the paper To start with, Section2 reviews related research including algorithmsand approaches that we will apply to our pur-poses. Next, Section 3 briefly presents the FNdata that we will work with. Then, the paper isstructured along our contributions: exploration ofF2F relations in frame embedding space (Sec 4)and F2F Relation Prediction task (Sec 5), includ-ing a demonstration of predictions for unannotatedframe pairs (Sec 5.3). Finally, Section 6 discussesour insights and traces options for future work.

    2 Related Work

    2.1 Frame Embeddings

    Frame embeddings for frame identification Toour knowledge, the only approach learning frameembeddings is a matrix factorization approach inthe context of the task of Frame Identification(FrameId), which is the first step in FN SemanticRole Labeling. The state-of-the-art system (Her-mann et al., 2014) for FrameId projects frames andpredicates with their context words into the samelatent space by using the WSABIE algorithm (We-ston et al., 2011). Two projection matrices (one forframes and one for predicates) are learned usingWARP loss and gradient-based updates such thatthe distance between the predicate’s latent repre-sentation and that of the correct frame are min-imized. Consequently, latent representations offrames will end up close to each other if they areevoked by similar predicates and context words.As the focus of such systems is on the FrameIdtask, the latent representations of the frames arerather a sub-step contributing to FrameId but notstudied further or applied to other tasks. We willextract these frame embeddings and explore themwith respect to F2F relations.Word2Vec embeddings The Neural Network(NN) architecture of the Word2Vec algorithm(Mikolov et al., 2013a) learns word embeddingsby either predicting a target word given its con-text words (CBOW model) or by predicting contextwords given their target word (skip-gram model).

    147

  • There are different tasks specifically designedfor the evaluation of word embeddings (Mikolovet al., 2013b). They are formulated as analogyquestions about syntax or semantics of the form“a is to b as c is to ”. Mikolov (2013b) sug-gest a vector offset method based on cosine dis-tance to solve these analogy tasks. This assumesthat relationships are expressed by vector offsets:given two word pairs (a, b) and (c, d), the ques-tion is to what extent the relations within the pairsare similar. We will apply this method to framepairs that are connected via F2F relations in orderto find out whether the frame embeddings incor-porate F2F relations.

    There is an interest in abstracting away fromword embeddings towards embeddings for morecoarse grained units: Word2Vec is used to learnembeddings for senses (Iacobacci et al., 2015) orfor supersenses (Flekova and Gurevych, 2016).Iacobacci (2015) use the CBOW model on textsannotated with BabelNet senses (Navigli andPonzetto, 2012). Flekova (2016) use the skip-gram model on texts with mapped WordNet super-senses (Miller, 1990; Fellbaum, 1990). For evalu-ation both works are oriented towards Mikolov’s(2013b) analogy tasks and perform qualitativeanalyses for the top k most similar embeddingsfor (super)senses or visualize the embeddings invector space. To have text-based frame embed-dings in line with related work, we will also usethe Word2Vec algorithm to learn an additional ver-sion for frame embeddings.

    2.2 Relation Prediction

    This task stems from automatic Knowledge GraphCompletion (KGC) and is known as “Link Predic-tion” (Bordes et al., 2011, 2012, 2013). We willtransfer this task to F2F Relation Prediction forframe pairs. For this task, knowledge-based em-beddings are well suited, which are not learned ontext but on triples of a KG.TransE embeddings We leverage an embeddinglearning approach from KGC to obtain embed-dings for frames and for F2F relations that aregrounded in the FN hierarchy. In translation mod-els, all entities and relations of the triples of head-entity, relation and tail-entity (h, r, t) are projectedinto one latent vector space such that the relation-vector connects from the head-vector to the tail-vector as a translating vector operation. TransE(Bordes et al., 2013) introduced the idea of mod-

    eling relations as translations that operate on theembeddings of the entities. The model is formu-lated to minimize | h + r − t | for a training set,with randomly initialized embeddings. The func-tion to minimize resembles the idea of the vectoroffset by Mikolov (2013b).Answer selection model Link Prediction ismethodologically related to the key-task of An-swer Selection from Question Answering (QA).The task is to rank a set of possible answer can-didates with respect to a given question (Tan et al.,2015). State-of-the-art QA models are presentedby (Feng et al., 2015) and by (Tan et al., 2015).They jointly learn vector representations for boththe questions and the answers. Representations ofthe same dimensionality in the same space allowone to compute the cosine similarity between thesevectors. We will orient ourselves by NN modelsfor Answer Selection in order to adapt the ideas toF2F Relation Prediction. In our case, a questioncorresponds to a frame pair and an answer corre-sponds to a F2F relation. Optionally, pretrainedframe embeddings can be used as initialization.

    3 Data

    Textual data In order to learn frame embed-dings on textual data with WSABIE or W2V, wetake the FrameNet 1.5 sentences provided bythe Dependency-Parsed FrameNet Corpus (Baueret al., 2012) which contains more than 170, 000sentences annotated manually with frame labelsfor 700 frames. We denote a frame as f wheref ∈ Ft the set of frames in the textual data.Hierarchy data The FN hierarchy lists for eachframe of the overall 1, 019 frames the F2F rela-tions to other frames. We denote with G the col-lection of triples (f1, r, f2) (standing for frame“f1 is in relation r to frame f2”), where f1 andf2 ∈ Fh the set of frames in the FN hierarchy andr ∈ R the set of F2F relations. As listed in Ta-ble 2, there are 2, 912 triples in the FN hierarchywith 1, 913 triples remaining if considering onlythose where both frames have lexical units and

    Corpus Frames F2F RelationsFN Hierarchy 1, 019 2, 912

    FN Hierarchy restricted 894 1, 913to frames with LU

    Textual data 700 1, 447FN 1.5 sentences

    Table 2: Counts for frames and F2F relations.

    148

  • with 1, 447 triples remaining if considering onlythose where both frames occur in the textual data.We split the obtained triples whose frames havelexical units into a training and a test set such thatthe training set contains the first 70% of all thetriples for each relation.

    Table 2 summarizes frame counts per datasource together with counts of F2F relations whereboth frames occur in the underlying source.

    4 Exploration of Frame Embeddings

    We aim at empirically analyzing whether F2F rela-tions from the FN hierarchy are mirrored in frameembeddings learned on frame-labeled text in thecontext of other tasks. Thus, we want to identifywhether a statistical analysis of text-based frameembeddings naturally yields the FN hierarchy. In-deed, the F2F relations are manually annotated byexpert linguists but there is no guarantee that F2Frelations can be observed in text. If these relationscould emerge from raw text it would be reassur-ing for the definitions of the F2F relations that ledto annotations of frame pairs and furthermore theannotations could be generated automatically. Wehypothesize that distances and directions betweenframe embeddings learned on textual data can cor-respond to F2F relations. Figure 2 exemplifiesthis as known from word embeddings by Mikolov(Mikolov et al., 2013a): it highlights two framepairs that are in the same relation: “Attempt” isin relation precedes with “Success or failure”and so is “Existence” in relation precedes with“Ceasing to be” and the connecting vectors areabout the same direction and length.

    Figure 2: Intuition for frame embeddings incorpo-rating F2F relations in vector space.

    4.1 MethodsWSABIE frame embeddings Concerning the ma-trix factorization approach for learning text-basedframe embeddings, we use the code provided by(Hartmann et al., 2017) as it is publically avail-able. It is leaned on Hermann’s (2014) descrip-tion of their state-of-the-art system and achievescomparable results on FrameId. Our hyperpa-rameter choices are oriented towards (Hartmannet al., 2017): embedding dimension 100, maxi-mum number of negative samples: 100, epochs:1000 and initial representation of predicate andcontext: concatenation of pretrained dependency-based word embeddings (Levy and Goldberg,2014).Word2Vec frame embeddings Concerning theNN approach for learning text-based frame em-beddings, we use the Word2Vec implementationin the python library gensim (Řehůřek and Sojka,2010). To obtain frame embeddings we follow thesame steps as if we would learn word embeddingson FN sentences plus we replace all predicateswith their frames. For instance, in the sequence“Officials claim that Iran has produced bombs”the predicates “claim” and “bombs” are replacedby “STATEMENT” and “WEAPON” respectively.This procedure corresponds to Flekova’s (2016)setup for learning supersense embeddings and ourhyperparameter choices are oriented towards theirbest performing ones: training algorithm: skip-gram model, embedding dimension: 300, minimalword frequency: 10, negative sampling of noisewords: 5, window size 2, initial learning rate:0.025 and iterations: 10.Prototypical relation embeddings We denotelearned embeddings with −→e1 (for frame f1). Weuse the frame embeddings to infer prototypicalF2F relation embeddings −→er with the vector off-set method in the following way: we denote withIr the relation-specific subset of G with all the in-stances (f1, r, f2) for this relation (see frame paircounts in Table 1). The vector offset−−−→oe1,e2 for twoframes (f1, f2) is the difference of their embed-dings, see Equation 1.

    offset{f1, f2} = −→e2 −−→e1 (1)

    We denote with Or the relation-specific set ofvector offsets of all (f1, f2) ∈ Ir. We de-fine the prototypical embedding −→er for a rela-tion r as the mean over all −−−→oe1,e2 ∈ Or. Forvisualizations in vector space we use t-SNE-

    149

  • plots (t-distributed Stochastic Neighbor Embed-ding (Maaten and Hinton, 2008) algorithm).Difficulty of associating frame pairs with pro-totypical relations The association of the embed-ding of a frame pair −−−→oe1,e2 ∈ Or with the correctprototypical relation embedding −→er is easier if theintra-relation variation (i.e. the deviation of framepair embeddings from their prototypical embed-ding) is smaller than the inter-relation variation(i.e. the distances between prototypical embed-dings). This means, the association is easier if twoframe pairs which are members of the same F2Frelation, on average, differ less from each other asthey would differ from a member of another rela-tion. As a way to capture this difficulty of asso-ciation we compare the mean cosine distance be-tween all prototypical relations embeddings −→er ofall r ∈ R to the relation-specific mean cosine dis-tance between the frame pair embeddings in Orand the prototypical embedding −→er .4.2 Experiments and Results

    Frame embeddings Once the frame embeddingsare learned, we perform a sanity check for framesand most similar frame embeddings by cosine sim-ilarity. Checking the top 10 most similar frameembeddings confirms that known properties fromword or sense embeddings also apply to frame em-beddings: their top 10 most similar frames aresemantically related, both for frame embeddingslearned with WSABIE and with Word2Vec. This isexemplified in Table 3 for the two most frequentlyoccurring frames in the text data evoked by nouns(“Weapon”) and by verbs (“Statement”). For bothWSABIE and Word2Vec, in many cases the mostsimilar frames are obviously semantically related(which we marked in bold), with some exceptionswhere it is hard to judge or related via an asso-

    Top 10 most similar framesframe WSABIE Word2VecWeapon Substance, Shoot projectiles, Military, Substance,

    Manufacturing, Bearing arms, Operational testing,Toxic substance, Store, Electricity,

    Hostile encounter, Process completed state,Ingredients, Information, Active substance, Range,

    Smuggling, Active substance Estimated value,Cause to make progress

    Statement Evidence, Causation, Reveal secret, Telling,Topic, Chatting, Complaining, Reasoning,

    Point of dispute, Request, Communication response,Text creation, Awareness, Reassuring,

    Cognitive connection, Bragging, Questioning,Make agreement on action, Cogitation

    Communication

    Table 3: Top 10 most similar frames to two ex-emplary most frequent frames.

    Figure 3: t-SNE plot of embeddings for two mostfrequent relations. Small: frame pair embeddings(offset). Large: prototypical embeddings (mean).

    ciation chain. For the frame “Weapon”, the mostsimilar frames by Word2Vec are weaker comparedto WSABIE, however this does not allow a generalconclusion over all frames learned with Word2Vecor WSABIE.F2F relations To check whether the frame embed-dings directly mirror F2F relations, we measurethe difficulty of associating frame pairs with thecorrect prototypical relation embedding.

    First, we visualize the frame pair embeddingsin the training set and the inferred prototypicalrelation embeddings in vector space with t-SNE-plots. Figure 3 depicts examples of WSABIE em-beddings for the most frequently occurring F2F re-lations inherits from and uses, and showsthat the prototypical embeddings are very close toeach other, whilst there are no separate relation-specific clusters for frame pairs. Vector space vi-sualizations of embeddings stemming from both,Word2Vec and WSABIE, hint that the embeddingshave difficulties in mirroring the F2F relations.

    Second, we quantify the insights from the plotsby comparing the distances between all prototyp-ical embeddings to the mean over all mean dis-tances between frame pair embeddings and theirprototypical embeddings. Table 4 lists these vec-tor space (cosine) distances. It shows that the dis-tance between the prototypical embeddings (inter-relation) is smaller than that between frame pairembeddings and corresponding prototypical em-beddings (intra-relation). In other words, twoframe pairs which are members of the same rela-tion, on average, differ as much from each other as

    150

  • Mean distances between WSABIE Word2Vecinter-relation variation 0.73± 0.28 0.76± 0.28(between prototypes)intra-relation variation 0.75± 0.04 0.78± 0.05(between frame pairsand their prototypes)

    Table 4: Cosine distances between the F2F rela-tion embeddings.

    they would differ from a member of another rela-tion.

    To sum up, we find that embeddings of framepairs that are in the same relation do not havea similar vector offset which corresponds to theF2F relation. The FN hierarchy could not bereconstructed by the statistical analysis of text-based embeddings because there is as much intra-relation variation as inter-relation variation. Weconclude that, concerning the methods we ex-plored, the frame embeddings learned with WS-ABIE and Word2Vec have difficulties in showingstructures in vector space corresponding to F2Frelations and that F2F relations might not emergepurely from textual data. Hence, these text-basedframe embeddings cannot be used as such to reli-ably infer the correct relation for a frame pair butmight need some advanced learning. In the nextsection, we address the prediction of F2F rela-tions with algorithms involving learning from theknowledge contained in the FN hierarchy.

    5 Frame-to-Frame Relation Prediction

    We aim at developing a system for finding thecorrect F2F relation given two frames, which canpotentially be used for automatic completion ofthe F2F relation annotations in the FN hierarchy.This task transfers the principles of Link Predic-tion from KGC to the case of FN. As the pre-vious experiment suggested that text-based frameembeddings do not mirror the F2F relations, wedevelop a system that learns from the knowl-edge contained in the FN hierarchy and that usespretrained frame embeddings as input representa-tions. Related work in KGC also demonstratesthe strengths of representations trained directly onthe KG for this task. For our systems involv-ing learning, we experiment with different em-beddings as input representations: in addition tothe text-based frame embeddings, we also learnknowledge-based embeddings for frames and forF2F relations on the structure of the FN hierarchywith TransE, an approach well-known from KGC.

    We want to quantify which combination of pre-trained embeddings and system is most promisingfor the F2F Relation Prediction task.

    5.1 Methods

    TransE embeddings In addition to the text-basedframe embeddings, we also learn embeddings forframes as well as for F2F relations by applyingthe well-known translation model TransE. TransEleverages the structure of the knowledge base,which is in our case the FN hierarchy with thecollection of the (frame, relation, frame) triples,and learns low dimensional vector representationsfor frames and for F2F relations in the same space.These embeddings will have the property of be-ing learned explicitly for incorporating the anno-tations from the FN hierarchy. Concerning thisknowledge-based approach for learning frame andF2F relation embeddings, we use an implementa-tion of TransE provided by (Lin et al., 2015) yield-ing embeddings of dimension 50.Neural network for relation selection We pro-pose a nonlinear model based on NNs to iden-tify the best F2F relation r between a frame pair(f1, f2). Figure 4 demonstrates the proposed NNarchitecture. Given a training instance, i.e. a triple(f1, r, f2), we feed a vector representation for eachelement into the NN. By default the input vectorrepresentations are initialized randomly but theycan also stem from a pretraining step (more detailsin Sec 5.2). Within the NN, the initial vector rep-resentations of the two frames are combined intoan internal dense layer c, followed by the calcula-tion of the cosine similarity between this combina-

    𝑓2

    𝑓1

    Positive relation: 𝑟

    Embeddings Dense layer

    cos

    cos

    loss

    Negative relation: 𝑟‘

    Combination 𝑐: Dense layer

    Figure 4: NN architecture for training on the F2FRelation Prediction task.

    151

  • tion and the representation for the F2F relation r.Meanwhile, a negative relation r′ is sampled ran-domly (by selecting a F2F relation which does nothold between the two frames) and its vector rep-resentation is also fed into the NN. The negativerelation is processed as the correct one, yieldinga second cosine similarity. Finally, the NN mini-mizes the following ranking loss:

    loss = max{0, m− cos(c, r) + cos(c, r′)} (2)

    m is a margin and cos is the cosine similarity func-tion. This means, the internal representations aretrained to maximize the similarity between framepair and correct relation and to minimize it for thenegative relation. Our hyperparameter choices are:epochs: 550, size of dense layers: 128, dropout:0.2, margin: 0.1, activation function: hyperbolictangent, batch size: 2, learning rate 0.001.

    5.2 Experiments and Results

    Given a triple (f3, r, f4) from the test set, we wantto predict the correct relation r for (f3, f4). Asdescribed in Sec 3, 70% of the triples in the FNhierarchy are used for training. Our systems are:• 0a): rand.bsl A random guessing baseline

    that chooses a relation randomly out of R.• 0b): maj.bsl Informed majority baseline that

    leverages the skewed distribution in the train-ing set and predicts most frequent relation.• 1: off A test of the pretrained frame embed-

    dings (WSABIE and Word2Vec) as introducedin Section 4. It computes the vector offset−−−→oe3,e4 (therefore “off”) between the test frameembeddings, measures the similarity with theprototypical mean relation embeddings −→er ofthe training set and ranks the relations withrespect to similarity (cosine) to output theclosest one. No further training with respectto the FN hierarchy.• 2: reg A test of the pretrained frame embed-

    dings (WSABIE and Word2Vec) as introducedin Section 4 involving training with respect tothe FN hierarchy. It is a multinomial logisticregression model (therefore “reg”) that trainsthe weights and biases on the training triples.It takes the test frame embeddings e3, e4 asinput and ranks the prediction for a relationvia the softmax function.• 3: NN NN architecture as described in Sec-

    tion 5.1 for training with respect to the FNhierarchy in the training triples. By de-

    fault, it uses randomly initialized input rep-resentations, but it can also take pretrainedrepresentations as input: (a) the pretrainedframe embeddings (WSABIE and Word2Vec)and inferred prototypical mean relation em-beddings as introduced in Section 4 and (b)the TransE frame and relation embeddingstrained on the training triples from the FNhierarchy as introduced in Section 5.1.

    To evaluate the predictions of our systems forthe F2F Relation Prediction task, we compare themeasurements of accuracy, mean rank of the truerelation and hits amongst the 5 first predictions,see Table 5.Accuracy measures the proportion of correctlypredicted relations amongst all predictions.For the next two measures, not only the one pre-dicted relation is of interest, but the ranked list ofall relations with the predicted relation at rank 1.Mean rank measures the mean of the rank of thetrue relation label over all predictions, aiming atlow mean rank (best is mr = 1).Hits@5 measures the proportion of true relationlabels ranked in the top 5.

    The random guessing baseline is a weak base-line that is outperformed by all approaches. Theinformed majority baseline, however, is a strongbaseline given the skewed distribution of F2F re-lations in the FN hierarchy.

    A comparison of this strong baseline with sys-tem 1 using the text-based frame embeddings (WS-ABIE and Word2Vec) and the similarity with proto-typical relation embeddings, emphasizes the diffi-culties of these embeddings for reconstructing theF2F relations. Concerning accuracy scores, sys-tem 1 performs slightly better than the strong base-line but concerning the other two measures, meanrank and hits at 5, it is the other way round. An-

    System Embed. acc ↑ mr ↓ hits@5 ↑0: rand.bsl - 7.69 6.5 38.460: maj.bsl - 22.48 3.27 87.511: off WSABIE 25.22 4.50 68.521: off Word2Vec 30.61 4.53 66.961: off TransE 51.13 2.99 83.302: reg WSABIE 35.65 3.14 84.002: reg Word2Vec 41.91 2.81 88.002: reg TransE 66.61 1.93 93.223: NN random 26.89 3.67 77.003: NN WSABIE 27.46 3.59 79.983: NN Word2Vec 30.55 3.27 82.613: NN TransE 67.73 1.83 94.39

    Table 5: Performances on relation prediction task.

    152

  • other point made by system 1 is the fact that it doesnot involve training on the triples but is still com-petitive with the strong baseline that leverages theunderlying distribution from the triples. This in-dicates that to some extent the textual frame em-beddings still capture useful information for theF2F Prediction Task. In a further step, we also usethe embeddings pretrained on the F2F relationsof the FN hierarchy (TransE), even if in this set-ting we do not need to calculate prototypical rela-tion embeddings as TransE provides embeddingsfor frames and relations. Thus, system 1 uses theTransE embeddings directly to calculate the simi-larity of the frame embeddings’ vector offset andthe relation embeddings. The large improvementin all performance measures shows the strength ofknowledge-based embeddings over text-based em-beddings and confirms the difficulty of text-basedembeddings in reconstructing the F2F relations.

    Performance increases with system 2, the soft-max regression model involving learning. Thisshows the effect of training with respect to the F2Frelations. It indicates that training should be in-volved for leveraging the text-based frame embed-dings in the F2F Prediction Task. Using embed-dings pretrained on the F2F relations of the FNhierarchy (TransE) instead, again leads to a largeimprovement in all performance measures. Thisconfirms that embeddings designed to incorporatethe knowledge from the FN hierarchy are bettersuited for the F2F relation prediction task and itemphasizes the large improvement over the textualembeddings.

    Overall, we achieve best results in all per-formance measures with system 3, the NN ap-proach, in combination with the knowledge-basedTransE embeddings as input representations. In-terestingly, the difference between NN and the re-gression model is only marginal when using theTransE embeddings, indicating the crucial influ-ence of the knowledge-based embeddings and notnecessarily the system. Moreover, when using thetext-based WSABIE and Word2Vec the softmax re-gression model is stronger than the NN, whichmight be due to little training data. Furthermore,the randomly initialized embeddings for system3 could be seen as another baseline which is notonly beaten by the knowledge-based TransE em-beddings but also by the text-based WSABIE andWord2Vec embeddings in systems 2 and 3. Thisagain indicates the capability of the textual frame

    Figure 5: Relation-specific analysis of the best-performing model with respect to accuracy.

    embeddings of capturing useful information forthe F2F Prediction Task to at least some extent.

    The systems could reach higher scores if thesplit of the data into training and test triples wouldbe done random per relation such that the train andtest set have some (random) relation-specific over-lap in frames on the position f1 in the triple. Butin this case, it would not clear whether the sys-tems would just perform “lexical memorization”as pointed out by (Levy et al., 2015) when the testset contains partial instances that were in the train-ing set. We leave it for future work to contrast andexplore different splits, e.g., random split, zero-overlap by relation or by all relations.

    To sum up, on the one hand, the results con-firm the conclusions from the exploration in Sec-tion 4: the frame embeddings learned on frame-labeled text in the context of other tasks are notable to reliably mirror the F2F relations, not evenwhen used as input representations to a classifier.On the other hand, our results clearly emphasizethe influence of the knowledge-based embeddingson the performance of our best-performing sys-tem. Thus, we propose this NN architecture incombination with the TransE embeddings as thefirst system for automatic F2F relation annotationfor frame pairs in the FN hierarchy.

    Figure 5 depicts a relation-specific analysis ofthe best-performing model showing good perfor-mances (above 60% accuracy) for frequent rela-tions, a drop for the less frequent precedencerelations and no capability at all in predicting in-frequent relations, such as is Causative of,see Also and is Inchoative of.

    5.3 Demonstration of Predictions

    We generically demonstrate the best-performingsystem’s prediction for examples of frame pairswhich are not annotated so far. Looking back at

    153

  • (f1, f2) top 3 F2F relation predictions(Biological urge, Subframe of,Sleep wake cycle) Inherits from, Uses(Biological urge, Is Inherited by,Being awake) Precedes, Is Preceded by(Biological urge, Is Inherited by,Fall asleep) Precedes, Is Preceded by

    Table 6: F2F relation predictions of best system.

    the motivational example from the beginning, Fig-ure 1 illustrated the incompleteness of the FN hier-archy at the F2F relation level with the exampleof a possibly missing precedence relation from“Being awake” to “Biological urge” (evoked bythe predicate “tired”). Table 6 displays the top 3F2F relation predictions for the frame pairs around“Biological urge” in the figure. The expected F2Frelation (printed bold) is indeed amongst the top3 predictions of the best performing system forthis example, even for the precedence relationwhich is rather underrepresented in the data. Ifthis system was used to make suggestions to hu-man expert annotators, they should be informedabout the system being biased against the infre-quent relations. However, it is hard to do a propermanual evaluation as judging the suggested re-lations requires expert knowledge of the defini-tions and annotation best-practices for the F2F re-lations. We propose using the best-performingsystem for semi-automatic FN completion on therelation level in cooperation with FN annotationexperts. The system can be used to make reason-able suggestions of relations for frame pairs andthe final decision could be made by experiencedFN annotators. This would be a first step towardsimproving the incompleteness of F2F relation an-notations in FN, which in turn could improve theperformance in other tasks that take these F2F re-lations as input.

    6 Discussion and Future Work

    As the F2F relations of the FN hierarchy did notemerge from frame embeddings learned on frame-labeled text, the F2F relations should be seenas meta-structures not having direct evidence intext. On the one hand, more advanced approachesmight be needed to distill F2F relations for framesoccurring in raw text, by learning about common-sense knowledge involving frames, and then infer-ring the implicit relations. Here, it could also behelpful to exploit inter-sentential clues e.g., eventchains, to enrich the frame embeddings which so

    far are built on sentence-level. On the other hand,the automatic completion of F2F relations can relyon knowledge-based embeddings trained on thehierarchy. To this end, an expert evaluation ofthe best-performing system’s predictions for framepairs could give clues for further system improve-ments. It could also yield an expert upper boundand may pave the way for developing advancedsystems using frame embeddings for the predic-tion of F2F relations. Finally, we plan to inves-tigate the case of FN for embeddings learned onboth, frame-labeled texts and F2F relation annota-tions. By having such a combination, the limita-tion of the text-based embeddings on frames thathave LUs (and hence occur in text) can be over-come as the knowledge-based embeddings alsohave access to frames without LUs. Last but notleast, for different tasks, different representationsof frames and relations might be better suited: em-beddings purely learned on text, or embeddingspurely learned on the FN hierarchy, or a combi-nation of both.

    7 Conclusion

    We raised the question whether text-based frameembeddings naturally mirror F2F relations in theFN hierarchy. We set up the F2F Relation Pre-diction task as an adaptation of the link predictiontask from KGC to the case of FN. Through thistask, we quantify the ability of systems and em-beddings to predict F2F relations. The F2F Re-lation Prediction task addresses the need for auto-matically completing F2F relations that are used indown-stream tasks. Our best-performing systemfor predicting F2F relations is a NN trained on theFN hierarchy and uses knowledge-based embed-dings that by design incorporate the F2F relation.It can be used to suggest more F2F relation anno-tations in the FN hierarchy. The comparison of ourdifferent systems and embeddings reveals insightsabout the difficulty of reconstructing F2F relationspurely from text. We encourage the developmentof advanced systems and embeddings for the F2FRelation Prediction task.

    Acknowledgments

    This work has been supported by the DFG-funded research training group “Adaptive Prepara-tion of Information form Heterogeneous Sources”(AIPHES, GRK 1994/1). We also acknowledgethe useful comments of the anonymous reviewers.

    154

  • References

    Roni Ben Aharon, Idan Szpektor, and Ido Da-gan. 2010. Generating Entailment Rules fromFrameNet. In Proceedings of the ACL 2010Conference Short Papers. Association forComputational Linguistics, pages 241–246.http://www.aclweb.org/anthology/P10-2045.

    Collin F Baker, Charles J Fillmore, and John B Lowe.1998. The Berkeley FrameNet Project. In Pro-ceedings of the 36th Annual Meeting of the Associ-ation for Computational Linguistics and 17th Inter-national Conference on Computational Linguistics-Volume 1. Association for Computational Linguis-tics, Montreal, Quebec, Canada, pages 86–90.https://doi.org/10.3115/980451.980860.

    Daniel Bauer, Hagen Fürstenau, and Owen Ram-bow. 2012. The Dependency-Parsed FrameNetCorpus. In Proceedings of the 8th LanguageResources and Evaluation Conference (LREC2012). Istanbul, Turkey, pages 3861–3867.http://hdl.handle.net/10022/AC:P:21192.

    Antoine Bordes, Xavier Glorot, Jason Weston, andYoshua Bengio. 2012. Joint Learning of Wordsand Meaning Representations for Open-Text Se-mantic Parsing. In Neil D. Lawrence andMark Girolami, editors, Proceedings of the Fif-teenth International Conference on Artificial In-telligence and Statistics. PMLR, La Palma, Ca-nary Islands, volume 22 of Proceedings ofMachine Learning Research, pages 127–135.http://proceedings.mlr.press/v22/bordes12.html.

    Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko.2013. Translating Embeddings for Modeling Multi-relational Data. In C. J. C. Burges, L. Bottou,M. Welling, Z. Ghahramani, and K. Q. Weinberger,editors, Advances in Neural Information ProcessingSystems 26. Curran Associates, Inc., pages 2787–2795. http://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-relational-data.pdf.

    Antoine Bordes, Jason Weston, Ronan Collobert, andYoshua Bengio. 2011. Learning Structured Em-beddings of Knowledge Bases. In Proceedings ofthe Twenty-Fifth AAAI Conference on Artificial In-telligence. AAAI Press, AAAI’11, pages 301–306.http://dl.acm.org/citation.cfm?id=2900423.2900470.

    Bob Coyne and Owen Rambow. 2009. LexPar: AFreely Available English Paraphrase Lexicon Auto-matically Extracted from FrameNet. In Proceed-ings of the Third IEEE International Conferenceon Semantic Computing (ICSC 2009). pages 53–58.https://doi.org/10.1109/ICSC.2009.56.

    Christiane Fellbaum. 1990. English Verbs as a Se-mantic Net. International Journal of Lexicography3(4):278–301. https://doi.org/10.1093/ijl/3.4.278.

    Minwei Feng, Bing Xiang, Michael R Glass, Li-dan Wang, and Bowen Zhou. 2015. Apply-ing Deep Learning to Answer Selection: AStudy and An Open Task. In 2015 IEEEWorkshop on Automatic Speech Recognition andUnderstanding (ASRU). IEEE, pages 813–820.https://doi.org/10.1109/ASRU.2015.7404872.

    Charles J Fillmore. 1976. Frame Seman-tics and the Nature of Language. An-nals of the New York Academy of Sciences280(1):20–32. https://doi.org/10.1111/j.1749-6632.1976.tb25467.x.

    Charles J Fillmore and Collin F Baker. 2001. FrameSemantics for Text Understanding. In Proceedingsof WordNet and Other Lexical Resources Workshop,NAACL.

    Lucie Flekova and Iryna Gurevych. 2016. SupersenseEmbeddings: A Unified Model for Supersense Inter-pretation, Prediction, and Utilization. In Proceed-ings of the 54th Annual Meeting of the Associationfor Computational Linguistics (ACL 2016). Asso-ciation for Computational Linguistics, Berlin, Ger-many, volume Volume 1: Long Papers, pages 2029–2041.

    Silvana Hartmann and Iryna Gurevych. 2013.FrameNet on the Way to Babel: Creating a Bilin-gual FrameNet Using Wiktionary as InterlingualConnection. In Proceedings of the 51st AnnualMeeting of the Association for Computational Lin-guistics (ACL 2013). Association for ComputationalLinguistics, Stroudsburg, PA, USA, volume 1, pages1363–1373.

    Silvana Hartmann, Ilia Kuznetsov, Teresa Martin, andIryna Gurevych. 2017. Out-of-domain FrameNetSemantic Role Labeling. In Proceedings of the 15thConference of the European Chapter of the Associ-ation for Computational Linguistics (EACL 2017).Association for Computational Linguistics, pages471–482.

    Karl Moritz Hermann, Dipanjan Das, Jason Weston,and Kuzman Ganchev. 2014. Semantic Frame Iden-tification with Distributed Word Representations. InProceedings of the 52nd Annual Meeting of the As-sociation for Computational Linguistics (Volume 1:Long Papers). Association for Computational Lin-guistics, Baltimore, Maryland, pages 1448–1458.http://www.aclweb.org/anthology/P14-1136.

    Ignacio Iacobacci, Mohammad Taher Pilehvar, andRoberto Navigli. 2015. SensEmbed: Learn-ing Sense Embeddings for Word and RelationalSimilarity. In Proceedings of the 53rd An-nual Meeting of the Association for Computa-tional Linguistics and the 7th International JointConference on Natural Language Processing (Vol-ume 1: Long Papers). Association for Computa-tional Linguistics, Beijing, China, pages 95–105.http://www.aclweb.org/anthology/P15-1010.

    155

  • Omer Levy and Yoav Goldberg. 2014. Dependency-Based Word Embeddings. In Proceedings of the52nd Annual Meeting of the Association for Compu-tational Linguistics, ACL 2014, June 22-27, 2014,Baltimore, MD, USA, Volume 2: Short Papers. TheAssociation for Computer Linguistics, pages 302–308. https://doi.org/10.3115/v1/P14-2050.

    Omer Levy, Steffen Remus, Chris Biemann, Ido Da-gan, and Israel Ramat-Gan. 2015. Do SupervisedDistributional Methods Really Learn Lexical Infer-ence Relations? In HLT-NAACL. pages 970–976.https://doi.org/10.3115/v1/N15-1098.

    Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu,and Xuan Zhu. 2015. Learning Entity and RelationEmbeddings for Knowledge Graph Completion. InAAAI. pages 2181–2187.

    Laurens van der Maaten and Geoffrey Hinton. 2008.Visualizing Data using t-SNE. Journal of MachineLearning Research 9(Nov):2579–2605.

    Tomas Mikolov, Kai Chen, Greg Corrado, and JeffreyDean. 2013a. Efficient Estimation of Word Repre-sentations in Vector Space. Proceedings of Work-shop at ICLR .

    Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig.2013b. Linguistic Regularities in Continuous SpaceWord Representations. In Proceedings of the 2013Conference of the North American Chapter of theAssociation for Computational Linguistics: HumanLanguage Technologies (NAACL-HLT-2013). Asso-ciation for Computational Linguistics, volume 13,pages 746–751. https://www.microsoft.com/en-us/research/publication/linguistic-regularities-in-continuous-space-word-representations/.

    George A Miller. 1990. Nouns in WordNet:A Lexical Inheritance System. Interna-tional journal of Lexicography 3(4):245–264.https://doi.org/10.1093/ijl/3.4.245.

    Roberto Navigli and Simone Paolo Ponzetto. 2012.BabelNet: The automatic construction, evaluationand application of a wide-coverage multilingual se-mantic network. Artificial Intelligence 193:217–250. https://doi.org/10.1016/j.artint.2012.07.001.

    Ellie Pavlick, Travis Wolfe, Pushpendre Rastogi,Chris Callison-Burch, Mark Dredze, and BenjaminVan Durme. 2015. FrameNet+: Fast Paraphras-tic Tripling of FrameNet. In Proceedings of the53rd Annual Meeting of the Association for Compu-tational Linguistics and the 7th International JointConference on Natural Language Processing (ShortPapers). Beijing, China, pages 408–413.

    Pushpendre Rastogi and Benjamin Van Durme. 2014.Augmenting FrameNet Via PPDB. In Proceedingsof the Second Workshop on EVENTS: Definition,Detection, Coreference, and Representation. Asso-ciation for Computational Linguistics, Baltimore,Maryland, USA, pages 1–5.

    Radim Řehůřek and Petr Sojka. 2010. Software Frame-work for Topic Modelling with Large Corpora. InProceedings of the LREC 2010 Workshop on NewChallenges for NLP Frameworks. ELRA, Valletta,Malta, pages 45–50. http://is.muni.cz/publication/884893/en.

    Josef Ruppenhofer, Michael Ellsworth, Miriam RLPetruck, Christopher R Johnson, and Jan Schef-fczyk. 2006. FrameNet II: Extended Theory andPractice. Distributed with the FrameNet data.

    Ming Tan, Bing Xiang, and Bowen Zhou. 2015.LSTM-based Deep Learning Models for Non-factoid Answer Selection. arXiv preprintarXiv:1511.04108 .

    Jason Weston, Samy Bengio, and Nicolas Usunier.2011. WSABIE: Scaling Up to Large VocabularyImage Annotation. In Proceedings of the Twenty-Second International Joint Conference on Artifi-cial Intelligence - Volume Volume Three. AAAIPress, Barcelona, Catalonia, Spain, IJCAI’11, pages2764–2770. https://doi.org/10.5591/978-1-57735-516-8/IJCAI11-460.

    156