productivity, reuse, and competition between...

143
Productivity, Reuse, and Competition between Generalizations Timothy J. O’Donnell MIT

Upload: others

Post on 24-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Productivity, Reuse, and Competition between

Generalizations

Timothy J. O’DonnellMIT

Page 2: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Two Problems

1. Problem of Competition

2. Problem of Productivity

Page 3: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

The Problem of Competition

When multiple ways of expressing a meaning exist, how do we decide

between them?

Page 4: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Competition (e.g., Aronoff, 1976; Plag, 2003; Rainer, 1988; van Marle, 1986)

• Examples

• Computed v. Stored

• goed v. went

• Computed v. Computed

• splinged v. splang (Albright & Hayes, 2003)

• Multi-way competition

Page 5: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Multi-way Competition

• Hierarchical and recursive structures often give rise to multi-way competition between different combinations of stored and computed subexpression.

Page 6: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Multi-way Competition (Aronoff, 1976)

6

Page 7: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Competition Resolution• Competition is resolved in general following the

elsewhere condition (subset principle, Pāṇini’s principle, blocking, pre-emption, etc.)

• “More specific” way of expressing meaning is preferred to “more general” way.

• Variability in strength of preferences

• goed v. went

• curiosity v. curiousness, depulsiveness v. depulsivity (Aronoff & Schvaneveldt, 1978)

• tolerance v. toleration (i.e., doublets, e.g., Kiparsky, 1982a)

• More frequent items are more strongly preferred (e.g., Marcus et al. 1992)

Page 8: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

The Problem of Productivity

Why can some potential generalizations actually generalize productively, while others remain “inert” in existing expressions?

Page 9: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Suffix

Productive (with Adjectives) -ness

Context-Dependent -ity

Unproductive -th

Productivity

Page 10: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Suffix

Productive (with Adjectives) -ness

Semi-productive -ity

Unproductive -th

Productivity

circuitousness, grandness, orderliness, pretentiousness, cheapness, ...Existing:

pine-scentednessNovel: pine-scented

Page 11: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Suffix

Productive (with Adjectives) -ness

Semi-productive -ity

Unproductive -th

Productivity

Page 12: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Suffix

Productive (with Adjectives) -ness

Context-Dependent -ity

Unproductive -thverticality,tractability,severity, seniority, inanity, electricity, ...

Existing:

*pine-scentedityNovel:

Productivity

Page 13: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Suffix

Productive (with Adjectives) -ness

Context-Dependent -ity

Unproductive -th-ile, -al, -able, -ic, -(i)an

subsequentiabilitysubsequentiable

Productivity

Page 14: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Suffix

Productive (with Adjectives) -ness

Context-Dependent -ity

Unproductive -th

Productivity

Page 15: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Suffix

Productive (with Adjectives) -ness

Context-Dependent -ity

Unproductive -th

warmth, width, truth, depth, ...Existing:*coolthNovel:

Productivity

Page 16: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Suffix

Productive (with Adjectives) -ness

Context-Dependent -ity

Unproductive -th

Productivity

Page 17: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Suffix

Most Productive -ness

Less Productive -ity

Least Productive -th

1. How can differences in productivity be represented?

2. How can differences be learned?

Productivity and Reuse

Page 18: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Unifying the Problems

• Fundamental problem: How to produce/comprehend linguistic expressions under uncertainty about how meaning is conventionally encoded by combinations of stored items and composed structures.

• Productivity and competition are often just special cases of this general problem.

Page 19: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Approach

• Build a model of computation and storage under uncertainty based on an inference which optimizes a tradeoff between productivity (computation) and reuse (storage).

• This implicitly explains many specific cases of productivity and competition.

Page 20: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Case Studies

1. What distributional factors signal productivity?

• Explaining Baayen’s hapax-based measures.

2. How is competition resolved?

• Derives elsewhere condition.

3. Multi-way competition.

• Explains productivity and ordering generalization.

• Handles exceptional cases of paradoxical suffix combinations.

Page 21: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Talk Outline

1. Introduction to productivity and reuse with Fragment Grammars (with Noah Goodman).

2. Case Studies on Productivity and Competition.

Page 22: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Talk Outline

1. Introduction to productivity and reuse with Fragment Grammars (with Noah Goodman).

2. Case Studies on Productivity and Competition.

Page 23: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

The Framework: Three Ideas

1. Model how expressions are built by composing stored pieces.

2. Treat productivity (computation) and reuse (storage) as properties which must be determined on a case-by-case basis.

3. Infer correct patterns of storage and computation by balancing ability to predict input data against simplicity biases.

Page 24: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

A Simple Formal Model: Fragment Grammars

1. Formalization of the hypothesis space.

• Arbitrary contiguous (sub)trees.

2. Formalization of the inference problem.

• Probabilistic conditioning to find good balance between computation and storage.

Page 25: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

A Simple Formal Model: Fragment Grammars

1. Formalization of the hypothesis space.

• Arbitrary contiguous (sub)trees.

2. Formalization of the inference problem.

• Probabilistic conditioning to find good balance between computation and storage.

Page 26: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Underlying Computational System

N

Adj

V

agree

-able

-ity

Page 27: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Underlying Computational System

N

Adj

V

agree

-able

-ity

Page 28: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Underlying Computational System

N

Adj

V

agree

-able

-ity

Page 29: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Underlying Computational System

N

Adj

V

agree

-able

-ity

Page 30: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Hypothesis Space

Any contiguous subtree can be stored in memory and reused as if it were a single rule from the starting grammar.

Page 31: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Hypothesis SpaceN

Adj

V

agree

-able

-ity

Page 32: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Hypothesis SpaceN

Adj

V

agree

-able

-ity

Page 33: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Hypothesis SpaceN

Adj

V

agree

-able

-ity

Page 34: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Computation with Stored items

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Page 35: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Computation with Stored items

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Page 36: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Computation with Stored items

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Page 37: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

A Simple Formal Model: Fragment Grammars

1. Formalization of the hypothesis space.

• Arbitrary contiguous (sub)trees.

2. Formalization of the inference problem.

• Probabilistic conditioning to find good balance between computation and storage.

Page 38: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Inference Problem

Find and store the subcomputations which best predict the distribution of forms in the linguistic input taking into account prior expectations for simplicity.

Page 39: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Prior Expectations

Two Opposing Simplicity Biases

1. Fewer, more reusable stored items.- Chinese Restaurant process prior on lexica.

2. Small amounts of computation.- Geometric decrease in probability in number of

random choices.

Page 40: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Example Input

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

Page 41: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Storage of Minimal, General Structures

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

Page 42: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Computation per Expression

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

Page 43: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Computation per Expression

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

P( )

Page 44: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Computation per Expression

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

P( ) ×

P( )

Page 45: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Computation per Expression

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

P( ) ×

P( )

P( )

×

Page 46: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Sharing Across Expressions

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

Page 47: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Sharing Across Expressions

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

Page 48: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Storage of Maximal, Specific Structures

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

Page 49: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Computation per Expression

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

Page 50: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Computation per Expression

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

P( )

Page 51: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Sharing Across Expressions

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

Page 52: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Sharing Across Expressions

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

Page 53: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Storage of Intermediate Structures

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

Page 54: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Computation per Expression

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

Page 55: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Computation per Expression

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

P( )

Page 56: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Computation per Expression

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

Computation per Expression

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

P( ) ×

P( )

Page 57: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Sharing Across Expressions

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

Page 58: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Sharing Across Expressions

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Thursday, February 2, 2012

Page 59: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Remarks on Inference Tradeoff

• Nothing fancy here.

• The two simplicity biases are just Bayesian prior and likelihood applied to computation and storage problem.

• Lexicon code length and data code length given lexicon in (two part) MDL.

• Can be connected with many other frameworks.

Page 60: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Inference as Conditioning

60

• Inference Process: Probabilistic Conditioning.

• Define joint model.

P(Data, Fragments) =

P(Data | Fragments) * P(Fragments)

Page 61: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Inference as Conditioning

61

• Inference Process: Probabilistic Conditioning.

• Define joint model.

P(Data, Fragments) =

P(Data | Fragments) * P(Fragments)

Likelihood(derivation probabilities)

Page 62: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Inference as Conditioning

62

• Inference Process: Probabilistic Conditioning.

• Define joint model.

P(Data, Fragments) =

P(Data | Fragments) * P(Fragments)

Prior(lexicon probabilities)

Page 63: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Inference as Conditioning

63

• Inference Process: Probabilistic Conditioning.

• Condition on particular dataset.

P(Fragments | Data) ∝

P(Data | Fragments) * P(Fragments)

Page 64: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Probabilistic Conditioning

64

• Intuition: two-step algorithm.

1. Throw away lexicons not consistent with the data.

2. Renormalize remaining lexicons so that they sum to one.

• Maximally conservative: Relative beliefs are always conserved.

Page 65: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

The Mathematical Model: Fragment Grammars

• Generalization of Adaptor Grammars (Johnson et al., 2007).

• Allows storing of partial trees.

• Framework first proposed in MDL setting by De Marcken, 1996.

• Related to work on probabilistic tree-substitution grammars (e.g., Bod, 2003; Cohn, 2010; Goodman, 2003; Zuidema, 2007; Post, 2013).

Page 66: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Talk Outline

1. Introduction to productivity and reuse with Fragment Grammars (with Noah Goodman).

2. Case Studies on Productivity and Competition.

Page 67: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Case Studies

• Other approaches to productivity and reuse.

1. What distributions signal productivity?

2. How is competition resolved?

3. Multi-way competition.

Page 68: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Case Studies

• Other approaches to productivity and reuse.

1. What distributions signal productivity?

2. How is competition resolved?

3. Multi-way competition.

Page 69: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Four Strategies for Productivity and Reuse

Thursday, February 2, 2012

Page 70: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Four Strategies for Productivity and Reuse

• 5 Formal Models

Thursday, February 2, 2012

Page 71: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Four Strategies for Productivity and Reuse

• 5 Formal Models

• Capture historical proposals from the literature.

Thursday, February 2, 2012

Page 72: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Four Strategies for Productivity and Reuse

• 5 Formal Models

• Capture historical proposals from the literature.

• Minimally different.

Thursday, February 2, 2012

Page 73: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Four Strategies for Productivity and Reuse

• 5 Formal Models

• Capture historical proposals from the literature.

• Minimally different.

• Same inputs, same underlying space of representations.

Thursday, February 2, 2012

Page 74: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Four Strategies for Productivity and Reuse

• 5 Formal Models

• Capture historical proposals from the literature.

• Minimally different.

• Same inputs, same underlying space of representations.

• State-of-the-art probabilistic models.

Thursday, February 2, 2012

Page 75: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Full-Parsing (MAP Multinomial-Dirichlet Context-

Free Grammars)

- All generalizations are productive.

- Minimal abstract units.

- Johnson, et al. 2007a

- Estimated on token frequency.

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Full-Parsing (FP)

Page 76: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Full-Listing (MAP All-Adapted Adaptor

Grammars)- Store whole form after first use

(recursively).

- Maximally specific units.

- Johnson, et al. 2007

- Base system estimated on type frequencies.

- Formalization of classical lexical redundancy rules.

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Full-Parsing (FP)

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Full-Listing (FL)

Page 77: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Exemplar-Based (Data-Oriented Parsing)

- Store all generalizations consistent with input.

- Two Formalization: Data-Oriented Parsing 1 (DOP1; Bod, 1998), Data-Oriented Parsing: Equal-Node Estimator (ENDOP; Goodman, 2003).

- Argued to be exemplar model of syntax.

Full-Parsing (FP)

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Full-Listing (FL)

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Exemplar-Based(EB)

Page 78: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Inference-Based (Fragment Grammars)

- Store set of subcomputations which best explains the data.

- Formalization: Fragment Grammars (O’Donnell, et al. 2009)

- Inference depends on distribution of tokens over types.

- Only model which infers variables.

Full-Parsing (FP)

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Full-Listing (FL)

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Exemplar-Based(EB)

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ness

N

Adj

V

count

-able

-ity

N

Adj

V

agree

-able

-ity

N

Adj

V

agree

-able

-ity

Inference-Based(IB)

Page 79: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Empirical Domains

Past Tense(Inflectional)

DerivationalMorphology

Productive +ed (walked) +ness (goodness)

Context-Dependent I →æ (sang) +ity (ability)

Unproductive suppletion (go/went)

+th (width)

Page 80: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Case Studies

• Other approaches to productivity and reuse.

1. What distributions signal productivity?

2. How is competition resolved?

3. Multi-way competition.

Page 81: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Empirical Evaluations

Past Tense DerivationalMorphology

Productive +ed (walked) +ness (goodness)

Context-Dependent I →æ (sang) +ity (ability)

Unproductive suppletion (go/went)

+th (width)

Page 82: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

What (Distributional) Cues Signal Productivity?

• Many proposals in the literature:

• Type frequency.

• Token frequency (combined with something else, e.g., entropy).

• Heterogeneity of context (generalized type frequency).

Page 83: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Full-Listing (MAG)Full-Parsing (MDPCFG)

Inference-Based (FG)

Exemplar (ENDOP)Exemplar (DOP1)

Top 5 Most Productive Suffixes

Page 84: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Top 5 Most Productive Suffixes

Full-Listing (MAG)Full-Parsing (MDPCFG)

Inference-Based (FG)

Exemplar (ENDOP)Exemplar (DOP1)

Page 85: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Full-Listing (MAG)Full-Parsing (MDPCFG)

Inference-Based (FG)

Exemplar (GDMN)Exemplar (DOP1)

Top 5 Most Productive Suffixes

Page 86: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

What Evidences Productivity?

• Crucial evidence of productivity: Use of a lexical item (morpheme, rule, etc.) to generate new forms.

• Distributional consequence: Large proportion of low frequency forms.

Page 87: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

What Predicts Productivity?

Page 88: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Top 5 Most Productive Suffixes

Full-Listing (MAG)Full-Parsing (MDPCFG)

Inference-Based (FG)

Exemplar (GDMN)Exemplar (DOP1)

High Proportion of Low Frequency Types

Page 89: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Full-Listing (MAG)Full-Parsing (MDPCFG)

Inference-Based (FG)

Exemplar (GDMN)Exemplar (DOP1)

Top 5 Most Productive Suffixes

High Token Frequency

High Type Frequency

High Token Frequency

High Token Frequency

Page 90: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

• Baayen’s / (e.g., Baayen, 1992)

• Estimators of productivity based on the proportion of frequency-1 words in an input corpus.

• Various derivations.

• Rate of vocabulary change in urn model.

• Good-Turing estimation.

• Fundamentally, a rule-of-thumb.

• Only defined for single affix estimation.

Baayen’s Hapax-Based Measures

P P⇤

Page 91: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Productivity Correlations ( values from Hay & Baayen, 2002)P/P⇤

MDPCFG(Full-parsing)

MAG(Full-listing)

DOP1(Exemplar-based)

ENDOP(Exemplar-based)

FG(Inference)

Page 92: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Fragment Grammars and Hapaxes

• For the case of single affixes, Fragments Grammars behave approximately as if they were using hapaxes.

• Not an explicit assumption of the model

• Model is about how words are built. Given the fact that some new words are built, behavior arises automatically.

• Generalizes to multi-way competition.

Page 93: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Case Studies

• Other approaches to productivity and reuse.

1. What distributions signal productivity?

2. How is competition resolved?

3. Multi-way competition.

Page 94: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Empirical Domains

Past Tense DerivationalMorphology

Productive +ed (walked) +ness (goodness)

Context-Dependent I →æ (sang) +ity (ability)

Unproductive suppletion (go/went)

+th (width)

Page 95: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Crucial Facts

• Defaultness: Regular rule applies when all else fails.

• Blocking: Existence of irregular blocks regular rule.

• In this domain preferences are sharp.

Page 96: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

How can Correct Inflection be Represented?

Irregulars Regulars

Thursday, February 2, 2012

Page 97: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

How can Correct Inflection be Represented?

Irregulars Regulars

Thursday, February 2, 2012

Page 98: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

−4−2

02

46

8

Correct Inflection

Log

Odd

s C

orre

ctIrregular ( Regular ( Unattested (

FP FPFPFL FL FLE1E1 E1E2 E2 E2IB IBIB

98

FP Full-Parsing(Multinomial-Dirichlet CFG)

FL Full-Listing(Adaptor Grammars)

E1 Exemplar (Data-Oriented Parsing 1)

E2 Exemplar (DOP: ENDOP)

IB Inference-Based(Fragment Grammars)

Page 99: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

−4−2

02

46

8

Log

Odd

s C

orre

ctIrregular ( Regular ( Unattested (

FP FPFPFL FL FLE1E1 E1E2 E2 E2IB IBIB

Preference for

Correct Past Form

99

FP Full-Parsing(Multinomial-Dirichlet CFG)

FL Full-Listing(Adaptor Grammars)

E1 Exemplar (Data-Oriented Parsing 1)

E2 Exemplar (DOP: ENDOP)

IB Inference-Based(Fragment Grammars)

Correct Inflection

Page 100: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

−4−2

02

46

8

Log

Odd

s C

orre

ctIrregular ( Regular ( Unattested (

FP FPFPFL FL FLE1E1 E1E2 E2 E2IB IBIB

Preference for

Incorrect Past Form 100

FP Full-Parsing(Multinomial-Dirichlet CFG)

FL Full-Listing(Adaptor Grammars)

E1 Exemplar (Data-Oriented Parsing 1)

E2 Exemplar (DOP: ENDOP)

IB Inference-Based(Fragment Grammars)

Correct Inflection

Page 101: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

−4−2

02

46

8

Log

Odd

s C

orre

ctIrregular ( Regular ( Unattested (

FP FPFPFL FL FLE1E1 E1E2 E2 E2IB IBIB

Irregulars in Training

101

FP Full-Parsing(Multinomial-Dirichlet CFG)

FL Full-Listing(Adaptor Grammars)

E1 Exemplar (Data-Oriented Parsing 1)

E2 Exemplar (DOP: ENDOP)

IB Inference-Based(Fragment Grammars)

Correct Inflection

Page 102: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

−4−2

02

46

8

Log

Odd

s C

orre

ctIrregular ( Regular ( Unattested (

FP FPFPFL FL FLE1E1 E1E2 E2 E2IB IBIB

Regulars in Training102

FP Full-Parsing(Multinomial-Dirichlet CFG)

FL Full-Listing(Adaptor Grammars)

E1 Exemplar (Data-Oriented Parsing 1)

E2 Exemplar (DOP: ENDOP)

IB Inference-Based(Fragment Grammars)

Correct Inflection

Page 103: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

−4−2

02

46

8

Log

Odd

s C

orre

ctIrregular ( Regular ( Unattested (

FP FPFPFL FL FLE1E1 E1E2 E2 E2IB IBIB

Regulars and Irregulars not in Training

103

FP Full-Parsing(Multinomial-Dirichlet CFG)

FL Full-Listing(Adaptor Grammars)

E1 Exemplar (Data-Oriented Parsing 1)

E2 Exemplar (DOP: ENDOP)

IB Inference-Based(Fragment Grammars)

Correct Inflection

Page 104: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

−4−2

02

46

8

Log

Odd

s C

orre

ctIrregular ( Regular ( Unattested (

FP FPFPFL FL FLE1E1 E1E2 E2 E2IB IBIB

104

FP Full-Parsing(Multinomial-Dirichlet CFG)

FL Full-Listing(Adaptor Grammars)

E1 Exemplar (Data-Oriented Parsing 1)

E2 Exemplar (DOP: ENDOP)

IB Inference-Based(Fragment Grammars)

Correct Inflection

Page 105: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

−4−2

02

46

8

Log

Odd

s C

orre

ctIrregular ( Regular ( Unattested (

FP FPFPFL FL FLE1E1 E1E2 E2 E2IB IBIB

Exemplar (Data-Oriented Parsing 1)

105

FP Full-Parsing(Multinomial-Dirichlet CFG)

FL Full-Listing(Adaptor Grammars)

E1 Exemplar (Data-Oriented Parsing 1)

E2 Exemplar (DOP: ENDOP)

IB Inference-Based(Fragment Grammars)

Correct Inflection

Page 106: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

−4−2

02

46

8

Log

Odd

s C

orre

ctIrregular ( Regular ( Unattested (

FP FPFPFL FL FLE1E1 E1E2 E2 E2IB IBIB

FP Full-Parsing(Multinomial-Dirichlet CFG)

FL Full-Listing(Adaptor Grammars)

E1 Exemplar (Data-Oriented Parsing 1)

E2 Exemplar (DOP: ENDOP)

IB Inference-Based(Fragment Grammars)

Full-Listing (Adaptor Grammars)

106

Correct Inflection

Page 107: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

−4−2

02

46

8

Log

Odd

s C

orre

ctIrregular ( Regular ( Unattested (

FP FPFPFL FL FLE1E1 E1E2 E2 E2IB IBIB

Inference-Based (Fragment Grammars)

107

FP Full-Parsing(Multinomial-Dirichlet CFG)

FL Full-Listing(Adaptor Grammars)

E1 Exemplar (Data-Oriented Parsing 1)

E2 Exemplar (DOP: ENDOP)

IB Inference-Based(Fragment Grammars)

Correct Inflection

Page 108: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Why Does Blocking Occur?

• Consequence of two principles.

• Law of Conservation of Belief: Hypotheses that predict a greater variety of observed datasets place less probability on each.

• Conservativity of Conditioning: Posterior distributions have same relative probability as prior distributions.

108

Page 109: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Law of Conservation of Belief

109

Page 110: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Law of Conservation of Belief

110

Page 111: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Law of Conservation of Belief

111

Page 112: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Observation

112

x

Page 113: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Conservativity

113

∝P(H1|D)P(H1)

∝P(H2|D)P(H2)

x

Page 114: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Past Tense

114

Page 115: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Elsewhere(Kiparsky, 1973; Anderson, 1969; Kiparsky, 1982a; Andrews, 1982)

• Don’t need elsewhere condition as independent stipulation (cf. subset principle, premption, etc.).

• When a choice must be made between two analyses/derivations, prefer the one with highest P(form | meaning) more “tightly.”

• More general than original statement.

• Any factor influencing P(form | meaning)

• input conditions on rules, frequency, etc.

• Stored-stored, stored-computed, computed-computed, etc.

Page 116: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Case Studies

• Other approaches to productivity and reuse.

1. What distributions signal productivity?

2. How is competition resolved?

3. Multi-way competition.

Page 117: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Empirical Domains

Past Tense DerivationalMorphology

Productive +ed (walked) +ness (goodness)

Context-Dependent I →æ (sang) +ity (ability)

Unproductive suppletion (go/went)

+th (width)

Page 118: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Hierarchical Structure

• Derivational morphology hierarchical and recursive.

• Multiple suffixes can appear in a word.

Page 119: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Many HypothesesN

Adj

V

agree

-able

-ity

Page 120: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Many HypothesesN

Adj

V

agree

-able

-ity

Page 121: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Many HypothesesN

Adj

V

agree

-able

-ity

Page 122: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Empirical Problem: Suffix Ordering

• Many combinations of suffixes do not appear in words.

• Fabb (1988).

- 43 suffixes.

- 663 possible pairs (taking into account selectional restrictions)

- Only 50 exist.

Page 123: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Empirical Problem: Suffix Ordering

• Many theories

• Level-ordering (e.g., Siegel, 1974)

• Selectional-restriction based (e.g., Plag, 2003)

• Complexity-based ordering (Hay, 2004)

• Focus on two phenomena

• Productivity and ordering generalization

• Paradoxical suffix combinations

Page 124: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Productivity and Ordering Generalization

(Hay, 2004)

On average, more productive suffixes appear after less productive suffixes (Hay, 2002; Hay and Plag, 2004; Plag et al, 2009).

Page 125: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Productivity and Ordering Generalization

(Hay, 2004)

• Implicit in many earlier theories (e.g., Level-Ordering Generalization of Siegel 1974).

• Hay’s argues for processing-based view (Complexity-Based Ordering)

• But: Follows as a logically necessary consequence of pattern of storage and computation.

Page 126: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Productivity and Ordering Generalization

• Intuition:

• Less productive suffixes stored as part of words.

• More productive suffixes can attach to anything, including morphologically-complex stored forms.

Page 127: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

But: Paradoxical Suffix Combinations

• Combinations of suffixes which violate the Productivity and Ordering Generalization (as well as predictions of other earlier theories).

• -ability, -ation, -istic, -mental

Page 128: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Multi-way Competition: -ity v. -ness

• In general, -ness more productive than -ity.

• -ity more productive after:

-ile, -able, -(i)an, -ic. (Anshen & Aronoff, 1981; Aronoff & Schvaneveldt, 1978; Cutler, 1980)

Page 129: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Two Frequent Combinations: -ivity v. -bility

• -ive + -ity: -ivity (e.g., selectivity).

• Speaker prefer to use -ness with novel words (Aronoff & Schvaneveldt, 1978).

• depulsiveness > depulsivity.

• -ble + -ity: -bility (e.g., sensibility).

• Speakers prefer to use -ity with novel words (Anshen & Aronoff, 1981).

• remortibility > remortibleness.

Page 130: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

-ivity v. -bility

−50

5{{

-ness

-ity

ive ive ive ive ive iveble ble ble ble ble ble

-ive-ble Full-Parsing

(MDPCFG)Full-Listing(Adaptor Grammars)

Exemplar(DOP1)

Exemplar(GDMN)

Inference(FrMAGment Grammars)

Predicted

Page 131: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

-ivity v. -bility

−50

5{{

-ness

-ity

ive ive ive ive ive iveble ble ble ble ble ble

-ive-ble Full-Parsing

(MDPCFG)Full-Listing(Adaptor Grammars)

Exemplar(DOP1)

Exemplar(GDMN)

Inference(FrMAGment Grammars)

Predicted

Preference for -ness

Page 132: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Full-Parsing(MDPCFG)

Full-Listing(Adaptor Grammars)

Exemplar(DOP1)

Exemplar(GDMN)

Inference(FrMAGment Grammars)

Predicted

-ivity v. -bility

−50

5{{

-ness

-ity

ive ive ive ive ive iveble ble ble ble ble ble

-ive-ble Full-Parsing

(MDPCFG)Full-Listing(Adaptor Grammars)

Exemplar(DOP1)

Exemplar(GDMN)

Inference(FrMAGment Grammars)

Preference for -ity

Page 133: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Full-Parsing(MDPCFG)

Full-Listing(Adaptor Grammars)

Exemplar(DOP1)

Exemplar(GDMN)

Inference(FrMAGment Grammars)

Predicted

-ivity v. -bility

−50

5{{

-ness

-ity

ive ive ive ive ive iveble ble ble ble ble ble

-ive-ble Full-Parsing

(MDPCFG)Full-Listing(Adaptor Grammars)

Exemplar(DOP1)

Exemplar(GDMN)

Inference(FrMAGment Grammars)

Preceding suffix -ive

Page 134: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Full-Parsing(MDPCFG)

Full-Listing(Adaptor Grammars)

Exemplar(DOP1)

Exemplar(GDMN)

Inference(FrMAGment Grammars)

Predicted

-ivity v. -bility

−50

5{{

-ness

-ity

ive ive ive ive ive iveble ble ble ble ble ble

-ive-ble Full-Parsing

(MDPCFG)Full-Listing(Adaptor Grammars)

Exemplar(DOP1)

Exemplar(GDMN)

Inference(FrMAGment Grammars)

Preceding suffix -ble

Page 135: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Full-Parsing(MDPCFG)

Full-Listing(Adaptor Grammars)

Exemplar(DOP1)

Exemplar(GDMN)

Inference(FrMAGment Grammars)

Predicted

Full-Parsing (Multinomial-Dirichlet Context-Free Grammar)

−50

5{{

-ness

-ity

ive ive ive ive ive iveble ble ble ble ble ble

-ive-ble Full-Listing

(Adaptor Grammars)Exemplar

(DOP1)Exemplar

(GDMN)Inference

(FrMAGment Grammars)

Page 136: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Full-Parsing(MDPCFG)

Full-Listing(Adaptor Grammars)

Exemplar(DOP1)

Exemplar(GDMN)

Inference(FrMAGment Grammars)

Predicted

−50

5{{

-ness

-ity

ive ive ive ive ive iveble ble ble ble ble ble

-ive-ble Exemplar

(DOP1)Exemplar

(GDMN)Inference

(FrMAGment Grammars)

Full-Listing (Adaptor Grammars)

Page 137: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Full-Parsing(MDPCFG)

Full-Listing(Adaptor Grammars)

Exemplar(DOP1)

Exemplar(GDMN)

Inference(FrMAGment Grammars)

Predicted Exemplar(GDMN)

Inference(FrMAGment Grammars)

−50

5{{

-ness

-ity

ive ive ive ive ive iveble ble ble ble ble ble

-ive-ble

Exemplar-Based (Data-Oriented Parsing 1)

Page 138: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Full-Parsing(MDPCFG)

Full-Listing(Adaptor Grammars)

Exemplar(DOP1)

Exemplar(GDMN)

Inference(FrMAGment Grammars)

Predicted

−50

5{{

-ness

-ity

ive ive ive ive ive iveble ble ble ble ble ble

-ive-ble

Exemplar-Based (Data-Oriented Parsing: Goodman Estimator)

Page 139: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

−50

5{{

-ness

-ity

ive ive ive ive ive iveble ble ble ble ble ble

-ive-ble Full-Parsing

(MDPCFG)Full-Listing(Adaptor Grammars)

Exemplar(DOP1)

Exemplar(GDMN)

Inference(Fragment Grammars)

Predicted

Inference-Based (Fragment Grammars)

Page 140: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Multi-way Competition

• Explains productivity and ordering generalization.

• Explains difficult cases of competition involving paradoxical suffix combinations.

Page 141: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Global Summary

• Inference based on distribution of tokens over types.

• Derives Baayen’s hapax-based theory.

• View the choice of whether to retrieve or compute as an inference.

• Derives elsewhere condition.

• Storage of arbitrary structures explains ordering generalizations.

• Explains Productivity and Ordering Generalization.

• Also accounts for paradoxical suffix combinations such as -ability

Page 142: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Conclusion

• Model the problem of deriving word forms using a mixture of computation and storage as a tradeoff using standard inferential tools.

• Automatically solves many problems of productivity and competition resolution.

Page 143: Productivity, Reuse, and Competition between Generalizationsidiom.ucsd.edu/~bakovic/competition/slides/O... · Case Studies on Productivity and Competition. The Framework: Three Ideas

Thanks!