Download - Pattern Learning

Transcript
Page 1: Pattern Learning

Pattern Learning for Web Information Extraction

Justin Betteridge

IR Series SeminarMay 23, 2008

Tom Mitchell, Andrew CarlsonSue Ann Hong, Edith Law, Sophie Wang,

Estevam Hruschka

Page 2: Pattern Learning

Outline

1. Web Information Extraction (WIE) 1.0

2. Contextual Patternsa. Definition

b. Assessment

c. Usage

3. Comparing Pattern Assessment Methods

4. WIE 2.0

Page 3: Pattern Learning

WIE 1.0The Web

The Web

CheeseCheddar

Swiss

Gouda

Page 4: Pattern Learning

WIE 1.0The Web

The Web

CheeseCheddar

Swiss

Gouda

__ such as __

such __ as __

… cheeses, including __ …

Page 5: Pattern Learning

WIE 1.0The Web

The Web

Cheese

Google

Cheddar

Swiss

Gouda

__ such as __

such __ as __

… cheeses, including __ …

Kraft manufactures a variety of cheeses, including Feta cheese.

Companies who market a variety of cheeses, including Kraft Foods, …

Page 6: Pattern Learning

WIE 1.0The Web

The Web

Cheese

Google

Cheddar

Swiss

Gouda

__ such as __

such __ as __

… cheeses, including __ …

Kraft manufactures a variety of cheeses, including Feta cheese.

Companies who market a variety of cheeses, including Kraft Foods, …

Page 7: Pattern Learning

WIE 1.0The Web

The Web

Cheese

Google

Cheddar

Swiss

Gouda

__ such as __

such __ as __

… cheeses, including __ …

Kraft manufactures a variety of cheeses, including Feta cheese.

Companies who market a variety of cheeses, including Kraft Foods, …

Page 8: Pattern Learning

WIE 1.0The Web

The Web

Cheese

Google

Cheddar

Swiss

Gouda

__ such as __

such __ as __

… cheeses, including __ …

Kraft manufactures a variety of cheeses, including Feta cheese.

Companies who market a variety of cheeses, including Kraft Foods, …

Feta

Kraft Foods

Page 9: Pattern Learning

WIE 1.1The Web

The Web

Cheese

Google

Cheddar cheeses such as __

Swiss cheeses, including __

Gouda

Page 10: Pattern Learning

WIE 1.1The Web

The Web

Cheese

Google

A slice of cheddar cheese goes well with …

Cheddar cheeses such as __

Swiss cheeses, including __

Gouda

Page 11: Pattern Learning

WIE 1.1The Web

The Web

Cheese

Google

A slice of cheddar cheese goes well with …

Cheddar cheeses such as __

Swiss cheeses, including __

Gouda

Page 12: Pattern Learning

WIE 1.1The Web

The Web

Cheese

Google

A slice of cheddar cheese goes well with …

Cheddar cheeses such as __

Swiss cheeses, including __

Gouda

a slice of __ cheese

__ goes well with

Page 13: Pattern Learning

Dual Iterative Approach

1. Given seeds (instances and/or patterns)

2. Using instances, extract & assess candidate patterns

3. Using patterns, extract & assess candidate instances

4. Repeat

Page 14: Pattern Learning

WIE 1.1The Web

The Web

Cheese

Google

A slice of cheddar cheese goes well with …

Cheddar cheeses such as __

Swiss cheeses, including __

Gouda

a slice of __ cheese

__ goes well with

Page 15: Pattern Learning

Assessing Instances

PMI:cheeses such as __ PMI:a slice of __

Cheese?NB classifier

Page 16: Pattern Learning

Point-wise Mutual Information

score(choice) = log2(p(problem & choice) / (p(problem)p(choice)))

Turney, 2001

Page 17: Pattern Learning

Patterns: FAQ

What does one look like?

How can I tell which ones are good?

What can I use them for?

Page 18: Pattern Learning

Patterns Defined

mayo and a lot of cheddar cheese can make a sandwich yummy

Page 19: Pattern Learning

Patterns Defined

mayo and a lot of cheddar cheese can make a sandwich yummy

... cheddar cheese soaked in brine

dairy products like cheddar cheese … NP Prep

VP Prep NP

Page 20: Pattern Learning

Patterns Assessed

CheeseCheddar cheeses such as __

Swiss cheeses, including __

Gouda

a slice of __ cheese

__ goes well with

Page 21: Pattern Learning

Patterns Assessed

CheeseCheddar cheeses such as __

Swiss cheeses, including __

Gouda

a slice of __ cheese

__ goes well with

H1: Must co-occur with at least 2 seed instancesH2: Sort by estimated precision

H1: Must co-occur with at least 2 seed instancesH2: Must not co-occur with any negative seed instances

H3: Must contain a trigger word

Page 22: Pattern Learning

Patterns Assessed

CheeseCheddar cheeses such as __

Swiss cheeses, including __

Gouda

a slice of __ cheese

__ goes well with

H1: Must co-occur with at least 2 seed instancesH2: Sort by estimated precision

H1: Must co-occur with at least 2 seed instancesH2: Must not co-occur with any negative seed instances

H3: Must contain a trigger word

Etzioni et al, 2004

Talukdar et al, 2006

Page 23: Pattern Learning

Patterns Assessed

PMI:cheddar PMI:swiss…

Cheese?NB classifier

Page 24: Pattern Learning

Comparing Assessment Methods

Precision = portion of co-occuring NPs that are instances of the category

Recall = portion of instances that co-occur

Precision is more imporant (F0.5)

Page 25: Pattern Learning

WIE 2.0

Category Discovery Relation Discovery

from patterns!! Continuous Learning


Top Related