1 learning entity specific models stefan niculescu carnegie mellon university november, 2003

28
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003

Post on 20-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

1

Learning Entity Specific Models

Stefan Niculescu

Carnegie Mellon University

November, 2003

2

Outline

Introduction

• Learning Entity Specific Models from complete data

• Inference in Entity Specific models

• EM for learning Entity Specific models from incomplete data

• Learning in presence of simple Domain Knowledge

• Summary / Future Work

3

Entities and Examples

• Today, huge databases track the evolution of users, companies or other entities/objects across time

• Multiple examples are collected per entity– Hospitals (Entities) treat many patients

(Examples)– Users (Entities) are observed when handling

various Emails (Examples)– In fMRI experiments, a Subject (Entity) is

observed across few tens of Trials (Examples)

4

Entity Specific and General Knowledge

• Each Entity has its own particularities

• Different number of attributes may be available for each entity

– Different Hospitals may perform different tests on their

patients to diagnose a given disease

– A certain User may have some software installed while others

may not

• Even two attributes that are present in two entities may relate in a

complete different ways

• However, there are things that are common across entities

– Treatments for a disease are the same across hospitals

5

Goals and Approaches

• GOAL: Make inference about new examples

– From available entities

– From a new entity

• TWO EXTREME APPROACHES:

– Learn a General Model by combining all the examples available

• May have different attributes for different entities

• Entity Specific params will be an “Average” across all Entities

– BAD for making inference about existing Entities

– Learn a separate Model for each Entity

• May not have enough data to learn some dependencies accurately

• Cannot be used when a new example comes from a previously

unseen Entity

6

Outline

• IntroductionLearning Entity Specific Models from

complete data

• Inference in Entity Specific models

• EM for learning Entity Specific models from incomplete data

• Learning in presence of simple Domain Knowledge

• Summary / Future Work

7

Entity Specific Models

• Proposed approach: build a model that is

somewhere between the two extremes

– Takes advantage of multiple Entities to better learn

General Knowledge

– Also adapts itself to whatever is specific to each Entity

• Bayes Nets will be adapted to deal with the two

issues

8

Assumptions

• Examples independent given the parameters of the distribution

• There is no uncertainty about the entity

– This is the case in our studies

• Data is fully observable (no missing values)

• Entities have the same sets of attributes

• Model – Bayes Net

– Structure of the Bayes Net is the same for all entities

– Parameters of the Bayes Net may vary from entity to entity

• It is known which parameters are General and which Entity

Specific

9

Notations

10

Conditional Probability Tables

Gijk

Eilk

CPT for Entity e1 CPT for Entity e2

11

Maximum Data Likelihood setting

• Find the hypothesis that maximizes the likelihood of the data

– Assumes that all hypotheses have equal priors

• Constraints: for all entities, each column of their CP tables should sum up to 1 !!!

12

Optimizing using Lagrange Multipliers

• Can split in a set of independent optimization problems:

• Apply Lagrange Multipliers Theory:

• Solution of Pik is among solutions of:

where

13

Optimizing using Lagrange Multipliers

• Sanity Check: If all parameters are general (no entity specific params), then this is equivalent to a normal Bayes Net

• Sanity Check: If all parameters are entity specific, first fraction cancels and we have a collection of independent Bayes Nets

14

Outline

• Introduction

• Learning Entity Specific Models from complete data

Inference in Entity Specific models

• EM for learning Entity Specific models from incomplete data

• Learning in presence of simple Domain Knowledge

• Summary / Future Work

15

Inference in Entity Specific models

TWO CASES:

1. For a new (partial) example coming from a previously SEEN Entity:

• Build the Bayes Network corresponding to that Entity, then use any existing

BN inference algorithm

2. For a new (partial) example coming from a previously UNSEEN Entity:

• Average / Weight Average the entity specific parameters corresponding to all

previously seen entities into a General BN

OR

• Train a General BN based on all seen entities – gives some prior over all

parameters

• Apply the General BN to make inference about the new example

16

Outline

• Introduction

• Learning Entity Specific Models from complete data

• Inference in Entity Specific modelsEM for learning Entity Specific models

from incomplete data

• Learning in presence of simple Domain Knowledge

• Summary / Future Work

17

EM for maximizing data likelihood

18

EM for learning Entity Specific models from incomplete data

• E STEP:

– Compute expected counts under current estimated parameters

– Because of incomplete data, the counts are random variables

• M STEP: Reestimate model parameters

– Maximize likelihood under observed expected counts

19

Outline

• Introduction

• Learning Entity Specific Models from complete data

• Inference in Entity Specific models

• EM for learning Entity Specific models from incomplete data

Learning in presence of simple Domain Knowledge

• Summary / Future Work

20

Conditional Probability Tables

CPT for Entity e

• Given

• Not given => Estimate!

• Given for Entity e

– May be unknown for other entities

• Not given for Entity e => Estimate!

– May be given for other entities

21

Maximum Data Likelihood setting

• Find the hypothesis that maximizes the likelihood of the data

– Assumes that all hypotheses have equal priors

• Constraints: for all entities, each column of their CP tables should sum up to 1 !!!

22

Optimizing using Lagrange Multipliers

• Can split in a set of independent optimization problems:

• Apply Lagrange Multipliers Theory:

• Solution of Pik is among solutions of:

where

23

Optimizing using Lagrange Multipliers

THEOREM: The ML estimators exist and they are unique. In addition, they can be accurately approximated by a bisection method.

Proof (Sketch):

• Given parameters are treated as constants

– Do not differentiate with respect to them !

• Differentiating to unknown parameters we obtain:

24

Proof sketch

25

Proof sketch

where

• U is strictly increasing on the domain of admissible values

• U takes both negative and positive values

• Therefore A exists, is unique and it can be determined by a bisection method

• Once A is known, it is trivial to find Be values by substituting A in the constraints

• OBSERVATION: There is no closed form for the ML estimators, but they can be approximated arbitrarily close!

0)( AU

• Substituting in the constraints, we easily obtain that A is the solution of a polynomial equation:

26

Outline

• Introduction

• Learning Entity Specific Models from complete data

• Inference in Entity Specific models

• EM for learning Entity Specific models from incomplete data

• Learning in presence of simple Domain Knowledge

Summary / Future Work

27

Summary

• Derived ML estimators for Entity Specific Bayes Nets

• Modified EM to deal with learning in presence of multiple entities

• Proved how simple Domain Knowledge can be incorporated in the learning algorithm

28

Future Work

• Test the Entity Specific Bayes Net model on artificially generated data

• Incorporate uncertainty about the entity in the model

• Modify the model to be able to have different network topologies for different entities

• Improve the representation power of the domain knowledge that can be incorporated in learning … maybe probabilistic rules