data mining-knowledge presentation 2 prof. sin-min lee

54
Data Mining-Knowledge Presentation 2 Prof. Sin-Min Lee

Upload: malcolm-blankenship

Post on 01-Jan-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

Data Mining-Knowledge Presentation 2

Prof. Sin-Min Lee

Overview

Association rules are useful in that they suggest

hypotheses for future researchAssociation rules integrated into the generic

actual argument model can assist in identifying the

most plausible claim from given data items in a

forward inference way or the likelihood of missing

data values in a backward inference way

What is data mining ? What is knowledge discovery from databases KDD?

knowledge discovery in databases (KDD)

is the 'non trivial extraction of nontrivial of

implicit, previously unknown, and

potentially useful information from data

KDD encompasses a number of different

technical approaches, such as clustering, data

summarization, learning classification rules,

finding dependency networks, analyzing

changes, and detecting anomalies

KDD has only recently emerged because we

only recently have been gathering vast

quantities of data

Mangasarian et al (1997) Breast Cancer diagnosis. A sample from breast lump mass is assessed by:

mammagrophy (not sensitive 68%-79%) data mining from FNA test results and visual inspection (65%-98%) surgery (100% but invasive, expensive)

Basket analysis. People who buy nappies also buy beerNBA. National Basketball Association of America. Player pattern profile. Bhandary et al (1997)Credit card fraud detectionStranieri/Zeleznikow (1997) predict family law property outcomes

Rissland and Friedman (1997) discovers a change in the concept of ‘good faith’ in US Bankruptcy cases

Pannu (1995) discovers a prototypical case from a library of cases

• Wilkins and Pillaipakkamnatt (1997) predicts the time a case takes to be heard

• Veliev et al (1999) association rules for economic analaysis

Examples of KDD studies

Overview of process of knowledge discovery in databases ?

Interpretpatterns

Datamining

Transform

PreprocessSelect

knowledgepatternsTransform

ed dataPre-processed dataTarget dataRaw data

from Fayyad, Pitatetsky-Shapiro, Smyth (1996)

Phase 4. Data mining

Finding patterns in data or fitting models to data

Categories of techniques

Predictive (classification: neural networks, rule induction, lin

ear, multiple regression)

Segmentation (clustering, k-means, k-median)

Summarisation (associations, visualisation)

Change detection/modelling

What Is Association Mining?

• Association rule mining:– Finding frequent patterns, associations, correlations, or causal

structures among sets of items or objects in transaction databases, relational databases, and other information repositories.

• Applications:– Basket data analysis, cross-marketing, catalog design, loss-

leader analysis, clustering, classification, etc.• Examples.

– Rule form: “Body ead [support, confidence]”.– buys(x, “diapers”) buys(x, “beers”) [0.5%, 60%]– major(x, “CS”) ^ takes(x, “DB”) grade(x, “A”) [1%, 75%]

More examples

– age(X, “20..29”) ^ income(X, “20..29K”) buys(X, “PC”) [support = 2%, confidence = 60%]

– contains(T, “computer”) contains(x, “software”) [1%, 75%]

Association rules are a data mining technique

• An association rules tell us something about the association between two attributes • Agrawal et al (1993) developed the first association rule algorithm, Apriori• A famous (but unsubstantiated AR) from a hypothetical supermarket transaction database is if nappies then beer (80%) Read this as nappies are bought implies beer are bought 80% of the time• Association rules have only recently been applied to law with promising results• Association rules can automatically discover rules that may prompt an analyst to think of hypothesis they would otherwise have considered

Rule Measures: Support and Confidence

• Find all the rules X & Y Z with minimum confidence and support– support, s, probability that a

transaction contains {X Y Z}– confidence, c, conditional

probability that a transaction having {X Y} also contains Z

Transaction ID Items Bought2000 A,B,C1000 A,C4000 A,D5000 B,E,F

Let minimum support 50%, and minimum confidence 50%, we have– A C (50%, 66.6%)– C A (50%, 100%)

Customerbuys diaper

Customerbuys both

Customerbuys beer

Support and confidenceare two independent notions.

Mining Association Rules—An Example

For rule A C:support = support({A C}) = 50%

confidence = support({A C})/support({A}) = 66.6%

Transaction ID Items Bought2000 A,B,C1000 A,C4000 A,D5000 B,E,F

Frequent Itemset Support{A} 75%{B} 50%{C} 50%{A,C} 50%

Min. support 50%Min. confidence 50%

Two Step Association Rule Mining

Step 1: Frequent itemset generation – use Support

Step 2: Rule generation – use Confidence

{milk, bread} is a frequent item set.Folks buying milk, also buy bread.Is it also true?: “Folks buying bread also buy milk.”

Confidence and support of an association rule

• 80% is the confidence of the rule if nappies then beer (80%). This is calculated by n2/n1 where:

•n1 = no of records where nappies are bought•n2 = no of records where nappies were bought and beer was also boug

ht.• if 1000 transactions for nappies, and of those, 800 also had beer then confidence is 80%.• A rule may have a high confidence but not be interesting because it doesn’t apply to many records in the database. i.e. no. of records where nappies were bought with beer / total records.• Rules that may be interesting have a confidence level and support level above a user set threshold

Interesting rules: Confidence and support of an association rule

• if 1000 transactions for nappies, and of those, 800 also had beer t

hen confidence is 80%.• A rule may have a high confidence but not be interesting because

it doesn’t apply to many records in the database. i.e. no. of records

where nappies were bought with beer / total records.• Rules that may be interesting have a confidence level and support

level above a user set threshold

Association rule screen shot with A-Miner from Split Up data set

• In 73.4% of cases where the wife's needs are some to high then the husband's future needs are few to some.• Prompts an analyst to posit plausible hypothesis e.g. it may be the case that the rule reflects the fact that more women remain custodial parents of the children following divorce than men do. The women that have some to high needs may do so because of their obligation to children.

Mining Frequent Itemsets: the Key Step

• Find the frequent itemsets: the sets of items that have

minimum support

– A subset of a frequent itemset must also be a frequent item

set – Apriori principle

• i.e., if {AB} is a frequent itemset, both {A} and {B} should be a fr

equent itemset

– Iteratively find frequent itemsets with cardinality from 1 to

k (k-itemset)

• Use the frequent itemsets to generate association rule

s.

The Apriori Algorithm

• Join Step: Ck is generated by joining Lk-1with itself

• Prune Step: Any (k-1)-itemset that is not frequent cannot be a subset of a frequent k-itemset

• Pseudo-code:Ck: Candidate itemset of size kLk : frequent itemset of size k

L1 = {frequent items};for (k = 1; Lk !=; k++) do begin Ck+1 = candidates generated from Lk; for each transaction t in database do

increment the count of all candidates in Ck+1 that are contained in t

Lk+1 = candidates in Ck+1 with min_support endreturn k Lk;

Association rules in law• Association rules generators are typically packaged with very expensive data mining suites. We developed A-Miner (available from authors) for a PC platform.• Typically, too many association rules are generated for feasible analysis. So, our current research involves exploring metrics of interesting to restrict numbers of rules that might be interesting• In general, structured data is not collected in law as it is in other domains so very large databases are rare• Our current research involves 380,000 records from a Legal Aid organization data base that contains data on client features. • ArgumentDeveloper shell that can be used by judges to structure their reasoning in a way that will facilitate data collection and reasoning

The Apriori Algorithm — Example

TID Items100 1 3 4200 2 3 5300 1 2 3 5400 2 5

Database D itemset sup.{1} 2{2} 3{3} 3{4} 1{5} 3

itemset sup.{1} 2{2} 3{3} 3{5} 3

Scan D

C1L1

itemset{1 2}{1 3}{1 5}{2 3}{2 5}{3 5}

itemset sup{1 2} 1{1 3} 2{1 5} 1{2 3} 2{2 5} 3{3 5} 2

itemset sup{1 3} 2{2 3} 2{2 5} 3{3 5} 2

L2

C2 C2

Scan D

Support = 2

Join Operation — Example

C3 L3itemset{2 3 5}

Scan D itemset sup{2 3 5} 2

itemset sup{1 3} 2{2 3} 2{2 5} 3{3 5} 2

itemset sup{1 3} 2{2 3} 2{2 5} 3{3 5} 2

L2 L2

L2 L2join

{1 3} {1 3}{1 3} {2 3}{1 3} {2 5}{1 3} {3 5}

null{1 2 3}null{1 3 5}

{2 3} {2 3}{2 3} {2 5}{2 3} {3 5}

null{2 3 5}{2 3 5}

{1 2}

{1 5}

Infrequent Subset

{2 5} {2 5}{2 5} {3 5}

null{2 3 5}

Anti-Monotone Property

If a set cannot pass a test, all of its supersetswill fail the same test as well.

If {2 3} does not have a support, nor will{1 2 3}, {2 3 5}, {1 2 3 5}, etc.

If {2 3} occurs only in 5 times, can {2 3 5}occur in 8 times?

How to Generate Candidates?• Suppose the items in Lk-1 are listed in an order

• Step 1: self-joining Lk-1

insert into Ck

select p.item1, p.item2, …, p.itemk-1, q.itemk-1

from Lk-1 p, Lk-1 q

where p.item1=q.item1, …, p.itemk-2=q.itemk-2, p.itemk-1 < q.itemk-1

• Step 2: pruningforall itemsets c in Ck do

forall (k-1)-subsets s of c do

if (s is not in Lk-1) then delete c from Ck

Example of Generating Candidates

• L3={abc, abd, acd, ace, bcd}

• Self-joining: L3*L3

– abcd from abc and abd

– acde from acd and ace

• Pruning:

– acde is removed because ade is not in L3

• C4={abcd}

Problem of generate-&-testheuristic

extremely serious pattern of priors

no prior convictions

very serious pattern of priorsserious pattern of priorsnot so serious pattern of priors

Severity of priorconvictions constellation

I.1Offender lone

penalty

Imprisonment

Hospital security orderIntensive correction orderSuspended sentenceYouth training centre detention

Combined custody and treatment order

Community based orderFineAdjournment on conditionsDischarge offenderDismiss offenceDefer sentence

I.3

serious offender status

Degree of remorsedisplayed by offender

highextreme

somelittlenone

I.2

Moral culpability ofoffender

highvery high

averagelowvery low

seriousness of the offencerelative to other armed

robberies

very seriousextreme serious

seriousnot so serioustrifling

Co-operationhighextreme

somelittlenone

Offender's age >0 yrs

I.4

Extent to which retribution is anappropriate purpose

very appropriatesomewhat appropriatenot appropriate at all

yesno

I.6

I.7

I.5

I.A Seriousness of armed robbery as anoffense relative to other offenses

very seriousextremely serious

seriousnot so serioustrifling

some psychiatric illness

major disability

Extent to which rehabilitation is anappropriate purpose

Extent to which communityprotection is an appropriate

purpose

very appropriatesomewhat appropriatenot appropriate at all

very appropriatesomewhat appropriatenot appropriate at all

Extent to which specific deterrenceis an appropriate purpose

very appropriatesomewhat appropriatenot appropriate at all

Extent to which general deterrenceis an appropriate purpose

very appropriatesomewhat appropriatenot appropriate at all

Offender's plea guilty plea duringguilt plea early

not guilty throughout

Offender's healthmajor psychiatric illness

drug dependency

no major health issuesAssociation rules can be used for forward

and backward inferences in the

generic/actual argument model for sentencing armed

robbery

Generic/actual argument model for sentencing armed robbery

extremely serious pattern of priors

no prior convictions

very serious pattern of priorsserious pattern of priorsnot so serious pattern of priors

Severity of priorconvictions constellation

prior offence name

prior offence type

:

date of prior offence

serious offender status at time

?

prior offence sentence

imprisonment, ico,cboetc

prior sentence jurisdiction VictoriaOther AustraliaOther

1Page-1 19 May, 2001 1 ofPage

I.1Offender lone

penalty

Imprisonment

Hospital security orderIntensive correction orderSuspended sentenceYouth training centre detention

Combined custody and treatment order

Community based orderFineAdjournment on conditionsDischarge offenderDismiss offenceDefer sentence

I.3

serious offender status

Degree of remorsedisplayed by offender

Degree of assistenceoffered to police by the

offender

Police interview

highextreme

somelittlenone

I.2

Moral culpability ofoffender

highvery high

averagelowvery low

seriousness of the offencerelative to other armed

robberies

very seriousextreme serious

seriousnot so serioustrifling

extremely significantsignificantnot so significant

not significant at all

highextreme

somelittlenone

Co-operationhighextreme

somelittlenone

Offender's age >0 yrs

Psychiatric illness

Gambling

Personal crisis

Cultural adjustment

bipolar disorder

none

other psychiatricother psychological

extreme addiction

none

serious addictionsome gambling

Drug dependence

extremely pertinent

not an issuesomewhat pertinent

extremely pertinent

not an issuesomewhat pertinent

I.4

Extent to which retribution is anappropriate purpose

very appropriatesomewhat appropriatenot appropriate at all

yesno

highverry highe

averagelowvery low

full admissionpartial admissioncomplete denialpositive defense offerredno instructions

I.6

I.7

neutralindicate remorse

indicate no remorse

noyes

noyes

I.5

I.A Seriousness of armed robbery as anoffense relative to other offenses

I.B

Impact of thecrime onvictims

Extent to whichAssisted victim

I.C

I.C

extremely significantsignificantnot so significant

not significant at all

Degree ofviolence

extremely significantsignificantnot so significant

not significant at all

Degree ofplanning

Impact of thecrime on thecommunity

highextreme

somelittlenone

Value ofproperty stolen

Duration ofoffence

over many days/months or yearsover many hoursover many minutes

I.D

very seriousextremely serious

seriousnot so serioustrifling

some psychiatric illness

major disability

Extent to which rehabilitation is anappropriate purpose

Extent to which communityprotection is an appropriate

purpose

very appropriatesomewhat appropriatenot appropriate at all

very appropriatesomewhat appropriatenot appropriate at all

Extent to which specific deterrenceis an appropriate purpose

very appropriatesomewhat appropriatenot appropriate at all

Extent to which general deterrenceis an appropriate purpose

very appropriatesomewhat appropriatenot appropriate at all

Assistance to Crown

importantvery important

provided but not importantnot provided

Penalty

Co-offender's

penalty

Reasons todepart fromfrom parity

with co-offenderpenalty

probably existcertainly exist

possibly existdon't exist

Imprisonment

Hospital security orderIntensive correction orderSuspended sentenceYouth training centre detention

Combined custody and treatment order

Community based orderFineAdjournment on conditionsDischarge offenderDismiss offenceDefer sentence

None

Offender's plea guilty plea duringguilt plea early

not guilty throughout

Offender's healthmajor psychiatric illness

drug dependency

no major health issues

Offender's healthmajor psychiatric illness

drug dependency

no major health issues

extreme addiction

none

serious addictionsome addiction

Personal background

extreme impact

little impact

serious impactsome impact

Intellectual disabilityextremely pertinent

not an issuesomewhat pertinent

Remarks to police

Apology offered

Restitution made

Pleaguilty plea duringguilt plea early

not guilty throughout

Forward inference: confidenceextremely serious pattern of priors

no prior convictions

serious pattern of priorsnot so serious pattern of priors

Severity of priorconvictions constellation

I.1Offender lone

penalty

Imprisonment

Hospital security orderIntensive correction orderSuspended sentenceYouth training centre detention

Combined custody and treatment order

Community based orderFineAdjournment on conditionsDischarge offenderDismiss offenceDefer sentence

I.3

serious offender status

Degree of remorsedisplayed by offender

highextreme

somelittlenone

I.2

Moral culpability ofoffender

highvery high

averagelowvery low

seriousness of the offencerelative to other armed

robberies

very seriousextreme serious

seriousnot so serioustrifling

Co-operationhighextreme

somelittlenone

Offender's age >0 yrs

I.4

Extent to which retribution is anappropriate purpose

very appropriatesomewhat appropriatenot appropriate at all

yesno

I.6

I.7

I.5

I.A Seriousness of armed robbery as anoffense relative to other offenses

very seriousextremely serious

seriousnot so serioustrifling

some psychiatric illness

major disability

Extent to which rehabilitation is anappropriate purpose

Extent to which communityprotection is an appropriate

purpose

very appropriatesomewhat appropriatenot appropriate at all

very appropriatesomewhat appropriatenot appropriate at all

Extent to which specific deterrenceis an appropriate purpose

very appropriatesomewhat appropriatenot appropriate at all

Extent to which general deterrenceis an appropriate purpose

very appropriatesomewhat appropriatenot appropriate at all

Offender's plea guilty plea duringguilt plea early

not guilty throughout

Offender's healthmajor psychiatric illness

drug dependency

no major health issues

very serious pattern of priors • In the sentence actual argument database the following outcomes were noted for the inputs suggested:

57%0.1%0%12%2%10%16%0%0%0%

Imprisonment

Hospital security orderIntensive correction orderSuspended sentenceYouth training centre detention

Combined custody and treatment order

Community based orderFineAdjournment on conditionsDischarge offenderDismiss offenceDefer sentence

If extremely serious pattern of priors then imprisonmentIf very serious pattern of priors then imprisonmentIf serious pattern of priors then imprisonmentIf not so serious pattern of priors then imprisonmentIf no prior convictions then imprisonment

Backward inference: constructing the strongest argument

If all the items you suggest AND

90% 2%75% 7%68% 17%78% 17%2% 3%

Conclusion

Data mining or Knowledge discovery from databases has not been

appropriately exploited in law to date. Association rules are useful in that they suggest hypotheses for

future researchAssociation rules integrated into the generic actual argument model

can assist in identifying the most plausible claim from given data items

in a forward inference way or the likelihood of missing data values in

a backward inference way

Generating Association Rules• For each nonempty subset s of l, output the rule:

s => (l - s)

if support_count(l) / support_count(s) >= min_conf

where min_conf is the minimum confidence threshold.

l = {2 3 5}, {2 3}, {2},{3 5}, {3},{2 5}, & {5}.

{2 3} => {5}{3 5} => {2}{2 5} => {3}

{2} => {3 5}{3} => {2 5}{5} => {2 3}

s of l are

Candidate rules:

Generating Association Rulesif support_count(l) / support_count(s) >= min_conf (e.g,75%),

then introduce the rule s => (l - s).

l = {2 3 5} s = {2 3} {2}{3 5} {3}{2 5} {5}

{2 3} => {5} : 2/2{3 5} => {2} : 2/2{2 5} => {3} : 2/3

{2} => {3 5} : 2/3{3} => {2 5} : 2/3{5} => {2 3} : 2/3

itemset sup{1 2} 1{1 3} 2{1 5} 1{2 3} 2{2 5} 3{3 5} 2

itemset sup{2 3 5} 2

itemset sup.{1} 2{2} 3{3} 3{4} 1{5} 3

Presentation of Association Rules (Table Form )

Visualization of Association Rule Using Plane Graph

Visualization of Association Rule Using Rule Graph

Decision tree is a classifier in the form of a tree structure where each node is either:  

•       a leaf node, indicating a class of instances, or

•       a decision node that specifies some test to be carried out on a single attribute value, with one branch and sub-tree for each possible outcome of the test.

 

A decision tree can be used to classify an instance by starting at the root of the tree and moving through it until a leaf node, which provides the classification of the instance.

 

Example: Decision making in the London stock market

 

Suppose that the major factors affecting the London stock market are:

 

•          what it did yesterday;

•          what the New York market is doing today;

•          bank interest rate;

•          unemployment rate;

•          England’s prospect at cricket.

 

The process of predicting an instance by this decision tree can also be expressed by answering the questions in the following order:

Is unemployment high?

YES: The London market will rise today

NO: Is the New York market rising today?

YES: The London market will rise today

NO: The London market will not rise today.

 

Decision tree induction is a typical inductive approach to learn knowledge on classification. The key requirements to do mining with decision trees are:

•       Attribute-value description: object or case must be expressible in terms of a fixed collection of properties or attributes.

•        Predefined classes: The categories to which cases are to be assigned must have been established beforehand (supervised data).

•       Discrete classes: A case does or does not belong to a particular class, and there must be for more cases than classes.

•       Sufficient data: Usually hundreds or even thousands of training cases.

•        “Logical” classification model: Classifier that can be only expressed as decision trees or set of production rules

 

 

An appeal of market analysis comes from the clarity and utility of its results, which are in the form of association rules. There is an intuitive appeal to a market analysis because it expresses how tangible products and services relate to each other, how they tend to group together. A rule like, “if a customer purchases three way calling, then that customer will also purchase call waiting” is clear. Even better, it suggests a specific course of action, like bundling three-way calling with call waiting into a single service package. While association rules are easy to understand, they are not always useful.

The following three rules are examples of real rules generated from real data:

On Thursdays, grocery store consumers often purchase diapers and beer together.

   Customers who purchase maintenance agreements are very likely to purchase large appliances.

  When a new hardware store opens, one of the most commonly sold items is toilet rings.

These three examples illustrate the three common types of rules produced by association rule analysis: the useful, the trivial, and the inexplicable.

OLAP (Summarization) Display Using MS/Excel 2000

Market-Basket-Analysis (Association)—Ball graph

Display of Association Rules in Rule Plane Form

Display of Decision Tree (Classification Results)

Display of Clustering (Segmentation) Results

3D Cube Browser