mining weighted sequential patterns in a sequence

Upload: daniel-dharmapuri

Post on 10-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    1/22

    1

    Mining weighted sequential patterns in a

    sequence database with a time-interval weight

    Author:Joong Hyuk Chang

    DCIT, Daegu University,

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    2/22

    2

    ABSTRACT

    The weighted sequential pattern mining aims to find more interesting

    sequential patterns, considering the different significance of each data

    element in a sequence database.

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    3/22

    3

    OUTLINE

    ABSTRACT

    KEYWORDS

    APPLICATIONS

    INTRODUCTION

    RELATED WORK PROBLEM DEFINITION

    TiWS Patterns

    Mining TiWS patterns in a large database

    EXPERIMENTAL RESULTS

    CONCLUSION

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    4/22

    4

    KEY WORDS

    Sequence database

    Time-interval sequence database

    Sequential pattern mining Weighted sequential pattern

    TiWS pattern

    TiWS support

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    5/22

    5

    KEY WORDS

    Sequence database: A sequence database consists of ordered

    elements or events.

    Time-interval sequence database: The sequence database with

    associated time stamp list.

    Sequential pattern mining: Given a set of sequences and supportthreshold, finding the complete set of frequent subsequences.

    Given support thresholdmin_sup =2, is a sequentialpattern.

    Weighted sequential pattern mining: The sequentional patternmining of the weighted sequences in a sequence database.

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    6/22

    6

    APPLICATIONS

    Applications of sequential pattern mining

    Customer purchase pattern analysis

    First buy computer, then CD-ROM, and then digital camera,

    within 3 months.

    Medical treatments, natural disasters (e.g., earthquakes), science

    & eng. processes, stocks and markets, etc.

    Web access pattern analysis

    Telephone calling patterns, Weblog click streams

    DNA sequences and gene structures analysis.

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    7/22

    7

    INTRODUCTION

    Sequential pattern mining aims to discover more

    interesting patterns in a sequence database.

    Example:

    [Customer_A]:Laser printer______________Jan

    Scanner_________________Feb

    CD Burner_______________March

    [Customer_B]:

    Laser printer______________Jan

    Scanner_________________Jun

    CD Burner_______________Sep.

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    8/22

    8

    RELATED WORK

    To improve the usefulness of mining results in realworld applications, weighted pattern mining has beenstudied in association rule mining and sequentialpattern mining .Most of the weighted pattern mining

    algorithms usually require pre-assigned weights, andthe weights are generally derived from the quantitativeinformation and the importance of items in a real worldapplication.

    General sequential pattern mining,

    Closed and maximal sequential pattern mining, etc.

    SPADE and PrefixSpan are more efficient in terms ofprocessing time.

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    9/22

    9

    PROBLEM DEFINITION

    Let I={i1,i2,in} be a set of all items.

    A sequence S= is an ordered list ofitemsets,

    where sj : itemset, Time stamp list TS(S)=

    where tj-1

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    10/22

    10

    Subsequence vs. super sequence

    Given two sequences =< a1 a2 an > and =< b1 b2 bm >

    is called a subsequence of , denoted as , if thereexist integers 1 j1 < j2 and =< (abc), (de)>

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    11/22

    TiWS-patterns

    A time-interval between pair of itemsets:

    Time interval weight of a pair of itemsets.

    Time interval weight of a sequence.1. Strength of a pair of itemsets.

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    12/22

    12

    A time-interval between pair of items

    Definition 1:A time-interval between pair of items:S= is a sequence.TS(S)=be the time stamp list

    The time interval between si and sj isTiij=tj-ti where(1

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    13/22

    13

    Time interval weight of a pair of itemsets.

    3 weight functions as

    WF_1:General scale weighting: Wg(TIij)=7,

    LMXWMWLX

    WF_2:Log scale weighting:Wl(TIij)=ORJ7,LMXORJWM

    WLX

    WF_3:General scale weighting with a

    ceiling:Wc(TIij)=7,LM

    X

    WMWL

    X

    :KHUHXX!WKHVL]HRIWKHXQLWWLPHDQG

    LVWKHEDVHQRWRGHWHUPLQHWKHDPRXQWRIZHLJKWUHGXFWLRQSHUXQLWWLPH

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    14/22

    14

    Time ineterval weight of the sequence.

    Definition 2:

    Strength of a pair of itemsets

    STij=length(si)xlength(sj).

    Time-interval weight of a sequence.

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    15/22

    15

    TiWS-Support

    Definition3:(TiW-support of a sequence)

    The TiW-support of a sequence X in SDB,TiW-Supp(X), is defined as follows

    WKHZHLJKWRIWKHVHTXHQFH;7KHZHLJKWRIWKHWRWDOVHTXHQFHVLQ6'%

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    16/22

    16

    TiWS-Support

    Definition 6:TiWS-patterns

    Given a support threshold minSupport(0

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    17/22

    Anti-monotone property of TiWS-support

    Let A and B be sequences in an SDB, and B is a supersequence of A then the TiWS-support of A is found as

    VLQFH$%WKHZHLJKWRIWKH$LVDOZD\VJUHDWHUWKDQWKHZHLJ $FFRUGLQJO\WKHIROORZLQJKROGV

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    18/22

    6'%0LQVXSWLPHLQWHUYDOZHLJKWLQJ

    IXQFWLRQ

    SV7L:6

    6FDQ6'%RQFH)RUHDFKVHTXHQFH6FDOO*HW:HLJKW6)LQGHDFKWLPHLQWHUYDOIUHTXHWLWHPVXFKWKDW7L:66XSS!PLQ6XSS

    )RUHDFKWLPHLQWHUYDOZHLJKWHGIUHTXHQWLWHPRXWSXW DQGFDOO6SDQO6O

    7KHFRPSOHWHVHWRI7L:6SDWWHUV

    ,QSXW

    3URFHGXUH6SDQO6O

    3URFHGXUH*HW:HLJKW6

    0LQLQJ7L:6SDWWHUQVLQODUJHVHTXHQFHGDWDEDVH

    OVO2XWSXW

    6VVVVQ!766WWWWQ!

    :6

    O6O

    7L:6SDWWHUQV7L:6SDWWHUQ6LQJOHLWHP

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    19/22

    3URFHGXUH*HW:HLJKW6

    3DUDPHWHUV

    6LVDVHTXHQFHZLWKHOHPHQWV766LVLWVWLPHVWDPSOLVWLH6VVVVQ!

    766WWWO!/HWZ7,EHDZHLJKWLQJIXQFWLRQIRUDWLPHLQWHUYDO7,

    0HWKRG

    )RUHDFKSDLURIHOHPHQWVVLDQGVMLMOILQGWKHWLPHLQWHUYDO7,LMEHWZHHQWKHWZRHOHPHQWVDQG

    ILQGWKHVWUHQJWKRIWKHSDLU67LM

    )LQGWKHWLPHLQWHUYDOZHLJKWRIVZ6DVIROORZV

    :6

    5HWXUQ:6

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    20/22

    3DUDPHWHUV

    LVDVHTXHQWLDOSDWWHUQOLVWKHOHQJWKRI

    6OLVSURMHFWHGGDWDEDVH

    0HWKRG

    6FDQ6O RQFHILQGHDFKWLPHLQWHUYDOZHLJKWHGIUHTXHQWLWHPVXFKWKDWLWFDQEHDVVHPEOHGWRWKHODVWHOHPHQWRIWRIRUPD7L:6SDWWHUQRULWFDQEHDSSHQGHGWRWRIRUPD7L:6SDWWHUQ

    )RUHDFKWLPHLQWHUYDOZHLJKWHGIUHTXHQWLWHPDSSHQGLWWRWRIRUPD7L:6SDWWHUQDQGRXWSXW

    )RUHDFKFRQVWUXFWSURMHFWHGGDWDEDVHVODQG

    FDOOVSDQOVO

    3URFHGXUH6SDQO6O

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    21/22

    21

    CONCLUSION

    A process to get the weight of the sequence isproposed.

    A new framework is developed for miningweighted sequential patterns.

  • 8/8/2019 Mining Weighted Sequential Patterns in a Sequence

    22/22

    22