seqstream: mining closed sequential pattern over stream sliding windows lei chang tengjiao wang...

20
SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 111/05/14 1

Upload: barnaby-horn

Post on 18-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

SeqStream: Mining Closed Sequential Pattern over Stream Sliding WindowsSeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows

Lei Chang Tengjiao Wang Dongqing Yang Hua Luan

ICDM’08

Lei Chang Tengjiao Wang Dongqing Yang Hua Luan

ICDM’08

112/04/21 1

Page 2: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

Outline.

Preliminary.

Algorithm.

Experimental results.

Conclusion.

112/04/21 2

Page 3: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

Preliminary.

The inverse sequence of a sequence s, denoted by s’

s = <abae>, s’= <eaba>

An s-projected database Ds

<b>-projected database is {<da>,<ae>,<cda>,<cdae>,<cda>}

The size of Ds denoted as R(Ds)

The size of <b>-projected database is 14.112/04/21 3

Page 4: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

<e>-projected database is {φ,φ,<bcda>,<ac>}

The size of <e>-projected database is 6.

The inverse database of D, denoted by D’

The database in current sliding window after inserting(but before removing), denoted by D^.

D^ : {<fbda>,<abaec>,<fbcdac>,<bcdae>,<ebcdaf>,<aeac>}112/04/21 4

Page 5: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

In the inverse database of D^, the set of sequence from user appear in current window is called an insertion database denoted by D+.

The set of sequence from user that appear in remove winodw is called a removal database denoted by D-.

112/04/21 5

Page 6: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

D^ : {<fbda>,<abaec>,<fbcdac>,<bcdae>,<ebcdaf>,<aeac>}

D^’: {<adbf>,<ceaba>,<cadcbf>,<eadcb>,<fadcbe>,<caea>}

D+ : {<ceaba>,<cadcbf>,<fadcbe>}

D- : {<cadcbf>,<eadcb>,<fadcbe>}

112/04/21 6

Page 7: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

112/04/21 7

Page 8: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

closed pattern : {<a>:6,<ae>:3,<c>:4,<ba>:5,<bda>:4,

<bcda>:3,<e>:4}

closed pattern : {<a>:6,<ab>:5,<adb>:4,<adcb>:3,<c>:4,

<e>:4,<ea>:3}

112/04/21 8

Page 9: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

sn : A node n of an IST corresponds a sequence that starts from the root

node to that node, and the sequence is denoted by Sn.

c-node : If sn is a closed sequential sequence in D’, n is a c-node.

t-node : If sn is not a closed sequential sequence in D’ and it does not

have any t-node ancestor.

i-node : n is neither a c-node nor t-node. 112/04/21 9

Page 10: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

Algorithm.

Element insertion

Element removal

State update

112/04/21 10

Page 11: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

Element insertion

Theorem 2 : If a depth-1 node whose item does not occur in the newly coming element, nodes under that node will not change their attribute values and any t-node under it does not change its type after inserting the element.

Theorem 3 : After inserting a new element, if the PDBSize and support of a t-node do not change, it will keep to be a t-node.

112/04/21 11

Page 12: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

112/04/21 12

Page 13: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

Dc^’ : {<eaba>,<adbf>,<b>,<be>,<aea>}

Df^’ : {φ, φ,<adcbe>}

c : {<eaba>,<ab>,<b>,<be>,<aea>}

ca : {<ba>,<b>,<ea>}

cb : {<a>, φ,<e>}

ce : {<aba>, φ,<a>}

112/04/21 13

Page 14: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

Element removal

Theorem 5 : After the removal of etc−w, a t-node may be deleted, but it never changes to a c-node or an i-node.

For each child node t of n, it computes st-projected database in the removal database D−

112/04/21 14

Page 15: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

D − : {<cadcbf>,<eadcb>,<fadcbe>}

Da−: {<dcbf>,<dcb>,<dcbe>}

Db−: {<f> ,φ,<e>}

Dc − : {<adbf>, <b>,<be>}

……

Df − : {φ,<adcbe>}

112/04/21 15

Page 16: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

State update

Theorem 6 : Given a t-node n in an IST for the inverse database D, there must exist an i-node or a c-node t in the IST.

i-node => c-node

c-node => t-node

112/04/21 16

Page 17: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

112/04/21 17

Page 18: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

Experimental results.

112/04/21 18

Page 19: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

112/04/21 19

Page 20: SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang

Conclusion.

This paper has proposed a Seqstream algorithm to mine closed sequential pattern in sliding window.

Designed for multi-stream?

112/04/21 20