Download - Frequent-Pattern Tree
![Page 1: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/1.jpg)
Frequent-Pattern Tree
![Page 2: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/2.jpg)
2
Bottleneck of Frequent-pattern Mining
Multiple database scans are costly Mining long patterns needs many passes
of scanning and generates lots of candidates
To find frequent itemset i1i2…i100
# of scans: 100 # of Candidates: (100
1) + (1002) + … + (1
10000) =
2100-1 = 1.27*1030 ! Bottleneck: candidate-generation-and-
test Can we avoid candidate generation?
![Page 3: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/3.jpg)
3
Grow long patterns from short ones using local frequent items
“abc” is a frequent pattern Get all transactions having “abc”: DB|abc
(projected database on abc) “d” is a local frequent item in DB|abc
abcd is a frequent pattern Get all transactions having “abcd”
(projected database on “abcd”) and find longer itemsets
Mining Freq Patterns w/o Candidate Generation
![Page 4: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/4.jpg)
4
Mining Freq Patterns w/o Candidate Generation Compress a large database into a
compact, Frequent-Pattern tree (FP-tree) structure
Highly condensed, but complete for frequent pattern mining
Avoid costly database scans Develop an efficient, FP-tree-based
frequent pattern mining method A divide-and-conquer methodology:
decompose mining tasks into smaller ones Avoid candidate generation: examine sub-
database (conditional pattern base) only!
![Page 5: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/5.jpg)
5
Construct FP-tree from a Transaction DB
min_sup= 50%
TID Items bought (ordered) frequent items
100 {f, a, c, d, g, i, m, p}{f, c, a, m, p}
200 {a, b, c, f, l, m, o} {f, c, a, b, m}
300 {b, f, h, j, o}{f, b}
400 {b, c, k, s, p}{c, b, p}
500 {a, f, c, e, l, p, m, n}{f, c, a, m, p}
Steps:
1. Scan DB once, find frequent 1-itemset (single item pattern)
2. Order frequent items in frequency descending order: f, c, a, b, m, p (L-order)
3. Process DB based on L-order
a 3 i 1
b 3 j 1
c 4 k 1
d 1 l 2
e 1 m 3
f 4 n 1
g 1 o 2
h 1 p 3
![Page 6: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/6.jpg)
6
Construct FP-tree from a Transaction DB
{}Header Table
Item frequency head f 0 nilc 0 nila 0 nilb 0 nilm 0 nilp 0 nil
TID Items bought (ordered) frequent items
100 {f, a, c, d, g, i, m, p}{f, c, a, m, p}
200 {a, b, c, f, l, m, o} {f, c, a, b, m}
300 {b, f, h, j, o}{f, b}
400 {b, c, k, s, p}{c, b, p}
500 {a, f, c, e, l, p, m, n}{f, c, a, m, p}
Initial FP-tree
![Page 7: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/7.jpg)
7
Construct FP-tree from a Transaction DB
{}
f:1
c:1
a:1
m:1
p:1
Header Table
Item frequency head f 1c 1a 1b 0 nilm 1p 1
TID Items bought (ordered) frequent items
100 {f, a, c, d, g, i, m, p}{f, c, a, m, p}
200 {a, b, c, f, l, m, o} {f, c, a, b, m}
300 {b, f, h, j, o}{f, b}
400 {b, c, k, s, p}{c, b, p}
500 {a, f, c, e, l, p, m, n}{f, c, a, m, p}
Insert {f, c, a, m, p}
![Page 8: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/8.jpg)
8
Construct FP-tree from a Transaction DB
{}
f:2
c2
a:2
b:1m:1
p:1 m:1
Header Table
Item frequency head f 2c 2a 2b 1m 2p 1
TID Items bought (ordered) frequent items
100 {f, a, c, d, g, i, m, p}{f, c, a, m, p}
200 {a, b, c, f, l, m, o} {f, c, a, b, m}
300 {b, f, h, j, o}{f, b}
400 {b, c, k, s, p}{c, b, p}
500 {a, f, c, e, l, p, m, n}{f, c, a, m, p}
Insert {f, c, a, b, m}
![Page 9: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/9.jpg)
9
Construct FP-tree from a Transaction DB
{}
f:3
b:1c:2
a:2
b:1m:1
p:1 m:1
Header Table
Item frequency head f 3c 2a 2b 2m 2p 1
TID Items bought (ordered) frequent items
100 {f, a, c, d, g, i, m, p}{f, c, a, m, p}
200 {a, b, c, f, l, m, o} {f, c, a, b, m}
300 {b, f, h, j, o}{f, b}
400 {b, c, k, s, p}{c, b, p}
500 {a, f, c, e, l, p, m, n}{f, c, a, m, p}
Insert {f, b}
![Page 10: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/10.jpg)
10
Construct FP-tree from a Transaction DB
{}
f:3 c:1
b:1
p:1
b:1c:2
a:2
b:1m:1
p:1 m:1
Header Table
Item frequency head f 3c 3a 2b 3m 2p 2
TID Items bought (ordered) frequent items
100 {f, a, c, d, g, i, m, p}{f, c, a, m, p}
200 {a, b, c, f, l, m, o} {f, c, a, b, m}
300 {b, f, h, j, o}{f, b}
400 {b, c, k, s, p}{c, b, p}
500 {a, f, c, e, l, p, m, n}{f, c, a, m, p}
Insert {c, b, p}
![Page 11: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/11.jpg)
11
Construct FP-tree from a Transaction DB
{}
f:4 c:1
b:1
p:1
b:1c:3
a:3
b:1m:2
p:2 m:1
Header Table
Item frequency head f 4c 4a 3b 3m 3p 3
TID Items bought (ordered) frequent items
100 {f, a, c, d, g, i, m, p}{f, c, a, m, p}
200 {a, b, c, f, l, m, o} {f, c, a, b, m}
300 {b, f, h, j, o}{f, b}
400 {b, c, k, s, p}{c, b, p}
500 {a, f, c, e, l, p, m, n}{f, c, a, m, p}
Insert {f, c, a, m, p}
![Page 12: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/12.jpg)
12
Benefits of FP-tree Structure Completeness:
Preserve complete DB information for frequent pattern mining (given prior min support)
Each transaction mapped to one FP-tree path; counts stored at each node
Compactness One FP-tree path may correspond to multiple
transactions; tree is never larger than original database (if not count node-links and counts)
Reduce irrelevant information—infrequent items are gone
Frequency-descending ordering: more frequent items are closer to tree top and more likely to be shared
![Page 13: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/13.jpg)
13
How Effective Is FP-tree?
1
10
100
1000
10000
100000
0% 20% 40% 60% 80% 100%
Support threshold
Siz
e (
K)
Alphabetical FP-tree Ordered FP-tree
Tran. DB Freq. Tran. DB
Dataset: Connect-4(a dense dataset)
![Page 14: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/14.jpg)
14
Mining Frequent Patterns Using FP-tree General idea (divide-and-conquer)
Recursively grow frequent pattern path using FP-tree
Frequent patterns can be partitioned into subsets according to L-order
L-order=f-c-a-b-m-p Patterns containing p Patterns having m but no p Patterns having b but no m or p … Patterns having c but no a nor b, m, p Pattern f
![Page 15: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/15.jpg)
15
Mining Frequent Patterns Using FP-tree Step 1 : Construct conditional pattern
base for each item in header table Step 2: Construct conditional FP-tree
from each conditional pattern-base Step 3: Recursively mine conditional FP-
trees and grow frequent patterns obtained so far
If conditional FP-tree contains a single path, simply enumerate all patterns
![Page 16: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/16.jpg)
16
Step 1: Construct Conditional Pattern Base Starting at header table of FP-tree Traverse FP-tree by following link of each
frequent item Accumulate all transformed prefix paths
of item to form a conditional pattern base
Conditional pattern bases
item cond. pattern base
c f:3
a fc:3
b fca:1, f:1, c:1
m fca:2, fcab:1
p fcam:2, cb:1
{}
f:4 c:1
b:1
p:1
b:1c:3
a:3
b:1m:2
p:2 m:1
Header Table
Item frequency head f 4c 4a 3b 3m 3p 3
![Page 17: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/17.jpg)
17
Step 2: Construct Conditional FP-tree For each pattern-base
Accumulate count for each item in base Construct FP-tree for frequent items of
pattern baseConditional pattern bases
item cond. pattern base
c f:3
a fc:3
b fca:1, f:1, c:1
m fca:2, fcab:1
p fcam:2, cb:1
p conditional FP-tree
f 2
c 3
a 2
m 2
b 1
{}
c:3
Item frequency head c 3
min_sup= 50%
# transaction =5
fcamfcam
cb
![Page 18: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/18.jpg)
18
Mining Frequent Patterns by Creating Conditional Pattern-Bases
EmptyEmptyf
{(f:3)}|c{(f:3)}c
{(f:3, c:3)}|a{(fc:3)}a
Empty{(fca:1), (f:1), (c:1)}b
{(f:3, c:3, a:3)}|m{(fca:2), (fcab:1)}m
{(c:3)}|p{(fcam:2), (cb:1)}p
Conditional FP-treeConditional pattern-baseItem
![Page 19: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/19.jpg)
19
Step 3: Recursively mine conditional FP-tree
suffix: p(3)
FP: p(3) CPB: fcam:2, cb:1
c(3)
FP-tree
: Suffix: cp(3)
FP: cp(3)
CPB: nil
Collect all patterns that end at p
![Page 20: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/20.jpg)
20
• Collect all patterns that end at m
suffix: m(3)
FP: m(3)
CPB: fca:2, fcab:1
suffix: cm(3)
FP: cm(3
)
CPB: f:3
f(3)
FP-tree
:c(3
)
suffix: fm(3)
FP: fm(3)
CPB: nil
f(3)
FP-tree
:
suffix: fcm(3)
FP: fcm(3)
CPB: nil
a(3)
suffix: am(3)
Continue next page
Step 3: Recursively mine conditional FP-tree
![Page 21: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/21.jpg)
21
Collect all patterns that end at m (cont’d)
suffix: am(3)
FP: am(3
)
CPB: fc:3
suffix: cam(3)
FP: cam(3
)
CPB: f:3
f(3)
FP-tree
:
c(3)
suffix: fam(3)
FP: fam(3
)
CPB: nil
f(3)
FP-tree
:
suffix: fcam(3)
FP: fcam(3
)
CPB: nil
![Page 22: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/22.jpg)
22
FP-growth vs. Apriori: Scalability With the Support Threshold
0
10
20
30
40
50
60
70
80
90
100
0 0.5 1 1.5 2 2.5 3
Support threshold(%)
Ru
n t
ime(
sec.
)
D1 FP-grow th runtime
D1 Apriori runtime
Data set T25I20D10K
![Page 23: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/23.jpg)
23
Why Is Frequent Pattern Growth Fast?
Performance study shows FP-growth is an order of magnitude faster
than Apriori Reasoning
No candidate generation, no candidate test Use compact data structure Eliminate repeated database scan Basic operations are counting and FP-tree
building
![Page 24: Frequent-Pattern Tree](https://reader036.vdocuments.us/reader036/viewer/2022070417/568153ce550346895dc1c323/html5/thumbnails/24.jpg)
24
Weaknesses of FP-growth Support dependent; cannot accommodate
dynamic support threshold Cannot accommodate incremental DB
update Mining requires recursive operations