introduction to machine learning - san jose state universitystamp/ml/files/zz_figures.pdf ·...
TRANSCRIPT
![Page 1: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/1.jpg)
Introduction to Machine Learningwith Applications in Information Security
Mark Stamp
April 27, 2017
Chapter 2
𝒪0 𝒪1 𝒪2 · · · 𝒪𝑇−1
𝑋0 𝑋1 𝑋2 · · · 𝑋𝑇−1𝐴 𝐴 𝐴 𝐴
𝐵 𝐵 𝐵 𝐵
Figure 2.1: Hidden Markov model
H.06
C.28
H.0448
C.0336
H.003136
C.014112
H.002822
C.000847
Figure 2.2: Dynamic programming
1
![Page 2: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/2.jpg)
2
Problem 2.15
I L I K E K I L L I N G P E O P L
E B E C A U S E I T I S S O M U C
H F U N I T I S M O R E F U N T H
A N K I L L I N G W I L D G A M E
I N T H E F O R R E S T B E C A U
S E M A N I S T H E M O S T D A N
G E R O U E A N A M A L O F A L L
T O K I L L S O M E T H I N G G I
V E S M E T H E M O S T T H R I L
L I N G E X P E R E N C E I T I S
E V E N B E T T E R T H A N G E T
T I N G Y O U R R O C K S O F F W
I T H A G I R L T H E B E S T P A
R T O F I T I S T H A E W H E N I
D I E I W I L L B E R E B O R N I
N P A R A D I C E A N D A L L T H
E I H A V E K I L L E D W I L L B
E C O M E M Y S L A V E S I W I L
L N O T G I V E Y O U M Y N A M E
B E C A U S E Y O U W I L L T R Y
T O S L O I D O W N O R A T O P M
Y C O L L E C T I O G O F S L A V
E S F O R M Y A F T E R L I F E E
B E O R I E T E M E T H H P I T I
Problem 2.16
![Page 3: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/3.jpg)
3
Chapter 3
begin 𝑀1 𝑀2 𝑀3 𝑀4 end
Figure 3.1: PHMM without gaps
begin 𝑀1 𝑀2 𝑀3 𝑀4 end
𝐼0 𝐼1 𝐼2 𝐼3 𝐼4
Figure 3.2: PHMM with insertions
begin 𝑀1 𝑀2 𝑀3 𝑀4 end
𝐷1 𝐷2 𝐷3 𝐷4
Figure 3.3: PHMM with deletions
begin 𝑀1 𝑀2 𝑀3 𝑀4 end
𝐼0 𝐼1 𝐼2 𝐼3 𝐼4
𝐷1 𝐷2 𝐷3 𝐷4
Figure 3.4: Profile hidden Markov model
![Page 4: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/4.jpg)
4
5
48
3
2
7 1
6
10
9
79 105
85
79
94 85
84
78
81
Figure 3.5: Minimum spanning tree
10,11,12 12 5,8,12
1,3,9
5,13 1,2,4,6,10
2,6,7,8
3,7,9,11
1,2,3,4,5 10 13
6,7,8,9,13
4,11
4 8
3,7,9,11,13
1,2,3,5
2,6,7,10
1 6 9
𝑀0 𝑀1 𝑀2 𝑀3
𝐼0 𝐼1 𝐼2
𝐷1 𝐷2
Figure 3.6: PHMM with 𝑁 = 2 illustrating paths in Table 3.12
![Page 5: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/5.jpg)
5
𝐹𝑀0 (0)
𝐹𝑀1 (1)
𝐹 𝐼0 (1)
𝐹𝐷1 (0)
𝐹𝑀2 (2)
𝐹 𝐼1 (2)
𝐹𝐷2 (1)
𝐹𝑀1 (2)
𝐹 𝐼0 (2)
𝐹𝐷1 (1)
𝐹𝑀2 (1)
𝐹 𝐼1 (1)
𝐹𝐷2 (0)
𝐹𝑀𝑁 (𝐿)
𝐹 𝐼𝑁 (𝐿)
𝐹𝐷𝑁 (𝐿)
Figure 3.7: Forward algorithm recursion
𝐹𝑀𝑁+1(𝐿)
𝐹𝑀𝑁 (𝐿)
𝐹 𝐼𝑁 (𝐿)
𝐹𝐷𝑁 (𝐿)
𝑎𝑀
𝑁𝐸
𝑎𝐼𝑁𝐸
𝑎𝐷𝑁
𝐸
Figure 3.8: Final score
![Page 6: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/6.jpg)
6
Chapter 4
𝑥
𝐴𝑥
Figure 4.1: Matrix multiplication example
𝑥
𝐴𝑥
Figure 4.2: Eigenvector example
(a) Experimental results (b) Direction of maximum variance
Figure 4.3: PCA and maximum variance
![Page 7: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/7.jpg)
7
Figure 4.4: A better basis
𝜃
Figure 4.5: Ferris wheel data
Figure 4.6: Non-orthogonal data
![Page 8: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/8.jpg)
8
𝜎1
𝜎2
𝜎1𝜎2
𝑀
𝑆
𝑉 𝑇 𝑈
Figure 4.7: Matrix transformation using SVD
![Page 9: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/9.jpg)
9
Chapter 5
Figure 5.1: Scatterplot of training data
Figure 5.2: Separating hyperplanes
Figure 5.3: Maximizing the margin
![Page 10: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/10.jpg)
10
Figure 5.4: Not linearly separable
𝜑=⇒
Figure 5.5: Transformation to linearly separable
𝜑=⇒
Figure 5.6: Transformation from 2-d to 3-d
![Page 11: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/11.jpg)
11
−4.00−2.00 0.00 2.00 4.00 −5.00
0.00
5.00
−20.00
0.00
20.00
𝑥𝑦
𝑓(𝑥,𝑦)
Figure 5.7: Graph of 𝑓(𝑥, 𝑦) = 16− (𝑥2 + 𝑦2)
−4.00−2.00 0.00 2.00 4.00 −5.00
0.00
5.00
−20.00
0.00
20.00
−4.00−2.00 0.00 2.00 4.00 −5.00
0.00
5.00
−20.00
0.00
20.00
(a) Intersection (b) Feasible region
Figure 5.8: Constrained optimization example
𝑚
Figure 5.9: Linearly separable example
![Page 12: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/12.jpg)
12
supportvectors
Figure 5.10: Support vectors
Figure 5.11: Errors and soft margin
![Page 13: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/13.jpg)
13
Problem 5.15
![Page 14: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/14.jpg)
14
Chapter 6
𝐸 𝐴 𝑂 𝐼 𝑈 𝑇 𝑁 𝑆 𝑅
Figure 6.1: Dendrogram
𝐴
𝐵
Euclidean
dista
nce
Manhattan distance
Figure 6.2: Euclidean vs Manhattan distance
Figure 6.3: Distortion
![Page 15: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/15.jpg)
15
(a) Suitable for clustering (b) Not as well-suited for clustering
Figure 6.4: Clusterability
(c) Correlation 0 < 𝑟𝑋𝑌 < 1 (d) Correlation −1 < 𝑟𝑋𝑌 < 0
(a) Correlation 𝑟𝑋𝑌 = 1 (b) Correlation 𝑟𝑋𝑌 = −1
Figure 6.5: Correlation coefficient and regression line examples
![Page 16: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/16.jpg)
16
(a) Correlation 𝑟𝐴𝐷 = −0.8652 (b) Correlation 𝑟𝐴𝐷 = −0.5347
Figure 6.6: Correlation coefficient examples
X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 231 0.00 0.36 0.70 0.85 0.75 1.61 0.95 1.41 2.17 2.25 1.71 2.49 2.81 3.28 3.50 3.82 4.46 4.46 3.96 3.20 2.97 3.86 4.232 0.36 0.00 0.56 0.49 0.51 1.25 0.78 1.10 2.23 2.42 1.77 2.55 2.94 3.16 3.37 3.67 4.25 4.25 3.75 3.01 2.75 3.62 4.003 0.70 0.56 0.00 0.62 1.03 1.27 1.31 1.44 1.72 2.01 1.28 2.03 2.48 2.61 2.83 3.13 3.76 3.76 3.26 2.50 2.27 3.16 3.534 0.85 0.49 0.62 0.00 0.61 0.76 0.89 0.83 2.31 2.63 1.88 2.61 3.09 2.97 3.16 3.42 3.92 3.92 3.42 2.72 2.42 3.25 3.655 0.75 0.51 1.03 0.61 0.00 1.14 0.29 0.70 2.73 2.93 2.28 3.05 3.46 3.56 3.75 4.02 4.52 4.52 4.03 3.32 3.03 3.85 4.266 1.61 1.25 1.27 0.76 1.14 0.00 1.31 0.76 2.75 3.21 2.38 3.02 3.59 3.01 3.14 3.32 3.64 3.64 3.16 2.57 2.21 2.92 3.357 0.95 0.78 1.31 0.89 0.29 1.31 0.00 0.71 3.01 3.18 2.55 3.33 3.72 3.85 4.04 4.30 4.79 4.79 4.29 3.60 3.30 4.11 4.528 1.41 1.10 1.44 0.83 0.70 0.76 0.71 0.00 3.13 3.45 2.71 3.43 3.92 3.66 3.82 4.02 4.39 4.39 3.91 3.29 2.94 3.68 4.109 2.17 2.23 1.72 2.31 2.73 2.75 3.01 3.13 0.00 0.74 0.46 0.32 0.85 1.63 1.90 2.30 3.21 3.21 2.79 2.05 2.11 2.90 3.1110 2.25 2.42 2.01 2.63 2.93 3.21 3.18 3.45 0.74 0.00 0.86 0.83 0.63 2.33 2.60 3.00 3.92 3.92 3.52 2.79 2.85 3.64 3.8411 1.71 1.77 1.28 1.88 2.28 2.38 2.55 2.71 0.46 0.86 0.00 0.78 1.22 1.88 2.15 2.53 3.38 3.38 2.93 2.15 2.13 2.98 3.2412 2.49 2.55 2.03 2.61 3.05 3.02 3.33 3.43 0.32 0.83 0.78 0.00 0.68 1.51 1.78 2.18 3.11 3.11 2.72 2.03 2.15 2.87 3.0413 2.81 2.94 2.48 3.09 3.46 3.59 3.72 3.92 0.85 0.63 1.22 0.68 0.00 2.09 2.33 2.73 3.68 3.68 3.33 2.68 2.82 3.51 3.6514 3.28 3.16 2.61 2.97 3.56 3.01 3.85 3.66 1.63 2.33 1.88 1.51 2.09 0.00 0.27 0.67 1.60 1.60 1.25 0.75 1.08 1.48 1.5615 3.50 3.37 2.83 3.16 3.75 3.14 4.04 3.82 1.90 2.60 2.15 1.78 2.33 0.27 0.00 0.40 1.35 1.35 1.03 0.70 1.07 1.30 1.3316 3.82 3.67 3.13 3.42 4.02 3.32 4.30 4.02 2.30 3.00 2.53 2.18 2.73 0.67 0.40 0.00 0.96 0.96 0.72 0.75 1.13 1.05 0.9817 4.46 4.25 3.76 3.92 4.52 3.64 4.79 4.39 3.21 3.92 3.38 3.11 3.68 1.60 1.35 0.96 0.00 0.00 0.50 1.27 1.50 0.74 0.3218 4.46 4.25 3.76 3.92 4.52 3.64 4.79 4.39 3.21 3.92 3.38 3.11 3.68 1.60 1.35 0.96 0.00 0.00 0.50 1.27 1.50 0.74 0.3219 3.96 3.75 3.26 3.42 4.03 3.16 4.29 3.91 2.79 3.52 2.93 2.72 3.33 1.25 1.03 0.72 0.50 0.50 0.00 0.79 1.00 0.38 0.3220 3.20 3.01 2.50 2.72 3.32 2.57 3.60 3.29 2.05 2.79 2.15 2.03 2.68 0.75 0.70 0.75 1.27 1.27 0.79 0.00 0.39 0.85 1.1021 2.97 2.75 2.27 2.42 3.03 2.21 3.30 2.94 2.11 2.85 2.13 2.15 2.82 1.08 1.07 1.13 1.50 1.50 1.00 0.39 0.00 0.90 1.2622 3.86 3.62 3.16 3.25 3.85 2.92 4.11 3.68 2.90 3.64 2.98 2.87 3.51 1.48 1.30 1.05 0.74 0.74 0.38 0.85 0.90 0.00 0.4323 4.23 4.00 3.53 3.65 4.26 3.35 4.52 4.10 3.11 3.84 3.24 3.04 3.65 1.56 1.33 0.98 0.32 0.32 0.32 1.10 1.26 0.43 0.00
(a) Heatmap corresponding to Figure 6.6 (a)
X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 231 0.00 0.36 0.45 0.85 0.75 1.53 0.95 1.41 0.74 1.01 0.80 1.32 1.56 1.46 1.70 2.05 2.11 2.42 1.66 1.10 1.23 1.66 2.052 0.36 0.00 0.54 0.49 0.51 1.17 0.78 1.10 0.79 1.23 0.69 1.31 1.70 1.35 1.55 1.86 1.82 2.12 1.35 0.79 0.89 1.31 1.703 0.45 0.54 0.00 0.86 1.05 1.55 1.30 1.61 0.29 0.70 0.43 0.87 1.17 1.05 1.30 1.68 1.84 2.16 1.45 0.93 1.35 1.57 1.964 0.85 0.49 0.86 0.00 0.61 0.70 0.89 0.83 1.00 1.55 0.74 1.36 1.90 1.24 1.37 1.60 1.41 1.69 0.91 0.41 0.51 0.82 1.205 0.75 0.51 1.05 0.61 0.00 1.01 0.29 0.70 1.30 1.72 1.17 1.80 2.21 1.78 1.95 2.20 2.00 2.26 1.50 1.02 0.64 1.31 1.666 1.53 1.17 1.55 0.70 1.01 0.00 1.17 0.61 1.66 2.23 1.35 1.89 2.51 1.63 1.66 1.73 1.25 1.43 0.82 0.76 0.38 0.43 0.677 0.95 0.78 1.30 0.89 0.29 1.17 0.00 0.71 1.57 1.95 1.45 2.09 2.47 2.07 2.24 2.48 2.25 2.50 1.75 1.30 0.79 1.52 1.838 1.41 1.10 1.61 0.83 0.70 0.61 0.71 0.00 1.81 2.31 1.57 2.19 2.72 2.03 2.12 2.26 1.85 2.04 1.39 1.13 0.35 1.03 1.259 0.74 0.79 0.29 1.00 1.30 1.66 1.57 1.81 0.00 0.59 0.34 0.59 0.92 0.81 1.08 1.47 1.74 2.06 1.40 0.95 1.51 1.60 1.9710 1.01 1.23 0.70 1.55 1.72 2.23 1.95 2.31 0.59 0.00 0.92 0.83 0.63 1.22 1.48 1.89 2.27 2.58 1.97 1.53 2.05 2.18 2.5511 0.80 0.69 0.43 0.74 1.17 1.35 1.45 1.57 0.34 0.92 0.00 0.64 1.17 0.67 0.90 1.26 1.43 1.75 1.07 0.62 1.25 1.26 1.6312 1.32 1.31 0.87 1.36 1.80 1.89 2.09 2.19 0.59 0.83 0.64 0.00 0.68 0.43 0.67 1.07 1.54 1.84 1.37 1.13 1.86 1.70 2.0213 1.56 1.70 1.17 1.90 2.21 2.51 2.47 2.72 0.92 0.63 1.17 0.68 0.00 1.10 1.30 1.67 2.21 2.51 2.05 1.75 2.41 2.36 2.6914 1.46 1.35 1.05 1.24 1.78 1.63 2.07 2.03 0.81 1.22 0.67 0.43 1.10 0.00 0.27 0.67 1.12 1.42 1.00 0.90 1.68 1.37 1.6515 1.70 1.55 1.30 1.37 1.95 1.66 2.24 2.12 1.08 1.48 0.90 0.67 1.30 0.27 0.00 0.40 0.93 1.21 0.93 0.99 1.77 1.35 1.5716 2.05 1.86 1.68 1.60 2.20 1.73 2.48 2.26 1.47 1.89 1.26 1.07 1.67 0.67 0.40 0.00 0.71 0.91 0.92 1.19 1.91 1.35 1.4817 2.11 1.82 1.84 1.41 2.00 1.25 2.25 1.85 1.74 2.27 1.43 1.54 2.21 1.12 0.93 0.71 0.00 0.32 0.50 1.03 1.54 0.83 0.8218 2.42 2.12 2.16 1.69 2.26 1.43 2.50 2.04 2.06 2.58 1.75 1.84 2.51 1.42 1.21 0.91 0.32 0.00 0.78 1.33 1.76 1.01 0.8719 1.66 1.35 1.45 0.91 1.50 0.82 1.75 1.39 1.40 1.97 1.07 1.37 2.05 1.00 0.93 0.92 0.50 0.78 0.00 0.56 1.06 0.43 0.6520 1.10 0.79 0.93 0.41 1.02 0.76 1.30 1.13 0.95 1.53 0.62 1.13 1.75 0.90 0.99 1.19 1.03 1.33 0.56 0.00 0.78 0.65 1.0321 1.23 0.89 1.35 0.51 0.64 0.38 0.79 0.35 1.51 2.05 1.25 1.86 2.41 1.68 1.77 1.91 1.54 1.76 1.06 0.78 0.00 0.75 1.0422 1.66 1.31 1.57 0.82 1.31 0.43 1.52 1.03 1.60 2.18 1.26 1.70 2.36 1.37 1.35 1.35 0.83 1.01 0.43 0.65 0.75 0.00 0.3923 2.05 1.70 1.96 1.20 1.66 0.67 1.83 1.25 1.97 2.55 1.63 2.02 2.69 1.65 1.57 1.48 0.82 0.87 0.65 1.03 1.04 0.39 0.00
(b) Heatmap corresponding to Figure 6.6 (b)
Figure 6.7: Heatmaps
![Page 17: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/17.jpg)
17
𝑥𝑖
𝐶1
𝐶2
𝐶3
Figure 6.8: Silhouette coefficient example
(a) 𝐸 = 0.7632 and 𝑈 = 0.7272 (b) 𝐸 = 1.0280 and 𝑈 = 0.4545
Figure 6.9: Entropy and purity examples
Cluster 1
Cluster 3
Cluster 2
Figure 6.10: Three clusters
![Page 18: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/18.jpg)
18
0 2 4 6 8 10
Cluster 1
Cluster 2
Cluster 3
NumberOvals Circles Diamonds
Figure 6.11: Stacked column chart for clusters in Figure 6.10
0 200 400 600 800 1000 1200 1400 1600 1800
Cluster 1
Cluster 2
Cluster 3
Cluster 4
Cluster 5
Cluster 6
Cluster 7
Cluster 8
Cluster 9
Cluster 10
NumberZeroaccess Zbot Winwebsec Benign
Figure 6.12: Stacked column chart (4-d model with 10 clusters)
1.50 2.00 2.50 3.00 3.50 4.00 4.50 5.00 5.5040
50
60
70
80
90
Duration
Waitingtime
Figure 6.13: EM clustering of Old Faithful eruption data
![Page 19: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/19.jpg)
19
1.50 2.00 2.50 3.00 3.50 4.00 4.50 5.00 5.5040
50
60
70
80
90
Duration
Waitingtime
Figure 6.14: Old Faithful data for Gaussian mixture example
1.50 2.00 2.50 3.00 3.50 4.00 4.50 5.00 5.5040
50
60
70
80
90
Duration
Waitingtime
Figure 6.15: EM clusters for Old Faithful data
![Page 20: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/20.jpg)
20
Problem 6.4
Problem 6.16
![Page 21: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/21.jpg)
21
Chapter 7
Figure 7.1: Labeled training data
𝑏
𝑋𝑟1 𝑟2
𝑏
𝑋
(a) 1-nearest neighbor (1-NN) (b) 3-nearest neighbor (3-NN)
Figure 7.2: 𝑘-NN examples
![Page 22: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/22.jpg)
22
𝑋1 𝑋2
𝑓1 𝑓1 𝑓1
𝑓2 𝑓2 𝑓2 𝑓2
𝑔 𝑔 𝑔
𝑌1 𝑌2 𝑌3
Input layer
1st hidden layer
2nd hidden layer
Output layer
Output
Figure 7.3: MLP with two hidden layers
file size
entropy
entropy
benign
malware
benign
benign
large
small
high
high
low
low
Figure 7.4: Decision tree example
![Page 23: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/23.jpg)
23
entropy
file size
file size
benign
malware
benign
benign
high
low
large
large
small
small
Figure 7.5: Features in different order
(a) Separating with LDA (b) Separating with QDA
Figure 7.6: LDA vs QDA
(a) A projection (b) A better projection
Figure 7.7: Projecting onto hyperplanes
![Page 24: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/24.jpg)
24
𝜇𝑥
𝜇𝑦
𝜇𝑥
𝜇𝑦
(a) Means widely separated (b) Means closer together
Figure 7.8: Projecting the means
(a) Largest eigenvalue (b) Smallest eigenvalue
Figure 7.9: Projections of data in Table 7.4
0 1−1 2−2 3−3 · · ·· · ·[ )[ )[ )[ )[ )[ )[ )
Figure 7.10: Rounding as VQ
(a) House size vs price (b) Linear regression
Figure 7.11: Regression line
![Page 25: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/25.jpg)
25
(a) Linear regression (b) Piecewise linear
Figure 7.12: Regression examples
−6 −4 −2 0 2 4 6
0.25
0.50
0.75
1.00
Figure 7.13: Logistic function
𝒪0 𝒪1 𝒪2 · · · 𝒪𝑇−1
𝑋0 𝑋1 𝑋2 · · · 𝑋𝑇−1
Figure 7.14: Graph structure of HMM
![Page 26: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/26.jpg)
26
𝒪0 𝒪1 𝒪2 · · · 𝒪𝑇−1
𝑋0 𝑋1 𝑋2 · · · 𝑋𝑇−1
Figure 7.15: Linear chain CRF
Logistic regression Linear chain CRF Conditional random field
Naıve bayes Hidden Markov model Generative directed model
Figure 7.16: Generative-discriminative pairs
![Page 27: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/27.jpg)
27
Problem 7.16
![Page 28: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/28.jpg)
28
Chapter 8
Match scores
Nomatch scores
Experiment
Score
Figure 8.1: Scatterplot of scores
Figure 8.2: Thresholding is easy . . . sometimes
TP FP
FN TN
Low
score
Highscore
Malware Not malware
Figure 8.3: Confusion matrix
![Page 29: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/29.jpg)
29
FPR
TPR
00
1
1
Figure 8.4: Scatterplot and a point on the ROC curve
FPR
TPR
00
1
1
Figure 8.5: ROC curve example
FPR
TPR
00
1
1 FPR
TPR
00
1
1
Figure 8.6: Area under ROC curve (AUC) and AUC𝑝
![Page 30: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/30.jpg)
30
TP = 99 FP = 1998
FN = 1 TN = 97,902
Low
score
Highscore
Malware Not malware
Figure 8.7: Confusion matrix
Recall
Precision
00
1
1
Figure 8.8: Scatterplot and a point on the PR curve
Recall
Precision
00
1
1
Figure 8.9: PR curve example
![Page 31: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/31.jpg)
31
Problem 8.7
Match scores
Nomatch scores
ExperimentScore
![Page 32: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/32.jpg)
32
Chapter 9
0 20 40 60 80 100 120 140 160 180 200
−140
−120
−100
−80
−60
−40
−20
0
Sample
HMM
score
MalwareBenign
Figure 9.1: NGVCK vs benign
letters A B C D E F G H I J K L M N O P Q R S T U V W X Y ZA 0.02 0.19 0.38 0.33 0.01 0.10 0.19 0.04 0.29 0.01 0.09 0.85 0.29 1.59 0.02 0.17 0.00 0.93 0.82 1.19 0.10 0.18 0.08 0.02 0.26 0.01B 0.15 0.01 0.00 0.00 0.48 0.00 0.00 0.00 0.10 0.01 0.00 0.19 0.00 0.00 0.18 0.00 0.00 0.09 0.03 0.01 0.17 0.00 0.00 0.00 0.13 0.00C 0.43 0.01 0.06 0.01 0.49 0.00 0.00 0.52 0.21 0.00 0.12 0.12 0.01 0.00 0.64 0.01 0.00 0.12 0.03 0.30 0.09 0.00 0.01 0.00 0.03 0.00D 0.39 0.14 0.08 0.08 0.63 0.09 0.06 0.10 0.47 0.02 0.01 0.07 0.11 0.06 0.30 0.07 0.01 0.13 0.23 0.38 0.12 0.03 0.11 0.00 0.06 0.00E 1.00 0.22 0.61 1.02 0.46 0.30 0.18 0.19 0.39 0.03 0.06 0.53 0.48 1.25 0.33 0.36 0.04 1.75 1.36 0.77 0.09 0.24 0.36 0.14 0.15 0.01F 0.23 0.02 0.05 0.02 0.19 0.13 0.02 0.04 0.25 0.01 0.00 0.06 0.04 0.02 0.42 0.03 0.00 0.18 0.05 0.36 0.08 0.00 0.03 0.00 0.01 0.00G 0.21 0.02 0.03 0.02 0.31 0.03 0.03 0.22 0.18 0.00 0.00 0.06 0.03 0.06 0.19 0.02 0.00 0.19 0.07 0.15 0.06 0.00 0.03 0.00 0.01 0.00H 0.84 0.02 0.04 0.01 2.42 0.02 0.01 0.03 0.66 0.00 0.00 0.02 0.03 0.04 0.48 0.02 0.00 0.10 0.05 0.21 0.08 0.01 0.04 0.00 0.03 0.00I 0.23 0.08 0.56 0.27 0.30 0.13 0.21 0.01 0.01 0.00 0.04 0.40 0.22 1.81 0.56 0.07 0.01 0.26 0.94 0.89 0.01 0.22 0.01 0.01 0.00 0.05J 0.03 0.00 0.00 0.00 0.03 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.06 0.00 0.00 0.00 0.00 0.00 0.06 0.00 0.00 0.00 0.00 0.00K 0.04 0.01 0.01 0.01 0.21 0.01 0.00 0.02 0.09 0.00 0.00 0.02 0.01 0.04 0.04 0.01 0.00 0.01 0.06 0.03 0.00 0.00 0.02 0.00 0.01 0.00L 0.50 0.06 0.06 0.27 0.70 0.07 0.02 0.03 0.54 0.00 0.02 0.58 0.06 0.02 0.33 0.06 0.00 0.04 0.17 0.15 0.10 0.03 0.04 0.00 0.36 0.00M 0.50 0.11 0.02 0.01 0.63 0.01 0.00 0.02 0.30 0.00 0.00 0.01 0.11 0.01 0.32 0.17 0.00 0.07 0.09 0.07 0.11 0.00 0.02 0.00 0.04 0.00N 0.53 0.08 0.38 1.03 0.64 0.11 0.80 0.10 0.44 0.02 0.05 0.09 0.09 0.13 0.49 0.07 0.01 0.05 0.50 1.20 0.09 0.05 0.12 0.00 0.10 0.00O 0.14 0.14 0.16 0.18 0.06 0.86 0.09 0.07 0.10 0.01 0.06 0.34 0.49 1.36 0.22 0.22 0.00 1.04 0.31 0.46 0.70 0.17 0.30 0.01 0.04 0.00P 0.27 0.01 0.00 0.00 0.35 0.01 0.00 0.07 0.12 0.00 0.00 0.20 0.02 0.00 0.28 0.11 0.00 0.36 0.05 0.09 0.08 0.00 0.01 0.00 0.01 0.00Q 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.10 0.00 0.00 0.00 0.00 0.00R 0.65 0.06 0.16 0.21 1.43 0.08 0.10 0.08 0.64 0.01 0.10 0.13 0.19 0.17 0.64 0.09 0.00 0.12 0.50 0.51 0.13 0.06 0.08 0.00 0.20 0.00S 0.66 0.14 0.24 0.09 0.73 0.13 0.05 0.36 0.65 0.02 0.05 0.12 0.16 0.10 0.56 0.26 0.01 0.07 0.48 1.33 0.24 0.02 0.21 0.00 0.05 0.00T 0.63 0.09 0.11 0.05 0.97 0.08 0.03 2.89 1.09 0.01 0.01 0.14 0.10 0.05 1.03 0.07 0.00 0.35 0.40 0.49 0.20 0.01 0.19 0.00 0.20 0.00U 0.09 0.08 0.13 0.08 0.11 0.02 0.10 0.00 0.07 0.00 0.00 0.26 0.10 0.37 0.01 0.11 0.00 0.38 0.37 0.34 0.00 0.00 0.00 0.00 0.01 0.00V 0.09 0.00 0.00 0.00 0.65 0.00 0.00 0.00 0.22 0.00 0.00 0.00 0.00 0.00 0.05 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00W 0.32 0.01 0.01 0.01 0.31 0.01 0.00 0.32 0.33 0.00 0.00 0.01 0.02 0.06 0.21 0.01 0.00 0.03 0.04 0.03 0.00 0.00 0.01 0.00 0.02 0.00X 0.02 0.00 0.02 0.00 0.01 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.01 0.05 0.00 0.00 0.00 0.03 0.00 0.00 0.00 0.00 0.00 0.00Y 0.19 0.07 0.07 0.05 0.14 0.06 0.02 0.07 0.12 0.01 0.01 0.04 0.07 0.04 0.18 0.05 0.00 0.04 0.17 0.20 0.01 0.01 0.09 0.00 0.01 0.00Z 0.02 0.00 0.00 0.00 0.04 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01
Figure 9.2: English digraph relative frequencies (as percentages)
![Page 33: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/33.jpg)
33
200 400 600 800 10000
10
20
30
40
50
60
70
80
90
100
Ciphertext length
Accuracy
(percentage)
keydata
Figure 9.3: Jakobsen’s algorithm
200 400 600 800 1000 12000
10
20
30
40
50
60
70
80
90
100
Ciphertext length
Accuracy
1 start10 restarts
102 restarts
103 restarts
104 restarts
105 restarts
Figure 9.4: Accuracy vs data size
![Page 34: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/34.jpg)
34
200 400 600 800 10000
10
20
30
40
50
60
70
80
90
100
Ciphertext length
Accuracy
HMM (105 restarts)Jakobsen’s
Figure 9.5: Jakobsen’s algorithm vs HMM
200 400 600 800 1000 1200101
103
105
0
20
40
60
80
100
Ciphertext length
Restarts
Accuracy
Figure 9.6: Accuracy vs data size vs restarts (200 iterations)
![Page 35: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/35.jpg)
35
101 102 103 104 105400
800
1200
0.00
0.20
0.40
0.60
0.80
1.00
Restarts
CiphertextlengthAccuracy
Figure 9.7: Accuracy vs restarts vs data size (200 iterations)
![Page 36: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/36.jpg)
36
Chapter 10
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.000.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
False positive rate
Truepositiverate
1-gram2-gram3-gramHMM
Figure 10.1: Comparison of HMM and weighted 𝑛-grams
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.000.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
False positive rate
Truepositiverate
4 sequences5 sequences10 sequences20 sequences50 sequences
Figure 10.2: Detection results for PHMM
![Page 37: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/37.jpg)
37
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.000.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
False positive rate
Truepositiverate
PHMMHMM3-gram
Figure 10.3: Comparison of PHMM, HMM, and 3-gram scores
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.000.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
False positive rate
Truepositive
rate
PHMMHMM
Figure 10.4: HMM vs PHMM based on simulated data
![Page 38: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/38.jpg)
38
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.000.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
False positive rate
Truepositiverate
PHMM 200PHMM 400PHMM 800HMM 200HMM 400HMM 800
Figure 10.5: HMM vs PHMM with limited training data
200 commands 400 commands 800 commands
0.2
0.4
0.6
0.8
1.0
PHMM AUC0.1
PHMM AUCHMM AUC0.1
HMM AUC
Figure 10.6: Results based on limited synthetic data
![Page 39: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/39.jpg)
39
0.00 10.00 20.00 30.00 40.00 50.00−30.00
−25.00
−20.00
−15.00
−10.00
−5.00
0.00
Score
MalwareBenign
0.00 10.00 20.00 30.00 40.00 50.00−30.00
−25.00
−20.00
−15.00
−10.00
−5.00
0.00
Score
MalwareBenign
(a) Scatterplot for static case (b) Scatterplot for dynamic case
0.00 0.20 0.40 0.60 0.80 1.000.00
0.20
0.40
0.60
0.80
1.00
False Positive Rate
TruePositiveRate
0.00 0.20 0.40 0.60 0.80 1.000.00
0.20
0.40
0.60
0.80
1.00
False Positive Rate
TruePositiveRate
(c) ROC curve for static case (d) ROC curve for dynamic case
Figure 10.7: Security Shield HMM results
Cridex
Harebot
SecurityShield
SmartHDD
Winwebsec
Zbot
ZeroAccess
0.2
0.4
0.6
0.8
1.0
AUC
PHMM
HMM (dynamic)
HMM (static)
Figure 10.8: PHMM vs HMMs
![Page 40: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/40.jpg)
40
Chapter 11
Figure 11.1: Training images
Figure 11.2: Eigenfaces of images in Figure 11.1
![Page 41: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/41.jpg)
41
1.0 1.5 2.0 2.5 3.0 3.5 4.00.95
0.96
0.97
0.98
0.99
1.00
Padding ratio
AUC
Figure 11.3: Graph of AUC for MWOR
1.0 1.5 2.0 2.5 3.0 3.5 4.00.50
0.55
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
Padding ratio
AUC
SVDHMMSSD
Figure 11.4: AUC comparison for MWOR
![Page 42: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/42.jpg)
42
Figure 11.5: Image spam
Figure 11.6: Spam images from standard dataset
![Page 43: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/43.jpg)
43
Figure 11.7: Projections onto eigenspace for images in Figure 11.6
1 10 100 5000.75
0.80
0.85
0.90
0.95
1.00
Number of eigenvalues
AUC
Figure 11.8: AUC for different numbers of eigenvalues (standard dataset)
Figure 11.9: Examples of improved spam images
![Page 44: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/44.jpg)
44
Chapter 12
ADD CALL JMP
NOP
SUB
1/4
1/4
1/2
1/3
1/2
2/51/5
1/6
1/2
1/3
2/5
1/6
1/2
1/2
Figure 12.1: Opcode graph
0 5 10 15 20 25 30 35 40−60
−50
−40
−30
−20
−10
0
Score
MalwareBenign
0 5 10 15 20 25 30 35 400.0
0.2
0.4
0.6
0.8
1.0
Score
MalwareBenign
(a) HMM (b) OGS
0 5 10 15 20 25 30 35 400.0
0.2
0.4
0.6
0.8
1.0
Score
MalwareBenign
0 5 10 15 20 25 30 35 400.0
0.2
0.4
0.6
0.8
1.0
Score
MalwareBenign
(c) SSD (d) SVM
Figure 12.2: NGVCK score scatterplots
![Page 45: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/45.jpg)
45
Linear
Polynomial
Neural
Radial
0.0
0.2
0.4
0.6
0.8
1.0 0.920.86 0.85
1.00
AUC
Figure 12.3: Comparison of SVM kernels (NGVCK at 80% morphing)
0 20 40 60 80 100 1200.50
0.60
0.70
0.80
0.90
1.00
Morphing percentage
AUC
HMMOGSSSDSVM
Figure 12.4: AUC at various morphing percentages (NGVCK)
![Page 46: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/46.jpg)
46
Harebot
SecurityShield
SmartHDD
Winwebsec
Zbot
Zeroaccess
0.0
0.2
0.4
0.6
0.8
1.0
AUC
HMMOGSSSDSVM
Figure 12.5: AUC comparisons for Malicia families
![Page 47: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/47.jpg)
47
0 20 40 60 80 100 120 140
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
Morphing percentage
AUC
HMMOGSSSDSVM
0 20 40 60 80 100 120 140
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
Morphing percentage
AUC
HMMOGSSSDSVM
(a) Winwebsec (b) Zeroaccess
0 20 40 60 80 100 120 140
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
Morphing percentage
AUC
HMMOGSSSDSVM
0 20 40 60 80 100 120 140
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
Morphing percentage
AUC
HMMOGSSSDSVM
(c) Zbot (d) Harebot
0 20 40 60 80 100 120 140
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
Morphing percentage
AUC
HMMOGSSSDSVM
0 20 40 60 80 100 120 140
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
Morphing percentage
AUC
HMMOGSSSDSVM
(e) Security Shield (f) Smart HDD
Figure 12.6: AUC comparison for morphed Malicia families
![Page 48: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/48.jpg)
48
(a) Ham image (b) Grayscale
(c) Canny edges (d) HOG
Figure 12.7: Features of a ham image
(a) Spam image (b) Grayscale
(c) Canny edges (d) HOG
Figure 12.8: Spam image feature extraction
![Page 49: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/49.jpg)
49
0 2 4 6 8 100
100
200
300
400
500
600
SNR
Frequency
HamSpam
0 50 100 150 2000
5
10
15
20
25
30
35
Compression ratio
Frequency
HamSpam
(a) Signal to noise ratio (b) Compression ratio
1 1 2 2 3 3 4 4 50
50
100
150
200
250
300
Entropy of LBP
Frequency
HamSpam
0 10000 20000 30000 40000 500000
50
100
150
200
250
300
350
400
Edge count
Frequency
HamSpam
(c) LBP (d) Edge count
Figure 12.9: Ham and spam distributions for standard dataset
Comp
Aspect
Edges
EdgelenSNRNoiseLBP
Color
HOG
Mean1
Mean2
Mean3
Variance1
Variance2
Variance3
Skew
1
Skew
2
Skew
3
Kurtosis
1
Kurtosis
2
Kurtosis
30.0
0.2
0.4
0.6
0.8
1.0 0.95
0.77 0.87
0.65
0.95
0.63
0.85
0.58
0.81
0.50
0.50
0.50
0.96
0.98
0.97
0.91
0.95
0.94
0.92
0.94
0.94
AUC
Figure 12.10: AUC for individual features
![Page 50: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/50.jpg)
50
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 210.94
0.96
0.98
1.00
Number of features selected
AUC
0.92
0.94
0.96
0.98
Accuracy
AUCAccuracy
Figure 12.11: RFE results for standard dataset
Comp
Aspect
Edges
EdgelenSNRNoiseLBP
Color
HOG
Mean1
Mean2
Mean3
Variance1
Variance2
Variance3
Skew
1
Skew
2
Skew
3
Kurtosis
1
Kurtosis
2
Kurtosis
30.0
0.2
0.4
0.6
0.8
1.0
SVM
weigh
t
Figure 12.12: Linear SVM weights for standard dataset
![Page 51: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/51.jpg)
51
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 210.94
0.95
0.96
0.97
0.98
0.99
1.00
Number of features
AUC
RFE featuresRanked features
Figure 12.13: RFE vs ranked features
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.50
2
4
6
8
10
12
14
SNR
Frequency
HamSpam
20 30 40 50 60 70 80 90 1000
2
4
6
8
10
12
14
16
18
Compression ratio
Frequency
HamSpam
(a) Signal to noise ratio (b) Compression ratio
2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.00
2
4
6
8
10
12
14
16
Entropy of LBP
Frequency
HamSpam
3000 6000 9000 12000 15000 180000
2
4
6
8
10
12
14
Edge count
Frequency
HamSpam
(c) LBP (d) Edge count
Figure 12.14: Ham and spam distributions for improved dataset
![Page 52: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/52.jpg)
52
Comp
Aspect
Edges
EdgelenSNRNoiseLBP
Color
HOG
Mean1
Mean2
Mean3
Variance1
Variance2
Variance3
Skew
1
Skew
2
Skew
3
Kurtosis
1
Kurtosis
2
Kurtosis
30.0
0.2
0.4
0.6
0.8
1.0
AUC
Standard datasetImproved dataset
Figure 12.15: Comparison of standard and improved datasets
![Page 53: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/53.jpg)
53
Chapter 13Com
bination
k=2
k=3
k=4
k=5
k=6
k=7
k=8
k=9
k=10
k=11
k=12
k=13
k=14
k=15
0000001
0.6714
0.67790.682
20.6954
0.695
90.696
50.7001
0.7169
0.7245
0.7245
0.7219
0.7264
0.7304
0.7304
0000011
0.6132
0.63990.655
90.6569
0.661
30.681
40.6823
0.6863
0.6863
0.7020
0.7054
0.7218
0.7306
0.7306
0000010
0.5650
0.65230.684
90.6849
0.695
80.680
90.7160
0.7155
0.7250
0.7248
0.7256
0.7337
0.7450
0.7538
0000110
0.5591
0.57480.574
50.5930
0.596
10.596
20.5962
0.5967
0.5967
0.5991
0.6036
0.6036
0.6011
0.6044
0000111
0.5657
0.68080.687
40.6878
0.680
30.700
20.7015
0.7172
0.7172
0.7248
0.7255
0.7969
0.7359
0.8020
0000101
0.5598
0.56040.615
70.5607
0.638
70.638
90.6444
0.6359
0.6430
0.6427
0.6530
0.6498
0.6610
0.6664
0000100
0.5656
0.66180.666
80.6604
0.676
80.690
00.7017
0.7018
0.7213
0.7263
0.7018
0.7190
0.7336
0.7120
0001100
0.6769
0.69810.716
50.7041
0.771
50.760
60.7683
0.7600
0.7636
0.7768
0.7718
0.7707
0.7703
0.7766
0001101
0.6767
0.68410.683
60.7818
0.777
40.777
50.7753
0.7779
0.7759
0.7357
0.7755
0.7357
0.7510
0.7383
0001111
0.6779
0.69650.684
20.7783
0.711
00.711
50.7686
0.7047
0.7712
0.7710
0.7703
0.7757
0.7794
0.7837
0001110
0.6769
0.68380.683
60.7575
0.710
10.784
40.7689
0.7773
0.7683
0.7823
0.7719
0.7824
0.7727
0.7830
0001010
0.6769
0.68440.678
80.7061
0.773
70.705
80.7733
0.7747
0.7727
0.7782
0.7746
0.7746
0.7702
0.7689
0001011
0.6768
0.68440.763
70.7042
0.783
80.704
70.7837
0.7834
0.7834
0.7784
0.7823
0.7774
0.7812
0.7800
0001001
0.6777
0.68540.685
80.7120
0.787
90.706
10.7818
0.7091
0.7842
0.7848
0.7850
0.7865
0.7798
0.7851
0001000
0.6763
0.68360.764
40.6850
0.787
10.705
80.7852
0.7101
0.7862
0.7873
0.7833
0.7875
0.7724
0.7875
0011000
0.5591
0.58090.560
60.5604
0.639
40.640
10.6009
0.6103
0.6103
0.6532
0.6531
0.6155
0.6535
0.6531
0011001
0.6768
0.67740.683
70.6915
0.704
70.704
90.7115
0.7134
0.7068
0.7079
0.7356
0.7347
0.7052
0.7346
0011011
0.5589
0.61250.607
30.5590
0.644
60.644
80.6467
0.6182
0.6398
0.7019
0.6269
0.6478
0.6928
0.7633
0011010
0.5650
0.66300.669
40.6806
0.689
90.690
00.7005
0.7006
0.7032
0.7005
0.7224
0.7224
0.7068
0.7213
0011110
0.5591
0.57180.562
60.5630
0.583
00.583
40.5834
0.5936
0.5936
0.5936
0.6117
0.5931
0.6009
0.6022
0011111
0.5657
0.67960.689
40.6927
0.705
20.713
80.7102
0.7149
0.7156
0.7329
0.7073
0.7166
0.7382
0.7360
0011101
0.5589
0.56020.603
40.5871
0.587
20.641
50.6415
0.6467
0.6515
0.6515
0.7011
0.6472
0.7020
0.7019
0011100
0.5656
0.66400.670
50.6700
0.691
80.692
00.6920
0.7031
0.6958
0.7095
0.7093
0.7105
0.7146
0.7106
0010100
0.6768
0.68420.703
60.7109
0.779
80.699
50.7710
0.7727
0.7712
0.7787
0.7771
0.7734
0.7784
0.7770
0010101
0.6768
0.68440.765
20.7049
0.785
00.704
20.7841
0.7150
0.7841
0.7096
0.7779
0.7201
0.7792
0.7129
0010111
0.6777
0.68510.677
20.7120
0.788
40.705
60.7815
0.7087
0.7847
0.7801
0.7846
0.7844
0.7887
0.7844
0010110
0.6759
0.68270.765
70.6849
0.787
30.705
00.7851
0.7088
0.7847
0.7161
0.7862
0.7741
0.7784
0.7879
0010010
0.6768
0.67740.684
70.7043
0.700
10.799
60.7000
0.7847
0.7141
0.7839
0.7939
0.7999
0.7939
0.7759
0010011
0.6767
0.68400.684
50.7045
0.709
20.792
40.7001
0.7879
0.7128
0.7873
0.7119
0.8015
0.7834
0.7791
0010001
0.6773
0.67790.685
40.7068
0.706
80.804
80.7008
0.7905
0.7084
0.7916
0.8038
0.8102
0.8032
0.7511
0010000
0.6754
0.68200.682
00.7050
0.705
80.787
80.7059
0.7880
0.7174
0.7917
0.7178
0.7917
0.7879
0.7812
0110000
0.5658
0.68080.694
00.7125
0.712
50.714
90.7104
0.7142
0.7142
0.7011
0.7156
0.7141
0.7014
0.7152
0110001
0.5658
0.68000.683
70.6961
0.719
50.716
60.7181
0.7202
0.7322
0.7309
0.7373
0.7319
0.7279
0.7301
0110011
0.5656
0.65740.684
70.6805
0.704
60.704
70.7077
0.7084
0.7120
0.7242
0.7231
0.7254
0.7266
0.7370
0110010
0.5658
0.67970.684
90.6849
0.706
90.707
00.7143
0.7170
0.7160
0.7137
0.7301
0.7282
0.7346
0.7286
0110110
0.5658
0.68880.689
50.7027
0.716
90.716
90.7160
0.7166
0.7168
0.7359
0.7168
0.7359
0.7154
0.7174
0110111
0.5658
0.68780.688
60.6937
0.702
20.702
00.7156
0.7010
0.7155
0.7237
0.7445
0.7197
0.7154
0.7448
0110101
0.5658
0.66100.661
70.6817
0.688
80.687
80.7005
0.7024
0.7105
0.7105
0.7184
0.7336
0.7177
0.7161
0110100
0.5658
0.67990.680
50.6809
0.701
70.701
70.7018
0.7018
0.7187
0.7106
0.7264
0.7268
0.7346
0.7291
0111100
0.6765
0.68370.709
90.7780
0.773
00.772
80.7741
0.7354
0.7741
0.7377
0.7797
0.7359
0.7794
0.7514
0111101
0.6767
0.68400.683
70.7507
0.734
60.751
10.7331
0.7612
0.7374
0.7776
0.7373
0.7796
0.7315
0.7779
0111111
0.6774
0.68460.683
70.7848
0.710
50.778
30.7727
0.7815
0.7755
0.7814
0.7753
0.7826
0.7773
0.7829
0111110
0.6760
0.68330.679
90.6928
0.705
50.714
30.7339
0.7333
0.7332
0.7764
0.7377
0.7826
0.7429
0.7824
0111010
0.6765
0.68370.690
00.7010
0.771
60.700
20.7703
0.7161
0.7755
0.7190
0.7730
0.7751
0.7757
0.7861
0111011
0.6767
0.68350.755
10.7045
0.772
70.702
90.7732
0.7193
0.7778
0.7555
0.7857
0.7860
0.7815
0.7860
0111001
0.6770
0.68420.683
70.7065
0.782
10.705
80.7820
0.7158
0.7839
0.7839
0.7839
0.7853
0.7778
0.7837
0111000
0.5656
0.68240.684
00.7047
0.705
50.705
60.7059
0.7161
0.7803
0.7723
0.7719
0.7776
0.7234
0.7709
0101000
0.5658
0.69040.691
10.7101
0.710
20.715
80.7019
0.7155
0.7173
0.7152
0.7166
0.7106
0.7106
0.7127
0101001
0.5658
0.68830.689
10.6809
0.701
50.714
00.7140
0.7014
0.7140
0.7111
0.7143
0.7087
0.7223
0.7300
0101011
0.5658
0.67600.676
70.6992
0.699
20.698
10.7001
0.7022
0.7081
0.7081
0.7102
0.7261
0.7166
0.7283
0101010
0.5658
0.67960.679
60.6808
0.700
60.709
50.7008
0.7008
0.7142
0.7090
0.7209
0.7190
0.7213
0.7266
0101110
0.5659
0.68700.694
10.7115
0.711
60.710
60.7168
0.7169
0.7169
0.7593
0.7169
0.7165
0.7587
0.7158
0101111
0.5658
0.68860.689
10.6933
0.714
90.712
90.7154
0.7220
0.7222
0.7387
0.7183
0.7387
0.7388
0.7457
0101101
0.5658
0.67190.678
70.7002
0.700
20.700
20.7004
0.7010
0.7095
0.7086
0.7109
0.7129
0.7255
0.7175
0101100
0.5658
0.68000.680
80.6822
0.701
80.701
70.7161
0.7027
0.7110
0.7257
0.7114
0.7257
0.7275
0.7278
0100100
0.6765
0.68470.678
60.7040
0.779
60.699
90.7747
0.7064
0.7751
0.7059
0.7924
0.7123
0.7798
0.7224
0100101
0.6763
0.68350.692
00.7040
0.781
20.702
60.7814
0.7164
0.7807
0.7152
0.7794
0.7155
0.7815
0.7714
0100111
0.6770
0.68420.683
70.7116
0.786
90.705
90.7820
0.7115
0.7852
0.7122
0.7851
0.7106
0.7816
0.7424
0100110
0.6753
0.68040.683
60.7038
0.704
20.704
60.7727
0.7068
0.7735
0.7169
0.7816
0.7734
0.7789
0.7761
0100010
0.5658
0.68500.685
50.7047
0.699
10.711
40.6996
0.7785
0.7134
0.7816
0.7143
0.7843
0.7952
0.7773
0100011
0.5656
0.68440.684
90.7046
0.714
70.783
50.7004
0.7837
0.7015
0.7979
0.7127
0.7943
0.7961
0.7874
0100001
0.5658
0.68330.684
00.7063
0.700
80.715
90.7006
0.7787
0.7083
0.7825
0.7149
0.7885
0.7848
0.7202
0100000
0.5656
0.68040.681
10.7040
0.700
40.779
40.7011
0.7793
0.7150
0.7853
0.7168
0.7917
0.7844
0.7759
1100000
0.6767
0.68380.700
20.7764
0.772
10.766
90.7643
0.7953
0.7964
0.7950
0.7988
0.7983
0.7803
0.7807
1100001
0.6759
0.68320.679
90.7615
0.778
40.781
60.7624
0.7682
0.7650
0.7694
0.7603
0.7696
0.7605
0.7706
1100011
0.6713
0.67500.683
60.6845
0.707
80.779
10.7753
0.7753
0.7739
0.7828
0.7741
0.7766
0.7769
0.7792
1100010
0.6722
0.67960.683
80.6799
0.704
10.781
80.7812
0.7636
0.7619
0.7706
0.7703
0.7732
0.7793
0.7720
1100110
0.6767
0.68500.685
60.7051
0.699
60.796
90.7001
0.7848
0.7688
0.7825
0.8015
0.7805
0.8005
0.7833
1100111
0.5656
0.68140.680
50.6849
0.704
20.701
40.7058
0.7974
0.7055
0.7965
0.7134
0.8037
0.7956
0.7693
1100101
0.5658
0.67690.677
60.7000
0.705
40.705
40.7031
0.8070
0.7837
0.7735
0.8029
0.7891
0.8212
0.7693
1100100
0.5656
0.67960.684
20.6849
0.783
40.702
20.7069
0.7756
0.7058
0.7797
0.7168
0.7800
0.7806
0.7701
1101100
0.6767
0.68380.710
20.7123
0.782
90.775
90.7753
0.7756
0.7756
0.7824
0.7888
0.7706
0.7934
0.7803
1101101
0.6768
0.68410.683
70.7847
0.781
40.782
10.7768
0.7766
0.7771
0.7697
0.7773
0.7701
0.7816
0.7734
1101111
0.6774
0.69470.683
70.7887
0.786
10.705
40.7796
0.7783
0.7853
0.7784
0.7943
0.7962
0.7762
0.8012
1101110
0.6772
0.68440.683
60.7924
0.783
00.783
00.7765
0.7837
0.7760
0.7801
0.7798
0.7847
0.7889
0.7806
1101010
0.6767
0.68420.705
20.7834
0.727
80.782
40.7762
0.7734
0.7750
0.7851
0.7818
0.7761
0.7748
0.7784
1101011
0.6768
0.68440.685
10.7045
0.786
70.784
20.7883
0.7737
0.7725
0.7803
0.7723
0.7802
0.7757
0.7655
1101001
0.6774
0.68470.683
70.7120
0.792
80.784
20.7839
0.7867
0.7894
0.7797
0.7843
0.7793
0.7842
0.7782
1101000
0.6772
0.68440.680
60.7065
0.788
50.785
90.7871
0.7851
0.7764
0.7814
0.7780
0.7885
0.7928
0.7859
1111000
0.6776
0.68500.685
50.7056
0.707
30.700
20.7032
0.7956
0.7689
0.7984
0.7009
0.7984
0.6997
0.7809
1111001
0.6738
0.68150.680
10.7029
0.700
90.700
80.7989
0.7925
0.7983
0.7983
0.7675
0.7967
0.7104
0.7741
1111011
0.6714
0.67820.678
80.7013
0.705
00.705
00.8012
0.7095
0.8044
0.8062
0.7084
0.7223
0.7155
0.7218
1111010
0.5652
0.67950.680
10.6808
0.702
90.703
70.7873
0.7812
0.7924
0.7917
0.7700
0.7911
0.7714
0.7869
1111110
0.5658
0.67760.684
90.7059
0.699
20.699
30.7008
0.7152
0.8160
0.7005
0.8152
0.8152
0.8019
0.8157
1111111
0.5656
0.67990.680
40.6846
0.714
50.804
00.7013
0.7152
0.8114
0.7149
0.8112
0.7278
0.8112
0.8111
1111101
0.5658
0.67220.679
60.7010
0.701
00.700
20.7059
0.7047
0.7997
0.7042
0.8023
0.7181
0.7125
0.8062
1111100
0.5656
0.67970.680
40.6847
0.705
90.794
80.7014
0.7019
0.7142
0.7069
0.8033
0.7150
0.7967
0.8011
1110100
0.6767
0.68410.705
20.7869
0.707
40.781
50.7768
0.7732
0.7774
0.7770
0.7796
0.7794
0.7673
0.7792
1110101
0.6768
0.68440.684
70.7045
0.787
10.784
40.7888
0.7843
0.7885
0.7778
0.7775
0.7778
0.7773
0.7771
1110111
0.6774
0.68460.683
70.7120
0.712
40.783
40.7775
0.7843
0.7811
0.7810
0.7811
0.7814
0.7810
0.7812
1110110
0.6770
0.68410.679
90.7060
0.788
50.785
50.7894
0.7848
0.7887
0.7768
0.7814
0.7848
0.7885
0.7842
1110010
0.6767
0.68410.684
60.7049
0.794
20.699
70.7938
0.6996
0.7802
0.7802
0.7796
0.7802
0.7770
0.7796
1110011
0.6767
0.68440.774
70.7045
0.789
10.700
10.7819
0.7055
0.7860
0.7848
0.7800
0.7810
0.7776
0.7805
1110001
0.6774
0.68490.685
50.7068
0.789
80.706
00.7883
0.6996
0.7893
0.7996
0.7848
0.7864
0.7801
0.7884
1110000
0.6768
0.68410.684
70.7061
0.794
10.705
50.7879
0.7027
0.7855
0.7888
0.7802
0.7888
0.7843
0.7853
1010000
0.6776
0.68440.704
90.7702
0.759
80.773
50.7591
0.7721
0.7624
0.7489
0.7807
0.7538
0.7792
0.7524
1010001
0.6750
0.68090.681
40.7020
0.702
00.757
10.7589
0.7574
0.7614
0.7614
0.7611
0.7637
0.7551
0.7637
1010011
0.6717
0.67880.683
60.7046
0.704
50.708
60.7105
0.7136
0.7710
0.7643
0.7588
0.7756
0.7596
0.7671
1010010
0.6723
0.67940.679
50.6799
0.700
50.707
40.7511
0.7628
0.7568
0.7574
0.7674
0.7560
0.7686
0.7696
1010110
0.5658
0.68410.752
40.7055
0.712
50.704
70.7136
0.7721
0.7687
0.7748
0.7213
0.7698
0.7545
0.7747
1010111
0.5658
0.68010.680
30.7041
0.703
60.706
10.7067
0.7538
0.7815
0.7815
0.7175
0.7818
0.7175
0.7687
1010101
0.5658
0.67880.684
00.7006
0.705
80.705
50.7020
0.7775
0.7101
0.7718
0.7609
0.7807
0.7915
0.7694
1010100
0.5658
0.67940.681
30.6815
0.703
30.706
10.7058
0.7058
0.7105
0.7138
0.7140
0.7128
0.7191
0.7742
1011100
0.6765
0.68380.705
00.7775
0.772
10.773
30.7725
0.7747
0.7719
0.7891
0.7769
0.7730
0.7724
0.7974
1011101
0.6767
0.68400.704
20.7794
0.769
10.778
70.7729
0.7820
0.7770
0.7814
0.7770
0.7783
0.7776
0.7534
1011111
0.6774
0.68460.684
60.7867
0.784
30.778
70.7743
0.7759
0.7782
0.7774
0.7783
0.7551
0.7958
0.7956
1011110
0.6770
0.68410.683
70.7814
0.775
60.775
60.7756
0.7748
0.7768
0.7773
0.7785
0.7855
0.7762
0.7818
1011010
0.6767
0.68380.680
00.7004
0.777
30.779
80.7805
0.7738
0.7785
0.7732
0.7743
0.7766
0.7784
0.7816
1011011
0.6767
0.68380.757
40.7040
0.782
40.701
00.7856
0.7721
0.7847
0.7733
0.7756
0.7138
0.7756
0.7618
1011001
0.6774
0.68460.683
70.7061
0.784
40.780
70.7828
0.7805
0.7788
0.7773
0.7819
0.7652
0.7823
0.7669
1011000
0.6768
0.68410.755
90.7055
0.785
10.782
50.7724
0.7814
0.7793
0.7101
0.7803
0.7702
0.7857
0.7759
1001000
0.5658
0.68370.678
30.7050
0.712
50.703
80.7050
0.7210
0.7765
0.7869
0.7727
0.7768
0.7091
0.7727
1001001
0.5658
0.67940.681
70.7018
0.702
90.706
30.7830
0.7054
0.7823
0.7820
0.7714
0.7824
0.7170
0.7715
1001011
0.5658
0.67900.684
20.7005
0.699
00.705
90.7844
0.7092
0.7097
0.7885
0.7727
0.7885
0.7646
0.7727
1001010
0.5658
0.67950.681
80.7004
0.703
30.706
10.7768
0.7056
0.7159
0.7159
0.7718
0.7161
0.7748
0.7779
1001110
0.5658
0.68330.683
80.7049
0.700
10.798
70.7008
0.7143
0.7277
0.7010
0.8158
0.7175
0.8011
0.7138
1001111
0.5658
0.67990.680
40.7026
0.702
40.706
50.7068
0.7068
0.7058
0.7145
0.8025
0.7265
0.8114
0.7355
1001101
0.5658
0.67900.679
60.7009
0.700
10.793
80.7010
0.7042
0.7110
0.7096
0.7213
0.7179
0.7186
0.7170
1001100
0.5658
0.67950.680
10.6800
0.701
00.703
70.7011
0.7069
0.7102
0.7197
0.7915
0.7049
0.7997
0.7165
1000100
0.6767
0.68380.700
10.7766
0.701
90.780
30.7748
0.7727
0.7789
0.7759
0.7810
0.7773
0.7787
0.7821
1000101
0.6767
0.68380.685
90.7043
0.783
20.781
20.7823
0.7741
0.7856
0.7759
0.7771
0.7756
0.7771
0.7789
1000111
0.6774
0.68460.683
70.7061
0.784
10.782
40.7826
0.7818
0.7802
0.7766
0.7830
0.7797
0.7832
0.7843
1000110
0.6768
0.68330.679
90.7056
0.786
00.782
90.7834
0.7820
0.7873
0.7757
0.7861
0.7839
0.7869
0.7684
1000010
0.6767
0.68500.685
40.7050
0.782
50.704
60.7770
0.6997
0.7818
0.7771
0.7801
0.7821
0.7756
0.7785
1000011
0.6767
0.68370.766
60.7045
0.783
40.699
50.7800
0.7105
0.7776
0.7785
0.7778
0.7787
0.7800
0.7782
1000001
0.6770
0.68420.764
70.7067
0.786
60.705
10.7847
0.7024
0.7832
0.7791
0.7824
0.7869
0.7760
0.7906
1000000
0.5656
0.68330.765
50.7056
0.705
40.701
80.7009
0.7079
0.7819
0.7824
0.7033
0.7824
0.7104
0.7775
Figure 13.1: Heatmap of HMM scores (Gray code order)
2 3 4 5 6 7 8 9 100.00
0.20
0.40
0.60
0.80
1.00
Clusters
Purity
Score
𝐾-means clusteringEM clustering
2 3 4 5 6 7 8 9 100.00
0.20
0.40
0.60
0.80
1.00
Clusters
Purity
Score
𝐾-means clusteringEM clustering
(a) 2-dimensional (b) 3-dimensional
2 3 4 5 6 7 8 9 100.00
0.20
0.40
0.60
0.80
1.00
Clusters
Purity
Score
𝐾-means clusteringEM clustering
2 3 4 5 6 7 8 9 100.00
0.20
0.40
0.60
0.80
1.00
Clusters
Purity
Score
𝐾-means clusteringEM clustering
(c) 4-dimensional (d) 5-dimensional
Figure 13.2: Purity scores for EM and 𝐾-means clustering
![Page 54: Introduction to Machine Learning - San Jose State Universitystamp/ML/files/zz_figures.pdf · Introduction to Machine Learning with Applications in Information Security Mark Stamp](https://reader034.vdocuments.us/reader034/viewer/2022051602/5b42f83f7f8b9a17568b934d/html5/thumbnails/54.jpg)
54
23
45 2
46
810
0.60
0.80
1.00
Dimensions
Cluster
s
Purity
Score
23
45 2
46
810
0.60
0.80
1.00
Dimensions
Cluster
s
Purity
Score
(a) 𝐾-means clustering (b) EM clustering
Figure 13.3: Clustering stem plots