machine learning speaker :chia-shing huang advisor :dr. kai-wei ke 2016/01/14 1

1

Machine LearningSpeaker :Chia-Shing Huang

Advisor :Dr. Kai-Wei Ke2016/01/14

2

Outline

Machine learning Decision tree Artificial neural Network Conclusion

3

Machine Learning Definition

Field of study that gives computers the ability to learn without being explicitly programmed - Arthur Samuel

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E - Tom M. Mitchell

https://en.wikipedia.org/wiki/Arthur_Samuel

https://en.wikipedia.org/wiki/Tom_M._Mitchell

4

Simple Learning Flow

Unknown target function

Trainging examples Learning Algorithm

A

Hypothesis setH

Final hypothesis

()

5

Method

Supervised learning

Unsupervised learning

Semi-supervised learning

Reinforcement learning

6

Decision Tree

What time is

it?

Has homewo

rk?Has

date?

N Y YN

Play game or not?

< 19:00

>19:00

falsetrue falsetru

e

A decision tree is a flowchart-like structure in which each internal node represents a "test" on an, each branch represents the outcome of the test and each leaf node represents a class label.

The paths from root to leaf represents classification rules.

7

Classification and Regression Tree(CART)

Number of branches = 2 (binary tree)

Base hypothesis = optimal constant Binary/multiclass classification(0/1 error) : majority of

{yn} (result) Regression(squared error) : average of {yn} (result)

Termination criteria = until forced to terminate All yn the same All xn the same

8

Branching criteria = purifying

decision stumps h(x)

Data rate in total data

• for classification error :with = majority of {}

• for regression error :with = average of {}

9

Simple Data SetOne more example

Let’s play online ! http://cn.akinator.com

http://cn.akinator.com/

http://cn.akinator.com/

10

Artificial Neural Network (ANN) Definition: Artificial neural networks (ANNs) are a family of models inspired by biological neural networks and are used to estimate or approximate functions that can depend on a large number of inputs and are generally unknown.

11

Single Neuron

Xn

X1

X2

X3

X0

SUM

Transform

FunctionF

output

w0

w1

w2

w3

...wn Xi = nonlinear information

(input)Wi = weight of data features

Perceptron Algorithm

𝑓 (𝑥)={𝑥>0 ,+1𝑥<0 ,−1

𝑠𝑢𝑚=∑𝑖=0

𝑛

𝑥 𝑖∙𝑤𝑖

12

𝑔𝑖(𝑥)

Hidden layer

13

Xn

X1

X2

...

w1

w2 g2

g1

+1

X0 = 1

= -1

= +1

= +1

𝐺 (𝑥)

• Otherwise

+1

+1

+1

-1-1 -1

𝑔1(𝑥) 𝑔2(𝑥) 𝑔1 (𝑥 ) 𝐴𝑁𝐷𝑔2(𝑥 )

14

w1

w2

w3

wn

...Xn

X1

X2

X3

b

g2

g1

gn

G...

...

a1

a2

a3

...an

Feedforward NetworkFeedback Network How to get optimization?Use Gradient descent

15

Example :DDoS attack detection

Distributed Denial of Service(DDos) attack: is an attempt to make an online service unavailable by overwhelming it with traffic from multiple sources. SYN flood UDP Flood ICMP Flood LAND attack

16

Example :DDoS attack detection(con’t)

Training dataCPU idle rateMemory usageNetwork packets inflowsNetwork packet outflows Current number of system processIdeal target (normal =0 /attack = 1)

17

，

i j𝑊 𝑖𝑗

= weights = internal variable = transform function = threshold = output

= expected output = real output = error function = learning rate

Logistic regression

18

Schematic Simulation Environment

19

Simulation Environment Hardware Standard

20

Artificial Neural Network Preferences Input = 5 neurons

CPU idle rateMemory usageNetwork packets inflowsNetwork packet outflows Current number of system process

Hidden layer = 10 neurons Output = 1 neuron (true or false) Weight & threshold = random (0~1)

25

Conclusion - Decision treePros:

Human-explainable, widely used in business/medical data analysis

Simple Efficient in prediction and training

Cons: Heuristic: mostly little theoretical explanations Confusing to beginners

26

Conclusion - Artificial Neural NetworkPros:

good to model the non-linear data with large number of input features

Robustness & fault-tolerance Strong adaptability

Cons: So many answers that can’t identify which is the best answer. are prone to overfitting requires greater computational resources

27

Reference http://

supercomputer.ncku.edu.tw/ezfiles/343/1343/img/1609/125202900.pdf

https://www.youtube.com/watch?v=nQvpFSMPhr0&list=PLXVfgk9fNX2I7tB6oIINGBmW50rrmFTqf

https://class.coursera.org/ntumltwo-002/lecture http://

bryannotes.blogspot.tw/2014/11/algorithm-stochastic-gradient_4.html https://en.wikipedia.org/wiki/Decision_tree https://en.wikipedia.org/wiki/Artificial_neural_network Ashraf, J. and Latif, S., “Handling intrusion and DDoS attacks in

Software Defined Networks using machine learning techniques” in National Software Engineering Conference (NSEC), 2014,pp. 55-60.

紀宏宜、張偉德、陳志榮 , “應用類神經網路於阻斷式服務攻擊之預測” 網際網路技術學刊 , pp.173-178, 9:2 2008.04[民 97.04]

http://supercomputer.ncku.edu.tw/ezfiles/343/1343/img/1609/125202900.pdf






https://class.coursera.org/ntumltwo-002/lecture

https://class.coursera.org/ntumltwo-002/lecture

http://bryannotes.blogspot.tw/2014/11/algorithm-stochastic-gradient_4.html

http://bryannotes.blogspot.tw/2014/11/algorithm-stochastic-gradient_4.html

https://en.wikipedia.org/wiki/Decision_tree

https://en.wikipedia.org/wiki/Decision_tree

https://en.wikipedia.org/wiki/Artificial_neural_network

https://en.wikipedia.org/wiki/Artificial_neural_network

28

Thank you for listeningHappy winter vacation & happy new year

29

w1

w2

w3

wn

...

w0

Xn

X1

X2

X3

X0

g2

g1

g0

gn

G

......

a0

a1

a2

a3...an

𝑔𝑖(𝑥)

machine learning speaker :chia-shing huang advisor :dr. kai-wei ke 2016/01/14 1

Documents