Download - Lecture 03
![Page 1: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/1.jpg)
Statistical Methods in Artificial IntelligenceCSE471 - Monsoon 2015 : Lecture 03
Avinash SharmaCVIT, IIIT Hyderabad
![Page 2: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/2.jpg)
Lecture 03: Plan
• Linear Algebra Recap• Linear Discriminant Functions (LDFs)• The Perceptron• Generalized LDFs• The Two-Category Linearly Separable Case• Learning LDF: Basic Gradient Descend• Perceptron Criterion Function
![Page 3: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/3.jpg)
Basic Linear Algebra Operations
• Vector• Vector Operations– Scaling– Transpose– Addition– Subtraction– Dot Product
• Equation of a Plane
![Page 4: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/4.jpg)
¿ [𝑥1 𝑦1 𝑧 1 ]
Vector Operations
𝑥
𝑦
¿ [𝑥1 𝑦1 ]
𝑥1
𝑦 1𝒗
𝑧1
𝑧
𝑎𝒗=[𝑎𝑥1 𝑎𝑦1 𝑎𝑧 1 ]
‖𝒗‖=√𝑥12+𝑦12+𝑧12‖𝑎𝒗‖=𝑎√𝑥12+𝑦12+𝑧12
Scaling: Only Magnitude Changes
¿ [𝑎𝑥1 𝑎𝑦1 𝑎𝑧1 ]𝑇Transpose
![Page 5: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/5.jpg)
Vector Operations
𝑥
𝑦
𝒖
𝒗
𝒖+𝒗
𝒖−𝒗
−𝒗
![Page 6: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/6.jpg)
Vector Operations
• Dot Product (Inner Product) of two vectors is a scalar.
𝒖 . 𝒗=𝑥1 𝑦1+𝑥2 𝑦2
𝑥
𝑦
𝒖
𝒗
¿𝒖𝑻 𝒗¿‖𝒖‖‖𝒗‖cos𝜶
𝜶
𝒖𝑻 𝒗‖𝒖‖‖𝒗‖=cos𝜶
• Dot product if two perpendicular vectors is 0
![Page 7: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/7.jpg)
Equation of a Plane
𝑶
𝑃0
𝑃
𝒓𝒓𝟎
𝒓 −𝒓𝟎
𝒏
90 °
(𝒓 −𝒓𝟎 ) .𝒏=𝟎
![Page 8: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/8.jpg)
Linear Discriminant Functions
• Assumes a 2-class classification setup
• Decision boundary is represented explicitly in terms of components of .
• Aim is to seek parameters of a linear discriminant function which minimize the training error.
• Why Linear ? – Simplest possible– Generalized
![Page 9: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/9.jpg)
Linear Discriminant Functions
𝑔 ( 𝑿 )={ ¿0 (+𝑣𝑒)𝑐𝑙𝑎𝑠𝑠 𝐴¿0 (−𝑣𝑒)𝑐𝑙𝑎𝑠𝑠𝐵
¿ 0𝐷𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦
𝑔 ( 𝑿 )=0𝑿
Class A
Class B
𝑾
𝑔 ( 𝑿 )=𝑾 𝑇 𝑿
𝑔 ( 𝑿 )=𝑾 𝑇 𝑿+𝑤0
![Page 10: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/10.jpg)
The perceptron𝑔 ( 𝑿 )=𝑾 𝑇 𝑿+𝑤0
𝑥0=1
𝑥1
𝑥2
𝑥𝑑
⋮ ⋮
𝑤0
𝑤1
𝑤2
𝑤𝑑
∑❑ 𝑔 ( 𝑿 )
𝑿=[𝑥1 ⋮ 𝑥𝑑 ]
𝑾=[𝑤1 ⋮ 𝑤𝑑
]
![Page 11: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/11.jpg)
Perceptron Decision Boundary
![Page 12: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/12.jpg)
• Decision boundary surface (hyperplane) divides feature space into two regions.
• Orientation of the boundary surface is decided by the normal vector .
• Location of the boundary surface is determined by the bias term .
• is proportional to distance of from the boundary surface.
• positive side and negative side.
Perceptron Summary
![Page 13: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/13.jpg)
Generalized LDFs• Linear:
• Non Linear
𝑔 ( 𝑿 )=𝑤0+𝑾𝑇 𝑿 𝑿=[𝑥1 ⋮ 𝑥𝑑
]𝑾=[𝑤1 ⋮ 𝑤𝑑
]𝑔 ( 𝑿 )=𝑤0+∑
𝑖=1
𝑑
𝑤 𝑖𝑥 𝑖
𝑔 ( 𝑿 )=𝑤0+∑𝑖=1
𝑑
𝑤 𝑖𝑥 𝑖+∑𝑖=1
𝑑
∑𝑗=1
𝑑
𝑤 𝑖 𝑗 𝑥𝑖 𝑥 𝑗(Quadratic)
![Page 14: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/14.jpg)
Generalized LDFs
• Linear
• Non Linear
𝑿=[𝑥1 ⋮ 𝑥𝑑 ]𝑾=[𝑤1
⋮ 𝑤𝑑
]
𝒀=[𝑥0 ⋮ 𝑥𝑑 ]=[𝑥0𝑿 ]𝒂=[𝑤0
⋮ 𝑤𝑑
]=[𝑤0
𝑾 ]
=1
𝑔 ( 𝑿 )=𝑤0+𝑾𝑇 𝑿
𝑔 ( 𝑿 )=𝒂𝑇𝒀
𝑔 ( 𝑿 )=∑𝑖=0
𝑑
𝑤𝑖 𝑥𝑖
𝒀=φ(𝑿 )
𝑔 ( 𝑿 )=𝒂𝑇𝒀=∑𝑖=1
�̂�
𝑎𝑖 𝑦 𝑖𝒂=[𝑎1 ⋮ 𝑎
�̂� ]
![Page 15: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/15.jpg)
Generalized LDFs
![Page 16: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/16.jpg)
Generalized LDFs Summary
• can be any arbitrary mapping function that projects original data points to points where .
• The hyperplane decision surface passes through origin.
• Advantage: In the mapped higher dimensional space data might be linear separable.
• Disadvantage: The mapping is computationally intensive and learning the classification parameters can be non-trivial.
![Page 17: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/17.jpg)
Two-Category Linearly Separable Case
𝑔 ( 𝑿 )=𝒂𝑇𝒀=∑𝑖=1
�̂�
𝑎𝑖 𝑦 𝑖¿ { ¿ 0(+𝑣𝑒)𝑐𝑙𝑎𝑠𝑠 𝐴¿0 (−𝑣𝑒)𝑐𝑙𝑎𝑠𝑠𝐵
¿0𝐷𝑒𝑐𝑖𝑠𝑖𝑜𝑛 𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦
𝑦 1
𝑦 2𝒂
![Page 18: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/18.jpg)
Two-Category Linearly Separable Case
𝑔 ( 𝑿 )=𝒂𝑇𝒀=∑𝑖=1
�̂�
𝑎𝑖 𝑦 𝑖>0
𝑦 1
𝑦 2𝒂
Normalized Case
![Page 19: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/19.jpg)
Two-Category Linearly Separable Case
Data vector
![Page 20: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/20.jpg)
Two-Category Linearly Separable Case
![Page 21: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/21.jpg)
Learning LDF: Basic Gradient Descend
• Define a scalar function which captures classification error for specific boundary plane described by parameter
• Minimize using gradient descent.– Start with arbitrary value of for .– Iteratively refine estimate of :
• is the positive scale factor also known as learning rate– A too small makes the convergence very slow– A too large can diverge due to overshooting of corrections.
![Page 22: Lecture 03](https://reader035.vdocuments.us/reader035/viewer/2022062313/563db8d5550346aa9a9762df/html5/thumbnails/22.jpg)