![Page 1: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/1.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Quick Tour of Basic Linear Algebra andProbability Theory
CS246: Mining Massive Data SetsWinter 2011
![Page 2: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/2.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Outline
1 Basic Linear Algebra
2 Basic Probability Theory
![Page 3: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/3.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Matrices and Vectors
Matrix: A rectangular array of numbers, e.g., A ∈ Rm×n:
A =
a11 a12 . . . a1na21 a22 . . . a2n
......
......
am1 am2 . . . amn
Vector: A matrix consisting of only one column (default) orone row, e.g., x ∈ Rn
x =
x1x2...
xn
![Page 4: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/4.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Matrices and Vectors
Matrix: A rectangular array of numbers, e.g., A ∈ Rm×n:
A =
a11 a12 . . . a1na21 a22 . . . a2n
......
......
am1 am2 . . . amn
Vector: A matrix consisting of only one column (default) orone row, e.g., x ∈ Rn
x =
x1x2...
xn
![Page 5: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/5.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Matrix Multiplication
If A ∈ Rm×n, B ∈ Rn×p, C = AB, then C ∈ Rm×p:
Cij =n∑
k=1
AikBkj
Special cases: Matrix-vector product, inner product of twovectors. e.g., with x , y ∈ Rn:
xT y =n∑
i=1
xiyi ∈ R
![Page 6: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/6.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Matrix Multiplication
If A ∈ Rm×n, B ∈ Rn×p, C = AB, then C ∈ Rm×p:
Cij =n∑
k=1
AikBkj
Special cases: Matrix-vector product, inner product of twovectors. e.g., with x , y ∈ Rn:
xT y =n∑
i=1
xiyi ∈ R
![Page 7: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/7.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Properties of Matrix Multiplication
Associative: (AB)C = A(BC)
Distributive: A(B + C) = AB + ACNon-commutative: AB 6= BABlock multiplication: If A = [Aik ], B = [Bkj ], where Aik ’s andBkj ’s are matrix blocks, and the number of columns in Aik isequal to the number of rows in Bkj , then C = AB = [Cij ]where Cij =
∑k AikBkj
Example: If −→x ∈ Rn and A = [−→a1|−→a2| . . . |
−→an] ∈ Rm×n,B = [
−→b1|−→b2| . . . |
−→bp] ∈ Rn×p:
A−→x =n∑
i=1
xi−→ai
AB = [A−→b1|A
−→b2| . . . |A
−→bp]
![Page 8: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/8.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Properties of Matrix Multiplication
Associative: (AB)C = A(BC)
Distributive: A(B + C) = AB + AC
Non-commutative: AB 6= BABlock multiplication: If A = [Aik ], B = [Bkj ], where Aik ’s andBkj ’s are matrix blocks, and the number of columns in Aik isequal to the number of rows in Bkj , then C = AB = [Cij ]where Cij =
∑k AikBkj
Example: If −→x ∈ Rn and A = [−→a1|−→a2| . . . |
−→an] ∈ Rm×n,B = [
−→b1|−→b2| . . . |
−→bp] ∈ Rn×p:
A−→x =n∑
i=1
xi−→ai
AB = [A−→b1|A
−→b2| . . . |A
−→bp]
![Page 9: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/9.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Properties of Matrix Multiplication
Associative: (AB)C = A(BC)
Distributive: A(B + C) = AB + ACNon-commutative: AB 6= BA
Block multiplication: If A = [Aik ], B = [Bkj ], where Aik ’s andBkj ’s are matrix blocks, and the number of columns in Aik isequal to the number of rows in Bkj , then C = AB = [Cij ]where Cij =
∑k AikBkj
Example: If −→x ∈ Rn and A = [−→a1|−→a2| . . . |
−→an] ∈ Rm×n,B = [
−→b1|−→b2| . . . |
−→bp] ∈ Rn×p:
A−→x =n∑
i=1
xi−→ai
AB = [A−→b1|A
−→b2| . . . |A
−→bp]
![Page 10: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/10.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Properties of Matrix Multiplication
Associative: (AB)C = A(BC)
Distributive: A(B + C) = AB + ACNon-commutative: AB 6= BABlock multiplication: If A = [Aik ], B = [Bkj ], where Aik ’s andBkj ’s are matrix blocks, and the number of columns in Aik isequal to the number of rows in Bkj , then C = AB = [Cij ]where Cij =
∑k AikBkj
Example: If −→x ∈ Rn and A = [−→a1|−→a2| . . . |
−→an] ∈ Rm×n,B = [
−→b1|−→b2| . . . |
−→bp] ∈ Rn×p:
A−→x =n∑
i=1
xi−→ai
AB = [A−→b1|A
−→b2| . . . |A
−→bp]
![Page 11: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/11.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Properties of Matrix Multiplication
Associative: (AB)C = A(BC)
Distributive: A(B + C) = AB + ACNon-commutative: AB 6= BABlock multiplication: If A = [Aik ], B = [Bkj ], where Aik ’s andBkj ’s are matrix blocks, and the number of columns in Aik isequal to the number of rows in Bkj , then C = AB = [Cij ]where Cij =
∑k AikBkj
Example: If −→x ∈ Rn and A = [−→a1|−→a2| . . . |
−→an] ∈ Rm×n,B = [
−→b1|−→b2| . . . |
−→bp] ∈ Rn×p:
A−→x =n∑
i=1
xi−→ai
AB = [A−→b1|A
−→b2| . . . |A
−→bp]
![Page 12: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/12.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Operators and properties
Transpose: A ∈ Rm×n, then AT ∈ Rn×m: (AT )ij = Aji
Properties:(AT )T = A(AB)T = BT AT
(A + B)T = AT + BT
Trace: A ∈ Rn×n, then: tr(A) =∑n
i=1 Aii
Properties:tr(A) = tr(AT )tr(A + B) = tr(A) + tr(B)tr(λA) = λtr(A)If AB is a square matrix, tr(AB) = tr(BA)
![Page 13: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/13.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Operators and properties
Transpose: A ∈ Rm×n, then AT ∈ Rn×m: (AT )ij = Aji
Properties:(AT )T = A(AB)T = BT AT
(A + B)T = AT + BT
Trace: A ∈ Rn×n, then: tr(A) =∑n
i=1 Aii
Properties:tr(A) = tr(AT )tr(A + B) = tr(A) + tr(B)tr(λA) = λtr(A)If AB is a square matrix, tr(AB) = tr(BA)
![Page 14: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/14.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Operators and properties
Transpose: A ∈ Rm×n, then AT ∈ Rn×m: (AT )ij = Aji
Properties:(AT )T = A(AB)T = BT AT
(A + B)T = AT + BT
Trace: A ∈ Rn×n, then: tr(A) =∑n
i=1 Aii
Properties:tr(A) = tr(AT )tr(A + B) = tr(A) + tr(B)tr(λA) = λtr(A)If AB is a square matrix, tr(AB) = tr(BA)
![Page 15: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/15.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Operators and properties
Transpose: A ∈ Rm×n, then AT ∈ Rn×m: (AT )ij = Aji
Properties:(AT )T = A(AB)T = BT AT
(A + B)T = AT + BT
Trace: A ∈ Rn×n, then: tr(A) =∑n
i=1 Aii
Properties:tr(A) = tr(AT )tr(A + B) = tr(A) + tr(B)tr(λA) = λtr(A)If AB is a square matrix, tr(AB) = tr(BA)
![Page 16: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/16.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Special types of matrices
Identity matrix: I = In ∈ Rn×n:
Iij =
1 i=j,0 otherwise.
∀A ∈ Rm×n: AIn = ImA = A
Diagonal matrix: D = diag(d1,d2, . . . ,dn):
Dij =
di j=i,0 otherwise.
Symmetric matrices: A ∈ Rn×n is symmetric if A = AT .Orthogonal matrices: U ∈ Rn×n is orthogonal ifUUT = I = UT U
![Page 17: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/17.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Special types of matrices
Identity matrix: I = In ∈ Rn×n:
Iij =
1 i=j,0 otherwise.
∀A ∈ Rm×n: AIn = ImA = ADiagonal matrix: D = diag(d1,d2, . . . ,dn):
Dij =
di j=i,0 otherwise.
Symmetric matrices: A ∈ Rn×n is symmetric if A = AT .Orthogonal matrices: U ∈ Rn×n is orthogonal ifUUT = I = UT U
![Page 18: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/18.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Special types of matrices
Identity matrix: I = In ∈ Rn×n:
Iij =
1 i=j,0 otherwise.
∀A ∈ Rm×n: AIn = ImA = ADiagonal matrix: D = diag(d1,d2, . . . ,dn):
Dij =
di j=i,0 otherwise.
Symmetric matrices: A ∈ Rn×n is symmetric if A = AT .
Orthogonal matrices: U ∈ Rn×n is orthogonal ifUUT = I = UT U
![Page 19: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/19.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Special types of matrices
Identity matrix: I = In ∈ Rn×n:
Iij =
1 i=j,0 otherwise.
∀A ∈ Rm×n: AIn = ImA = ADiagonal matrix: D = diag(d1,d2, . . . ,dn):
Dij =
di j=i,0 otherwise.
Symmetric matrices: A ∈ Rn×n is symmetric if A = AT .Orthogonal matrices: U ∈ Rn×n is orthogonal ifUUT = I = UT U
![Page 20: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/20.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Linear Independence and Rank
A set of vectors x1, . . . , xn is linearly independent if@α1, . . . , αn:
∑ni=1 αixi = 0
Rank: A ∈ Rm×n, then rank(A) is the maximum number oflinearly independent columns (or equivalently, rows)Properties:
rank(A) ≤ minm,nrank(A) = rank(AT )rank(AB) ≤ minrank(A), rank(B)rank(A + B) ≤ rank(A) + rank(B)
![Page 21: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/21.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Linear Independence and Rank
A set of vectors x1, . . . , xn is linearly independent if@α1, . . . , αn:
∑ni=1 αixi = 0
Rank: A ∈ Rm×n, then rank(A) is the maximum number oflinearly independent columns (or equivalently, rows)
Properties:rank(A) ≤ minm,nrank(A) = rank(AT )rank(AB) ≤ minrank(A), rank(B)rank(A + B) ≤ rank(A) + rank(B)
![Page 22: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/22.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Linear Independence and Rank
A set of vectors x1, . . . , xn is linearly independent if@α1, . . . , αn:
∑ni=1 αixi = 0
Rank: A ∈ Rm×n, then rank(A) is the maximum number oflinearly independent columns (or equivalently, rows)Properties:
rank(A) ≤ minm,nrank(A) = rank(AT )rank(AB) ≤ minrank(A), rank(B)rank(A + B) ≤ rank(A) + rank(B)
![Page 23: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/23.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Matrix Inversion
If A ∈ Rn×n, rank(A) = n, then the inverse of A, denotedA−1 is the matrix that: AA−1 = A−1A = IProperties:
(A−1)−1 = A(AB)−1 = B−1A−1
(A−1)T = (AT )−1
![Page 24: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/24.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Range and Nullspace of a Matrix
Span: span(x1, . . . , xn) = ∑n
i=1 αixi |αi ∈ R
Projection:Proj(y ; xi1≤i≤n) = argminv∈span(xi1≤i≤n)||y − v ||2Range: A ∈ Rm×n, then R(A) = Ax | x ∈ Rn is the spanof the columns of AProj(y ,A) = A(AT A)−1AT yNullspace: null(A) = x ∈ Rn|Ax = 0
![Page 25: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/25.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Range and Nullspace of a Matrix
Span: span(x1, . . . , xn) = ∑n
i=1 αixi |αi ∈ RProjection:Proj(y ; xi1≤i≤n) = argminv∈span(xi1≤i≤n)||y − v ||2
Range: A ∈ Rm×n, then R(A) = Ax | x ∈ Rn is the spanof the columns of AProj(y ,A) = A(AT A)−1AT yNullspace: null(A) = x ∈ Rn|Ax = 0
![Page 26: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/26.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Range and Nullspace of a Matrix
Span: span(x1, . . . , xn) = ∑n
i=1 αixi |αi ∈ RProjection:Proj(y ; xi1≤i≤n) = argminv∈span(xi1≤i≤n)||y − v ||2Range: A ∈ Rm×n, then R(A) = Ax | x ∈ Rn is the spanof the columns of A
Proj(y ,A) = A(AT A)−1AT yNullspace: null(A) = x ∈ Rn|Ax = 0
![Page 27: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/27.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Range and Nullspace of a Matrix
Span: span(x1, . . . , xn) = ∑n
i=1 αixi |αi ∈ RProjection:Proj(y ; xi1≤i≤n) = argminv∈span(xi1≤i≤n)||y − v ||2Range: A ∈ Rm×n, then R(A) = Ax | x ∈ Rn is the spanof the columns of AProj(y ,A) = A(AT A)−1AT y
Nullspace: null(A) = x ∈ Rn|Ax = 0
![Page 28: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/28.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Range and Nullspace of a Matrix
Span: span(x1, . . . , xn) = ∑n
i=1 αixi |αi ∈ RProjection:Proj(y ; xi1≤i≤n) = argminv∈span(xi1≤i≤n)||y − v ||2Range: A ∈ Rm×n, then R(A) = Ax | x ∈ Rn is the spanof the columns of AProj(y ,A) = A(AT A)−1AT yNullspace: null(A) = x ∈ Rn|Ax = 0
![Page 29: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/29.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Determinant
A ∈ Rn×n, a1, . . . ,an the rows of A,S =
∑ni=1 αiai |0 ≤ αi ≤ 1, then det(A) is the volume of
S.Properties:
det(I) = 1det(λA) = λdet(A)det(AT ) = det(A)det(AB) = det(A)det(B)det(A) 6= 0 if and only if A is invertible.If A invertible, then det(A−1) = det(A)
![Page 30: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/30.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Quadratic Forms and Positive Semidefinite Matrices
A ∈ Rn×n, x ∈ Rn, xT Ax is called a quadratic form:
xT Ax =∑
1≤i,j≤n
Aijxixj
A is positive definite if ∀ x ∈ Rn : xT Ax > 0A is positive semidefinite if ∀ x ∈ Rn : xT Ax ≥ 0A is negative definite if ∀ x ∈ Rn : xT Ax < 0A is negative semidefinite if ∀ x ∈ Rn : xT Ax ≤ 0
![Page 31: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/31.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Quadratic Forms and Positive Semidefinite Matrices
A ∈ Rn×n, x ∈ Rn, xT Ax is called a quadratic form:
xT Ax =∑
1≤i,j≤n
Aijxixj
A is positive definite if ∀ x ∈ Rn : xT Ax > 0A is positive semidefinite if ∀ x ∈ Rn : xT Ax ≥ 0A is negative definite if ∀ x ∈ Rn : xT Ax < 0A is negative semidefinite if ∀ x ∈ Rn : xT Ax ≤ 0
![Page 32: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/32.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Eigenvalues and Eigenvectors
A ∈ Rn×n, λ ∈ C is an eigenvalue of A with thecorresponding eigenvector x ∈ Cn (x 6= 0) if:
Ax = λx
eigenvalues: the n possibly complex roots of thepolynomial equation det(A− λI) = 0, and denoted asλ1, . . . , λn
Properties:tr(A) =
∑ni=1 λi
det(A) =∏n
i=1 λirank(A) = |1 ≤ i ≤ n|λi 6= 0|
![Page 33: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/33.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Linear Algebra
Matrix Eigendecomposition
A ∈ Rn×n, λ1, . . . , λn the eigenvalues, and x1, . . . , xn theeigenvectors. X = [x1|x2| . . . |xn], Λ = diag(λ1, . . . , λn),then AX = XΛ.A called diagonalizable if X invertible: A = XΛX−1
If A symmetric, then all eigenvalues real, and X orthogonal(hence denoted by U = [u1|u2| . . . |un]):
A = UΛUT =n∑
i=1
λiuiuTi
A special case of Signular Value Decomposition
![Page 34: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/34.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Outline
1 Basic Linear Algebra
2 Basic Probability Theory
![Page 35: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/35.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Elements of Probability
Sample Space Ω: Set of all possible outcomesEvent Space F : A family of subsets of Ω
Probability Measure: Function P : F → R with properties:1 P(A) ≥ 0 (∀A ∈ F)2 P(Ω) = 13 Ai ’s disjoint, then P(
⋃i Ai ) =
∑i P(Ai )
![Page 36: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/36.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Conditional Probability and Independence
For events A,B:
P(A|B) =P(A
⋂B)
P(B)
A, B independent if P(A|B) = P(A) or equivalently:P(A
⋂B) = P(A)P(B)
![Page 37: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/37.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Random Variables and Distributions
A random variable X is a function X : Ω→ RExample: Number of heads in 20 tosses of a coinProbabilities of events associated with random variablesdefined based on the original probability function. e.g.,P(X = k) = P(ω ∈ Ω|X (ω) = k)Cumulative Distribution Function (CDF) FX : R→ [0,1]:FX (x) = P(X ≤ x)
Probability Mass Function (pmf): X discrete thenpX (x) = P(X = x)
Probability Density Function (pdf): fX (x) = dFX (x)/dx
![Page 38: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/38.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Properties of Distribution Functions
CDF:0 ≤ FX (x) ≤ 1FX monotone increasing, with limx→−∞FX (x) = 0,limx→∞FX (x) = 1
pmf:0 ≤ pX (x) ≤ 1∑
x pX (x) = 1∑x∈A pX (x) = pX (A)
pdf:fX (x) ≥ 0∫∞−∞ fX (x)dx = 1∫x∈A fX (x)dx = P(X ∈ A)
![Page 39: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/39.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Expectation and Variance
Assume random variable X has pdf fX (x), and g : R→ R.Then
E [g(X )] =
∫ ∞−∞
g(x)fX (x)dx
for discrete X , E [g(X )] =∑
x g(x)pX (x)
Properties:for any constant a ∈ R, E [a] = aE [ag(X )] = aE [g(X )]Linearity of Expectation:E [g(X ) + h(X )] = E [g(X )] + E [h(X )]
Var [X ] = E [(X − E [X ])2]
![Page 40: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/40.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Some Common Random Variables
X ∼ Bernoulli(p) (0 ≤ p ≤ 1):
pX (x) =
p x=1,1− p x=0.
X ∼ Geometric(p) (0 ≤ p ≤ 1): pX (x) = p(1− p)x−1
X ∼ Uniform(a,b) (a < b):
fX (x) =
1
b−a a ≤ x ≤ b,0 otherwise.
X ∼ Normal(µ, σ2):
fX (x) =1√2πσ
e−1
2σ2 (x−µ)2
![Page 41: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/41.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Multiple Random Variables and Joint Distributions
X1, . . . ,Xn random variablesJoint CDF: FX1,...,Xn (x1, . . . , xn) = P(X1 ≤ x1, . . . ,Xn ≤ xn)
Joint pdf: fX1,...,Xn (x1, . . . , xn) =∂nFX1,...,Xn (x1,...,xn)
∂x1...∂xn
Marginalization:fX1(x1) =
∫∞−∞ . . .
∫∞−∞ fX1,...,Xn (x1, . . . , xn)dx2 . . . dxn
Conditioning: fX1|X2,...,Xn (x1|x2, . . . , xn) =fX1,...,Xn (x1,...,xn)
fX2,...,Xn (x2,...,xn)
Chain Rule: f (x1, . . . , xn) = f (x1)∏n
i=2 f (xi |x1, . . . , xi−1)
Independence: f (x1, . . . , xn) =∏n
i=1 f (xi).More generally, events A1, . . . ,An independent ifP(⋂
i∈S Ai) =∏
i∈S P(Ai) (∀S ⊆ 1, . . . ,n).
![Page 42: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/42.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Random Vectors
X1, . . . ,Xn random variables. X = [X1X2 . . .Xn]T random vector.
If g : Rn → R, thenE [g(X )] =
∫Rn g(x1, . . . , xn)fX1,...,Xn (x1, . . . , xn)dx1 . . . dxn
if g : Rn → Rm, g = [g1 . . . gm]T , thenE [g(X )] =
[E [g1(X )] . . .E [gm(X )]
]TCovariance Matrix:Σ = Cov(X ) = E
[(X − E [X ])(X − E [X ])T ]
Properties of Covariance Matrix:Σij = Cov [Xi ,Xj ] = E
[(Xi − E [Xi ])(Xj − E [Xj ])
]Σ symmetric, positive semidefinite
![Page 43: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/43.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Multivariate Gaussian Distribution
µ ∈ Rn, Σ ∈ Rn×n symmetric, positive semidefiniteX ∼ N (µ,Σ) n-dimensional Gaussian distribution:
fX (x) =1
(2π)n/2det(Σ)1/2 exp(− 1
2(x − µ)T Σ−1(x − µ)
)E [X ] = µ
Cov(X ) = Σ
![Page 44: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/44.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Parameter Estimation: Maximum Likelihood
Parametrized distribution fX (x ; θ) with parameter(s) θ unknown.
IID samples x1, . . . , xn observed.
Goal: Estimate θ
MLE: θ = argmaxθf (x1, . . . , xn; θ)
![Page 45: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/45.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
MLE Example
X ∼ Gaussian(µ, σ2). θ = (µ, σ2) unknown. Samples x1, . . . , xn.Then:
f (x1, . . . , xn;µ, σ2) = (1
2πσ2 )n/2 exp(−∑n
i=1(xi − µ)2
2σ2
)Setting: ∂ log f
∂µ = 0 and ∂ log f∂σ = 0
Gives:
µMLE =
∑ni=1 xi
n, σ2
MLE =
∑ni=1(xi − µ)2
nIf not possible to find the optimal point in closed form, iterativemethods such as gradient decent can be used.
![Page 46: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/46.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
Some Useful Inequalities
Markov’s Inequality: X random variable, and a > 0. Then:
P(|X | ≥ a) ≤ E [|X |]a
Chebyshev’s Inequality: If E [X ] = µ, Var(X ) = σ2, k > 0,then:
Pr(|X − µ| ≥ kσ) ≤ 1k2
Chernoff bound: X1, . . . ,Xn iid random variables, withE [Xi ] = µ, Xi ∈ 0,1 (∀1 ≤ i ≤ n). Then:
P(|1n
n∑i=1
Xi − µ| ≥ ε) ≤ 2 exp(−2nε2)
Multiple variants of Chernoff-type bounds exist, which canbe useful in different settings
![Page 47: Quick Tour of Basic Linear Algebra and Probability Theory](https://reader035.vdocuments.us/reader035/viewer/2022071523/613cffc70c37c14a830ced39/html5/thumbnails/47.jpg)
Quick Tour of Basic Linear Algebra and Probability Theory
Basic Probability Theory
References
1 CS229 notes on basic linear algebra and probability theory2 Wikipedia!