linear algebra and probability theory

47
Quick Tour of Basic Linear Algebra and Probability Theory Quick Tour of Basic Linear Algebra and Probability Theory CS246: Mining Massive Data Sets Winter 2011

Upload: stephen-baker

Post on 06-Apr-2018

243 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 1/47

Quick Tour of Basic Linear Algebra and Probability Theory

Quick Tour of Basic Linear Algebra and

Probability Theory

CS246: Mining Massive Data Sets

Winter 2011

Page 2: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 2/47

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Outline

1 Basic Linear Algebra

2 Basic Probability Theory

Page 3: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 3/47

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Matrices and Vectors

Matrix: A rectangular array of numbers, e.g., A ∈ Rm ×n :

A =

a 11 a 12 . . . a 1n a 21 a 22 . . . a 2n

... ... ... ...

a m 1 a m 2 . . . a mn

Page 4: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 4/47

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Matrices and Vectors

Matrix: A rectangular array of numbers, e.g., A ∈ Rm ×n :

A =

a 11 a 12 . . . a 1n a 21 a 22 . . . a 2n

... ... ... ...

a m 1 a m 2 . . . a mn

Vector: A matrix consisting of only one column (default) or

one row, e.g., x ∈Rn

x =

x 1x 2...

x n

Page 5: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 5/47

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Matrix Multiplication

If A ∈ Rm ×n , B ∈ Rn ×p , C = AB , then C ∈ Rm ×p :

C ij =

n k =1

Aik B kj

Page 6: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 6/47

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Matrix Multiplication

If A ∈ Rm ×n , B ∈ Rn ×p , C = AB , then C ∈ Rm ×p :

C ij =

n k =1

Aik B kj

Special cases: Matrix-vector product, inner product of two

vectors. e.g., with x , y

∈Rn :

x T y =n

i =1

x i y i ∈ R

Q i k T f B i Li Al b d P b bili Th

Page 7: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 7/47

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Properties of Matrix Multiplication

Associative: (AB )C = A(BC )

Q i k T f B i Li Al b d P b bilit Th

Page 8: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 8/47

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Properties of Matrix Multiplication

Associative: (AB )C = A(BC )

Distributive: A(B + C ) = AB + AC

Quick Tour of Basic Linear Algebra and Probability Theory

Page 9: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 9/47

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Properties of Matrix Multiplication

Associative: (AB )C = A(BC )

Distributive: A(B + C ) = AB + AC

Non-commutative: AB = BA

Quick Tour of Basic Linear Algebra and Probability Theory

Page 10: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 10/47

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Properties of Matrix Multiplication

Associative: (AB )C = A(BC )

Distributive: A(B + C ) = AB + AC

Non-commutative: AB = BA

Block multiplication: If A = [Aik ], B = [B kj ], where Aik ’s and

B kj ’s are matrix blocks, and the number of columns in Aik is

equal to the number of rows in B kj , then C = AB = [C ij ]where C ij =

k Aik B kj

Quick Tour of Basic Linear Algebra and Probability Theory

Page 11: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 11/47

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Properties of Matrix Multiplication

Associative: (AB )C = A(BC )

Distributive: A(B + C ) = AB + AC

Non-commutative: AB = BA

Block multiplication: If A = [Aik ], B = [B kj ], where Aik ’s and

B kj ’s are matrix blocks, and the number of columns in Aik is

equal to the number of rows in B kj , then C = AB = [C ij ]where C ij =

k Aik B kj

Example: If−→x ∈ Rn and A = [

−→a 1| −→a 2| . . . | −→a n ] ∈ Rm ×n ,

B = [−→b 1| −→b 2| . . . | −→b p ] ∈ Rn ×p

:

A−→x =

n i =1

x i −→a i

AB = [A−→b 1|A−→b 2| . . . |A−→b p ]

Quick Tour of Basic Linear Algebra and Probability Theory

Page 12: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 12/47

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Operators and properties

Transpose: A ∈ Rm ×n , then AT ∈ Rn ×m : (AT )ij = A ji

Quick Tour of Basic Linear Algebra and Probability Theory

Page 13: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 13/47

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Operators and properties

Transpose: A ∈ Rm ×n , then AT ∈ Rn ×m : (AT )ij = A ji Properties:

(AT

)T

= A(AB )T = B T AT

(A + B )T = AT + B T

Quick Tour of Basic Linear Algebra and Probability Theory

Page 14: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 14/47

g y y

Basic Linear Algebra

Operators and properties

Transpose: A ∈ Rm ×n , then AT ∈ Rn ×m : (AT )ij = A ji Properties:

(AT

)T

= A(AB )T = B T AT

(A + B )T = AT + B T

Trace: A ∈ Rn ×n , then: tr (A) =n

i =1 Aii

Quick Tour of Basic Linear Algebra and Probability Theory

Page 15: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 15/47

g y y

Basic Linear Algebra

Operators and properties

Transpose: A ∈ Rm ×n , then AT ∈ Rn ×m : (AT )ij = A ji Properties:

(A

T

)

T

= A(AB )T = B T AT

(A + B )T = AT + B T

Trace: A ∈ Rn ×n , then: tr (A) =n

i =1 Aii

Properties:

tr (A) = tr (AT )tr (A + B ) = tr (A) + tr (B )tr (λA) = λtr (A)If AB is a square matrix, tr (AB ) = tr (BA)

Quick Tour of Basic Linear Algebra and Probability Theory

Page 16: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 16/47

Basic Linear Algebra

Special types of matrices

Identity matrix: I = I n ∈ Rn ×n :

I ij =

1 i=j,

0 otherwise.

∀A ∈ Rm ×n : AI n = I m A = A

Quick Tour of Basic Linear Algebra and Probability Theory

Page 17: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 17/47

Basic Linear Algebra

Special types of matrices

Identity matrix: I = I n ∈ Rn ×n :

I ij =

1 i=j,

0 otherwise.

∀A ∈ Rm ×n : AI n = I m A = A

Diagonal matrix: D = diag (d 1,d 2, . . . , d n ):

D ij =d i j=i,

0 otherwise.

Quick Tour of Basic Linear Algebra and Probability Theory

Page 18: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 18/47

Basic Linear Algebra

Special types of matrices

Identity matrix: I = I n ∈ Rn ×n :

I ij =

1 i=j,

0 otherwise.

∀A ∈ Rm ×n : AI n = I m A = A

Diagonal matrix: D = diag (d 1,d 2, . . . , d n ):

D ij =d i j=i,

0 otherwise.

Symmetric matrices: A ∈ Rn ×n is symmetric if A = AT .

Quick Tour of Basic Linear Algebra and Probability Theory

Page 19: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 19/47

Basic Linear Algebra

Special types of matrices

Identity matrix: I = I n ∈ Rn ×n :

I ij =

1 i=j,

0 otherwise.

∀A ∈ Rm ×n : AI n = I m A = A

Diagonal matrix: D = diag (d 1,d 2, . . . , d n ):

D ij =d i j=i,

0 otherwise.

Symmetric matrices: A ∈ Rn ×n is symmetric if A = AT .

Orthogonal matrices: U

∈Rn ×n is orthogonal if

UU T = I = U T U

Quick Tour of Basic Linear Algebra and Probability Theory

Page 20: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 20/47

Basic Linear Algebra

Linear Independence and Rank

A set of vectors x 1, . . . , x n is linearly independent if

α1, . . . , αn

: n

i =1 αi x i = 0

Quick Tour of Basic Linear Algebra and Probability Theory

Page 21: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 21/47

Basic Linear Algebra

Linear Independence and Rank

A set of vectors x 1, . . . , x n is linearly independent if

α1, . . . , αn

: n

i =1 αi x i = 0

Rank: A ∈ Rm ×n , then rank (A) is the maximum number of

linearly independent columns (or equivalently, rows)

Quick Tour of Basic Linear Algebra and Probability Theory

Page 22: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 22/47

Basic Linear Algebra

Linear Independence and Rank

A set of vectors x 1, . . . , x n is linearly independent if

α1, . . . , αn

: n

i =1 αi x i = 0

Rank: A ∈ Rm ×n , then rank (A) is the maximum number of

linearly independent columns (or equivalently, rows)

Properties:

rank (A) ≤ minm ,n rank (A) = rank (A

T

)rank (AB ) ≤ minrank (A), rank (B )rank (A + B ) ≤ rank (A) + rank (B )

Quick Tour of Basic Linear Algebra and Probability Theory

Page 23: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 23/47

Basic Linear Algebra

Matrix Inversion

If A∈Rn ×n , rank (A) = n , then the inverse of A, denoted

A−1 is the matrix that: AA−1 = A−1A = I

Properties:

(A−1)−1 = A(AB )−1 = B −1A−1

(A

−1

)

T

= (A

T

)

−1

Quick Tour of Basic Linear Algebra and Probability Theory

B i Li Al b

Page 24: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 24/47

Basic Linear Algebra

Range and Nullspace of a Matrix

Span: span (x 1, . . . , x n ) =

n i =1 αi x i |αi ∈ R

Quick Tour of Basic Linear Algebra and Probability Theory

B i Li Al b

Page 25: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 25/47

Basic Linear Algebra

Range and Nullspace of a Matrix

Span: span (x 1, . . . , x n ) =

n i =1 αi x i |αi ∈ R

Projection:Proj (y ; x i 1≤i ≤n ) = argmin v ∈span (x i 1≤i ≤n )||y − v ||2

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Page 26: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 26/47

Basic Linear Algebra

Range and Nullspace of a Matrix

Span: span (x 1, . . . , x n ) =

n i =1 αi x i |αi ∈ R

Projection:Proj (y ; x i 1≤i ≤n ) = argmin v ∈span (x i 1≤i ≤n )||y − v ||2Range: A ∈ Rm ×n , then R(A) = Ax | x ∈ R n is the span

of the columns of A

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Page 27: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 27/47

Basic Linear Algebra

Range and Nullspace of a Matrix

Span: span (x 1, . . . , x n ) =

n i =1 αi x i |αi ∈ R

Projection:Proj (y ; x i 1≤i ≤n ) = argmin v ∈span (x i 1≤i ≤n )||y − v ||2Range: A ∈ Rm ×n , then R(A) = Ax | x ∈ R n is the span

of the columns of A

Proj (y ,A) = A(A

T

A)

−1

A

T

y

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Page 28: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 28/47

Basic Linear Algebra

Range and Nullspace of a Matrix

Span: span (x 1, . . . , x n ) =

n i =1 αi x i |αi ∈ R

Projection:Proj (y ; x i 1≤i ≤n ) = argmin v ∈span (x i 1≤i ≤n )||y − v ||2Range: A ∈ Rm ×n , then R(A) = Ax | x ∈ R n is the span

of the columns of A

Proj (y ,A

) =A

(AT A

)

−1AT y

Nullspace: null (A) = x ∈ Rn |Ax = 0

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Page 29: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 29/47

Basic Linear Algebra

Determinant

A ∈ Rn ×n , a 1, . . . , a n the rows of A,

S = n i =1 αi a i | 0 ≤ αi ≤ 1, then det (A) is the volume of

S .Properties:

det (I ) = 1det (λA) = λdet (A)det (AT ) = det (A)

det (AB ) = det (A)det (B )det (A) = 0 if and only if A is invertible.If A invertible, then det (A−1) = det (A)

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Page 30: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 30/47

as c ea geb a

Quadratic Forms and Positive Semidefinite Matrices

A ∈ Rn ×n , x ∈ Rn , x T Ax is called a quadratic form:

x

T

Ax =

1≤i , j ≤n Aij x i x j

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Page 31: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 31/47

g

Quadratic Forms and Positive Semidefinite Matrices

A ∈ Rn ×n , x ∈ Rn , x T Ax is called a quadratic form:

x

T

Ax =

1≤i , j ≤n Aij x i x j

A is positive definite if ∀ x ∈ Rn : x T Ax > 0

A is positive semidefinite if∀x

∈Rn : x T Ax

≥0

A is negative definite if ∀ x ∈ Rn : x T Ax < 0

A is negative semidefinite if ∀ x ∈ Rn : x T Ax ≤ 0

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Page 32: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 32/47

Eigenvalues and Eigenvectors

A ∈ Rn ×n , λ ∈ C is an eigenvalue of A with the

corresponding eigenvector x ∈ Cn (x = 0) if:

Ax = λx

eigenvalues: the n possibly complex roots of the

polynomial equation det (A− λI ) = 0, and denoted as

λ1, . . . , λn

Properties:tr (A) =

n i =1 λi

det (A) =n

i =1 λi rank (A) = |1 ≤ i ≤ n |λi = 0|

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Linear Algebra

Page 33: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 33/47

Matrix Eigendecomposition

A ∈ Rn ×n , λ1, . . . , λn the eigenvalues, and x 1, . . . , x n the

eigenvectors. X = [x 1|x 2| . . . |x n ], Λ = diag (λ1, . . . , λn ),

then AX = X Λ.

A called diagonalizable if X invertible: A = X ΛX −1

If A symmetric, then all eigenvalues real, and X orthogonal

(hence denoted by U = [u 1|u 2| . . . |u n ]):

A = U ΛU T =

n i =1

λi u i u T i

A special case of Signular Value Decomposition

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Probability Theory

Page 34: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 34/47

Outline

1 Basic Linear Algebra

2 Basic Probability Theory

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Probability Theory

Page 35: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 35/47

Elements of Probability

Sample Space Ω: Set of all possible outcomes

Event Space F : A family of subsets of Ω

Probability Measure: Function P : F → R with properties:

1 P (A) ≥ 0 (∀A ∈ F )2 P (Ω) = 13 A

i ’s disjoint, then P

(

i Ai ) =

i P

(Ai )

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Probability Theory

Page 36: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 36/47

Conditional Probability and Independence

For events A,B :

P (A|B ) =P (A

B )

P (B )

A, B independent if P (A|B ) = P (A) or equivalently:

P (AB ) = P (A)P (B )

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Probability Theory

Page 37: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 37/47

Random Variables and Distributions

A random variable X is a function X : Ω → RExample: Number of heads in 20 tosses of a coin

Probabilities of events associated with random variables

defined based on the original probability function. e.g.,

P (X = k ) = P (ω ∈ Ω|X (ω) = k )

Cumulative Distribution Function (CDF) F X : R → [0, 1]:F X (x ) = P (X

≤x )

Probability Mass Function (pmf): X discrete thenp X (x ) = P (X = x )

Probability Density Function (pdf): f X (x ) = dF X (x )/dx

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Probability Theory

Page 38: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 38/47

Properties of Distribution Functions

CDF:

0 ≤ F X (x ) ≤ 1F X monotone increasing, with lim x →−∞F X (x ) = 0,

lim x →∞

F X (x ) = 1pmf:

0 ≤ p X (x ) ≤ 1x p X (x ) = 1

x ∈A p X (x ) = p X (A)

pdf:f X (x ) ≥ 0 ∞

−∞f X (x )dx = 1

x ∈Af X (x )dx = P (X ∈ A)

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Probability Theory

Page 39: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 39/47

Expectation and Variance

Assume random variable X has pdf f X (x ), and g : R → R.

Then

E [g (X )] = ∞

−∞

g (x )f X (x )dx

for discrete X , E [g (X )] =

x g (x )p X (x )

Properties:

for any constant a ∈ R, E [a ] = a E

[ag

(X

)] =aE

[g

(X

)]Linearity of Expectation:E [g (X ) + h (X )] = E [g (X )] + E [h (X )]

Var [X ] = E [(X − E [X ])2]

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Probability Theory

Page 40: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 40/47

Some Common Random Variables

X ∼ Bernoulli (p ) (0 ≤ p ≤ 1):

p X (x ) =

p x=1,

1 − p x=0.

X ∼ Geometric (p ) (0 ≤ p ≤ 1): p X (x ) = p (1 − p )x −1

X ∼ Uniform (a ,b ) (a < b ):

f X (x ) = 1

b −a a ≤ x ≤ b ,

0 otherwise.

X ∼ Normal (µ, σ2):

f X (x ) =1√2πσ

e − 1

2σ2 (x −µ)2

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Probability Theory

Page 41: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 41/47

Multiple Random Variables and Joint Distributions

X 1, . . . ,X n random variables

Joint CDF: F X 1,...,X n (x 1, . . . , x n ) = P (X 1 ≤ x 1, . . . ,X n ≤ x n )

Joint pdf: f X 1,...,X n (x 1, . . . , x n ) =∂ n F X 1,...,X n (x 1,...,x n )

∂ x 1...∂ x n

Marginalization:f X 1 (x 1) =

∞−∞ . . .

∞−∞ f X 1,...,X n (x 1, . . . , x n )dx 2 . . .dx n

Conditioning: f X 1|X 2,...,X n (x 1|x 2, . . . , x n ) =f X 1,...,X n (x 1,...,x n )

f X 2,...,X n (x 2,...,x n )

Chain Rule: f (x 1, . . . , x n ) = f (x 1)n

i =2 f (x i |x 1, . . . , x i −1)Independence: f (x 1, . . . , x n ) =

n i =1 f (x i ).

More generally, events A1, . . . ,An independent if

P (

i ∈S Ai ) =

i ∈S P (Ai ) (∀S ⊆ 1, . . . , n ).

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Probability Theory

Page 42: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 42/47

Random Vectors

X 1, . . . ,X n random variables. X = [X 1X 2 . . .X n ]T random vector.

If g : Rn → R, then

E [g (X )] = Rn g (x 1, . . . ,x n )f X 1,...,X n (x 1, . . . ,x n )dx 1 . . .dx n

if g : Rn → Rm , g = [g 1 . . . g m ]T , then

E [g (X )] =E [g 1(X )] . . .E [g m (X )]

T Covariance Matrix:

Σ = Cov (X ) = E

(X − E [X ])(X − E [X ])T

Properties of Covariance Matrix:

Σij = Cov [X i ,X j ] = E

(X i − E [X i ])(X j − E [X j ])

Σ symmetric, positive semidefinite

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Probability Theory

Page 43: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 43/47

Multivariate Gaussian Distribution

µ ∈ Rn , Σ ∈ Rn ×n symmetric, positive semidefinite

X ∼ N (µ,Σ) n -dimensional Gaussian distribution:

f X (x ) =1

(2π)n /2det (Σ)1/2exp

− 1

2(x − µ)T Σ−1(x − µ)

E [X ] = µCov (X ) = Σ

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Probability Theory

Page 44: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 44/47

Parameter Estimation: Maximum Likelihood

Parametrized distribution f X (x ; θ) with parameter(s) θ unknown.IID samples x 1, . . . ,x n observed.

Goal: Estimate θ

MLE: θ = argmax θf (x 1, . . . , x n ; θ)

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Probability Theory

Page 45: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 45/47

MLE Example

X ∼ Gaussian (µ, σ2). θ = (µ, σ2) unknown. Samples x 1, . . . , x n .Then:

f (x 1, . . . , x n ;µ, σ2) = ( 12πσ2 )n /2 exp

− n

i =1(x i − µ)2

2σ2

Setting: ∂ log f

∂µ = 0 and ∂ log f ∂σ = 0

Gives:

µMLE =

n i =1 x i n

, σ2MLE =

n i =1(x i − µ)2

n

If not possible to find the optimal point in closed form, iterative

methods such as gradient decent can be used.

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Probability Theory

Page 46: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 46/47

Some Useful Inequalities

Markov’s Inequality: X random variable, and a > 0. Then:

P (|X | ≥ a ) ≤ E [|X |]a

Chebyshev’s Inequality: If E [X ] = µ, Var (X ) = σ2, k > 0,

then:

Pr (|X − µ| ≥ k σ) ≤ 1

k 2

Chernoff bound: X 1, . . . ,X n iid random variables, with

E [X i ] = µ, X i

∈ 0,1

(

∀1

≤i

≤n ). Then:

P (|1

n

n i =1

X i − µ| ≥ ) ≤ 2 exp(−2n 2)

Multiple variants of Chernoff-type bounds exist, which can

be useful in different settings

Quick Tour of Basic Linear Algebra and Probability Theory

Basic Probability Theory

Page 47: Linear Algebra and Probability Theory

8/3/2019 Linear Algebra and Probability Theory

http://slidepdf.com/reader/full/linear-algebra-and-probability-theory 47/47

References

1 CS229 notes on basic linear algebra and probability theory

2 Wikipedia!