digest_matrix calculus
DESCRIPTION
Matrix calculus referenceTRANSCRIPT
![Page 1: digest_Matrix Calculus](https://reader034.vdocuments.us/reader034/viewer/2022051208/545a18edb1af9f37608b591b/html5/thumbnails/1.jpg)
Matrix Calculus
Jie FuSaturday, April 8, 2023
https://sites.google.com/site/bigaidream/
Matrix calculus collects the various partial derivatives of a single function with respect to many variables, and/or of a multivariate function with respect to a single variable, into vectors and matrices that can be treated as single entities. Reference: http://en.wikipedia.org/wiki/Matrix_calculus
ContentsMatrix Calculus...............................................................................................................................................................1Scope.............................................................................................................................................................................. 1Notation..........................................................................................................................................................................2Derivatives with Vectors.................................................................................................................................................2
Vector-by-scalar......................................................................................................................................................2Scalar-by-vector......................................................................................................................................................3Vector-by-vector.....................................................................................................................................................3
Derivatives with Matrices...............................................................................................................................................3Matrix-by-scalar......................................................................................................................................................3Scalar-by-matrix......................................................................................................................................................4
Identities.........................................................................................................................................................................5Vector-by-vector identities......................................................................................................................................5Scalar-by-vector identities.......................................................................................................................................6Vector-by-scalar identities......................................................................................................................................8Scalar-by-matrix identities......................................................................................................................................9Matrix-by-scalar identities....................................................................................................................................10Scalar-by-scalar identities.....................................................................................................................................11
With vectors involved...................................................................................................................................11With matrices involved.................................................................................................................................11
Identities in differential form................................................................................................................................11
Scope
For a scalar function of three independent variables, , the gradient is given by the vector equation
![Page 2: digest_Matrix Calculus](https://reader034.vdocuments.us/reader034/viewer/2022051208/545a18edb1af9f37608b591b/html5/thumbnails/2.jpg)
,
where represents a unit vector in the direction for . This type of generalized derivative can be seen as the derivative of a scalar, f, with respect to a vector, and its result can be easily collected in vector form.
Notation
All functions are assumed to be of differentiability class C1 unless otherwise noted. Generally letters from first half of the alphabet (a, b, c, …) will be used to denote constants, and from the second half (t, x, y, …) to denote variables.
![Page 3: digest_Matrix Calculus](https://reader034.vdocuments.us/reader034/viewer/2022051208/545a18edb1af9f37608b591b/html5/thumbnails/3.jpg)
Derivatives with Vectors
Vector-by-scalar
The derivative of a vector , by a scalar x is written (in numerator layout notation) as
Scalar-by-vector
The derivative of a scalar y by a vector , is written (in numerator layout notation) as
![Page 4: digest_Matrix Calculus](https://reader034.vdocuments.us/reader034/viewer/2022051208/545a18edb1af9f37608b591b/html5/thumbnails/4.jpg)
Vector-by-vector
The derivative of a vector function (a vector whose components are functions) , of an independent
vector , is written (in numerator layout notation) as
Derivatives with Matrices
Matrix-by-scalar
The derivative of a matrix function Y by a scalar x is known as the tangent matrix and is given (in numerator layout notation) by
![Page 5: digest_Matrix Calculus](https://reader034.vdocuments.us/reader034/viewer/2022051208/545a18edb1af9f37608b591b/html5/thumbnails/5.jpg)
Scalar-by-matrix
The derivative of a scalar y function of a matrix X of independent variables, with respect to the matrix X, is given (in numerator layout notation) by
![Page 6: digest_Matrix Calculus](https://reader034.vdocuments.us/reader034/viewer/2022051208/545a18edb1af9f37608b591b/html5/thumbnails/6.jpg)
Identities
Vector-by-vector identities
![Page 7: digest_Matrix Calculus](https://reader034.vdocuments.us/reader034/viewer/2022051208/545a18edb1af9f37608b591b/html5/thumbnails/7.jpg)
Scalar-by-vector identities
![Page 8: digest_Matrix Calculus](https://reader034.vdocuments.us/reader034/viewer/2022051208/545a18edb1af9f37608b591b/html5/thumbnails/8.jpg)
![Page 9: digest_Matrix Calculus](https://reader034.vdocuments.us/reader034/viewer/2022051208/545a18edb1af9f37608b591b/html5/thumbnails/9.jpg)
Vector-by-scalar identities
![Page 10: digest_Matrix Calculus](https://reader034.vdocuments.us/reader034/viewer/2022051208/545a18edb1af9f37608b591b/html5/thumbnails/10.jpg)
Scalar-by-matrix identities
![Page 11: digest_Matrix Calculus](https://reader034.vdocuments.us/reader034/viewer/2022051208/545a18edb1af9f37608b591b/html5/thumbnails/11.jpg)
Matrix-by-scalar identities
![Page 12: digest_Matrix Calculus](https://reader034.vdocuments.us/reader034/viewer/2022051208/545a18edb1af9f37608b591b/html5/thumbnails/12.jpg)
Scalar-by-scalar identities
With vectors involved
With matrices involved
Identities in differential form
It is often easier to work in differential form and then convert back to normal derivatives. This only works well using
![Page 13: digest_Matrix Calculus](https://reader034.vdocuments.us/reader034/viewer/2022051208/545a18edb1af9f37608b591b/html5/thumbnails/13.jpg)
the numerator layout.
To convert to normal derivative form, first convert it to one of the following canonical forms, and then use these identities:
![Page 14: digest_Matrix Calculus](https://reader034.vdocuments.us/reader034/viewer/2022051208/545a18edb1af9f37608b591b/html5/thumbnails/14.jpg)