mathematical methods for business and economics · m. maggi (mibe) mathematical methods for...
TRANSCRIPT
Mathematical Methods for Business and Economics
Mario Maggi
Dipartimento di Economia Politica e Metodi Quantitativi
Universita di Pavia
a.a. 2010/2011
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 1 / 79
Vectors
Real vector
x ∈ Rn
components xi , i = 1, . . . , n
x = [x1, x2, . . . , xn]
Superscripts denote different vectors, e.g. x1, x2
x1 =[x11 , x
12 , . . . , x
1n
]
Real numbers (scalars): α ∈ R
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 2 / 79
Vectors
Row/column vectors
x = [x1, x2, . . . , xn] , y =
y1
y2...
yn
Transposition: x ′
x = [x1, x2, . . . , xn] , x ′ =
x1
x2...
xn
,(x ′)′= [x1, x2, . . . , xn]
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 3 / 79
Special vectors
Special vectors:
The null vector [0] =[
0 · · · 0]
The sum vector 1 =[
1 · · · 1]
The basis vectors, each one have null components except the one in
i -th position which equals 1:
e1 =[
1 0 0 · · · 0]
e2 =[
0 1 0 · · · 0]
en =[
0 0 · · · 0 1]
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 4 / 79
Vector comparison
x = y if xi = yi ,∀i (x 6= y else);
x > y (greater than), if xi > yi ,∀i ;
x ≧ y (greater or equal than), if xi ≧ yi ,∀i ;
x ≥ y (quasi-greater than) if x ≧ y and x 6= y
In a similar way the opposite relations (<, ≦, ≤) and negations (≯, �, �,
≮, �, �) are introduced
Remark We use the same convention for scalars too, then “≧” stands for
greater or equal than (idem for “≦”)
Comparison between the vector x and [0]:
x > [0], x is positive
x ≧ [0], x is non-negative
x ≥ [0], x is semi-positive
(the same for the negative cases).M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 5 / 79
Vector operations
Sum: Given two vectors x , y ∈ Rn, both row or column
z = x+y , =
z1
z2...
zn
=
x1 + y1
x2 + y2...
xn + yn
, or zi = xi+yi , i = 1, . . . , n
Product by a scalar: Given a vector x ∈ Rn and a scalar α ∈ R
z = αx = [αx1, αx2, . . . αxn] or zi = αxi , i = 1, . . . , n
Scalar product: Given two vectors x ∈ Rn (row) and y ∈ Rn (column)
xy =n∑
i=1
xiyi
xy is a scalar: xy ∈ RM. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 6 / 79
Vector operations
{x , y ∈ X} ⇒ {x + y ∈ X} X is closed with respect
to the sum
{x , y ∈ X} ⇒ {y + x ∈ X} the sum is commutative
{x , y , z ∈ X} ⇒ {(x + y) + z = x + (y + z)} the sum is associative
{x ∈ X} ⇒ {∃ [0] : [0] ∈ X , (x + [0]) = x} ∃ the null vector (neu-
tral)
{x ∈ X} ⇒ {∃ (−x) : (−x) ∈ X , (x + (−x)) = [0]} ∃ the opposite vector
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 7 / 79
Vector operations
{x ∈ X , λ ∈ R} ⇒ {λx ∈ X} X is closed w.r.t. the
multiplication by a
scalar
{x ∈ X , λ, µ ∈ R} ⇒ {(λ+ µ) x ∈ X} distributive property
{x , y ∈ X , λ ∈ R} ⇒ {λ (x + y) = λx + λy} distributive property
{x ∈ X , λ, µ ∈ R} ⇒ {µ (λx) = (λµ) x} associative property
{x ∈ X} ⇒ {1x = x} ∃ the neutral element
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 8 / 79
Norm and distance
A norm is a function ‖·‖ : Rn → R which associates a real number to
each vector of Rn.
Norm properties: ∀x , y ∈ Rn, ∀α ∈ R
1 ‖x‖ ≧ 0, ∀x 6= [0] and ‖[0]‖ = 0
2 ‖αx‖ = |α| ‖x‖
3 ‖x + y‖ ≦ ‖x‖+ ‖y‖ (triangular unequality).
Therte exists different kind of norm.
The p-norms are widely used
‖x‖p =
(n∑
i=1
|xi |p
) 1p
, 1 ≦ p < +∞,
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 9 / 79
Norm and distance
in particular
‖x‖1 =n∑
i=1
|xi |
‖x‖2 =
√n∑
i=1
x2i , the Euclidean norm corresponding to the length of
the segment [0] , x in the Cartesian space Rn;
‖x‖∞ = maxi∈{1,...,n} {|xi |}
Remark Given x ∈ Rn (column),
x ′x =n∑
i=1
(xi)2 = ‖x‖2 .
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 10 / 79
Norm and distance
Given two vectors x , y ∈ Rn, the Euclidean norm of their difference is
‖x − y‖ =
√√√√
n∑
i=1
(xi − yi)2
The function d : Rn ×Rn → R+ which associates to each pair (x , y) of
Rn vectors the value ‖x − y‖ is said Euclidean distance
The Euclidean distance between x and y corresponds to the length of the
segment x , y in the Cartesian space Rn
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 11 / 79
Linear space
Consider a set X on whose elements the operations of sum and product by
a scalar are defined as above. The set X is said a Linear space if
given any pairs (x , y) of elements of X , then x + y ∈ X
for any x ∈ X and α ∈ R, then αx ∈ X
If the set Y is a linear space and Y ⊆ X , then Y is said a linear subset of
X
If in the linear space X the Euclidean norm is defined, then X is a
Euclidean space
The n-dimension Cartesian space Rn is a Euclidean space
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 12 / 79
Definition
The two vectors x , y ∈ Rn are orthogonal if their scalar product is null
From the geometric point of view, this means that the two segments
[0] , x and [0] , y form a right angle in the n-dimension Cartesian space
Definition
The vectors{x i ∈ Rn, i = 1, . . . , n
}are linearly independent if it is not
possible to find n not all null scalars αi , i = 1, . . . , n, such that
n∑
i=1
αixi = [0] .
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 13 / 79
Matrices
A matrix A of order (m × n) is a set of mn scalars endowed with a
complete double order: Given the set of indexes
(i , j) ∈ {1, . . . ,m} × {1, . . . , n}, i is the row index i , j is the column index
A =
a11 a12 · · · a1n
a21 a22. . .
......
. . .. . . a(m−1)n
am1 · · · am(n−1) amn
.
The elements of a matrix A: aij , with i = 1, . . . ,m and j = 1, . . . , n
A = [aij ]
Ai i−th row of A
Aj j−th column of A
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 14 / 79
Matrices
Columnwise:
A =[A1|A2| · · · |An
], Aj ∈ Rm, i = 1, . . . ,m
Rowwise:
A =
A1
A2
...
Am
, Ai ∈ Rn, j = 1, . . . , n.
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 15 / 79
Matrices
Transposition
A =[A1|A2| · · · |An
]∈ Rm×n, A′ =
(A1)′
(A2)′
...
(An)′
∈ Rn×m,
or
A =
A1
A2
...
Am
∈ Rm×n, A′ =[(A1)
′ | (A2)′ | · · · | (Am)
′] ∈ Rn×m.
A square matrix A is symmetric if
A = A′, aij = aji , ∀i , j
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 16 / 79
Matrix operations
Product by a scalar
Given A ∈ and λ ∈ R, then
C = λA ∈ Rm×n, and cij = λaij , i = 1, . . .m, j = 1, . . . , n
Matrix product
Given A ∈ Rm×n and B ∈ Rp×q, the product AB is defined if n = p, its
elements are
(AB)ij = AiBj .
The element in place (i , j) is obtained by the scalar product betwen
Ai and B j
In general, this product does not commutate AB 6= BA.
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 17 / 79
Matrix operations
Element by element (Hadamard) product
Given A,B ∈ Rm×n
C = A ∗ B , C ∈ Rm×n, cij = aijbij , i = 1, . . .m, j = 1, . . . , n
Kronecker (tensorial) product
Given A ∈ Rm×n, B ∈ Rp×q the Kronecker product C = A⊗ B yields a
matrix C ∈ Rmp×nq defined by blocks as follows
C =
a11B a12B · · · a1nB
a21B a22B · · · a2nB...
.... . .
...
am1B am2B · · · amnB
.
In general, this product does not commutate: A⊗ B 6= B ⊗ A.M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 18 / 79
Special matrices
The elements aij with i = j form the (principal) diagonal of the matrix A
Diagonal matrices: the elements out of the diagonal are null
Identity matrix:
I =
1 0 · · · 0
0 1. . .
......
. . .. . . 0
0 · · · 0 1
Remark: AI = A, IA = A, ∀A for which the matrix product is defined
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 19 / 79
Special matrices
A square matrix is said upper (lower) triangular if the elements below
(above) the diagonal are null
A =
a11 a12 · · · a1n
0 a22. . .
......
. . .. . . a(n−1)n
0 · · · 0 ann
, upper triangular
A =
a11 0 · · · 0
a21 a22. . .
......
. . .. . . 0
an1 · · · an(n−1) ann
, lower triangular
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 20 / 79
Block partitioned matrices
For example:
[
A B
C D
]
,
1 3 5
−9 0 4
0 1 −2
4 −2
2 −1
0 1
[
−1 1 −1] [
0 −1]
[ [
1 2
0 0
] [
0 1
1 1
] ]
,
[
0 1
1 1
]
[0] [0]
[0]
[
0 1
1 1
]
[0]
[0] [0]
[
0 1
1 1
]
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 21 / 79
Determinant
The determinant is a function det : Rn×n → R which associates a real
number to each square matrix
Determinant calculation
Minors Given a submatrix A of A
Order k minor: A contains the elements of some k rows and some k
columns of A; with k = 1, 2, . . . , n
Order k principal minor: A contains the elements of some k rows
the corresponding k columns of A; with k = 1, 2, . . . , n
Leading (or North-West) minor of order k: A contains the
elements of the first k rows and columns of A
Complement: given aij se A is obtained deleting the row Ai and the
column Aj from A; det(A)is of order (n − 1)
Cofactor of aij : the product (−1)i+j det(A)
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 22 / 79
Laplace rule
Given the square matrix A ∈ Rn×n, its determinant is given by
fix a row index i
det (A) =
n∑
j=1
aijcij ;
fix a column index j
det (A) =
n∑
i=1
aijcij .
where cij is the cofactor of the element aij
The determinant of a matrix of order n is the sum of n determinants of
order n− 1 → recursion
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 23 / 79
Determinant
Given two order n square martices A and B
det (AB) = det (A) det (B) .
If A is triangular det(A) = a11a22 · · · ann
Transposition: det(A) = det(A′)
Block matrices:
det
([
A B
[0] D
])
= det
([
A [0]
B D
])
= det(A) det(D)
with A and D square matrices
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 24 / 79
Rank
The rank of the matrix A ∈ Rm×n is equal to the number of its rows or
columns which are linearly independent
rk(A) ≦ min {m, n}.
The rakn of a matrix A ∈ Rm×n equals the maximum order of its non-null
minors
Theorem
Consider the matrices A ∈ Rm×n and B ∈ Rn×q, then
rk (AB) ≦ min {rk (A) , rk (B)} .
If B = A′, then
rk(AA′) = rk
(A′A
)= rk (A) .
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 25 / 79
Inverse
Given a matrix A ∈ Rn×n, it is invertible (non-singular) if a matrix
A−1 ∈ Rn×n exists such that
AA−1 = I , A−1A = I .
A matrix A is invertible if and only if det (A) 6= 0.
When it exists, the inverse is unique
Remark The inverse of a diagonal, upper triangular, lower triangular
matrix is diagonal, upper triangular, lower triangular, respectively
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 26 / 79
Linear transformations
Definition
A function f : Rn → Rm endowed with the properties
f (x + y) = f (x) + f (y) , ∀x , y ∈ Rn,
f (αx) = αf (x) , ∀α ∈ R,∀x ∈ Rn,
is called linear transformation
A linear transformation f : Rn → Rm can be identified by an m × n
matrix, the coefficient matrix
Given a column vector x ∈ Rn, the productAx is an Rm column vector
x 7→ Ax verifies
{
A (x + y) = Ax + Ay , ∀x , y ∈ Rn,
A (αx) = α (Ax) , ∀α ∈ R,∀x ∈ Rn.
it is a linear transformation Rn ⊇ x 7→ Ax ⊆ Rm.M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 27 / 79
Linear transformation
Consider the linear transformation y = Ax , A ∈ Rm×n
rk(A) = n each x ∈ Rn produces a different y ∈ Rm
rk(A) = m every y ∈ Rm can be obtained by transforming (at least
one) vector x ∈ Rm
rk(A) < n different x ∈ Rn can produce the same y ∈ Rm
rk(A) < m there are some y ∈ Rm which can not be obtained by trans-
forming an x ∈ Rn
Special case: rk(A) = n = m one-to-one transformation
A ∈ Rn×n, y = Ax , x = A−1y
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 28 / 79
Linear transformation
Consider a set of n Rm (column) vectors{A1,A2, . . . ,An
}. The set of all
linear combinations of them is a linear space of dymension k = rk(A),
where A =[A1 | A2 | · · · | An
]∈ Rm×n:
{y ∈ Rm | y = Ax , x ∈ Rn}
is a linear space, it is called the span of the set{A1,A2, . . . ,An
}or the
linear space generated by it.
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 29 / 79
Linear systems
Consider the system of m linear equations and n variables x1, x2, . . . , xn,
a11x1 + a12x2 + · · ·+ a1nxn = b1
a21x1 + a22x2 + · · ·+ a2nxn = b2...
......
am1x1 + am2x2 + · · ·+ amnxn = bm.
Collecting the aij coefficients into the matrix A ∈ Rm×n and the right hand
side terms into the vector b ∈ Rm, the system can be written in the form
Ax = b
Theorem (Rouche–Capelli)
The linear system Ax = b admit solution if and only if rk(A) = rk(A | b).
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 30 / 79
Linear systems
The system Ax = b can be written in the form
[A1 | A2 | · · · | An
]x = b
A1x1 + A2x2 + · · · + Anxn = b
that is:
Is it possible to find n real numbers (x1, x2, . . . , xn) such that the
linear combination of the columns of A is equal to b?
In oter words, does the vector b belong to the span of the column of
A?
The Rouche–Capelli theorem checks exactly this.
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 31 / 79
Linear systems
Consider the system Ax = b, A ∈ Rm×n, with rk(A) = rk(A | b) (i.e. a
solution exists).
rk(A) < m m − rk(A) equations are redundant
rk(A) < n n − rk(A) variables can be moved to the right hand side
rk(A) = m < n there are ∞n−m solutions for every b
rk(A) = m = n Cramerian system: one solution for every b
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 32 / 79
Eigenvalues and eigenvectors
Consider the square matrix A ∈ Rn×n.
There exists a (complex) number λ and a (complex) vector x ∈ Cn,
x 6= [0] such that Ax = λx?
That is, solve the problem
{
Ax = λx
x 6= [0]
In each pair (λ,x) which solves this problem:
λ is an eigenvalue of A
x is an eigenvector of A associated to the eigenvalue λ
The linear transformation acts as a simple scalar, transforming an
eigenvector into a vector proportional to it
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 33 / 79
Eigenvalues and eigenvectors
Ax = λx , x 6= [0] Ax = λIx , x 6= [0]
Ax − λIx = [0], x 6= [0] (A− λI ) x = [0] , x 6= [0]
The last relation, with respect to x is a linear homogeneous system,
therefore it admits non-null solution if and only if its coefficient matrix
(A− λI ) is singular, or its determinant is null:
det (A− λI ) = 0
The left hand side is a polynomial of degree n in λ: characteristic
polynomial of A
The roots (real, complex, single, multiple) of the characteristic polynomial
are the eigenvalues of A
The set of the eigenvalues of A is the spectrum of AM. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 34 / 79
Eigenvalues and eigenvectors
The multiplicity of an eigenvalues as a root of the characteristic
polynomial is said algebraic multiplicity of the eigenvalue
If x is an eigenvector of A associated to the eigenvalue λ, then (αx),
with ∀α 6= 0, is an eigenvector of A, associato a λ as well:
Ax = λx
x 6= [0]
α 6= 0
⇒
{
A (αx) = λ (αx)
(αx) 6= [0] .
A is singular if and only if one of its eigenvalues is null
If A is diagonal o triangular, its diagonal elements are its eigenvalues
det(A) is equal to the product of all the eigenvalues (with their
algebraic multiplicity)
det (A) =
n∏
i=1
λi .
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 35 / 79
Eigenvalues and eigenvectors
Theorem
Eigenvectors associated to different eigenvalues are linearly independent
Theorem
The eigenvalues (and the eigenvectors) of a real symmetric matrix are real
Theorem
A symmetric matrix always has n linearly independent eigenvectors.
It is possible tho choose them to be orthogonal with norm 1
Theorem
If A is not singular and has eigenvalue-eigenvector pairs(λi , x
i),
i = 1, . . . , n, then the pairs(
1λi, x i)
, i = 1, . . . , n, are the
eigenvalue-eigenvector pairs for A−1
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 36 / 79
Diagonalization
Theorem If and only if A has n linearly independent eigenvectors x1, x2,
. . . , xn, then the matrix
X =[
x1 x2 · · · xn]
is such that the product X−1AX is a diagonal matrix, with diagonal
elements equal to the eigenvalues:
D = X−1AX =
λ1 0 · · · 0
0 λ2 · · · 0
· · · · · ·. . . · · ·
0 0 · · · λn
Theorem If A is symmetric, then the column of X can be chosen such
that X−1 = X ′ (orthogonal matrix)M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 37 / 79
Spectral decomposition
Theorem
A symmetric matrix A can be decomposed (by product) as follows
A = λ1
[
x1(x1)′]
+ λ2
[
x2(x2)′]
+ · · ·+ λn
[xn (xn)′
],
where λ1, λ2, . . . , λn are the eigenvalues of A and x1, x2, . . . , xn are
eigenvectors respectively associated to them
An equivalent form is
A = XDX ′.
Each matrix[
x i(x i)′]
has rank 1
Moreover, if A = A′ is non-singular, then
A−1 = (λ1)−1[
x1(x1)′]
+ (λ2)−1[
x2(x2)′]
+ · · ·+ (λn)−1 [xn (xn)′
].
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 38 / 79
Quadratic forms
Let he function q : Rn → R be defined as follows
q (x) = x ′Ax + cx + c0,
where A ∈ Rn×n, c ∈ Rn e c0 ∈ R
The function q is a (complete) quadratic form
The function q can be rewritten as follows
q (x) =
n∑
i=1
n∑
j=1
aijxixj +
n∑
j=1
cjxj + c0.
Whene c0 and c are null, then the function
q (x) = x ′Ax =
n∑
i=1
n∑
j=1
aijxixj
is said homogeneous quadratic form or simply quadratic formM. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 39 / 79
Quadratic forms
Remark A function f : Rn → Rm is homogeneous of degree k if
f (αx) = αk f (x) , ∀α ≧ 0.
A linear transformation is homogeneous of degree 1
A quadratic form is homogeneous of degree 2.
Let q (x) = x ′Ax ,A ∈ Rn×n be a quadratic form.
The quadratic form x ′Bx with B = 12 (A+ A′) is equivalent to q, in fact
x ′Ax = x ′Bx , ∀x ∈ Rn
Moreover, there is a one-to-one relation between quadratic forms and
symmetric matrices
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 40 / 79
Quadratic forms – Classification
A quadratic form is (see examples.sce)
positive definite if
x 6= [0] ⇒ x ′Ax > 0;
negative definite if
x 6= [0] ⇒ x ′Ax < 0;
semi-positive definite if
x 6= [0] ⇒ x ′Ax ≧ 0, ∃x 6= [0] : x ′Ax = 0;
semi-negative definite if
x 6= [0] ⇒ x ′Ax ≦ 0, ∃x 6= [0] : x ′Ax = 0;
indefinite if it can assume both positive and negative values
∃x1, x2 ∈ Rn :(x1)′Ax1 > 0,
(x2)′Ax2 < 0.
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 41 / 79
Quadratic forms – Classification (test 1)
Let A 6= [0] be a symmetric real matrix with eigenvalues λ1, λ2, . . . , λn.
Then
A is positive definite ⇔ λj > 0,∀j ,
A is negative definite ⇔ λj < 0,∀j ,
A is positive semi-definite ⇔ λj ≧ 0,∀j , and ∃h ∈ {1, . . . , n} : λh = 0,
A is negative semi-definite ⇔ λj ≦ 0,∀j , and ∃h ∈ {1, . . . , n} : λh = 0,
A is indefinite ⇔ ∃h, k ∈ {1, . . . , n} : λh > 0, λk < 0.
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 42 / 79
Quadratic forms – Classification (test 2)
Let A 6= [0] be a symmetric real matrix, then A is
positive definite ⇔ all its n leading minors are positive
negative definite ⇔ its n leading minors have signs {−,+,−,+, · · · },
the determinant should be negative
positive semi-definite ⇔ all its (2n − 1) principal minors are ≧ 0, and
det (A) = 0;
negative semi-definite ⇔ its principal minors of order k are ≧ 0 if k is
even, and ≦ 0 if k is odd, and det (A) = 0;
indefinite in all other cases
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 43 / 79
Functions of several variables
Consider the function f : Rn ⊇ X → R. If the limit for t → 0 of the
partial differential ratio
f(x + te i
)− f (x)
t,
is finite, it is the partial derivative of f (x) with respect to xi :
∂f (x)
∂xi= lim
t→0
f(x + te i
)− f (x)
t
The (usually row) vector function ∇f : Rn → Rn which collects the
partial derivatives of f is the gradient of f :
∇f =[
∂f (x)∂x1
∂f (x)∂x2
· · · ∂f (x)∂xn
]
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 44 / 79
Functions of several variables
The partial derivartive of ∂f (x)∂xi
with respect to xj
∂[∂f (x)∂xi
]
∂xj= lim
t→0
∂f (x+te j)∂xi
− ∂f (x)∂xi
t,
when finite, is the 2◦ order partial derivative of f with respect to xi
and∂2f (x)
∂xi∂xj
The square n× n matrix which collects the 2◦ order partial derivatives of f
Hf (x) =
[∂f (x)
∂xi∂xj
]
,
is the Hessian matrix
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 45 / 79
Functions of several variables
A scalar function f (x), defined on X ⊆ Rn, belongs to the C0 class
(f (x) ∈ C0, if f (x) is continuous x ∈ X
It belongs to Ck class, with k ≥ 1 and inger (f (x) ∈ Ck), if f (x) is
continuous ∀x ∈ X , together with all its partial derivatives of order
1, 2, . . . , k .
Theorem (Schwarz)
If the scalar function f (x), with x ∈ X ⊆ Rn belongs to C2, then its
Hessian matris is symmetric:
Hf (x) = (Hf (x))′ .
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 46 / 79
Functions of several variables
Consider a vector function of several variables f : Rn → Rm
x ∈ Rn, f (x) =
f1(x)
f2(x)...
fm(x)
∈ Rm
Partial derivatives are defined on every component
∂fj (x)
∂xi, i = 1, . . . , n, j = 1, . . . ,m
The matrix which collects all the first partial derivative is the Jacobian
matrix
Jf (x) =
∇f1 (x)...
∇fm (x)
=
[∂fi (x)
∂xj
]
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 47 / 79
Functions of several variables
Definition
A function f : R ⊇ X → R is differentiable in x◦ ∈ int(X ) if a number m
exists (depending on f and x◦ only) such that
f (x◦ + h)− f (x◦) = mh + o (h)
where o (h) is infinitesimal of higher order with respect to h (that is
limh→0
o(h)h
= 0)
Definition
A function f : Rn ⊇ X → R is differentiable in x◦ ∈ int(X ) if there exists
a vector a ∈ Rn (depending on f and x◦ only) such that
f (x◦ + h)− f (x◦) = ah+ o (‖h‖)
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 48 / 79
Functions of several variables
Definition
A function f : Rn ⊇ X → Rm is differentiable in x◦ ∈ int(X ) if there
exists a matrix M ∈ Rm×n such that
f (x◦ + h)− f (x◦) = Mh+ o (‖h‖)
Definition
A function is said differentiable on the open set X if it is differentiable
on every point of X
Theorem
If f : Rn ⊇ X → R is differentiable in x◦, then in that point
f is continuous, it admits its n partial derivatives and
f (x◦ + h)− f (x◦) = ah+ o (‖h‖) , with a = ∇f (x◦)M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 49 / 79
Functions of several variables
Theorem (First order Taylor expansion with Peano reminder)
Let X ⊆ Rn be an open set, f : X → R, f ∈ C2 in X and
λx◦, (1− λ)x◦ + h ∈ X , ∀λ ∈ [0, 1]. Then
f (x◦ + h) = f (x◦) +∇f (x◦) h + o (‖h‖) , for h → [0]
Theorem (Second order Taylor expansion with Peano reminder)
Let X ⊆ Rn be an open set, f : X → R, f ∈ C2 in X and
λx◦, (1− λ)x◦ + h ∈ X , ∀λ ∈ [0, 1]. Then
f (x◦ + h) = f (x◦) +∇f (x◦) h+1
2h′Hf (x◦) h + o
(
‖h‖2)
, for h → [0]
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 50 / 79
Unconstrained optimization
Definition
A point x◦ ∈ X is a maximum (minimum) point for f : Rn ⊇ X → R, if
f (x◦) ≧ (≦) f (x), ∀x ∈ X
It is said strong if the unequality is strict ∀x ∈ X and ∀x 6= x◦
Definition
A point x◦ ∈ X is a local maximum (minimum) point for
f : Rn ⊇ X → R, if a neighborohood I (x◦) exists such that
f (x◦) ≧ (≦) f (x), ∀x ∈ X ∩ I (x◦)
It is said strong if the unequality is strict ∀x ∈ X ∩ I (x◦)∀ and x 6= x◦.
minx
f (x) , with f : Rn → R differentiable
For maximization problems, find the minimum of g (x) = −f (x)M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 51 / 79
Unconstrained optimization – Optimality conditions
Theorem (Fermat)
Let f : Rn ⊇ X → R be differentiable in X . If the vector x◦ ∈ int(X ) is a
local maximum or minimum point for f , then
∇f (x◦) = [0]
The point x◦ is said stationary or critical for f .
The conditions (an n-equations, n-unknowns system)
∇f (x◦) = [0]
are said first order (necessary) conditions (FOC)
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 52 / 79
(small digression) Convexity
Definition
A set X ⊆ Rn is convex if ∀x1, x2 ∈ X and ∀λ ∈ [0, 1], then[λx1 + (1− λ) x2
]∈ X
Definition
A function f : X → R, defined on a convex set X , is convex (concave) if
∀x1, x2 ∈ X and ∀λ ∈ [0, 1],
f(λx1 + (1− λ) x2
)≦ λf
(x1)+ (1− λ) f
(x2)
(≧ concave)
Definition
It is strictly convex (concave) if ∀x1, x2 ∈ X e ∀λ ∈ (0, 1)
f(λx1 + (1− λ) x2
)< λf
(x1)+ (1− λ) f
(x2)
(> concave)
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 53 / 79
Unconstrained optimization – Optimality conditions
Theorem
Consider the convex (concave) function f : X → R, with X convex in Rn.
Then
1 Each local minimum (maximum) point is global as well
2 The set of the minimum (maximum) points is convex
3 If f is differentable on the open convex set X , then each stationary
point of f is a global minimum (maximum) point
Remark The FOC conditions are necessary and sufficient for convex
(concave) functions
RemarkIf f is strictly convex (concave), then if x◦ ∈ X is a local
minimum (maximum) point, then it is the unique global strict minimum
(maximum) point
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 54 / 79
Constrained optimization – Equality constraints
Consider the problem
minx
f (x)
sub h(x) =
h1(x)...
hm(x)
= [0]
objective function f : Rn → R (n variables), f ∈ C2
constraint function h : Rn → Rm (m constraints)
The Lagrangian function is
L (x , λ) = f (x)− λh(x) = f (x)−
m∑
i=1
λihi (x)
(x ∈ Rn column, h(x) ∈ Rm column, λ ∈ Rm row)M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 55 / 79
Constrained optimization – First order conditions
Theorem (First order necessary conditions)
Let x◦ be an interior local minimum or maximum point of f constrained to
h(x) = [0] and the Jacobian Jh (x◦) have rank = m. Then a row vector
λ◦ ∈ Rm exists such that (x◦, λ◦) is stationary for L(x , λ)
∇L (x◦, λ◦) =
∂L∂x1
...
∂L∂xn
∂L∂λ1
...
∂L∂λm
x=x◦,λ=λ◦
= [0]
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 56 / 79
Constrained optimization – Optimality conditions
Theorem
Consider the problem
minx
f (x)
sub h(x) =
h1(x)...
hm(x)
= [0]
If the objective function f is convex (concave for maximum problems) and
the constraint function h is linear (that is each hi is linear ∀i = 1, . . . ,m),
then the stationary points of the Lagrangian function solve the problem.
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 57 / 79
Constrained optimization – Examples
Solve
minx
f (x) = (x1)2 + (x2)
2
sub h(x) = x1 + x2 = 10
The Lagrangian function is
L(x1, x2, λ) = (x1)2 + (x2)
2 − λ (x1 + x2 − 10)
The gradient of the Lagrangian is
∇L(x1, x2, λ) = [2x1 − λ, 2x2 − λ, −x1 − x2 + 10]
The FOC are
2x1 − λ = 0
2x2 − λ = 0
−x1 − x2 + 10 = 0
... M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 58 / 79
Constrained optimization – Examples
...
The stationary point for the Lagrangian is (x∗1 , x∗2 , λ
∗) = (5, 5, 10)
level sets (blue
the constraint
(black)
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 59 / 79
Constrained optimization – Examples
Solve
minx
f (x) = (x1 + x2)
sub h(x) = (x1)2 + 2 (x2)
2 − 6 = 0
The Lagrangian function is
L (x , λ) = x1 + x2 − λ(
(x1)2 + 2 (x2)
2 − 6)
,
the first order conditions are
Lx1 = 1− λ2x1 = 0
Lx2 = 1− λ4x2 = 0
Lλ = −(
(x1)2 + 2 (x2)
2 − 6)
= 0
...M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 60 / 79
Constrained optimization – Examples
...
from the first two equations, x1 and x2 cannot be null and
λ = 12x1
= 14x2
, therefore x1 = 2x2,
plug this into the third equation
−(
(2x2)2 + 2 (x2)
2 − 6)
= 0
and obtain the two solutions
(x∗, λ∗) =([2, 1] , 14
), (x∗∗, λ∗∗) =
([−2,−1, ] ,−1
4
)
The value of the function is
f (x∗, λ∗) = f([2, 1] , 14
)= 3, f (x∗∗, λ∗∗) = f
([−2,−1, ] ,−1
4
)= −3
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 61 / 79
Constrained optimization – Examples
minx
(5x1 + 2x2 − x3)
sub :
{
x1x2 − 3 = 0
x1x3 − 1 = 0
The Lagrangian function is
L (x , λ) = 5x1 + 2x2 − x3 − λ1 (x1x2 − 3)− λ2 (x1x3 − 1)
and the system ∇L (x , λ) = [0]:
Lx1 = 5− λ1x2 − λ2x3 = 0
Lx2 = 2− λ1x1 = 0
Lx3 = −1− λ2x1 = 0
Lλ1= 3− x1x2 = 0
Lλ2= 1− x1x3 = 0,
... M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 62 / 79
Constrained optimization – Examples
...
Its solutions (the Lagrangian’s stationary points) are
(x∗, λ∗) =([1, 3, 1]′ , [2,−1]
), (x∗∗, λ∗∗) =
([−1,−3,−1]′ , [−2, 1]
)
The values of the function in the stationary points are
f (x∗, λ∗) = f([1, 3, 1]′ , [2,−1]
)= 10,
f (x∗∗, λ∗∗) = f([−1,−3,−1]′ , [−2, 1]
)= −10
Remark Despite the function’s values, it’s possible to show that (x∗, λ∗) is
a constrained local minimum, whereas (x∗∗, λ∗∗) is a constrained local
maximum
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 63 / 79
Constrained optimization – Examples
Find the maximum and minimum of
f (x) = (x1)2 − x1x2 + (x2)
2
subject to
h (x) = (x1)2 + (x2)
2 − 1 = 0
The Lagrangian function is
L (x , λ) = (x1)2 − x1x2 + (x2)
2 − λ(
(x1)2 + (x2)
2 − 1)
therefore, the FOC are
Lx1 = 2x1 − x2 − 2λx1 = 0
Lx2 = −x1 + 2x2 − 2λx2 = 0
Lλ = 1− (x1)2 − (x2)
2 = 0.
...M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 64 / 79
Constrained optimization – Examples
...
From the first two equations, we get{
2 (1− λ) x1 − x2 = 0
x1 − 2 (1− λ) x2 = 0,
The solutions where x = [0] do not verify the constraint
Remark that the system is linear and homogeneous in x , therefore, we have
non-null solution when the determinant of the coefficient matrix is null
det
([
2 (1− λ) −1
1 −2 (1− λ)
])
= 4λ2 − 8λ+ 3 = 0 ⇒ λ1 =12 , λ2 =
32
with λ1 =12 ⇒ x1 = x2 and substituting into the constraint, we obtain
x1 = x2 = ±√
12
and with λ2 =32 ⇒ x1 = −x2, we obtain x1 = ±
√12 , x2 = −x1
... M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 65 / 79
Constrained optimization – Examples
...
the 4 stationary points:
a =
[√12 ,
√12
]′, b = −a, c =
[√12 ,−
√12
]′, d = −c .
The objective function can be written as the quadratic form associated to
the positive definite symmetric matrix (eigenvalues 12 ,
32)
[
1 −12
−12 1
]
The values of the function in the four stationary points are
f (a) = f (b) = 12 , f (c) = f (d) = 3
2 ,
therfeore a and b are minimum points and c e d are maximum pointsM. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 66 / 79
Constrained optimization – Examples
The level sets are ellipses and the constraint defines a circle
the constraint (black), the level sets, critical points
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 67 / 79
Constrained optimization – Sensitivity analysis example
Consider the Cobb Douglas production function Q(L,K ) = 20L12K
12 , with
a unitary cost of labour L and capital K of 10 and 4 respectively; the
available budget is 200
Maximize the production under the budget constraint
maxL,K
20L12K
12
sub 10L + 4K = 200
The Lagrangian function is
L(L,K , λ) = 20L12K
12 − λ(10L + 4K − 200)
and the FOC
10L−12K
12 = 10λ
10L12K− 1
2 = 4λ
10L+ 4K = 200
...M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 68 / 79
Constrained optimization – Sensitivity analysis example
...
dividing the first equation by the second one, we get
10L−12K
12
10L12K− 1
2
=10λ
4λ⇒
K
L=
5
2, ⇒ K =
5
2L
substituting this relation in the constraint, we obtain L∗ = 10, K ∗ = 25
and using the first to solve for λ, we get λ∗ = 5√10
Condider the problem where the budget is b, the optimal value of the
Lagrange multiplier is
λ∗ =dQ∗
db
that is
dQ∗ = λ∗db, ∆Q∗ ≃ λ∗∆b (locally)
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 69 / 79
Quadratic programming
Consider the problem
minx
1
2x ′Qx + p′x quadratic objective function
s. t. C (eq)x = b(eq) m linear equality constraints
Cx ≦ b s linear unequality constraints
l ≦ x ≦ u lower and upper bounds
where:
x ∈ Rn, Q ∈ Rn×n, p ∈ Rn, C (eq) ∈ Rn×m, C ∈ Rn×s
b(eq) ∈ Rm, b ∈ Rs , l ∈ Rn, u ∈ Rn
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 70 / 79
Quadratic programming – Example (vector notation)
minx
1
2x ′Qx
sub u′x = 1
where
x , u ∈ Rn (column), Q ∈ Rn×n, Q = Q ′, positive definite
The Lagrangian function is
L(x , λ) =1
2x ′Qx − λ(u′x − 1)
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 71 / 79
Quadratic programming – Example (vector notation)
Remark The gradient of the linear function f (x) = u′x is ∇f (x) = u′
Remark The gradient of the homogeneous quadratic form f (x) = x ′Qx
with Q symmetric, is ∇f (x) = 2x ′Q
The gradient of the Lagrangian is
∇L(x , λ) =[x ′Q − λu′, −u′x + 1
]
FOC[x ′Q − λu′, −u′x + 1
]= [0, . . . , 0]
[(x ′Q − λu′
)′, −u′x + 1
]
= [[0], 0][
Qx − λu
−u′x
]
=
[
[0]
−1
]
...M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 72 / 79
Quadratic programming – Example (vector notation)
...
Rewrite FOC in this way
[
Qx − λu
−u′x
]
=
[
[0]
−1
]
=
[
Q −u
−u′ 0
] [
x
λ
]
=
[
[0]
−1
]
If the matrix Q is invertible, then the matrix
[
Q −u
−u′ 0
]
is invertible as
well, therefore the stationary point is
[
x∗
λ∗
]
=
[
Q −u
−u′ 0
]−1 [
[0]
−1
]
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 73 / 79
Modelling examples
Collect the n realizations xi , i = 1, . . . , n of the random variable X in the
vector x ∈ Rn
Sample mean E [X ] = 1n
∑ni=1 xi =
1n1′x , where 1 is the column unitary
vector
Sample variance
Var(x) =1
n
n∑
i=1
(xi − E [X ])2 =1
n(x − E [X ]1)′ (x − E [X ]1) =
=1
n
(x ′x − x ′E [X ]1− E [X ]1′x + E [X ]21′1
)=
=1
n
(x ′x − 2E [X ]nE [X ] + nE [X ]2
)=
1
nx ′x − E [X ]2 =
1
nx ′x −
(1
n1′x
)2
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 74 / 79
Modelling examples
Let x , y ∈ Rn collect the n realizations of the random variables X and Y
Sample covariance betwen X and Y
Cov(X ,Y ) = E [(X − E [X ]) (Y − E [Y ])] =1
n
n∑
i=1
(xi − E [X ]) (yi − E [Y ]) =
= (x − E [X ]1)′ (y − E [Y ]1)
=1
n
(x ′y − E [Y ]x ′1− E [X ]1′y + E [X ]E [Y ]1′1
)
=1
n
(x ′y − E [Y ]nE [X ]− E [X ]nE [Y ] + nE [X ]E [Y ]
)=
1
nx ′y − E [X ]E [Y ]
=1
nx ′y −
1
n2(1′x)(1′y)
Remark that Cov(X ,Y ) = Cov(Y ,X )
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 75 / 79
Modelling examples
Linear regression
yi = β0 + β1x1i + β2x
2i + · · · + βnx
ni + εi , i = 1, . . . ,T
observations yi , xji , i = 1, . . . ,T , j = 1, . . . , n
parameters βj , j = 1, . . . , n
errors εi , i = 1, . . . ,T , mean zero: 1T
∑Ti=1 εi = 0
Matrix notation
Y =
y1...
yT
, X =
x11 . . . xn1...
. . ....
x1T . . . xnt
, β =
β1...
βn
, ε =
ε1...
εT
Y = β01+ Xβ + ε
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 76 / 79
Modelling examples
A more common matrix notation
Y =
y1...
yT
, X =
1 x11 . . . xn1...
.... . .
...
1 x1T . . . xnt
, β =
β0
β1...
βn
, ε =
ε1...
εT
Y = Xβ + ε
Minimize the sum of the squares of the errors
T∑
i=1
ε2i = ε′ε = ‖ε‖2 = ε′Iε
Sample errors
ε︸︷︷︸
erors
= Y︸︷︷︸
reality
− Xβ︸︷︷︸
model
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 77 / 79
Modelling examples
The objective function to be minimized is
ε′ε = (Y − Xβ)′(Y − Xβ) = Y ′Y − Y ′Xβ − β′X ′Y + β′X ′Xβ =
= Y ′Y − 2Y ′Xβ + β′X ′Xβ
complete quadratic form in β
The matrix X ′X ∈ R(n+1)×(n+1) is
1 symmetric (X ′X )′ = X ′X
2 positive definite or semi-definite:
q(z) = z ′(X ′X
)z =
(z ′X ′) (Xz) = ‖Xz‖2 ≧ 0,∀z ∈ Rn+1
3 rk (X ′X ) = rk(X )
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 78 / 79
Modelling examples
The gradient of the objective function is
2β′ (X ′X)− 2Y ′X
The FOC are(X ′X
)β = X ′Y
If X ′X is invertible (⇔ rk(X ) = n + 1), then
β∗ =(X ′X
)−1X ′Y
This β∗ is commonly known as ordinary least square (OLS) parameter
estimate in linear regression models
M. Maggi (MIBE) Mathematical Methods for Business and Economics a.a. 2010/2011 79 / 79