









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Visit through the matrix algebra final notes to see all methods formulas and solving the given problems
Typology: Cheat Sheet
1 / 16
This page cannot be seen from the preview
Don't miss anything!
A matrix is a way of organizing information.
It is a rectangular array of elements arranged in rows and columns. For example, the following matrix
A has m rows and n columns.
All elements can be identified by a typical element a^ ij , where i=1,2,…,m denotes rows and j=1,2,…,n
denotes columns.
A matrix is of order (or dimension) m by n (also denoted as (m x n)).
A matrix that has a single column is called a column vector.
A matrix that has a single row is called a row vector.
The transpose of a matrix or vector is formed by interchanging the rows and the columns. A matrix of
order (m x n) becomes of order (n x m) when transposed.
For example, if a (2 x 3) matrix is defined by
Then the transpose of A, denoted by A’, is now (3 x 2)
m m m mn
n
n
n
a a a a
a a a a
a a a a
a a a a
1 2 3
31 32 33 3
21 22 23 2
11 12 13 1
21 22 23
11 12 13
a a a
a a a A
13 23
12 22
11 21
a a
a a
a a
A
has the same number of rows as it has columns, and the off-diagonal elements are symmetric (i.e.
a (^) ij = aji foralliand j ).
For example,
A special case is the identity matrix , which has 1’s on the diagonal positions and 0’s on the off-
diagonal positions.
The identity matrix is a diagonal matrix , which can be denoted by diag ( a 1 , a 2 ,..., an ), where a (^) i is the
i
th element on the diagonal position and zeros occur elsewhere. So, we can write the identity matrix as
Matrices can be added and subtracted as long as they are of the same dimension. The addition of
matrix A and matrix B is the addition of the corresponding elements of A and B. So, C = A + B
implies that cij = aij + bij for all i and j.
For example, if
Then
A matrix with elements all zero is called a null matrix.
The trace of a square matrix A, denoted by tr(A), is defined to be the sum of its diagonal elements.
tr ( A )= a 11 + a 22 + a 33 +...+ a nn
= =
n
i
n
j
tr AA trAA aij 1 1
2 ( ) ( ).
The determinant of a square matrix A, denoted by det(A) or A , is a uniquely defined scalar number
associated with the matrix.
i) for a single element matrix (a scalar, A = a 11 ), det(A) = a 11.
ii) in the (2 x 2) case,
21 22
11 12
a a
a a A
the determinant is defined to be the difference of two terms as follows,
A = a 11 a 22 − a 12 a 21
which is obtained by multiplying the two elements in the principal diagonal of A and then subtracting
the product of the two off-diagonal elements.
iii) in the (3 x 3) case,
31 32 33
21 22 23
11 12 13
a a a
a a a
a a a
31 32
21 22 13 31 33
21 23 12 32 33
22 23 11 a a
a a a a a
a a a a a
a a A = a − +
iv) for general cases, we start first by defining the minor of element a^ ij as the determinant of the
submatrix of A that arises when the i
th row and the j
th column are deleted and is usually denoted as
A ij. The cofactor of the element aij^ is (^) ij
i j c (^) ij A
= ( − 1 ). Finally, the determinant of an n x n matrix
can be defined as
A ac foranyrowi n
n
j
ij ij^1 ,^2 ,..., 1
ac foranycolumnj n
n
i
ij ij^1 ,^2 ,..., 1
b d
a c k kb d
ka c
b kd
a kc = =
n = , for scalar k and n x n matrix A.
is zero, e.g.
= = k ( ab − ab )= 0 b b
a a k b kb
a ka
Rank and linear dependency are key concepts for econometrics. The rank of any (m x n) matrix can be
defined (i.e., the matrix does not need to be square, as was the case for the determinant and trace) and
is inherently linked to the invertibility of the matrix.
The rank of a matrix A is equal to the dimension of the largest square submatrix of A that has a
nonzero determinant. A matrix is said to be of rank r if and only if it has at least one submatrix of
order r with a nonzero determinant but has no submatrices of order greater than r with nonzero
determinants.
For example, the matrix
Furthermore, the maximum number of linearly independent (m x 1) vectors is m. For example,
consider the following set of two linearly independent vectors,
If there is a third vector,
2
1
b
b b
where b 1 (^) andb 2 can be any numbers, then the three unknown scalars c 1 (^) , c 2 , andc 3 can always be
found by solving the following set of equations,
2
1 1 2 3 b
b c c c.
In other words, the addition of any third vector will result in a (2 x 3) matrix that is not of full rank and
therefore not invertible.
Generally speaking, this is because any set of m linearly independent (m x 1) vectors are said to span
m-dimensional space. This means, by definition, that any (m x 1) vector can be represented as a linear
combination of the m vectors that span the space. The set of m vectors therefore is also said to form a
basis for m-dimensional space.
only if A is singular.
There are operations on the rows/columns of a matrix that leave its rank unchanged:
The inverse of a nonsingular (n x n) matrix A is another (n x n) matrix, denoted by A
the following equalities: A A = AA = I
− 1 − 1
. The inverse of a nonsingular (n x n) matrix is unique.
The inverse of a matrix A in terms of its elements can be obtained from the following formula:
i j whereC cij and cij A A
− + = −
1
Note that C’ is the transpose of the matrix of cofactors of A as defined in the section on determinants.
C’ is also called the adjoint of A.
For example, let
det(A) = -2 and the cofactors are c 11 (^) = 4 , c 22 = 1 , c 12 =− 3 , c 21 =− 2. So, the inverse is calculated as,
−
− 1
− 1 − 1 ( )
1 1 ′ = ′
− − A A
− 1 A is nonsingular.
1 1 1 ( )
− − − AB = B A.
Consider the following system of linear equations: Ax = b where A is a (m x n) matrix of known
coefficients, x is a (n x 1) vector of unknown variables, and b is a (m x 1) vector of known coefficients.
We want to find the conditions under which: 1) the system has no solution, 2) the system has infinitely
many solutions, 3) the system has a unique solution. Define the matrix A|b as the augmented matrix of
A. This is just the matrix A with the b vector attached on the end. The dimension of A|b is therefore
(m x (n+1)).
Succinctly put, the conditions for the three types of solutions are as follows. (Note: there are numerous
ways of characterizing the solutions, but we will stick to the simplest representation):
Let’s look at examples for each case.
Case 1: No Solution
Intuition: if rank(A|b) > rank(A), then b is not in the space spanned by A; so b cannot be represented as
a linear combination of A; so there is no x that solves (Ax = b); so there is no solution.
Case 3: Unique Solution
Intuition: if rank(A|b) = rank(A), then b is in the space spanned by A; so b can be represented as a
linear combination of A; so there exists an x that solves (Ax = b). Because rank(A) = n, there are equal
numbers of variables and equations. This gives us no “free variables” and therefore a single solution.
Consider the following system,
1 2
1 2
1 2
2
1
x x
x x
x x
or x
x
rank
rank because and
So, rank(A|b) = rank(A) = 2 = n < m. There is full column rank, and the system can be uniquely
solved. In fact, any two independent equations can be used to solve for the x’s. The solution is
x 1 = 2 , x 2 = 1.
In econometrics, we often deal with square matrices, so the following is important for us:
− 1
Let A be an (M x N) matrix and B be a (K x L) matrix. Then the Kronecker product (or direct
product) of A and B, written as A ⊗ B , is defined as the (MK x NL) matrix
a B a B a B
a B a B a B
a B a B a B
M M MN
N
N
1 2
21 22 2
11 12 1
For example if
Their Kronecker product is
A and B
Note that
1 1 1 ( )
− − − A ⊗ B = A ⊗ B
In least squares and maximum likelihood estimation, we need to take derivatives of the objective
function with respect to a vector of parameters.
Let a function relating y, a scalar, to a set of variables x (^) 1 , x 2 ,K, xn be y = f ( x 1 , x 2 ,K, xn )or
The gradient of y is the derivatives of y with respect to each element of x as follows
∂
∂
∂
∂
∂
∂
x n
y
x
y
x
y
y
M
2
1
x
Notice the matrix of derivatives of y is a column vector because y is differentiated with respect to x, an
(n x 1) column vector.
The same operations can be extended to derivatives of an (m x n) matrix X, such as
Based on the previous definitions, the rules of derivatives in matrix notation can be established for
reference. Consider the following function z = c ′ x , where c is a (n x 1) vector and does not depend on
x , and x is an (n x 1) vector, and z is a scalar. Then
c
∂
∂
∂
∂
∂
∂
x n
z
x
z
x
z
c
c
c
z
n
2
1
2
1
x x
If z = C ′ x , where C is an (n x n) matrix and x is an (n x 1) vector, then
c 1 c 2 cn
x L x x
z
where c (^) iis the i
th column (remember c is a vector) of C.
x x x
x x A (A A)
z
x x
. The proof of this result is given in the appendix.
If A is a symmetric matrix (A = A’), then
x
x x 2 A
x
For the second derivatives for any square matrix A,
2 = + ′ ∂∂ ′
x
(x x)
x
and if A = A’ (if A is symmetric), then
∂∂ ′
x
(x x)
x
Some other rules (x is a scalar, unless noted otherwise):
x y = ′ ∂
, where x and y are (n x 1) column vectors and B is an (n x n) matrix
∂
1 ( ) A
− = ′ ∂
1 ( ) A
ln (^) − = ′ ∂
x x
x
1 1
1
x
− −
−
⎟ ⎠
x
Since this review was by no means complete, if you want to learn more about matrix algebra, the
following are good references:
Anton, Howard (1994), Elementary Linear Algebra , 7
th edition, New York: John Wiley & Sons.
The math behind it all. Check out chapters 1, 2, 5.6.
Judge, George G., R. Carter Hill, William E. Griffiths, Helmut Lutkepohl, and Tsoung-Chao Lee
(1988), Introduction to the Theory and Practice of Econometrics , 2
nd Edition, New York: John Wiley & Sons, Appendix A.
These notes follow the Appendix fairly closely.
Leon, Steven J. (1994), Linear Algebra with Applications , 4
th edition, New Jersey: Prentice Hall.
Simon, Carl P. and Lawrence Blume (1994), Mathematics for Economists , New York: W.W. Norton.
Look at chapters 6 – 9, & 26.
or
=(A ′+A) x ∂
x
z
If A is symmetric, then a (^) ij = aji for all i,j , so
= 2 A x ∂
x
z
This also holds if n = n +1, so, by induction, the result holds for any (n x n) matrix.