


Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Great and schematic linear algebra cheat sheet
Typology: Cheat Sheet
1 / 4
This page cannot be seen from the preview
Don't miss anything!
Abstract—This document will review the fundamental ideas of linear algebra. We will learn about matrices, matrix operations, linear transformations and discuss both the theoretical and computational aspects of linear algebra. The tools of linear algebra open the gateway to the study of more advanced mathematics. A lot of knowledge buzz awaits you if you choose to follow the path of understanding, instead of trying to memorize a bunch of formulas.
I. INTRODUCTION Linear algebra is the math of vectors and matrices. Let n be a positive integer and let R denote the set of real numbers, then Rn^ is the set of all n-tuples of real numbers. A vector ~v ∈ Rn^ is an n-tuple of real numbers. The notation “∈ S” is read “element of S.” For example, consider a vector that has three components:
~v = (v 1 , v 2 , v 3 ) ∈ (R, R, R) ≡ R^3.
A matrix A ∈ Rm×n^ is a rectangular array of real numbers with m rows and n columns. For example, a 3 × 2 matrix looks like this:
a 11 a 12 a 21 a 22 a 31 a 32
The purpose of this document is to introduce you to the mathematical operations that we can perform on vectors and matrices and to give you a feel of the power of linear algebra. Many problems in science, business, and technology can be described in terms of vectors and matrices so it is important that you understand how to work with these.
Prerequisites
The only prerequisite for this tutorial is a basic understanding of high school math concepts^1 like numbers, variables, equations, and the fundamental arithmetic operations on real numbers: addition (denoted +), subtraction (denoted −), multiplication (denoted implicitly), and division (fractions). You should also be familiar with functions that take real numbers as inputs and give real numbers as outputs, f : R → R. Recall that, by definition, the inverse function f −^1 undoes the effect of f. If you are given f (x) and you want to find x, you can use the inverse function as follows: f −^1 (f (x)) = x. For example, the function f (x) = ln(x) has the inverse f −^1 (x) = ex, and the inverse of g(x) =
x is g−^1 (x) = x^2.
II. DEFINITIONS
A. Vector operations
We now define the math operations for vectors. The operations we can perform on vectors ~u = (u 1 , u 2 , u 3 ) and ~v = (v 1 , v 2 , v 3 ) are: addition, subtraction, scaling, norm (length), dot product, and cross product:
~u + ~v = (u 1 + v 1 , u 2 + v 2 , u 3 + v 3 ) ~u − ~v = (u 1 − v 1 , u 2 − v 2 , u 3 − v 3 ) α~u = (αu 1 , αu 2 , αu 3 )
||~u|| =
q u^21 + u^22 + u^23 ~u · ~v = u 1 v 1 + u 2 v 2 + u 3 v 3 ~u × ~v = (u 2 v 3 − u 3 v 2 , u 3 v 1 − u 1 v 3 , u 1 v 2 − u 2 v 1 )
The dot product and the cross product of two vectors can also be described in terms of the angle θ between the two vectors. The formula for the dot product of the vectors is ~u · ~v = ‖~u‖‖~v‖ cos θ. We say two vectors ~u and ~v are orthogonal if the angle between them is 90 ◦. The dot product of orthogonal vectors is zero: ~u · ~v = ‖~u‖‖~v‖ cos(90◦) = 0. The norm of the cross product is given by ‖~u × ~v‖ = ‖~u‖‖~v‖ sin θ. The cross product is not commutative: ~u × ~v 6 = ~v × ~u, in fact ~u × ~v = −~v × ~u.
(^1) A good textbook to (re)learn high school math is minireference.com
B. Matrix operations We denote by A the matrix as a whole and refer to its entries as aij. The mathematical operations defined for matrices are the following:
C = A + B ⇔ cij = aij + bij.
is another matrix C ∈ Rm×
^ given by the formulaC = AB ⇔ cij =
X^ n
k=
aik bkj , 2 4
a 11 a 12 a 21 a 22 a 31 a 32
b 11 b 12 b 21 b 22
a 11 b 11 + a 12 b 21 a 11 b 12 + a 12 b 22 a 21 b 11 + a 22 b 21 a 21 b 12 + a 22 b 22 a 31 b 11 + a 32 b 21 a 31 b 12 + a 32 b 22
α 1 β 1 α 2 β 2 α 3 β 3
Pn i=1 aii
C. Matrix-vector product The matrix-vector product is an important special case of the matrix- matrix product. The product of a 3 × 2 matrix A and the 2 × 1 column vector ~x results in a 3 × 1 vector ~y given by:
~y = A~x ⇔
y 1 y 2 y 3
a 11 a 12 a 21 a 22 a 31 a 32
x 1 x 2
a 11 x 1 + a 12 x 2 a 21 x 1 + a 22 x 2 a 31 x 1 + a 32 x 2
= x 1
a 11 a 21 a 31
(^5) +x 2
a 12 a 22 a 32
(a 11 , a 12 ) · ~x (a 21 , a 22 ) · ~x (a 31 , a 32 ) · ~x
There are two^2 fundamentally different yet equivalent ways to interpret the matrix-vector product. In the column picture, (C), the multiplication of the matrix A by the vector ~x produces a linear combination of the columns of the matrix: ~y = A~x = x 1 A[:,1] + x 2 A[:,2], where A[:,1] and A[:,2] are the first and second columns of the matrix A. In the row picture, (R), multiplication of the matrix A by the vector ~x produces a column vector with coefficients equal to the dot products of rows of the matrix with the vector ~x.
D. Linear transformations The matrix-vector product is used to define the notion of a linear transformation, which is one of the key notions in the study of linear algebra. Multiplication by a matrix A ∈ Rm×n^ can be thought of as computing a linear transformation TA that takes n-vectors as inputs and produces m-vectors as outputs:
TA : Rn^ → Rm. (^2) For more info see the video of Prof. Strang’s MIT lecture: bit.ly/10vmKcL
1
Instead of writing ~y = TA(~x) for the linear transformation TA applied to the vector ~x, we simply write ~y = A~x. Applying the linear transformation TA to the vector ~x corresponds to the product of the matrix A and the column vector ~x. We say TA is represented by the matrix A. You can think of linear transformations as “vector functions” and describe their properties in analogy with the regular functions you are familiar with:
function f : R → R ⇔ linear transformation TA : Rn^ → Rm input x ∈ R ⇔ input ~x ∈ Rn output f (x) ⇔ output TA(~x) = A~x ∈ Rm g ◦ f = g(f (x)) ⇔ TB (TA(~x)) = BA~x function inverse f −^1 ⇔ matrix inverse A−^1 zeros of f ⇔ N (A) ≡ null space of A range of f ⇔ C(A) ≡ column space of A = range of TA
Note that the combined effect of applying the transformation TA followed by TB on the input vector ~x is equivalent to the matrix product BA~x.
E. Fundamental vector spaces
A vector space consists of a set of vectors and all linear combinations of these vectors. For example the vector space S = span{~v 1 , ~v 2 } consists of all vectors of the form ~v = α~v 1 + β~v 2 , where α and β are real numbers. We now define three fundamental vector spaces associated with a matrix A. The column space of a matrix A is the set of vectors that can be produced as linear combinations of the columns of the matrix A:
C(A) ≡ {~y ∈ Rm^ | ~y = A~x for some ~x ∈ Rn}.
The column space is the range of the linear transformation TA (the set of possible outputs). You can convince yourself of this fact by reviewing the definition of the matrix-vector product in the column picture (C). The vector A~x contains x 1 times the 1st^ column of A, x 2 times the 2nd^ column of A, etc. Varying over all possible inputs ~x, we obtain all possible linear combinations of the columns of A, hence the name “column space.” The null space N (A) of a matrix A ∈ Rm×n^ consists of all the vectors that the matrix A sends to the zero vector:
N (A) ≡
~x ∈ Rn^ | A~x = ~ 0
The vectors in the null space are orthogonal to all the rows of the matrix. We can see this from the row picture (R): the output vectors is ~ 0 if and only if the input vector ~x is orthogonal to all the rows of A. The row space of a matrix A, denoted R(A), is the set of linear combinations of the rows of A. The row space R(A) is the orthogonal complement of the null space N (A). This means that for all vectors ~v ∈ R(A) and all vectors w~ ∈ N (A), we have ~v · w~ = 0. Together, the null space and the row space form the domain of the transformation TA, Rn^ = N (A) ⊕ R(A), where ⊕ stands for orthogonal direct sum.
F. Matrix inverse
By definition, the inverse matrix A−^1 undoes the effects of the matrix A. The cumulative effect of applying A−^1 after A is the identity matrix 1 :
.. . 0 1
The identity matrix (ones on the diagonal and zeros everywhere else) corresponds to the identity transformation: T 1 (~x) = 1 ~x = ~x, for all ~x. The matrix inverse is useful for solving matrix equations. Whenever we want to get rid of the matrix A in some matrix equation, we can “hit” A with its inverse A−^1 to make it disappear. For example, to solve for the matrix X in the equation XA = B, multiply both sides of the equation by A−^1 from the right: X = BA−^1. To solve for X in ABCXD = E, multiply both sides of the equation by D−^1 on the right and by A−^1 , B−^1 and C−^1 (in that order) from the left: X = C−^1 B−^1 A−^1 ED−^1.
Okay, I hear what you are saying “Dude, enough with the theory talk, let’s see some calculations.” In this section we’ll look at one of the fundamental algorithms of linear algebra called Gauss–Jordan elimination.
A. Solving systems of equations Suppose we’re asked to solve the following system of equations: 1 x 1 + 2x 2 = 5, 3 x 1 + 9x 2 = 21.
Without a knowledge of linear algebra, we could use substitution, elimina- tion, or subtraction to find the values of the two unknowns x 1 and x 2. Gauss–Jordan elimination is a systematic procedure for solving systems of equations based the following row operations: α) Adding a multiple of one row to another row β) Swapping two rows γ) Multiplying a row by a constant These row operations allow us to simplify the system of equations without changing their solution. To illustrate the Gauss–Jordan elimination procedure, we’ll now show the sequence of row operations required to solve the system of linear equations described above. We start by constructing an augmented matrix as follows: » 1 2 5 3 9 21
The first column in the augmented matrix corresponds to the coefficients of the variable x 1 , the second column corresponds to the coefficients of x 2 , and the third column contains the constants from the right-hand side. The Gauss-Jordan elimination procedure consists of two phases. During the first phase, we proceed left-to-right by choosing a row with a leading one in the leftmost column (called a pivot) and systematically subtracting that row from all rows below it to get zeros below in the entire column. In the second phase, we start with the rightmost pivot and use it to eliminate all the numbers above it in the same column. Let’s see this in action.
The matrix is now in reduced row echelon form (RREF), which is its “simplest” form it could be in. The solutions are: x 1 = 1, x 2 = 2.
B. Systems of equations as matrix equations We will now discuss another approach for solving the system of equations. Using the definition of the matrix-vector product, we can express this system of equations (1) as a matrix equation: » 1 2 3 9
x 1 x 2
This matrix equation had the form A~x = ~b, where A is a 2 × 2 matrix, ~x is the vector of unknowns, and ~b is a vector of constants. We can solve for ~x by multiplying both sides of the equation by the matrix inverse A−^1 :
A−^1 A~x = 1 ~x =
x 1 x 2
= A−^1 ~b =
But how did we know what the inverse matrix A−^1 is?
C. Dimension and bases for vector spaces
The dimension of a vector space is defined as the number of vectors in a basis for that vector space. Consider the following vector space S = span{(1, 0 , 0), (0, 1 , 0), (1, 1 , 0)}. Seeing that the space is described by three vectors, we might think that S is 3 -dimensional. This is not the case, however, since the three vectors are not linearly independent so they don’t form a basis for S. Two vectors are sufficient to describe any vector in S; we can write S = span{(1, 0 , 0), (0, 1 , 0)}, and we see these two vectors are linearly independent so they form a basis and dim(S) = 2. There is a general procedure for finding a basis for a vector space. Suppose you are given a description of a vector space in terms of m vectors V = span{~v 1 , ~v 2 ,... , ~vm} and you are asked to find a basis for V and the dimension of V. To find a basis for V, you must find a set of linearly independent vectors that span V. We can use the Gauss–Jordan elimination procedure to accomplish this task. Write the vectors ~vi as the rows of a matrix M. The vector space V corresponds to the row space of the matrix M. Next, use row operations to find the reduced row echelon form (RREF) of the matrix M. Since row operations do not change the row space of the matrix, the row space of reduced row echelon form of the matrix M is the same as the row space of the original set of vectors. The nonzero rows in the RREF of the matrix form a basis for vector space V and the numbers of nonzero rows is the dimension of V.
D. Row space, columns space, and rank of a matrix
Recall the fundamental vector spaces for matrices that we defined in Section II-E: the column space C(A), the null space N (A), and the row space R(A). A standard linear algebra exam question is to give you a certain matrix A and ask you to find the dimension and a basis for each of its fundamental spaces. In the previous section we described a procedure based on Gauss–Jordan elimination which can be used “distill” a set of linearly independent vectors which form a basis for the row space R(A). We will now illustrate this procedure with an example, and also show how to use the RREF of the matrix A to find bases for C(A) and N (A). Consider the following matrix and its reduced row echelon form:
(^5) rref(A) =
The reduced row echelon form of the matrix A contains three pivots. The locations of the pivots will play an important role in the following steps. The vectors {(1, 3 , 0 , 0), (0, 0 , 1 , 0), (0, 0 , 0 , 1)} form a basis for R(A). To find a basis for the column space C(A) of the matrix A we need to find which of the columns of A are linearly independent. We can do this by identifying the columns which contain the leading ones in rref(A). The corresponding columns in the original matrix form a basis for the column space of A. Looking at rref(A) we see the first, third, and fourth columns of the matrix are linearly independent so the vectors {(1, 2 , 3)T, (3, 7 , 9)T, (3, 6 , 10)T} form a basis for C(A). Now let’s find a basis for the null space, N (A) ≡ {~x ∈ R^4 | A~x = ~ 0 }. The second column does not contain a pivot, therefore it corresponds to a free variable, which we will denote s. We are looking for a vector with three unknowns and one free variable (x 1 , s, x 3 , x 4 )T^ that obeys the conditions:
2 4
1 3 0 0 0 0 1 0 0 0 0 1
3 5
2 (^66) 4
x 1 s x 3 x 4
3 (^77) 5 =
2 4
0 0 0
3 (^5) ⇒
1 x 1 + 3s = 0 1 x 3 = 0 1 x 4 = 0
Let’s express the unknowns x 1 , x 3 , and x 4 in terms of the free variable s. We immediately see that x 3 = 0 and x 4 = 0, and we can write x 1 = − 3 s. Therefore, any vector of the form (− 3 s, s, 0 , 0), for any s ∈ R, is in the null space of A. We write N (A) = span{(− 3 , 1 , 0 , 0)T}. Observe that the dim(C(A)) = dim(R(A)) = 3, this is known as the rank of the matrix A. Also, dim(R(A)) + dim(N (A)) = 3 + 1 = 4, which is the dimension of the input space of the linear transformation TA.
E. Invertible matrix theorem There is an important distinction between matrices that are invertible and those that are not as formalized by the following theorem. Theorem. For an n × n matrix A, the following statements are equivalent:
F. Determinants The determinant of a matrix, denoted det(A) or |A|, is a special way to combine the entries of a matrix that serves to check if a matrix is invertible or not. The determinant formulas for 2 × 2 and 3 × 3 matrices are ˛ ˛˛ ˛
a 11 a 12 a 21 a 22
˛ =^ a^11 a^22 −^ a^12 a^21 ,^ and ˛˛ ˛˛ ˛˛
a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33
˛˛ =^ a^11
a 22 a 23 a 32 a 33
˛ −^ a^12
a 21 a 23 a 31 a 33
˛ +^ a^13
a 21 a 22 a 31 a 32
If the |A| = 0 then A is not invertible. If |A| 6 = 0 then A is invertible.
G. Eigenvalues and eigenvectors The set of eigenvectors of a matrix is a special set of input vectors for which the action of the matrix is described as a simple scaling. When a matrix is multiplied by one of its eigenvectors the output is the same eigenvector multiplied by a constant A~eλ = λ~eλ. The constant λ is called an eigenvalue of A. To find the eigenvalues of a matrix we start from the eigenvalue equation A~eλ = λ~eλ, insert the identity 1 , and rewrite it as a null-space problem: A~eλ = λ 1 ~eλ ⇒ (A − λ 1 ) ~eλ = ~ 0. This equation will have a solution whenever |A−λ 1 | = 0. The eigenvalues of A ∈ Rn×n, denoted {λ 1 , λ 2 ,... , λn}, are the roots of the characteristic polynomial p(λ) = |A − λ 1 |. The eigenvectors associated with the eigenvalue λi are the vectors in the null space of the matrix (A − λi 1 ). Certain matrices can be written entirely in terms of their eigenvectors and their eigenvalues. Consider the matrix Λ that has the eigenvalues of the matrix A on the diagonal, and the matrix Q constructed from the eigenvectors of A as columns:
2 (^66) (^66) (^66) (^64)
λ 1 · · · 0 .. .
0 0 λn
3 (^77) (^77) (^77) (^75)
2 (^66) (^66) (^66) (^64)
~eλ 1 · · · ~eλn ˛˛ |
3 (^77) (^77) (^77) (^75)
, then A = QΛQ−^1.
Matrices that can be written this way are called diagonalizable. The decomposition of a matrix into its eigenvalues and eigenvectors gives valuable insights into the properties of the matrix. Google’s original PageRank algorithm for ranking webpages by “importance” can be formalized as an eigenvector calculation on the matrix of web hyperlinks.
VI. TEXTBOOK PLUG If you’re interested in learning more about linear algebra, you can check out my new book, the NO BULLSHIT GUIDE TO LINEAR ALGEBRA. A pre-release version of the book is available here: gum.co/noBSLA