





Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
The concepts of diagonalizable matrices and Singular Value Decomposition (SVD). Diagonalizable matrices are those that can be written as the product of a diagonal matrix and two invertible matrices. SVD is a factorization of a matrix into the product of an orthogonal matrix, a diagonal matrix with non-negative entries, and the transpose of another orthogonal matrix. examples and exercises to help understand these concepts.
Typology: Exercises
1 / 9
This page cannot be seen from the preview
Don't miss anything!
Math 480 Diagonalization and the Singular Value Decomposition
These notes cover diagonalization and the Singular Value Decomposition.
( − 6 0 0 2
Definition 1.1. We say that an n × n matrix A is diagonalizable if there exists an invertible matrix S such that S−^1 AS is diagonal.
Note that if D = S−^1 AS is diagonal, then we can equally well write A = SDS−^1. So diagonalizable matrices are those that admit a factorization A = SDS−^1 with D diagonal.
Example: If D is a diagonal n × n matrix and S is an invertible n × n matrix, then A = SDS−^1 is diagonalizable, since
S−^1 AS = S−^1 (SDS−^1 )S = D.
For instance, the matrix S =
is invertible, so
is diagonalizable.
Fact 1.2. If A is a diagonalizable n × n matrix, with S−^1 AS = D, then the columns of S are eigenvectors of A, and the diagonal entries of D are eigenvalues of A. In particular, if A is diagonalizable then there must exist a basis for Rn^ consisting of eigenvectors of A.
This follows from a simple computation: since S−^1 AS = D,
multiplying both sides by S yields AS = SD.
Write S = [~v 1... ~vn] and set
λ 1 0 0 · · · 0 0 λ 2 0 · · · 0 0 0 λ 3
0 0 · · · 0 λn
Since multiplying S by a diagonal matrix (on the right) just scales the columns,
SD = [λ 1 ~v 1... λn~vn].
On the other hand, AS = A[~v 1... ~vn] = [A~v 1... A~vn]. So the equation AS = SD tells us that A~vi = λi~vi (for each i), which precisely says that ~vi is an eigenvector with eigenvalue λi. The previous discussion also works in reverse, and yields the following conclusion. Fact 1.3. If A is an n × n matrix and there exists a basis ~v 1 ,... , ~vn for Rn^ such that ~vi is an eigenvector of A with eigenvalue λi, then A is diagonalizable. More specifically, if S = [~v 1... ~vn], then S−^1 AS = D, where D is the n × n diagonal matrix with diagonal entries λ 1 ,... , λn. Example. I claim that the matrix
has eigenvalues 2 and 8. To find the corresponding eigenvectors, you can analyze N (A − 2 I) and N (A− 8 I). By considering the parametric form for the homogeneous systems (A− 2 I)~x = ~0 and (A − 8 I)~x = ~0, you’ll find that the vectors
(^) and
form a basis for the eigenspace associated to the eigenvalue 2, and
forms a basis for the eigenspace associated with the eigenvalue 8. We can then conclude that S−^1 AS = D, where
(^) and D =
Note that order is important here: since we put eigenvectors corresponding to 2 into the first two columns of S, we have to put the eigenvalue 2 into the first two diagonal entries of D. We could, however, have switched the order of the eigenvectors corresponding to 2 without changing D, giving a second way of diagonalizing A. A third way of diagonalizing A would be to set
T =
(^) and E =
and again we have T −^1 AT = E.
Exercise 1: Check these formulas without computing S−^1 and T −^1. (Multiply both sides of the equations S−^1 AS = D and T −^1 AT = E
This is again easy to check. If λ is an eigenvalue of AT^ A, then we can always find a (non- zero) eigenvector ~v associated with λ, and dividing ~v by ||~v|| yields a length-one eigenvector. So let’s just assume that ||~v|| = 1 and AT^ A~v = λ~v. Then we have
||A~v||^2 = 〈A~v, A~v〉 = (A~v)T^ A~v = ~vT^ (AT^ A~v) = ~vT^ (λ~v) = λ〈~v, ~v〉 = λ.
So λ = ||A~v||^2 , which is a non-negative real number. In this computation, we used the fact that ~v is an eigenvector of AT^ A (where?) and the fact that ||~v|| = 1 (where?).
Exercise 3: Write each of the following symmetric matrices in the form SDS−^1 with D diagonal. In the second case, the eigenvalues are −1 and 11.
Exercise 4: Explain the following statement: if A is an orthogonal n × n matrix, then A is invertible and AT^ = A−^1. (This came up when we discussed the QR factorization.)
Definition 3.1. A Singular Value Decomposition of an m × n matrix A is an expression
A = U ΣV T
where U is an m×m matrix with orthonormal columns, V is an n×n matrix with orthonormal columns, and Σ = (σi,j ) is an m × n matrix with σi,j = 0 for i 6 = j and
σ 1 , 1 > σ 2 , 2 > σ 3 , 3 > · · · > σm,m > 0.
Example: Here is an example of a SVD:
( 6 30 − 21 17 10 − 22
Exercise 5: Check that the above decomposition is a Singular Value Decomposition. (You need to check that the left-hand matrix in the decomposition has orthonormal columns, that the rows of the right-hand matrix are orthonormal, and that the middle matrix is “diagonal” with decreasing, positive entries on the diagonal. Of course no work is required to check this third condition.)
Here are the key facts about the SVD:
Theorem 3.2. Every m × n matrix A admits (many) Singular Value Decompositions.
Fact 3.3. If A = U ΣV T^ is a Singular Value Decomposition of an m × n matrix A, then
The terms σi,i are called the singular values of A. We’ll set σi = σi,i for convenience.
It is not hard to check the second two statements in Fact ??, and in doing so we’ll also check the first statement. Let V = [~v 1... ~vn]. To see why each ~vi is an eigenvector of AT^ A, we simply write out AT^ A in terms of the given SVD:
(1) AT^ A~vi = (U ΣV T^ )T^ (U ΣV T^ )~vi = V ΣT^ (U T^ U )(ΣV T^ ~vi) = V ΣT^ Σ~ei.
The last step uses Exercise 3 and the fact that the columns of V are orthonormal (think about this!). Now, ΣT^ Σ is diagonal with diagonal entries σ i^2 , so ΣT^ Σ~ei = σ^2 i ei and hence
AT^ A~vi = V ΣT^ Σ~ei = V σ i^2 ~ei = σ^2 i ~vi.
So ~vi is an eigenvector of AT^ A with eigenvalue σ^2 i.
Exercise 6: Give a similar explanation for why the columns of U are eigenvectors of AAT^.
Exercise 7: Show that for any square matrix A, the eigenvectors of A are also eigenvectors of A^2. What are the eigenvalues for A^2?
Example: If A is a symmetric n × n matrix, then we learned in the previous section that A is diagonalizable; that is, A = S−^1 DS for some diagonal matrix D and some invertible matrix S. In fact, we saw that the columns of S corresponding to different eigenvalues are orthogonal to one another. If you apply Gram–Schmidt to the columns of S, you’ll get an orthogonal matrix T that also diagonalizes A. If you order the columns of T and D so that the diagonal entries of D are decreasing, then you’ve found a SVD for A.
For instance, let A =
. To find a SVD for A, we can just diagonalize A
(making sure the matrix S has orthonormal columns). We need to figure out the eigenvalues of AT^ A. Since A is symmetric, AT^ A = A^2 , and the eigenvalues of A^2 are just the squares of the eigenvalues of A (see Exercise 7). So the singular values of A are just the eigenvalues of A! To compute these eigenvalues/singular values, we first compute the characteristic polyno- mial, which is the determinant of
6 − λ 0 6 0 12 − λ 6 6 6 9 − λ
Using cofactor expansion and simplifying shows that the characteristic polynomial is
−λ^3 + 27λ^2 − 162 λ = −λ(λ^2 − 27 λ + 162) = −λ(λ − 9)(λ − 18),
c)
Fact 4.1. If A has r non-zero singular values, counted according to their multiplicity (i.e. if Σ has r non-zero entries), then A has rank r.
Why is this? Row reduce A = U ΣV T^ by multiplying on the left by elementary matrices. Since U is invertible you’ll eventually get down to IΣV T^ , which has rows σi~vi. The first r of these rows are non-zero and linearly independent, while the rest are zero. So ΣV T^ has r pivotal rows, so you’ll get r pivots when you finish row reducing A = U ΣV T^.
We can now explain how to find orthonormal bases for the four fundamental subspaces.
The Range of A: The first r columns of U form an orthonormal basis for R(A): for 1 6 i 6 r we have
A~vi = U ΣV T^ ~vi = U Σ~ei = U σi~ei = σi~ui,
so the vectors ~ui are in R(A), and they’re linearly independent (because they’re orthonormal). Since dim R(A) = rank(A) = r, these vectors must form an (orthonormal) basis.
The Null Space of A: The last n − r columns of V form a basis for N (A). These vectors are linearly independent (because they’re orthonormal), and they lie in N (A) because
A~vi = U ΣV T^ ~vi = U Σ~ei = U~ 0
for i > r. By the Rank–Nullity Theorem, dim N (A) = n − r, so the linearly independent set {~vr+1,... , ~vn} must form a basis for N (A).
To find bases for N (AT^ ) and R(AT^ ), just notice that AT^ = (U ΣV T^ ) = V ΣT^ U T^. This is a SVD for AT^ , because the diagonal entries of Σ and ΣT^ are the same, and V and U are orthogonal. Applying the same reasoning as above to AT^ = V ΣT^ U T^ , we find that the first r columns of V are a basis for R(AT^ ) and the last m − r columns of U are a basis for N (AT^ ).
Example: Consider the SVD
( 4 11 14 8 7 − 2
Letting A denote the matrix on the left, we have:
is a basis for R(A);
is a basis for N (A);
is a basis for R(AT^ );
Exercise 8: Find bases (and dimensions) of the four fundamental subspaces of the matrix whose singular value decomposition is given below:
( 6 30 − 21 17 10 − 22
Use your answer to compute the projection of (1 1 1) onto the row space of A.
Fact 5.1. If the columns of V are ~vi and the columns of U are ~ui, then we have
Ak = UkΣk(Vk)T^ ,
where Uk = [~u 1... ~uk 0... 0] and Vk = [~v 1... ~vk 0... 0]. In other words, we can just replace all the columns of U and V after the kth column by zeros.
This is straightforward to check: just think about multiplying U (ΣkV T^ ), and you’ll see that only the first k columns of U and V are actually used in computing these products. Note that this new description of our rank k approximation really does contain a lot less data than was in the original matrix A: there are just k non-zero columns of Uk, each containing m entries, and just k non-zero columns of Vk, each containing n entries. So there are a total of km + kn + k non-zero entries in Uk, Σk, and Vk, which is much less than the mn entries in the original matrix A (if k is much less then m and n). Note also that the entries of the vectors ~ui and ~vi are necessarily small numbers, since these are unit vectors, while the entries of A could all be quite large.