Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Diagonalizable Matrices and Singular Value Decomposition, Exercises of Linear Algebra

The concepts of diagonalizable matrices and Singular Value Decomposition (SVD). Diagonalizable matrices are those that can be written as the product of a diagonal matrix and two invertible matrices. SVD is a factorization of a matrix into the product of an orthogonal matrix, a diagonal matrix with non-negative entries, and the transpose of another orthogonal matrix. examples and exercises to help understand these concepts.

Typology: Exercises

2021/2022

Uploaded on 09/12/2022

scream
scream 🇬🇧

4.5

(11)

276 documents

1 / 9

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Math 480
Diagonalization and the Singular Value Decomposition
These notes cover diagonalization and the Singular Value Decomposition.
1. Diagonalization.
Recall that a diagonal matrix is a square matrix with all off-diagonal entries equal to zero.
Here are a few examples of diagonal matrices:
6 0
0 2 ,
4 0 0
0 2 0
0 0 1
,
4 0 0 0
0 2 0 0
0 0 1 0
0 0 0 1
.
Definition 1.1. We say that an n×nmatrix Ais diagonalizable if there exists an invertible
matrix Ssuch that S1AS is diagonal.
Note that if D=S1AS is diagonal, then we can equally well write A=SDS1. So
diagonalizable matrices are those that admit a factorization A=SDS1with Ddiagonal.
Example: If Dis a diagonal n×nmatrix and Sis an invertible n×nmatrix, then
A=SDS1is diagonalizable, since
S1AS =S1(SDS1)S=D.
For instance, the matrix S=1 2
2 4 is invertible, so
S6 0
0 2 S1=2 2
82
is diagonalizable.
Fact 1.2. If Ais a diagonalizable n×nmatrix, with S1AS =D, then the columns of S
are eigenvectors of A, and the diagonal entries of Dare eigenvalues of A. In particular, if
Ais diagonalizable then there must exist a basis for Rnconsisting of eigenvectors of A.
This follows from a simple computation: since
S1AS =D,
multiplying both sides by Syields
AS =SD.
Write S= [~v1. . . ~vn] and set
D=
λ10 0 · ·· 0
0λ20·· · 0
0 0 λ3
.
.
.
.
.
..
.
....0
0 0 ·· · 0λn
.
Since multiplying Sby a diagonal matrix (on the right) just scales the columns,
SD = [λ1~v1. . . λn~vn].
pf3
pf4
pf5
pf8
pf9

Partial preview of the text

Download Diagonalizable Matrices and Singular Value Decomposition and more Exercises Linear Algebra in PDF only on Docsity!

Math 480 Diagonalization and the Singular Value Decomposition

These notes cover diagonalization and the Singular Value Decomposition.

  1. Diagonalization. Recall that a diagonal matrix is a square matrix with all off-diagonal entries equal to zero. Here are a few examples of diagonal matrices:

( − 6 0 0 2

Definition 1.1. We say that an n × n matrix A is diagonalizable if there exists an invertible matrix S such that S−^1 AS is diagonal.

Note that if D = S−^1 AS is diagonal, then we can equally well write A = SDS−^1. So diagonalizable matrices are those that admit a factorization A = SDS−^1 with D diagonal.

Example: If D is a diagonal n × n matrix and S is an invertible n × n matrix, then A = SDS−^1 is diagonalizable, since

S−^1 AS = S−^1 (SDS−^1 )S = D.

For instance, the matrix S =

is invertible, so

S

S−^1 =

is diagonalizable.

Fact 1.2. If A is a diagonalizable n × n matrix, with S−^1 AS = D, then the columns of S are eigenvectors of A, and the diagonal entries of D are eigenvalues of A. In particular, if A is diagonalizable then there must exist a basis for Rn^ consisting of eigenvectors of A.

This follows from a simple computation: since S−^1 AS = D,

multiplying both sides by S yields AS = SD.

Write S = [~v 1... ~vn] and set

D =

λ 1 0 0 · · · 0 0 λ 2 0 · · · 0 0 0 λ 3

0 0 · · · 0 λn

Since multiplying S by a diagonal matrix (on the right) just scales the columns,

SD = [λ 1 ~v 1... λn~vn].

On the other hand, AS = A[~v 1... ~vn] = [A~v 1... A~vn]. So the equation AS = SD tells us that A~vi = λi~vi (for each i), which precisely says that ~vi is an eigenvector with eigenvalue λi. The previous discussion also works in reverse, and yields the following conclusion. Fact 1.3. If A is an n × n matrix and there exists a basis ~v 1 ,... , ~vn for Rn^ such that ~vi is an eigenvector of A with eigenvalue λi, then A is diagonalizable. More specifically, if S = [~v 1... ~vn], then S−^1 AS = D, where D is the n × n diagonal matrix with diagonal entries λ 1 ,... , λn. Example. I claim that the matrix

A =

has eigenvalues 2 and 8. To find the corresponding eigenvectors, you can analyze N (A − 2 I) and N (A− 8 I). By considering the parametric form for the homogeneous systems (A− 2 I)~x = ~0 and (A − 8 I)~x = ~0, you’ll find that the vectors 

 (^) and

form a basis for the eigenspace associated to the eigenvalue 2, and  

forms a basis for the eigenspace associated with the eigenvalue 8. We can then conclude that S−^1 AS = D, where

S =

 (^) and D =

Note that order is important here: since we put eigenvectors corresponding to 2 into the first two columns of S, we have to put the eigenvalue 2 into the first two diagonal entries of D. We could, however, have switched the order of the eigenvectors corresponding to 2 without changing D, giving a second way of diagonalizing A. A third way of diagonalizing A would be to set

T =

 (^) and E =

and again we have T −^1 AT = E.

Exercise 1: Check these formulas without computing S−^1 and T −^1. (Multiply both sides of the equations S−^1 AS = D and T −^1 AT = E

This is again easy to check. If λ is an eigenvalue of AT^ A, then we can always find a (non- zero) eigenvector ~v associated with λ, and dividing ~v by ||~v|| yields a length-one eigenvector. So let’s just assume that ||~v|| = 1 and AT^ A~v = λ~v. Then we have

||A~v||^2 = 〈A~v, A~v〉 = (A~v)T^ A~v = ~vT^ (AT^ A~v) = ~vT^ (λ~v) = λ〈~v, ~v〉 = λ.

So λ = ||A~v||^2 , which is a non-negative real number. In this computation, we used the fact that ~v is an eigenvector of AT^ A (where?) and the fact that ||~v|| = 1 (where?).

Exercise 3: Write each of the following symmetric matrices in the form SDS−^1 with D diagonal. In the second case, the eigenvalues are −1 and 11. 

  1. The Singular Value Decomposition Lots of matrices that arise in practice are not diagonalizable, and are often not even square. However, there is something sort of similar to diagonalization that works for any m × n matrix. We will call a square matrix orthogonal if its columns are orthonormal.

Exercise 4: Explain the following statement: if A is an orthogonal n × n matrix, then A is invertible and AT^ = A−^1. (This came up when we discussed the QR factorization.)

Definition 3.1. A Singular Value Decomposition of an m × n matrix A is an expression

A = U ΣV T

where U is an m×m matrix with orthonormal columns, V is an n×n matrix with orthonormal columns, and Σ = (σi,j ) is an m × n matrix with σi,j = 0 for i 6 = j and

σ 1 , 1 > σ 2 , 2 > σ 3 , 3 > · · · > σm,m > 0.

Example: Here is an example of a SVD:

( 6 30 − 21 17 10 − 22

Exercise 5: Check that the above decomposition is a Singular Value Decomposition. (You need to check that the left-hand matrix in the decomposition has orthonormal columns, that the rows of the right-hand matrix are orthonormal, and that the middle matrix is “diagonal” with decreasing, positive entries on the diagonal. Of course no work is required to check this third condition.)

Here are the key facts about the SVD:

Theorem 3.2. Every m × n matrix A admits (many) Singular Value Decompositions.

Fact 3.3. If A = U ΣV T^ is a Singular Value Decomposition of an m × n matrix A, then

  • The numbers σi,i are the square roots of the eigenvalues of AT^ A, repeated according to their multiplicities as roots of the characteristic polynomial of AT^ A. (Note that these eigenvalues are positive since AT^ A is symmetric.)
  • The columns of V are eigenvectors of AT^ A.
  • The columns of U are eigenvectors of AAT^.

The terms σi,i are called the singular values of A. We’ll set σi = σi,i for convenience.

It is not hard to check the second two statements in Fact ??, and in doing so we’ll also check the first statement. Let V = [~v 1... ~vn]. To see why each ~vi is an eigenvector of AT^ A, we simply write out AT^ A in terms of the given SVD:

(1) AT^ A~vi = (U ΣV T^ )T^ (U ΣV T^ )~vi = V ΣT^ (U T^ U )(ΣV T^ ~vi) = V ΣT^ Σ~ei.

The last step uses Exercise 3 and the fact that the columns of V are orthonormal (think about this!). Now, ΣT^ Σ is diagonal with diagonal entries σ i^2 , so ΣT^ Σ~ei = σ^2 i ei and hence

AT^ A~vi = V ΣT^ Σ~ei = V σ i^2 ~ei = σ^2 i ~vi.

So ~vi is an eigenvector of AT^ A with eigenvalue σ^2 i.

Exercise 6: Give a similar explanation for why the columns of U are eigenvectors of AAT^.

Exercise 7: Show that for any square matrix A, the eigenvectors of A are also eigenvectors of A^2. What are the eigenvalues for A^2?

Example: If A is a symmetric n × n matrix, then we learned in the previous section that A is diagonalizable; that is, A = S−^1 DS for some diagonal matrix D and some invertible matrix S. In fact, we saw that the columns of S corresponding to different eigenvalues are orthogonal to one another. If you apply Gram–Schmidt to the columns of S, you’ll get an orthogonal matrix T that also diagonalizes A. If you order the columns of T and D so that the diagonal entries of D are decreasing, then you’ve found a SVD for A.

For instance, let A =

. To find a SVD for A, we can just diagonalize A

(making sure the matrix S has orthonormal columns). We need to figure out the eigenvalues of AT^ A. Since A is symmetric, AT^ A = A^2 , and the eigenvalues of A^2 are just the squares of the eigenvalues of A (see Exercise 7). So the singular values of A are just the eigenvalues of A! To compute these eigenvalues/singular values, we first compute the characteristic polyno- mial, which is the determinant of

A =

6 − λ 0 6 0 12 − λ 6 6 6 9 − λ

Using cofactor expansion and simplifying shows that the characteristic polynomial is

−λ^3 + 27λ^2 − 162 λ = −λ(λ^2 − 27 λ + 162) = −λ(λ − 9)(λ − 18),

c)

  1. Application: the four fundamental subspaces. Given an SVD A = U ΣV T^ , it’s relatively easy to find bases for the four fundamental subspaces R(A), N (A), R(AT^ ), and R(A). This is based on the following fact.

Fact 4.1. If A has r non-zero singular values, counted according to their multiplicity (i.e. if Σ has r non-zero entries), then A has rank r.

Why is this? Row reduce A = U ΣV T^ by multiplying on the left by elementary matrices. Since U is invertible you’ll eventually get down to IΣV T^ , which has rows σi~vi. The first r of these rows are non-zero and linearly independent, while the rest are zero. So ΣV T^ has r pivotal rows, so you’ll get r pivots when you finish row reducing A = U ΣV T^.

We can now explain how to find orthonormal bases for the four fundamental subspaces.

The Range of A: The first r columns of U form an orthonormal basis for R(A): for 1 6 i 6 r we have

A~vi = U ΣV T^ ~vi = U Σ~ei = U σi~ei = σi~ui,

so the vectors ~ui are in R(A), and they’re linearly independent (because they’re orthonormal). Since dim R(A) = rank(A) = r, these vectors must form an (orthonormal) basis.

The Null Space of A: The last n − r columns of V form a basis for N (A). These vectors are linearly independent (because they’re orthonormal), and they lie in N (A) because

A~vi = U ΣV T^ ~vi = U Σ~ei = U~ 0

for i > r. By the Rank–Nullity Theorem, dim N (A) = n − r, so the linearly independent set {~vr+1,... , ~vn} must form a basis for N (A).

To find bases for N (AT^ ) and R(AT^ ), just notice that AT^ = (U ΣV T^ ) = V ΣT^ U T^. This is a SVD for AT^ , because the diagonal entries of Σ and ΣT^ are the same, and V and U are orthogonal. Applying the same reasoning as above to AT^ = V ΣT^ U T^ , we find that the first r columns of V are a basis for R(AT^ ) and the last m − r columns of U are a basis for N (AT^ ).

Example: Consider the SVD

( 4 11 14 8 7 − 2

Letting A denote the matrix on the left, we have:

is a basis for R(A);

is a basis for N (A);

is a basis for R(AT^ );

  • and N (AT^ ) = {~ 0 } (so, if you like, it has an empty basis).

Exercise 8: Find bases (and dimensions) of the four fundamental subspaces of the matrix whose singular value decomposition is given below:

( 6 30 − 21 17 10 − 22

Use your answer to compute the projection of (1 1 1) onto the row space of A.

  1. Application: data compression One of the principal uses of the SVD is in data compression. The basic idea is that the important information in a matrix is really contained in its largest singular values, and one may often ignore the smaller singular values without losing the essential features of the data. If A = U ΣV T^ is a singular value decomposition of an m × n matrix, and Σ has precisely r non-zero entries, then the rank of A is r (Fact ??). Similarly, if we let Σk be the matrix formed from Σ by replacing σk+1,... , σr by zeros, then Ak = U ΣkV T^ is a rank k matrix, and is the best approximation to A by a rank k matrix. (This can be made precise, but we won’t worry about the details.) We will refer to Ak = U ΣkV T^ as the rank k approximation to A. Now, deleting entries from the diagonal of Σ doesn’t really get rid of all that much in- formation: if you had to transmit the entire matrices U , V , and Σk, you’d be transmitting m^2 + n^2 + k numbers, which is actually more numbers than were in the original m × n matrix A! However, we can re-write the rank k approximation Ak = U ΣkV T^ in a way that requires much less information than was contained in the original matrix A.

Fact 5.1. If the columns of V are ~vi and the columns of U are ~ui, then we have

Ak = UkΣk(Vk)T^ ,

where Uk = [~u 1... ~uk 0... 0] and Vk = [~v 1... ~vk 0... 0]. In other words, we can just replace all the columns of U and V after the kth column by zeros.

This is straightforward to check: just think about multiplying U (ΣkV T^ ), and you’ll see that only the first k columns of U and V are actually used in computing these products. Note that this new description of our rank k approximation really does contain a lot less data than was in the original matrix A: there are just k non-zero columns of Uk, each containing m entries, and just k non-zero columns of Vk, each containing n entries. So there are a total of km + kn + k non-zero entries in Uk, Σk, and Vk, which is much less than the mn entries in the original matrix A (if k is much less then m and n). Note also that the entries of the vectors ~ui and ~vi are necessarily small numbers, since these are unit vectors, while the entries of A could all be quite large.