






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
A proof of the Cayley-Hamilton Theorem using generalized eigenvectors and the Jordan Normal Form. The theorem states that every square matrix satisfies its own characteristic equation. The proof is based on the fact that every operator can be represented as upper triangular with respect to some basis, and the equivalence of an operator being upper triangular with respect to a basis and its matrix having zeros below the main diagonal. The document also introduces concepts such as generalized eigenvectors, multiplicities of eigenvalues, and the characteristic polynomial.
Typology: Study notes
1 / 10
This page cannot be seen from the preview
Don't miss anything!
GABRIEL DAY
Abstract. We present three proofs for the Cayley-Hamilton Theorem. The final proof is a corollary of the Jordan Normal Form Theorem, which will also be proved here.
Contents
Date: September 8, 2017. 1
2 GABRIEL DAY
matrix by an upper triangular matrix. While this lemma is also used in Section 3, the proof presented there relies on analysis, namely, the density of diagonalizable matrices among all matrices. The third proof follows from the Jordan Normal Form Theorem.
Lemma 2.1. Every nonzero operator on a complex vector space has an eigenvalue.
Proof. Suppose that V is a complex vector space with dim V = n > 0. Let T be an operator on V. Choose v 6 = 0 ∈ V. We know that the list of vectors
(v, T v, T 2 v, · · · , T nv)
must be linearly dependent, as it contains n + 1 vectors, while the dimension of V is n. Thus there exist scalars a 0 , a 1 , · · · , an ∈ C, not all of which equal zero, such that
(2.2) 0 = a 0 v + a 1 T v + a 2 T 2 v + · · · + anT nv.
The Fundamental Theorem of Algebra ensures that there exists a largest nonzero integer m such that the coefficients a 0 , · · · am are also the coefficients for a polyno- mial with m roots, λ 1 , · · · , λm ∈ C. That is, for all x ∈ C,
(2.3) a 0 + a 1 x + a 2 x^2 + · · · , +amxm^ = c(x − λ 1 ) · · · (x − λm),
where c ∈ C is a nonzero constant. Using the result of (2.2) and the right hand side of (2.3), we see that 0 = c(T − λ 1 ) · · · (T − λm)v.
Thus, for at least one j, T − λj I is not injective. So T has an eigenvalue.
Using this result, we prove that every operator is upper triangular with respect to some basis, a lemma critical to this and later proofs of C-H. First, however, it is useful to note the equivalence of the following statements:
Remark 2.4. For an operator T on a vector space V with basis (v 1 , · · · , vn), the matrix of T with respect to (v 1 , · · · , vn) is upper triangular if, and only if
T vk ∈ span(v 1 , · · · , vk)
for each k = 1, · · · , n.
This equivalence follows from the definitions of upper triangular matrix and the matrix for an operator with respect to a basis. Now we prove the crucial lemma for this and the following proof of C-H.
4 GABRIEL DAY
As a result of this conclusion, if λ 1 , · · · , λn are the eigenvalues of T , listed with multiplicities, then an upper triangular matrix for T is written in the following way:
λ 1 ∗
.. . 0 λn
Since each eigenvalue appears its multiplicity number of times in the diagonal, and the diagonal has length dim V , if U 1 , · · · , Umare the eigensubspaces of T , then
(2.12) dim V = dim U 1 + · · · + dim Um.
We will give here a brief proof of an important structural theorem, which, though not used in this first proof of C-H, intimately relates to the concepts just stated, and will be employed later on in the paper to prove the Jordan Normal Form Theorem.
Theorem 2.13. Suppose that T is an operator over a vector space V , with distinct eigenvalues λ 1 , · · · , λm and corresponding eigensubspaces U 1 , · · · , Um. Then V is equal to the direct sum of these subspaces:
V = U 1 ⊕ · · · ⊕ Um.
Proof. Because U 1 + · · · + Um is a subset of of V , and (2.12) holds, this sum and V must be equal. So
V = U 1 + · · · + Um.
Along with (2.12), this is sufficient to conclude that the sum is direct.
Definition 2.14. The polynomial (x − λ 1 )d^1 · · · (x − λm)dm^ is the characteristic polynomial for T , where λ 1 , · · · , λm are the eigenvalues of T and d 1 , · · · , dm denote their respective multiplicities. We denote the characteristic polynomial of T by fT.
With that, we have arrived at the first proof of the Cayley-Hamilton Theorem.
Theorem 2.15. Suppose that T is an operator on a complex vector space V. Then
fT (T ) = 0.
Proof. Suppose that (v 1 , · · · , vn) is a basis for V with respect to which T has the form given in (2.11). To show that the matrix fT (T ) = 0, we need to show that fT (T )vj = 0 for all values of j. To do this, it suffices to show that
(2.16) (T − λ 1 I) · · · (T − λj I)vj = 0,
as the polynomial by which vj is multiplied is a factor of fT. We prove this by induction. The case of j = 1 is given by T v 1 = λ 1 v 1 , the definition of an eigenvector. Suppose that for 1 < j ≤ n, (2.16) holds. The form in (2.11) gives that the jth column of T − λj I will have zeros in the jth entry and all below, so (T − λj I)vj is a linear combination of v 1 , · · · , vj− 1. By the induction hypothesis, applying (T −λ 1 I) · · · (T −λj− 1 I) to (T −λj I)vj gives zero. Therefore (2.16) is satisfied, and the proof is complete.
THE CAYLEY-HAMILTON AND JORDAN NORMAL FORM THEOREMS 5
With this definition, C-H is stated precisely as in (2.15). This proof requires several steps. First, we prove C-H for operators with diagonal matrices, then diag- onalizable matrices, and finally for all operators. A matrix is diagonal if its entries are all zero, except possibly those on the diagonal. One can easily see that if A = diag(a 1 , · · · , an) and B = diag(b 1 , · · · , bn) are two diagonal matrices,
A + B = diag(a 1 + b 1 , · · · , an + bn). Likewise, AB = diag(a 1 b 1 , · · · , anbn). So if a polynomial f is applied to A, then f (A) = diag(f (a 1 ), · · · , f (an)). The determinate of a diagonal matrix is simply the product of the diagonal entries, so clearly diagonal matrices satisfy C-H.
Definition 3.1. Two matrices A and B are said to be similar, denoted A ∼ B, if there exists some invertible matrix S such that
A = S−^1 BS. Stating that two matrices are “similar” is synonymous with saying that they are the same matrix with respect to different bases; the invertible matrix represents a change of basis. So (2.5) can be rephrased “every matrix over a complex vector space is similar to an upper triangular matrix.”
Lemma 3.2. If two matrices A and B are similar, then their characteristic poly- nomials are equal.
Proof. Let fA and fB denote the characteristic polynomials of A and B, respec- tively. Since A ∼ B, there exists an invertible matrix S such that A = S−^1 BS. The following lines proceed from the definitions of characteristic polynomial and similarity, as well as some basic determinate properties.
fA = det(tI − S−^1 BS) = det(tS−^1 S − S−^1 BS) = det(S−^1 (tI − B)S) =
det(S−^1 ) det(tI − B) det(S) = det(S−^1 S) det(tI − B) = det(tI − B) = fB.
Definition 3.3. A matrix is diagonalizable if it is similar to a diagonal matrix.
THE CAYLEY-HAMILTON AND JORDAN NORMAL FORM THEOREMS 7
eigenvectors to distinct eigenvalues. Because these vectors are linearly independent by the inductive hypothesis, ai = 0 for all i. Thus, returning to (3.7), we find that ak+1vk+1 = 0, and thus ak+1 = 0. Thus, the eigenvectors are linearly independent.
Lemma 3.10. If a k × k matrix A has k linearly independent eigenvectors, then it is diagonalizable.
Proof. Suppose A has k linearly independent eigenvectors x 1 , · · · , xk. Let S = [x 1 , · · · , xk]. By the definition of eigenvector, AS = SD, where D is defined as above. The columns of S are linearly independent, so it is invertable. So SDS−^1 = ASS−^1 = A. Thus A is diagonalizable. One more lemma is needed to prove the general case of C-H, employing a bit of analysis.
Lemma 3.11. Diagonalizable matrices are dense in all complex, square matrices.
Proof. Consider a non-diagonalizable matrix A. By (2.5), there exists upper-
triangular matrix B =
λ 1 ∗ λ 2
.. . 0 λn
such that A ∼ B. Take some > 0.
Consider the matrix B produced by making slight variations to the diagonal en- tries of B by factors of , so that all of the diagonal entries of B are mutually distinct. For example, if B is a 2 × 2 matrix with diagonal values of 1, then B could have diagonal entries 1 and 1− 2. Because B has n distinct eigenvalues equal to the unique diagonal entries (we know these are eigenvalues from (2.11)), B has n linearly independent eigenvectors and is thus diagonalizable. So as → 0, the B go to B, and B ∼ A. Thus diagonalizable matrices are dense among all complex square matrices.
Finally, we can prove C-H using density of diagonalizable matrices:
Theorem 3.12. For all matrices A, fA(A) = 0.
Proof. Consider a non-diagonalizable matrix A as above. For each ,
fB (B) = 0.
Since B → B, fB → fB = fA, and taking the determinate is continuous (changing the values on the diagonal is simply changing the coefficients of the polynomial),
fA(A) = lim → 0 fB (B) = 0.
8 GABRIEL DAY
This important result is probably more consequential than C-H, and once we have it in hand a third and final proof for C-H is simple. The proof of JNFT presented here, originally found in Axler, makes extensive use of nilpotent operators.
Definition 4.1. An operator (or matrix) A is nilpotent if there exists a positive integer k such that Ak^ = 0.
This proof of JNFT relies primarily on a lemma which guarantees that a basis for the vector space can be found using a nilpotent operator. For a vector v and nilpotent operator N , let m(v) denote the maximum nonneg- ative integer such that N m(v)v 6 = 0.
Lemma 4.2. If an operator N over a complex vector space V is nilpotent, then there exist vectors v 1 , · · · , vk ∈ V such that
(a) the list
(4.3) (v 1 , N v 1 , · · · , N m(v^1 )v 1 , · · · , vk, N vk, · · · , N m(vk^ )vk)
is a basis for V , and (b) (N m(v^1 )v 1 , · · · , N m(vk^ )vk) is a basis for ker(N ).
Proof. We prove by induction on dim V. The dim V = 1 case is trivial, as N must equal [0]. So we assume that the lemma holds for vector spaces of dimension less than that of V. Since N is nilpotent it is not injective, hence dim ImN < dim V. Applying our inductive hypothesis (where we replace V with ImN and N with N |ImN ), there is a basis u 1 , · · · uj ∈ ImN such that
(i) (u 1 , N u 1 , · · · , N m(u^1 )u 1 , · · · , uj , N uj , · · · , N m(uj^ )uj ) is a basis for ImN , and (ii) (N m(u^1 )u 1 , · · · , N m(uj^ )uj ) is a basis for ImN ∩ ker N. Define v 1 , · · · , vj by N vr = ur , for all r. Note that m(vr ) = m(ur ) + 1. There exists a subspace W such that
(4.4) ker N = (ker N ∩ ImN ) ⊕ W.
We may choose a basis vj+1, · · · , vk for W. Each of these is in ker N and thus 0 = m(vj+1) = · · · = m(vm). The list v 1 , · · · vk is our suspect list of vectors satisfying (a) and (b). Showing that the list of vectors in (4.3) is linearly independent involves applying N to a linear combination of all of these vectors, along with careful use of the m(vr ) values. We can clearly see from the bases we have defined that
dim ker N = k. (i) gives that
dim ImN =
∑^ j
r=
(m(ur ) + 1) =
∑^ j
r=
m(vr ).
Combining these last two equations, we see that the list of vectors in (a) has length
10 GABRIEL DAY
Now consider an operator T with distinct eigenvalues λ 1 , · · · , λm and corre- sponding eigensubspaces U 1 , · · · Um. From (2.13),
V = U 1 ⊕ · · · ⊕ Um. Each (T − λj I)|Uj is nilpotent, a fact which arises from the definition of Uj : since Uj is the null space of a power of T − λj I, there is some power such that if (T − λj I)|Uj is raised to it, multiplying this matrix by any vector in Uj yields zero. So this restriction is nilpotent. Thus there exists a basis for Uj which is a Jordan basis for (T − λj I)|Uj. Putting these together into a single basis, we get a Jordan basis for T. As previously stated, C-H follows almost directly from JNFT. So here is one last proof of C-H:
Corollary 4.8. If T is an operator on a vector space V , then
fT (T ) = 0.
Proof. By (3.2) and (3.4), we may assume T is in Jordan Normal Form. We use Axler’s definition of the characteristic polynomial, (2.14). Each binomial multiplied to get fT (T ) is of the form T − λj I, so the Jordan block for λj has the form (4.7). Taking powers of this Jordan block, we may observe that the diagonal line of ones recedes and disappears. Thus this Jordan block goes to zero when taken to the dj power, the multiplicity of λj. Since the power to which T − λj is raised is dj , this Jordan block is annihilated by the characteristic polynomial. Since this holds for each Jordan block, the entire matrix is annihilated. The power and usefulness of the Cayley-Hamilton Theorem arises from the fact that it holds not only over the complex numbers, as we have proved, but over all fields. We hope that through the proofs presented here, readers have a better understanding of the interactions between eigenvalues, the determinant, and matrix form, and of linear algebra as a whole.
Acknowledgments. Many thanks to my mentor, Jingren Chi, for his suggestions for the topic of my paper and reading material, and helpfulness in editing and improving my work. I would also like to thank Dr. Edwin Ihrig for discussing the paper with me and offering his suggestions. I owe much of my understanding of linear algebra to Professor Babai and his apprentice program this summer, in which he provided the second proof presented in this paper. Finally, thank you to Professor Peter May for organizing this excellent REU.
References
[1] Axler, S, Linear Algebra Done Right, 2nd ed. Springer, New York, 1997, 81, 83-85, 164, 167- 168, 171-175, 183-187pp. [2] Babai, L., Tulsiani, M., 2013: Apprentice Program - Summer 2013 REU, Class 7/25. Accessed 27 July 2017. Available online at http://ttic.uchicago.edu/~madhurt/courses/reu2013/class725.pdf. [3] Wikipedia, the Free Encyclopedia 2017: Jordan normal form. Accessed 3 August 2017. Avail- able online at https://en.wikipedia.org/wiki/Jordan_normal_form#cite_note-1.