There was a lot going on in the example of the SN decomposition (2 of 3). First off, we found eigenvalues of a non-diagonable matrix A, and constructed a diagonal matrix D from them. Then we found 2 eigenvectors and 1 generalized eigenvector of A, and used them to construct a transition matrix P. We used that transition matrix to go from our diagonal D back to the original basis, and find S similar to D.
So S is diagonable while A is not. And A and S have the same eigenvalues; and the columns of P should be eigenvectors of S. They are. The generalized eigenvector that we found for A is an (ordinary) eigenvector of S, but we had to get a generalized eigenvector of A in order to construct S from D.
I wonder. Can I understand the distinction between eigenvectors and generalized eigenvectors by studying S and A? We’ll see.
I would also remark that A was special in one sense: it was lower triangular. It is not an accident that the eigenvalues of A are its diagonal elements. We could have written the diagonal matrix D by inspection. Instead, I got it together with the eigenvectors.
A question: what does P do to A? We compute
It brings A to Jordan Canonical Form. (Did I make an especially good choice for the generalized eigenvector? I don’t know. It may be that any choice would have led to the JCF; or maybe not. My vague recollection from examples years ago is that I lucked out, that in general I have to use some of that considerable freedom I found in v to get JCF.)
If finding generalized eigenvectors is relatively painless for you, you may be happy with the N+S decomposition. (Of course, if you have a “matrix exponential” command available, you’re done.) If not, another possibility is to use the Cayley-Hamilton theorem: any matrix satisfies it’s own characteristic equation. In this example, the characteristic equation is
(The roots of that are the eigenvalues, of course.) The Cayley-Hamilton says that
where the RHS must be the 3×3 zero matrix, and the –4 on the LHS must multiply the 3×3 identity matrix. And, indeed, A and its powers satisfy that equation.
I used to wonder what the Cayley-Hamilton theorem was good for. One thing it’s good for is turning higher powers of A into lower ones. Use it to express A^3 in terms of I, A, and A^2. then reduce A^4, and keep going. For our example, we could replace the infinite series in A by 3 infinite series of scalars: one series multiplies I, another multiplies A, and the third multiplies A^2. Maybe we could see the patterns for the three scalar series more easily than a pattern for the series in A itself.
There is yet another way to so this; we’ll see it when I look at the spectral decomposition theorem out of Halmos.
Ah, there’s one last point I want to make. Schur’s lemma told us that any matrix could be brought to upper triangular form. I didn’t say this, but any nilpotent matrix is similar to a strictly upper triangular matrix (i.e. upper triangular with zero diagonal). It’s also similar to a strictly lower triangular form; I think upper or lower should just depend on an ordering of the basis.
So why not have just split our triangular matrix into diagonal plus nilpotent? Because we need S and N to commute. Our example is a perfect illustration: we have a lower triangular matrix
so we can write it as the sum of diagonal
and nilpotent B
While B is, indeed, nilpotent, it turns out that it’s B^3 which is equal to zero instead of B^2.
Do B and commute? We compute one
and the other
No, they do not commute. It is nontrivial that we can find N and S which commute. We needed to go from D back to S, and get N = A – S.
Incidentally, the online reference linked back at the beginning of 1 of 3 shows that schur’s lemma itself can be used to compute the matrix exponential numerically, although the triangular form which schur’s lemma gives us cannot be used to get N and S. (I.e. we can use schur’s lemma as the basis for a different algorithm.)