## discussion

There was a lot going on in the example of the SN decomposition (2 of 3). First off, we found eigenvalues of a non-diagonable matrix A, and constructed a diagonal matrix D from them. Then we found 2 eigenvectors and 1 generalized eigenvector of A, and used them to construct a transition matrix P. We used that transition matrix to go from our diagonal D back to the original basis, and find S similar to D.

So S is diagonable while A is not. And A and S have the same eigenvalues; and the columns of P should be eigenvectors of S. They are. The generalized eigenvector that we found for A is an (ordinary) eigenvector of S, but we had to get a generalized eigenvector of A in order to construct S from D.

I wonder. Can I understand the distinction between eigenvectors and generalized eigenvectors by studying S and A? We’ll see.

I would also remark that A was special in one sense: it was lower triangular. It is not an accident that the eigenvalues of A are its diagonal elements. We could have written the diagonal matrix D by inspection. Instead, I got it together with the eigenvectors.

A question: what does P do to A? We compute

$P^{-1}\ A\ P = \left(\begin{array}{lll} 2 & 1 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 1\end{array}\right)$

It brings A to Jordan Canonical Form. (Did I make an especially good choice for the generalized eigenvector? I don’t know. It may be that any choice would have led to the JCF; or maybe not. My vague recollection from examples years ago is that I lucked out, that in general I have to use some of that considerable freedom I found in v to get JCF.)

If finding generalized eigenvectors is relatively painless for you, you may be happy with the N+S decomposition. (Of course, if you have a “matrix exponential” command available, you’re done.) If not, another possibility is to use the Cayley-Hamilton theorem: any matrix satisfies it’s own characteristic equation. In this example, the characteristic equation is

$\lambda ^3-5 \lambda ^2+8 \lambda -4 = 0$

(The roots of that are the eigenvalues, of course.) The Cayley-Hamilton says that

$A^3\ - 5\ A^2 + 8\ A - 4 = 0$

where the RHS must be the 3×3 zero matrix, and the –4 on the LHS must multiply the 3×3 identity matrix. And, indeed, A and its powers satisfy that equation.

I used to wonder what the Cayley-Hamilton theorem was good for. One thing it’s good for is turning higher powers of A into lower ones. Use it to express A^3 in terms of I, A, and A^2. then reduce A^4, and keep going. For our example, we could replace the infinite series in A by 3 infinite series of scalars: one series multiplies I, another multiplies A, and the third multiplies A^2. Maybe we could see the patterns for the three scalar series more easily than a pattern for the series in A itself.

There is yet another way to so this; we’ll see it when I look at the spectral decomposition theorem out of Halmos.

Ah, there’s one last point I want to make. Schur’s lemma told us that any matrix could be brought to upper triangular form. I didn’t say this, but any nilpotent matrix is similar to a strictly upper triangular matrix (i.e. upper triangular with zero diagonal). It’s also similar to a strictly lower triangular form; I think upper or lower should just depend on an ordering of the basis.

So why not have just split our triangular matrix into diagonal plus nilpotent? Because we need S and N to commute. Our example is a perfect illustration: we have a lower triangular matrix

$A = \left(\begin{array}{lll} 1 & 0 & 0 \\ -1 & 2 & 0 \\ 1 & 1 & 2\end{array}\right)$

so we can write it as the sum of diagonal $\Sigma$

$\Sigma = \left(\begin{array}{lll} 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 2\end{array}\right)$

and nilpotent B

$B = \left(\begin{array}{lll} 0 & 0 & 0 \\ -1 & 0 & 0 \\ 1 & 1 & 0\end{array}\right)$

While B is, indeed, nilpotent, it turns out that it’s B^3 which is equal to zero instead of B^2.

Do B and $\Sigma$ commute? We compute one

$B\ \Sigma = \left(\begin{array}{lll} 0 & 0 & 0 \\ -1 & 0 & 0 \\ 1 & 2 & 0\end{array}\right)$

and the other

$\Sigma \ B = \left(\begin{array}{lll} 0 & 0 & 0 \\ -2 & 0 & 0 \\ 2 & 2 & 0\end{array}\right)$

No, they do not commute. It is nontrivial that we can find N and S which commute. We needed to go from D back to S, and get N = A – S.

Incidentally, the online reference linked back at the beginning of 1 of 3 shows that schur’s lemma itself can be used to compute the matrix exponential numerically, although the triangular form which schur’s lemma gives us cannot be used to get N and S. (I.e. we can use schur’s lemma as the basis for a different algorithm.)