i bumped into someone last night who asked me about schur’s lemma, something about bringing a matrix to triangular form. i’ve spent so much time looking at diagonalizng things that i didn’t appreciate schur’s lemma, and it deserves to be appreciated.
it says that we can bring any (complex) matrix A to upper triangular form using a unitary similarity transform. in this form, the restriction to “unitary” is a bonus: a perfectly useful but weaker statement is that any matrix is similar to an upper triangular matrix.
now, we’re usually interested in diagonalizing a matrix. when can we go that far?
easy: that upper triangular matrix is in fact diagonal iff the original matrix A is normal; that is, iff A commutes with its conjugate transpose:
so, any normal matrix can be diagonalized; furthermore, the similarity transform is unitary.
the combined result deserves to be rephrased. any matrix A can be diagonalized using a unitary similarity transform if and only if A is normal. now the restriction to “unitary” is not a bonus.
strang’s “linear algebra” is the perfect reference for all that.
anyway, the next time we see a theorem that says, “if A is normal…”, we could read it as “if A can be diagonalized by a unitary similarity transformation….”
OTOH, if we look in stewart’s “intro matrix computations”, we find a theorem saying that any non-defective matrix can be brought to diagonal form by a similarity transform P, but P need not be unitary; an nxn matrix is called non-defective if it has n linearly indepenent eigenvectors.
that’s a way of saying what we already know: if A is nxn and if we have enough distinct eigenvectors of A to make an nxn matrix P from them, then P is a similarity transform which will bring A to diagonal form.
this didn’t say anything about the similarity transform being unitary. i have got to ask: can i find a non-defective matrix which is not normal? are there matrices which can be diagonalized even though it cannot be done by a unitary similarity transform?
here we go. here’s an upper triangular matrix:
since this matrix is already upper triangular, we might expect that it cannot be diagonalized by a unitary matrix: what it can be brought to, is precisely itself. schur’s lemma is trivial when applied to this matrix.
it’s easy enough to let mathematica find the schur decomposition. i do indeed get the identity matrix for the unitary similarity transform, and i get the original A as the upper triangular form. that’s good.
we expect that A is not normal. since it is real, the conjugate transpose is just the transpose; we compute
and
.
they are not equal, so A is not normal, and therefore it cannot be diagonalized by a unitary similarity transform.
so if we read a theorem that says, “if A is normal…”, we should wonder if it’s true under the weaker hypothesis “if A can be diagonalized…. (i.e. by a non-unitary similarity transform).”
let’s find the eigenstructure to see if it can be diagonalized at all. we get the following eigenvector matrix P…
that is, A can be diagonalized by P;
is diagonal:
but P is not orthogonal:
having computed that P is not orthogonal, let’s actually look at P. the second eigenvector (= the second column), (1,0), is a basis for the x-axis; the eigenspace is the x-axis. the other eigenvector (1,1) is a basis for the line y = x: it’s at
to the x-axis. the second eigenspace just is not orthogonal to the x-axis. nothing we do is going to change that.
by finding two linearly independent eigenvectors of A, we have shown that it can be diagonalized; but the similarity transform P which does is not unitary.
for another view, let’s compute the SVD of A: find u, v, w such that
, where u and v are unitary. we get
if we compute
we will see that it is A, as it should be. we can also confirm that u and v are both unitary (in this case, orthogonal), but u and v are not the same. rather than leave you to confirm that those expressions don’t simplify to the same thing, i evaluate u and v:
that u and v are different confirms that A cannot be diagonalized by a unitary similarity transform. it can indeed be diagonalized, to w…
by two unitary (in this case, orthogonal) matrices u ≠ v, but not by one such.
i suppose it is worth noting that the diagonal elements of w are not the eigenvalues. that reqiuires that u and v be the same, so that the SVD be equivalent to the eigendecomposition. (so we have an example of a diagonal matrix from the similarity transform which is different from the diagonal matrix from the SVD.)
it is also worth confessing i didn’t start with the upper triangular matrix A. i started with B and P, B being diagonal with distinct eigenvalues and P being invertible but not orthogonal. i computed A as
and then took it as my starting point.