## schur’s lemma: any matrix is unitarily similar to an upper triangular

i bumped into someone last night who asked me about schur’s lemma, something about bringing a matrix to triangular form. i’ve spent so much time looking at diagonalizng things that i didn’t appreciate schur’s lemma, and it deserves to be appreciated.
it says that we can bring any (complex) matrix A to upper triangular form using a unitary similarity transform. in this form, the restriction to “unitary” is a bonus: a perfectly useful but weaker statement is that any matrix is similar to an upper triangular matrix.
now, we’re usually interested in diagonalizing a matrix. when can we go that far?
easy: that upper triangular matrix is in fact diagonal iff the original matrix A is normal; that is, iff A commutes with its conjugate transpose:
$A \ A^{\dagger } = A^{\dagger }\ A.$

so, any normal matrix can be diagonalized; furthermore, the similarity transform is unitary.

the combined result deserves to be rephrased. any matrix A can be diagonalized using a unitary similarity transform if and only if A is normal. now the restriction to “unitary” is not a bonus.
strang’s “linear algebra” is the perfect reference for all that.
anyway, the next time we see a theorem that says, “if A is normal…”, we could read it as “if A can be diagonalized by a unitary similarity transformation….”
OTOH, if we look in stewart’s “intro matrix computations”, we find a theorem saying that any non-defective matrix can be brought to diagonal form by a similarity transform P, but P need not be unitary; an nxn matrix is called non-defective if it has n linearly indepenent eigenvectors.
that’s a way of saying what we already know: if A is nxn and if we have enough distinct eigenvectors of A to make an nxn matrix P from them, then P is a similarity transform which will bring A to diagonal form.
this didn’t say anything about the similarity transform being unitary. i have got to ask: can i find a non-defective matrix which is not normal? are there matrices which can be diagonalized even though it cannot be done by a unitary similarity transform?
here we go. here’s an upper triangular matrix:
$A = \left(\begin{array}{cc} 1&1\\ 0.&2\end{array}\right)$

since this matrix is already upper triangular, we might expect that it cannot be diagonalized by a unitary matrix: what it can be brought to, is precisely itself. schur’s lemma is trivial when applied to this matrix.
it’s easy enough to let mathematica find the schur decomposition. i do indeed get the identity matrix for the unitary similarity transform, and i get the original A as the upper triangular form. that’s good.
we expect that A is not normal. since it is real, the conjugate transpose is just the transpose; we compute $A \ A^T$ and  $A^T \ A$

$A \ A^T = \left(\begin{array}{cc} 2&2\\ 2&4\end{array}\right)$

$A^T \ A = \left(\begin{array}{cc} 1&1\\ 1&5\end{array}\right)$

they are not equal, so A is not normal, and therefore it cannot be diagonalized by a unitary similarity transform.
so if we read a theorem that says, “if A is normal…”, we should wonder if it’s true under the weaker hypothesis “if A can be diagonalized…. (i.e. by a non-unitary similarity transform).”
let’s find the eigenstructure to see if it can be diagonalized at all. we get the following eigenvector matrix P…

$P = \left(\begin{array}{cc} 1&1\\ 1&0.\end{array}\right)$

that is, A can be diagonalized by P;$B = P^{-1}\ A \ P$ is diagonal:

$B = \left(\begin{array}{cc} 2&0.\\ 0.&1\end{array}\right)$

but P is not orthogonal:

$P^T \ P= \left(\begin{array}{cc} 2&1\\ 1&1\end{array}\right)$

having computed that P is not orthogonal, let’s actually look at P. the second eigenvector (= the second column), (1,0), is a basis for the x-axis; the eigenspace is the x-axis. the other eigenvector (1,1) is a basis for the line y = x: it’s at $45^{\circ}$ to the x-axis. the second eigenspace just is not orthogonal to the x-axis. nothing we do is going to change that.
by finding two linearly independent eigenvectors of A, we have shown that it can be diagonalized; but the similarity transform P which does is not unitary.

for another view, let’s compute the SVD of A: find u, v, w such that $A = u \ w \ v^T$, where u and v are unitary. we get
$u = \left(\begin{array}{cc} \frac{-1+\sqrt{5}}{\sqrt{4+\left(-1+\sqrt{5}\right)^2}}&\frac{-1-\sqrt{5}}{\sqrt{4+\left(-1-\sqrt{5}\right)^2}}\\ \frac{2}{\sqrt{4+\left(-1+\sqrt{5}\right)^2}}&\frac{2}{\sqrt{4+\left(-1-\sqrt{5}\right)^2}}\end{array}\right)$

$w = \left(\begin{array}{cc} \sqrt{3+\sqrt{5}}&0.\\ 0.&\sqrt{3-\sqrt{5}}\end{array}\right)$

$v = \left(\begin{array}{cc} \frac{-2+\sqrt{5}}{\sqrt{1+\left(-2+\sqrt{5}\right)^2}}&\frac{-2-\sqrt{5}}{\sqrt{1+\left(-2-\sqrt{5}\right)^2}}\\ \frac{1}{\sqrt{1+\left(-2+\sqrt{5}\right)^2}}&\frac{1}{\sqrt{1+\left(-2-\sqrt{5}\right)^2}}\end{array}\right)$

if we compute $u \ w \ v^T$ we will see that it is A, as it should be. we can also confirm that u and v are both unitary (in this case, orthogonal), but u and v are not the same. rather than leave you to confirm that those expressions don’t simplify to the same thing, i evaluate u and v:

$u = \left(\begin{array}{cc} 0.525731&-0.850651\\ 0.850651&0.525731\end{array}\right)$

$v = \left(\begin{array}{cc} 0.229753&-0.973249\\ 0.973249&0.229753\end{array}\right)$

that u and v are different confirms that A cannot be diagonalized by a unitary similarity transform. it can indeed be diagonalized, to w…

$\left(\begin{array}{cc} \sqrt{3+\sqrt{5}}&0.\\ 0.&\sqrt{3-\sqrt{5}}\end{array}\right)$

by two unitary (in this case, orthogonal) matrices u ≠ v, but not by one such.
i suppose it is worth noting that the diagonal elements of w are not the eigenvalues. that reqiuires that u and v be the same, so that the SVD be equivalent to the eigendecomposition. (so we have an example of a diagonal matrix from the similarity transform which is different from the diagonal matrix from the SVD.)
it is also worth confessing i didn’t start with the upper triangular matrix A. i started with B and P, B being diagonal with distinct eigenvalues and P being invertible but not orthogonal. i computed A as
$A = P \ B \ P^{-1}$

and then took it as my starting point.

### 4 Responses to “schur’s lemma: any matrix is unitarily similar to an upper triangular”

1. The style of writing is quite familiar to me. Have you written guest posts for other bloggers?

2. rip Says:

Nope.

If I’ve picked up someone else’s style, I have no idea who or from where.

3. Vish Says:

can u provide a general proof for the above as well ?