recall the post about schur’s lemma. i provided an example of a matrix A which could be diagonalized (to B) but not by an orthogonal transition matrix. A is not a normal matrix: it does not commute with its transpose:
it was the monday morning after the post when i woke up thinking, “but B is a diagonal matrix, it is its own transpose…
and therefore it is trivially normal:
but i know that’s wrong.”
that is, i seem to have converted A into a normal matrix, but that’s wrong because normality should be a property of the operator represented by A or B. if one matrix is normal, the other should be, since they represent the same operator. (now may be a good time to remind you that the whole point of the change-of-basis equation for square matrices
is that the matrix B represents the same linear operator L as A does, but taken wrt a new basis whose transition matrix is P.)
what’s going on here? A and B represent the same non-normal operator L; how is B its own transpose? that’s the wrong question. correct is, how can B represent both the operator and its adjoint?
i wish i could say that the answer sprang full-blown into my head right after i said all that, but it didn’t. i knew three pieces of the answer, and as soon as i saw the complete answer in halmos, i understood it.
i am going to be a little sloppy. when i refer to the adjoint of a matrix M, i really mean the adjoint operator whose matrix is M wrt some basis. OTOH, the term transpose refers only to a matrix, and means exactly what it has always meant: interchange the rows and columns. in everything that follows, we have only two linear operators (an operator and its adjoint), which for one brief shining moment i will call L and L*. what i have given you is that wrt the original basis, they are represented by matrices A and .
its transpose is…
we found a transition matrix P which diagonalizes A; it also defines what i’m calling the new basis. we compute
giving us what we called B. since B is diagonal, it is its own transpose. the problem is that B appears to be not only normal, but in fact self-adjoint. but it can’t be, because A isn’t normal.
the first question that comes to my mind is, what does the transition matrix P do to ? let’s call the result . (not C; that would lead to a notational nightmare down the road.) since represents the adjoint of A wrt the original basis, then must represent the adjoint of A wrt the new basis.
we compute :
and that’s not B. it’s not supposed to be. somehow.
so the transpose matrix is not always the adjoint operator? what’s going on?
i knew three things.
- one, that whenever i have a basis (in this case, the new basis) which is not orthonormal, i have to jump through a hoop to compute the dot product of two vectors. i have to introduce what’s called the reciprocal basis, and compute the components of one vector (either one, but only one) wrt the reciprocal basis. then i compute the dot product in the usual way, but using the components wrt the new basis for one vector and wrt the reciprocal basis for the other vector. (you may have seen this in tensor analysis as “lowering an index”.
- two, an orthonormal basis is its own reciprocal basis.
- three, the transpose of a matrix corresponds to the adjoint operator if the matrix is wrt an orthonormal basis. i knew that, but i wasn’t comfortable with it. nobody wrote it, at least not where i saw it.
here’s the truth: we can always form the matrix of the adjoint operator by transposing the matrix B of the operator; but that transpose matrix is wrt a different basis, the so-called reciprocal basis; hence we get the same basis as a special case if the basis is orthonormal.
it would help if i gave you the definition of the adjoint operator. in an inner product space, the adjoint operator L* of a linear operator L (operators, not matrices) is defined by
where indicates the inner product (dot product) of two vectors. so, the definition of the adjoint operator hinges on the inner product.
(to be quite explicit about the order of operations, , although no other interpretation holds water. on the left, L does not apply to the real number ; and on the right, is not defined, not for a vector and an operator.)
trust me, we could show that wrt any orthonormal basis, the matrices of L and L* are transposes. the closest thing to a challenge in working that out is to define the matrix of an operator wrt a basis.
what i claim is that the matrices of L and L* are always transposes, provided they are taken wrt two different bases, each being called the reciprocal basis of the other. if the basis is orthonormal, it is its own reciprocal basis; and in that case only, the matrices are wrt the same basis.
we need the reciprocal basis because the computation of the dot product of two vectors cannot be just the usual multiplication of corresponding components and summing. the definition of the adjoint operator requires the inner product; writing that in matrices essentially requires that we change the matrix of the inner product from the identity to whatever it is in the new non-orthonormal basis.
all i’m i’m going to do next is show you how to get the reciprocal basis and confirm that it diagonalizes to the matrix B which is the diagonalized matrix A.