Transpose & Adjoint: review & example

introduction

Let me pick up an old problem. The mathematics comes from the two posts “transpose matrix and adjoint operator” part 1 & part 2. The problem itself comes from the schur’s lemma post.

Upon further reflection, I am going to change the problem a little bit. Do not expect to see the same answers as before.

I am also going to work it twice, assuming that we are given different information as our starting point, but I’ll do it for the very same problem.

As I have said, the appropriate question for an introduction to ABO blood groups is: Can your mother donate blood to you? Until you can answer that question, you’re missing something about blood groups.

This example is not that good, but it does tie together the following concepts and computations:

  • the transition matrix;
  • the attitude matrix;
  • the change of basis formula;
  • the reciprocal basis;
  • the transpose of a matrix;
  • the adjoint of a linear operator;
  • the matrix of the adjoint operator.

This example is intended as a guide to and a review of my relevant posts, and will not repeat very much of the explanations in them.

Well, I have found that the explanations are growing.

What is the problem? Given a matrix and a non-orthonormal basis, find the matrix of the adjoint operator with respect to the non-orthonormal basis.

What is the key? We only know one way to find the matrix of the adjoint operator: take the transpose of some matrix.

Here is the general case: given a matrix X with respect to a basis E, we can transpose it, getting X^T\ . But that is the matrix of the adjoint with respect to the dual basis F. To get the matrix with respect to the basis E, we must do a change-of-basis to get from X^T\ with respect to to F to, say, Y^T\ with respect to E.

There is, however, a very special case: if E is an orthonormal basis, then it is its own reciprocal basis, F = E, and we were done as soon as we computed the transpose, becuse X^T\ is the matrix of the adjoint with respect to E.

The general problem, then, is: given a matrix, find the matrix of its adjoint with respect to a non-orthonormal basis. We will consider two possibiities: first, that the original basis is orthonormal; second, that it is not.

I would like to make one final point before we dive in. There is a fundamental truth here, regardless of the precise definition of the adjoint operator. The fact is that two matrices A and A^T\ represent two different but related linear operators (with respect to an orthonormal basis); but if we transform those matrices to a non-orthonormal basis, the two results will not be the transposes of each other.

In other words, the relationship between two linear operators which is represented by a matrix and its transpose, holds only in an orthonormal basis.

first form of the question

First, let us suppose we are given a matrix A with respect to an orthonormal basis; and a description of a non-orthonormal basis.

Here is the given matrix A:

A = \left(\begin{array}{cc} 1 & 1 \\ 0 & 2\end{array}\right)

(This much is the same as before; this is the same matrix as in the previous posts.)

Here is the given non-orthonormal basis \{f_1, f_2\}\ specified by an attitude matrix…

att = \left(\begin{array}{cc} 1 & -1 \\ 1 & 0\end{array}\right)

Find the matrix, which I will call C^T\ , of the adjoint operator with respect to the non-orthonormal basis.

Easy enough, once you’re used to it: we have A with respect to an orthonormal basis, so we form its transpose A^T\ . This is the matrix of the adjoint with respect to the orthonormal basis. Now we just need to transform A^T\ to the non-orthonormal basis.

We want the transition matrix P, which is the transpose of the attitude matrix: P = {att}^T\ .

P = \left(\begin{array}{cc} 1 & 1 \\ -1 & 0\end{array}\right)

One form of the change-of-basis formula tells us how to find the matrix B with respect to the new basis corresponding to A with respect to the original:

B = P^{-1} A P\ ,

so we get

B = \left(\begin{array}{cc} 2 & 0 \\ -2 & 1\end{array}\right)

That is, A and B are matrix representations, with respect to two bases, of a single linear operator. We don’t actually need the matrix B, but it serves for displaying the general change-of-basis formula for from A to B. And we will want to contrast it with C^T\ .

We have been asked to find the matrix C^T\ corresponding to A^T\ with respect to a new basis. But that uses the change-of-basis formula, with A^T\ and C^T\ in place of A and B respectively:

C^T = P^{-1} A^T P\ ,

so we get

C^T = \left(\begin{array}{cc} 1 & -1 \\ 0 & 2\end{array}\right)

What we see (as we should expect) is that B and C^T\ are not the transposes of each other.

We have, then, 4 matrices. A and A^T\ represent a linear operator and its adjoint with respect to an orthornormal basis. B and C^T\ represent the same linear operator and its adjoint with respect to the non-orthonormal basis \{f_1, f_2\}\ .

(In principle, of course, we only needed three of those matrices: A, A^T\ , and C^T\ ; but it may be informative to compute B.)

second form of the question
first of three solutions

We are given the matrix B, representing a linear operator with respect to a non-orthonormal basis…

B = \left(\begin{array}{cc} 2 & 0 \\ -2 & 1\end{array}\right)

and we were given the attitude matrix describing that non-orthonormal basis…

att = \left(\begin{array}{cc} 1 & -1 \\ 1 & 0\end{array}\right)

We want to find the matrix C^T\ of the adjoint operator with respect to the non-orthonormal basis.

The quickest way to do it is to transform B to the orthonormal basis, which gives us A. We still use the transition matrix P = {att}^T \

P = \left(\begin{array}{cc} 1 & 1 \\ -1 & 0\end{array}\right)

and the change of basis formula (but note that we’re going in the other direction, to the orthonormal basis instead of from it)…

A = P B P^{-1}\ .

so we get

A = \left(\begin{array}{cc} 1 & 1 \\ 0 & 2\end{array}\right)

We have just reduced this question to the previous problem: now we can get C^T\ by transforming A^T\ to the non-orthonormal basis…

C^T = P^{-1} A^T P\ .

We’re done. We could stop here. Maybe we should. But it’s nice to know more than one way to do something, because then we can check our work.

second of three solutions

Alternatively, we could use the reciprocal basis.

This gets a little messy only because we’re introducing a 3rd basis, so we get 2 more matrices.

We transpose B, and we get the matrix of the adjoint operator with respect to the reciprocal basis:

B^T = \left(\begin{array}{cc} 2 & -2 \\ 0 & 1\end{array}\right)\ .

But we’ve only just begun. We don’t want the adjoint with respect to the reciprocal basis; we want it with respect to the given non-orthonormal basis. It’s just that, ultimately, we have to take the transpose with respect to some basis; it’s the only way to compute the matrix of the adjoint. Then the question

how do we transform from the reciprocal basis (where we have B^T\ ) to the non-orthonormal basis where we want C^T\ ?

becomes

what is the transition matrix from the reciprocal basis to the basis?

And that leads to the question: exactly what is our reciprocal basis?

I could simply say (!) that the attitude matrix for the reciprocal basis is the inverse of the transition matrix P for the non-orthonormal basis; that is, the transition matrix Q for the reciprocal basis is the inverse transpose of P:

Q = P^{-T}\ ,

Q = \left(\begin{array}{cc} 0 & 1 \\ -1 & 1\end{array}\right)

But I think it’s only fair to admit that I don’t remember the description of the attitude matrix for the reciprocal basis. What I have in my head is a matrix equation, which says that the product of the attitude matrix Q for the reciprocal basis with the transition matrix P for the given basis, is the identity:

Q^T P = I\ .

(And that matrix equation simply says that the dot products of the reciprocal basis vectors – the rows of Q^T\ – with the given non-orthonormal basis vectors – the columns of P – are either 0 or 1.)

Then I solve for Q:

Q = P^{-T}\ .

Of course Q is what I said it was, but the point is that I remember the derivation rather than the answer. Fortunately it’s a very short and easy derivation, so I see the answer almost immediately….

Having Q, we can transform B^T\ to (not from) the orthonormal basis…

D^T = Q B^T Q^{-1}\ ,

so we get

D^T = \left(\begin{array}{cc} 1 & 0 \\ 1 & 2\end{array}\right)

and then we could transform D^T\ to the non-orthonormal basis:

C^T = P^{-1} D^T P\ ,

C^T = \left(\begin{array}{cc} 1 & -1 \\ 0 & 2\end{array}\right)

That is, as it should be, the same answer.

third of three solutions

And the second solution shows us another way to have done it. From

C^T = P^{-1} D^T P\

and

D^T = Q B^T Q^{-1}\

we get

C^T = P^{-1} Q B^T Q^{-1} P\ ,

which we write as

C^T	= R^{-1} B^T R\ ,

with R defined as

R = Q^{-1} P\ .

By writing out the algebra first, we can dispense with explicit the computation and transformation of D^T\ .

In other words, the transition matrix R from the reciprocal basis to the non-orthonormal basis is given by R = Q^{-1} P\ . Equivalently, the transition matrix from the non-orthonormal basis to its reciprocal basis is given by P^{-1} Q\ .

(But even if we use R, we are implicitly going through the original orthonormal basis.)

transpose matrix & adjoint operator 2

(
begin digression
just in case you’ve seen this in another form, let me make some connections. if you don’t recognize any of this digression, that’s ok. you can move along, there’s nothing to see here.
i am taking a linear operator L: V\rightarrow V, from a vector space V to the same vector space V. in fact, i have more than a vector space: V is an inner product space; i have a “dot product”. 
what i call the reciprocal basis is often called the dual basis. in fact, halmos calls it the dual basis. but that terminolgy is also associated specifically with the so-called dual space V* of linear functionals on V; in that case, the dual basis is a basis in V*. there can be a great deal of confusion here. the dual space V* can be defined without an inner product on V; the inner product on V can be defined without ever mentioning the dual space V*. but if we introduce both inner product and V*, then there is a natural isomorphism between elements of V and of V*. i have seen people think of the one-to-one relationship between elements of V and V* as an identity, and to confuse the inner product of two vectors in V with the effect of a linear functional on a vector. (worse, i have seen people assert that an inner product involves one element of V and one element of V*.
there is a one-to-one correspondence between my right shoe and my left, but they are not identical. isomorphism is not always identity.
here, i have a finite-dimensional vector space V with an inner product on it. i have two bases (original and new) on V, and i want to construct a third basis for V. i call it the reciprocal basis to emphasize that it is not a basis on V*.
end digression
)
let’s see how this plays out.
our new basis defined by the columns of P is not orthonormal. we need to figure out what the reciprocal basis needs to be. its purpose is to make dot products – and transposes – come out right.
the columns of P are the basis vectors for the new basis in which our matrix A became the diagonal matrix B.
recall P:
P = \left(\begin{array}{cc} 1&1\\ 1&0\end{array}\right)
what are the dot products of those two basis vectors with each other? well, we have them wrt the original basis, so we can compute their dot products as P^T\ P (that’s a convenient way of getting them in one fell swoop):
P^T\ P =\left(\begin{array}{cc} 1&1\\ 1&0\end{array}\right) x \left(\begin{array}{cc} 1&1\\ 1&0\end{array}\right) 
= \left(\begin{array}{cc} 2&1\\ 1&1\end{array}\right)
in fact, we did that in the schur’s theorem post, but i let it slide after saying “but P is not orthogonal”. 
the “2″ says the first vector is of length \sqrt{2}; the diagonal “1″ says the second vector is of length 1. each off-diagonal “1″ says that the cross-term dot products are 1 instead of the 0 we get from orthogonal vectors: that is, if the basis vectors are e_1 and e_2:
1 = e_1\cdot e_2 = |e_1| \ |e_2| \ cos \theta = \sqrt{2}\ cos \theta
so
cos \theta = \frac{1}{\sqrt{2}}\ and \theta = 45{}^{\circ}
“we knew that.”
now, what do these two basis vectors look like wrt the new basis, i.e. wrt themselves? what are their new components? they’re trival wrt themselves:
{(1,\ 0.)}
{(0,\ 1)}
the first new basis vector is 1 times itself, plus no part of the second vector; the second new basis vector is no part of the first vector plus 1 times itself.
how could we compute the euclidean inner product of the new basis vectors using new components? 
we can’t do it by just multiplying together the components. if we did that, we would get the identity matrix, and we would mistakenly conclude that the two basis vectors were orthonormal. 
there’s something going on with the inner product and the transformation to the new basis. (we should get to this, but not today. what we’re looking at is called the induced metric; what we’re about to do is the linear algebra equivalent of lowering indices using the metric tensor g_{ ij} in tensor analysis.)
i want another basis, the so-called reciprocal basis. i want a pair of vectors whose dot products with the new basis are 1 and 0. to put it another way, for this reciprocal basis i want an attitude matrix \alpha such that when i multiply \alpha times P (the rows of the reciprocal basis times the columns of the new basis) i get the identity. bear in mind that i am writing a matrix equation in the original basis, where the inner products come out correctly.
that is, i want
\alpha \ P = I
but P is invertible, inverses are unique, and therefore 
\alpha = P^{-1},
so we want to define the reciprocal basis by that attitude matrix.
\alpha = P^{-1} = \left(\begin{array}{cc} 0&1\\ 1&-1\end{array}\right)
and then i would write the transition matrix for the reciprocal basis as the transpose of its attitude mattrix. call it Q:
Q = \alpha^T = \left(\begin{array}{cc} 0&1\\ 1&-1\end{array}\right)
(yes, that was a trivial computation.) 
we now have three bases: original, new, and reciprocal.
it’s worth noting for future use that the transition matrix for the reciprocal basis is the inverse transpose of the transition matrix of the new basis: 
Q = P^{-T}.
what i have claimed is that: as A is diagonal in the new basis, the transpose of A is diagonal in the reciprocal basis; more, i claim that it’s the very same diagonal matrix B. in our original basis, we have A and its transpose:
A = \left(\begin{array}{cc} 1&1\\ 0&2\end{array}\right)
A^T = \left(\begin{array}{cc} 1&0\\ 1&2\end{array}\right)
they represent L and L* in the original basis. in the new basis, whose transition matrix is P, we knew that A becomes the diagonal matrix B:
B = \left(\begin{array}{cc} 2&0\\ 0&1\end{array}\right)
but that A^T becomes C^T which is not diagonal:
C^T = \left(\begin{array}{cc} 3&1\\ -2&0\end{array}\right)
we also believe that C^T represents L* wrt the new basis; it was defined by the change-of-basis formula for a matrix. now, in the reciprocal basis, whose transition matrix is Q, we compute Q^{-1}\ A^T\ Q to see what A^T becomes.
Q^{-1}\ A^T\ Q = \left(\begin{array}{cc} 1&1\\ 1&0\end{array}\right) x \left(\begin{array}{cc} 1&0\\ 1&2\end{array}\right) x \left(\begin{array}{cc} 0&1\\ 1&-1\end{array}\right)
=\left(\begin{array}{cc} 2&0\\ 0&1\end{array}\right)
which is B, as promised.
to repeat: after we diagonalize A, getting B, we still say that the transpose of B – which is B itself – is the matrix of the adjoint of A. the catch is that we have to be using a different basis, the reciprocal basis, for B^T.
it is so tempting to say the following, that i must:
the adjoint of B is B^T, but wrt a different basis.
there, i’ve said it. it’s terribly sloppy, sloppier than even i am comfortable being. the word adjoint should be reserved for an operator, as the word transpose is reserved for a matrix. but if we started with a matrix A, and we never really got our hands on the operator L, it can be awkward. say what works for you, but be ready to introduce the operator L as soon as you need the clarity.
in our case, A and A^T represent L and its adjoint L* wrt the original basis; B and C^T represent L and L* wrt the new basis; and B represents L* wrt the reciprocal basis. in tabular form, what we have so far is:
picture-8.png
or, A and B represent L wrt the original and new bases; A^T, C^T and B represent L* wrt the original, new, and reciprocal bases.
so what represents L wrt the reciprocal basis?
i hope you answered C, and not just because it wasn’t listed! with that hole filled, we have:
picture-10.png
to check on C, we transform A to the reciprocal basis by computing Q^{-1}\ A \ Q: we belive this is C.
Q^{-1}\ A \ Q = \left(\begin{array}{cc} 1&1\\ 1&0\end{array}\right) x \left(\begin{array}{cc} 1&1\\ 0&2\end{array}\right)x \left(\begin{array}{cc} 0&1\\ 1&-1\end{array}\right)
=\left(\begin{array}{cc} 3&-2\\ 1&0\end{array}\right)
and that is, indeed, the transpose of C^T, which is C:
C = \left(\begin{array}{cc} 3&-2\\ 1&0\end{array}\right)
let me close by saying that if we start with a given matrix M, instead of a linear operator L, the presumption is that M is wrt an orthonormal basis. if that’s not true, someone should have said something.
we started with the matrix A; we never had any other definition of the linear operator it represents.
the pair of operators L and L* are represented by a pair of matrices M and M^T so long as that pair are wrt a basis and its reciprocal basis.
confusing? if a basis is not orthonormal, you’ve got to introduce the reciprocal basis because dot products using components are messed up; and you can always get the matrix of an adjoint L* by transposing the matrix of L, but it’s wrt the reciprocal basis.
if you think you’ve got it, then i have to ask you: what would have happened if we had started with a self-adjoint or normal matrix?