Color: decomposing the RGB A transpose matrix

(abuse of) terminology

Sometimes I get tired of writing “xbar, ybar, zbar tables” — and I just write “xyz bar tables” or even “XYZ tables”. Similarly for rbar, gbar, bbar tables — rgb bar. I’m not talking about anything new, just abbreviating the names.

Introduction

This is the third post about the example on p. 160 of W&S. Once again, I am going to decompose the reflected spectrum into its fundamental and its residual.

This time, however, I’m going to use the rbar, gbar, bbar tables (RGB) instead of the xbar, ybar, zbar (XYZ) tables. They did not do this.

I’ll tell you now there is one little twist in these calculations. We will need the ybar table, because we still need to use it to scale our results.

In addition to showing, as I did previously, the dual basis spectra and the orthonormal basis spectra for the non-nullspace, I will display the orthonormal basis for the nullspace.

As usual, I am using the CIE 1931 tables, and I am working at 20 nm intervals.

Our ingredients were: a reflectance spectrum, an illiminant spectrum, and some choice of color matching functions. For this post, I am choosing a different set of color matching functions.

Our results in the previous post were a few sets of 3 numbers. The final result was called xyz — meaning the triple (x, y, z) — for plotting a point on the CIE chromaticity chart. We will not do this again.

That result came from a triple denoted XYZ, which are values in XYZ color-space. That space has the advantage of being device-independent. We could — and did — convert those values to an RGB color space, specifically the one defined by the rgb bar (i.e. rbar, gbar, bbar) color-matching tables.

This post will compute the RGB values directly, and we will find that we get almost exactly the same RGB values. (We really ought to.)

The XYZ result in turn came from a triple for which I have no convenient notation, but I keep saying without proof, and confirming by computation, that that triple is the components of the fundamental of the reflected spectrum wrt the basis dual to the chosen color-matching functions.

Let us go see that, this time for the rbar, gbar, bbar tables instead of xbar, ybar, zbar.

The same old calculations, but with a twist

But first, let me review the calculations. Feel free to skip to the next section.

Our first computation is to get the reflected spectrum, which is the pointwise product of the illuminant spectrum and the reflectance spectrum. For this problem, the reflected spectrum is the perceived spectrum, the input to the eye of “the standard observer”. The data itself is in the fourth case of the first post about this example.

Here is what the three of those spectra looked like:

g1107a

As before, I’ve decided to avoid red, green, and blue — except where they are appropriate. For this post, they will be appropriate in some cases. The illuminant is shown in white, the reflectance spectrum is shown in purple, and the reflected spectrum is shown in yellow.

Having said that, I am going to use red, green, blue colors for the three columns of our A matrix, because these entires are rbar, gbar, bbar values. These are the 1931 color matching functions (i.e. a subset, at 20 nm intervals)

g1107b

BTW, we can tell that we are using RGB (i.e. rbar, gbar, bbar tables) because one of these functions is clearly negative over part of its range. In fact all of them are negative somewhere, but it’s only obvious for the red one.

Since I’ve just introduced new data, let me display it:

A = \left(\begin{array}{ccc} 0.00003 & -0.00001 & 0.00117 \\ 0.0003 & -0.00014 & 0.01214 \\ 0.00211 & -0.0011 & 0.11541 \\ -0.00261 & 0.00149 & 0.31228 \\ -0.02608 & 0.01485 & 0.29821 \\ -0.04939 & 0.03914 & 0.14494 \\ -0.07173 & 0.08536 & 0.04776 \\ -0.09264 & 0.17468 & 0.01221 \\ -0.03152 & 0.21466 & 0.00146 \\ 0.0906 & 0.19702 & -0.0013 \\ 0.24526 & 0.1361 & -0.00108 \\ 0.34429 & 0.06246 & -0.00049 \\ 0.29708 & 0.01828 & -0.00015 \\ 0.15968 & 0.00334 & -0.00003 \\ 0.05932 & 0.00037 & 0 \\ 0.01687 & 0.00003 & 0 \\ 0.0041 & 0 & 0 \\ 0.00105 & 0 & 0 \\ 0.00025 & 0 & 0 \\ 0.00006 & 0 & 0 \\ 0 & 0 & 0\end{array}\right)

Our second computation (or third, this order doesn’t matter) should be to apply the matrix A^T — the (transpose of) the color matching functions A — to the reflected spectrum S:

A^T\ S = \{70.5889,\ 61.249,\ 47.8412\}\ .

As before these numbers are components of (the fundamental of) the reflected spectrum wrt the basis dual to A^T\ . In Cohen’s notation, these 3 numbers are the components wrt the columns of the E matrix.

The other computation, whether we do it second or third, is to find the constant for normalizing things, using the illuminant spectrum; that is, to set the white point (1,1,1) in our color space.

Here’s the twist. We still have to use the second column of the XYZ (xyz bar) tables; not one of the columns of our new A is the total cone response. (Nor is the sum of them, and I’ll confess that I don’t know why. My WAG (my guess) is that the three color matching functions each have to be weighted differently in order to sum to the total cone response.)

The number we got by applying the second column of the XYZ tables to the illuminant was 529.764 .

Now we move on to the fourth computation: we divide our 3 values A^T\ S by that to get RGB.

RGB = {0.133246, 0.115616, 0.0903067}.

It is encouraging to see that these RGB values are in [0, 1]. These are tristimulus values in an RGB color space, namely the one I keep showing on the chromaticity chart.

Let me be clear. The red, green, and blue dots on this diagram are the projections of (1,0,0), (0,1,0) and (0,0,1) onto xy-space. The values I just got are in this RGB space.
CIE with orig RGB

Oh, in the previous post we transformed the computed XYZ values into RGB; what did we get?

{0.133246, 0.115616, 0.0903103}

So. Almost exactly the same answers: only the B value differs. That’s good. I said I would confirm this equivalence.

The blue value is off a tiny bit. (The simple fact is that the two sets of tables — xyz bar and rgb bar — and the transition matrix between them are not consistent. In fact, it appears that the columns of the two tables do not quite span the same 3D space. If I’m right, I could fix this by using one table and the transition matrix to construct a very slightly different version of the second table. But bear in mind that I am printing far more digits than W&S showed.)

the dual basis

Now, let’s get the dual basis E. We recall that instead of

E^T = A^{-1}\ ,

we use the pseudo inverse and write

E^T = (A'A)^{-1}\ A'\

It is easier to work with E^T but easier to display E:

E = \left(\begin{array}{ccc} 0.000561675 & -0.000849316 & 0.00535311 \\ 0.00583075 & -0.00906171 & 0.0555629 \\ 0.0527362 & -0.0836627 & 0.52778 \\ 0.110419 & -0.184984 & 1.42203 \\ 0.0130902 & -0.0537785 & 1.34031 \\ -0.149518 & 0.244672 & 0.616947 \\ -0.311401 & 0.664664 & 0.135783 \\ -0.492205 & 1.35497 & -0.0912093 \\ -0.339342 & 1.5831 & -0.142044 \\ 0.0806418 & 1.32314 & -0.0962321 \\ 0.656427 & 0.713092 & 0.00326589 \\ 1.0634 & 0.072223 & 0.0910496 \\ 0.956881 & -0.194913 & 0.10246 \\ 0.521479 & -0.151607 & 0.0594277 \\ 0.194688 & -0.0626132 & 0.0226816 \\ 0.0554502 & -0.0183495 & 0.00649827 \\ 0.0134844 & -0.00451219 & 0.00158394 \\ 0.00345331 & -0.00115556 & 0.000405644 \\ 0.000822217 & -0.000275134 & 0.0000965819 \\ 0.000197332 & -0.0000660321 & 0.0000231797 \\ 0. & 0. & 0.\end{array}\right)

Check that we do have E^T\ A = I\ . I didn’t say it last time, so let me emphasize: I didn’t prove that the pseudo-inverse would give me the dual basis. This easy “check” is the real confirmation that my recipe actually works. This check is crucial, not merely convenient.

Let me call the basis vectors E1, E2, E3, and let’s look at them. In fact, let me juxtapose the dual basis (right) with the color matching functions (left).

g1107c

If we multiply each of those dual basis vectors (E1, E2, E3) by our original three numbers…

{70.5889, 61.249, 47.8412}

and add the results.. we get numbers…

{0.243727,2.51476,23.848,64.4959,61.7521,33.9471,25.2246,
43.8826,66.214,82.1295,90.1689,83.8434,60.5088,30.3679,
10.993,3.10116,0.751257,0.192395,0.0458083,0.010994,0.}

and a picture…

g1107e

where I have shown the sum in white.

Let’s look quickly at the reflected spectrum (in yellow) and that sum (still in white):

g1107f

Yes, we’ve seen that before. And as before, that white curve is the fundamental. It differs from the yellow curve by a metameric black, a spectrum in the null space of A^T\ .

As before, we could stop here. We have gotten RGB values from the reflected spectrum… we have found the dual basis… we have found the fundamental of the reflected spectrum… we know that the first three numbers we computed (A^T\ S\ ) are the components of the fundamental wrt the dual basis.

The SVD, orthonormal basis, and projection operator

Now let’s do the SVD (Singular Value Decomposition), because we can get a nice orthonormal basis for the entire domain of A^T\ including its null space, i.e. for spectra including the metameric blacks.

As before, we will get an orthonormal basis v1 for the pre-image of A^T\ , then a projection operator R onto that pre-image, then recompute the fundamental by applying the projection operator to the reflected spectrum.

First, let

A^T = u\ w\ v^T\ .

(The first time I did this, I let B = A^T\ . Now we’re more experienced, but if you want to use B instead of A^T\ , feel free.)

As always, our vectors have length 21 because we used 20 nm intervals from 380 to 780 nm. Then, the leftmost 3 columns of v are an orthonormal basis for the pre-image of A^T\ , and the rightmost 18 (i.e. 21-3) are an orthonormal basis for the nullspace of A^T\ .

Here are the leftmost 3 columns of v, denoted v1 as is my habit:

v1 = \left(\begin{array}{ccc} 0.000242117 & 0.00237304 & 0.000773655 \\ 0.00254521 & 0.0246016 & 0.00811386 \\ 0.0253371 & 0.23387 & 0.0760077 \\ 0.0806111 & 0.633731 & 0.188135 \\ 0.110853 & 0.608184 & 0.12797 \\ 0.102268 & 0.306133 & -0.04696 \\ 0.0972532 & 0.129667 & -0.243487 \\ 0.0880827 & 0.101349 & -0.512971 \\ -0.0317461 & 0.111471 & -0.580417 \\ -0.228205 & 0.117862 & -0.449608 \\ -0.460923 & 0.112998 & -0.181177 \\ -0.596203 & 0.0923988 & 0.0811638 \\ -0.500375 & 0.0613641 & 0.162946 \\ -0.266387 & 0.0296408 & 0.104503 \\ -0.0986158 & 0.0105714 & 0.0410965 \\ -0.0280157 & 0.00296645 & 0.0118833 \\ -0.00680593 & 0.000717077 & 0.00290704 \\ -0.00174298 & 0.000183642 & 0.000744487 \\ -0.000414996 & 0.0000437242 & 0.000177259 \\ -0.000099599 & 0.0000104938 & 0.0000425421 \\ 0. & 0. & 0.\end{array}\right)

OK, I have to know: did v1 change? Yes. I’ll show you, soon.

Let me juxtapose the orthonormal basis v1 (right) with the dual basis E vectors. Both bases span the same space, the pre-image of A^T\ , but they are different.

g1107g

And in this case, I can’t see that any vectors in one basis look similar to vectors in the other basis.

Let me recall an image from the previous post: here are the dual basis E and the o/n basis v1 from the previous post. This is the old counterpart to the preceding drawing.

g1107h

And the right hand pieces of those two drawings are the new v1 and the old v1. We can see that they are quite different.

Having an orthonormal basis for the preimage of A^T\ , we know that we can construct a projection operator R from the domain of A^T\ onto the preimage; we just take the product of v1 and v1^T\ in the other order:

R = v1\ v1^T\ .

It’s 21×21, as before, so I’m not going to show it to you. But, for checking, I’ll show you the first column:

g1107i

And I’ll tell you that the last column, as before, is identically 0.

the fundamental and the residual

So. Apply the projection operator R to the reflected spectrum S to get the fundamental f…

f = {0.243727,2.51476,23.848,64.4959,61.7521,33.9471,25.2246,
43.8826,66.214,82.1295,90.1689,83.8434,60.5088,30.3679,
10.993,3.10116,0.751257,0.192395,0.0458083,0.010994,0.}

and subtract to get the residual n = S – f…

n = {4.85627,26.2996,19.116,-14.6684,-7.32845,18.6715,27.0686,
15.2246,3.00319,-13.0295,-23.7795,-12.7434,13.5977,41.0282,
59.1236,63.6887,60.2519,52.106,59.2832,33.0258,39.625}

Did we get the same fundamental? They are very close. Again, I think the real problem is that the two tables are not exactly consistent. Here’s

old f minus new f =

{-0.00362232,0.000221807,0.00281357,-0.00101889,-0.00138667,
-0.00346164,-0.00173991,-0.00126245,-0.00325203,-0.000486578,
0.00271479,0.00422385,0.00120929,-0.00248099,-0.00107542,
0.0013427,0.00223839,-0.00306751,0.00309787,0.00376061,0.}

We already know that the fundamental is the same whether we compute it using the dual basis or using the projection operator. Let me illustrate that the fundamental computed using RGB is the same as the fundamental computed using XYZ.

I have plotted points using the previous fundamental, and I have plotted a piecewise linear curve using the new fundamental:

g1107j

I should probably emphasize that although this drawing looks like one from the previous post, it illustrates something else. The apparently identical drawing in the previous post showed the equivalence of the fundamental computed two ways using XYZ; this drawing shows the equivalence of the fundamental from the previous post with the fundamental from this post.

And the null? Close, too. We have

old n minus new n =

{0.00362232,-0.000221807,-0.00281357,0.00101889,0.00138667,
0.00346164,0.00173991,0.00126245,0.00325203,0.000486578,
-0.00271479,-0.00422385,-0.00120929,0.00248099,0.00107542,
-0.0013427,-0.00223839,0.00306751,-0.00309787,-0.00376061,0.}

We can check that n is in the nullspace of A^T\ by computing A^T\ n\ :

{-6.77791*10^-14, 6.72509*10^-15, -3.50254*10^-14}.

We should also check that the fundamental has the same 3 components as the entire reflected spectrum. That is, we had

A^T\ S = \{70.5889,\ 61.249,\ 47.8412\}\ .

for A^T\ applied to the reflected spectrum. If we apply A^T\ to the fundamental, we get

A^T\ f = \{70.5889,\ 61.249,\ 47.8412\}.

The same. As they should be.

You know what? There’s another pair of calculations I can do, to make a point.

What are the components of the fundamental wrt the orthonormal basis v1? We just dot f into the v1 matrix, because its columns are orthonormal basis vectors:

f \cdot v1 =\
{-129.491,141.916,-79.7421,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}.

That confirms that no part of the fundamental lies in the nullspace. That was the whole point of the projection operator.

And what are the components of the residual? I hope you’ve got it: exactly the first three are zero:

n \cdot v1 =\
{0,0,0,-24.6623,-15.3786,17.8955,33.3365,29.6755,19.4414,
-0.412828,-18.9652,-15.4907,8.59491,37.8598,57.8821,63.3301,
60.1642,52.0835,59.2778,33.0245,39.625}.

The decomposition looks the same as in the previous post. The reflected spectrum is in yellow, the fundamental in white, and their difference n, the residual, is in black.

g1107j2

NOTE that although reflected and fundamental are non-negative, the residual is not. The good news is that, in principle, we could generate the fundamental spectrum, as well as the reflected. But I can’t imagine how to generate a spectrum with negative values.

We face this problem not just for the residual, but for just about every basis vector we’ve computed!

(The middle vector of the orthonormal basis, i.e. the middle column of v1, is the only exception so far; it is everywhere non-negative. But that’s only for the orthonormal basis for the RGB color matching functions; for the previous post, every column of v1 has some negative values.)

Because the basis vectors are, frankly, weird spectra, don’t expect anything nice if you try to find their RGB, XYZ, or xyz values. Most of them lie outside the CIE chromaticity chart! Incidentally, I have computed Cohen’s F1 and F2 bases for this case, and some of them are still negative, too. Well, they have to be: things have to cancel out so that the dot products among the orthonormal vectors are actually zero.

But while we’re looking at the linear algebra, let’s take a look at the orthonormal basis for the null space.

the null space basis

Now for something new.

We have 18 basis vectors (metameric black spectra) for the nullspace. It’s rather confusing to look at all of them… it’s not much better to look at a few of them close together. In the top left, we have the 1st 5th, 9th, and 13th (1,5,9,13)… the top right is 2,6,10,14… bottom left is 3,7,11,15… bottom right is 4,8,12,16:

g1107j3

There are two more to show (17 & 18) and I thought I show a few more adjacent pairs, too. I have added 1 & 2 and 9 & 10:

g1107k

While I’m playing around…. What about four spectra centered around 500? And add them up, to get another metameric black.

g1107m

Let me do one last thing.

Graphs at 5 nm

What would the bases look like if I used the full 5 nm tables? The fundamental depended on the reflected spctrum, but the bases depend only on my choice of color matching functions. So, let me show them for the 1931 CIE rgb bar tables at 5nm intervals.

Here are the orthonormal basis vectors “v1″ — at 20 nm on the left, 5 nm on the right:

g1107n

And the dual basis vectors “E”:

g1107o

And the dual basis “E” juxtaposed with the orthonormal basis “v1″:

g1107p

I might as well do this for the XYZ color matching functions, too. The more detailed curves are from the 1931 xyz bar tables, at 5 nm intervals.

Here are the orthonormal basis vectors “v1″:

g1107p2

The dual basis vectors “E”:

g1107q

And the dual basis “E” juxtaposed with the orthonormal basis “v1″:

g1107r

Color: decomposing the A transpose matrix and the reflected spectrum

This is the second post about the example on p. 160 of W&S. Here’s the first. (Incidentally, I am going to decompose the reflected spectrum into its fundamental and its residual. I do not think that they did this. Oh, this is what I did for Cohen’s toy example, but now it’s for real. Approximate, but for real.)

review the XYZ for this

Let’s review one case from the previous post: 1931 CIE at 20 nm intervals. As is my custom, I took a subset of their values — I did not average the numbers to account for the data I omitted. If this approximation isn’t good enough, simply go to 5 nm or 1 nm intervals.

Our ingredients were: a reflectance spectrum, an illiminant spectrum, and some choice of color matching functions.

Our results were a few sets of 3 numbers. The final result was called xyz — meaning the triple (x, y, z) — for plotting a point on the CIE chromaticity chart.

That result came from a triple denoted XYZ, which are values in XYZ color-space. That space has the advantage of being device-independent. We could — and will shortly — convert those values to an RGB color space, specifically the one defined by the CIE 1931 rgb bar (i.e. rbar, gbar, bbar) color-matching tables.

That XYZ result in turn came from a triple for which I have no convenient notation, but I keep saying without demonstration that that triple is the components of the fundamental of the reflected spectrum wrt the basis dual to the chosen color-matching functions. (Try saying that with a mouthfull of rocks!)

Let us go see that.

But first, let me review the calculations. Feel free to skip to the next section.

Our first computation is to get the reflected spectrum, which is the pointwise product of the illuminant spectrum and the reflectance spectrum. For this problem, the reflected spectrum is the perceived spectrum, the input to the eye of “the standard observer”.

Here is what the three of those spectra looked like:

g1102a

For this post, I’ve decided to avoid red, green, and blue, except where it is really appropriate. The illuminant is shown in white, the reflectance spectrum is shown in purple, and the reflected spectrum is shown in yellow.

For this example, I had chosen the 1931 xyzbar tables for our color functions, and they looked like this:

g1102b

Once again, I’ve changed the colors. X is orange, Y is white, and Z is yellow. It might help if you remember that the Y column of this A is nothing like green, being instead the “photopic response” — the total energy received by the cones — of the standard observer; white seems a sensible color for it.

BTW, we can tell that we are using XYZ (i.e. xbar, ybar, zbar tables) because these functions are never negative; we recall that each of the rbar, gbar, bbar color matching functions has some negative values.

Our second computation (or third, it doesn’t matter) should be to apply the matrix A^T\ — the (transpose of) the color matching functions A — to the reflected spectrum S, getting:

A^T\ S = \{356.817,\ 354.641,\ 271.109\}

I claim that these numbers are components of the fundamental of the reflected spectrum wrt the basis dual to A^T\ . In Cohen’s notation, these 3 numbers are the components wrt the columns of the E matrix.

Please be patient. We’ll get there real soon now.

The other computation, whether we do it second or third, is to normalize things using the illuminant spectrum; that is, to set the white point (1,1,1) in our XYZ space.

We did this easily because the second column of A — the second row of A^T\ — is the total cone response — the photopic response — of the standard observer.

Here’s the second row of A^T\ applied to D65: 529.764

Now we move on to the fourth computation: we divide our 3 values by that to get X,Y,Z.

XYZ = {0.673539, 0.669432, 0.511754}

NOTE that the XYZ values are in [0, 1]. These are tristimulus values in a sort-of color space. (Let’s see. It is not HSI, even though that middle number is an intensity. In fact, it does not directly specify a color; that’s done by a change-of-basis. Maybe I should call it a pre-color space? Maybe I’m getting carried away here.)

And then our fifth computation is to get x,y,z. Here’s the sum X+Y+Z: 1.85472

and we divide XYZ by it to get xyz values: xyz = {0.363148, 0.360933, 0.275919}.

Where was that on the chromaticity chart? The yellow dot.

g1102c

All that was done in the previous post. Now let’s move on.

Moving on.

Get the RGB

Let’s do one last thing before putting this down. I want to get RGB values from these XYZ. I have in my hands, and have shown you before, the transition matrix T31 (having once needed the 1964 CIE, I have renamed some things)…

T31 = \left(\begin{array}{ccc} 2.76888 & 1.75175 & 1.13016 \\ 1 & 4.5907 & 0.0601 \\ 0 & 0.05651 & 5.59427\end{array}\right)

It treats XYZ as old, RGB as new, so we have

XYZ = T31 RGB,

so

RGB = T31^{-1}\ XYZ = \{0.133246,\ 0.115616,\ 0.0903103\}\ .

We’ll come back to these in the next post. These should be RGB color-space values using the 3 sources shown on the CIE chromaticity diagram; alternatively, these should be the RGB values computed directly from the rbar, gbar, bbar tables. That’s what I’ll confirm in the next color post.

Moving on, finally, there are two things we need to do. Actually, we might get away with only one of them, and that’s the dual basis, so let’s go get it first.

getting the dual basis E.

As I have said before, given a basis with transition matrix A we can find a basis with transition matrix E dual to it by requiring that the dot products of the vectors satisfy

E^T A = I\ .

That is, by laying out the E vectors as rows and the A vectors as columns, we compute all of the possible dot products as E^T A\ .

A digression on dual bases

I’ll remind you that we are working in finite dimensional vector spaces, and the dual basis and reciprocal basis look exactly the same. (So if you need to review, look at my posts on reciprocal bases. See the tag cloud.) But I think it’s worthwhile to think of reciprocal bases as living in the same vector space, while dual bases live in a vector space V and its dual V*.

Oh, the dual space V* is the vector space of all linear functionals whose domain is V, i.e. all real-valued linear operators from V to R.

I’ll also remind you that although one usually starts with a basis in V and finds “the dual basis” in V*, we are going backwards. The rows of A^T\ are a basis for V*, and we wish to find a matrix E whose columns are a basis for V, and more, a dual basis to the rows of A^T\ .

What a tangled knot! In my head I was going backwards, but I computed it the other way! What I really computed was a basis E^T dual to A — but that is the same as finding a basis E dual to A^T\ . The two equations

E^T A = I

and

A^T E = I

are equivalent. I chose to solve the first instead of the second, even though I was thinking of the second. (Aside: it’s a very good thing that the dual of the dual space is the original space, for finite dimensional vector spaces.)

Finally, I suppose I should say again that although we often compute the dot product of two vectors (u and v, say) by writing one as a row (say u) and the other (v) as a column, we may view that matrix product as the effect of applying the linear functional u to the vector v. And what I did was the equivalent of talking about u’v but computing the equivalent, v’u.

Why might we view it that way? Among other things, we can apply u to v without even having a dot product!

Let’s return to finding the dual basis E.
end digression

If A is square and of full rank — i.e. we really do have a basis, hence a linearly independent set of vectors — then we can invert A and write

E^T A = I

E^T = A^{-1}\ .

In this case, however — with A our color matching functions –as we first saw in the first Cohen post, A is not square and therefore cannot possibly be invertible. The 3×3 matrix A’A (A^T A\ ), however, is invertible, and we construct the pseudo-inverse

(A'A)^{-1} A^T

and substitute it for A^{-1}\ , getting

E^T = A (A'A)^{-1}\ .

(We remember that A’A is symmetric, i.e. its own transpose.)

BTW, it is very convenient to write the equation that way, because to check that we have a dual basis we want to compute E^T A\ . We might as well have found E^T first.

When we check it, we see that E^T A = I\ .

Let me call the three dual basis vectors E1, E2, E3, and let’s look at them. Here they are in columns…

E = \left(\begin{array}{ccc} 0.000358806 & -0.000360371 & 0.000884809 \\ 0.00330737 & -0.00334373 & 0.00929769 \\ 0.0302172 & -0.0308581 & 0.0885709 \\ 0.0643748 & -0.0678399 & 0.241909 \\ 0.0116003 & -0.0190679 & 0.237446 \\ -0.0843071 & 0.0839105 & 0.126398 \\ -0.190772 & 0.216831 & 0.060478 \\ -0.329572 & 0.420346 & 0.0457603 \\ -0.286455 & 0.453811 & 0.0276004 \\ -0.0869933 & 0.321465 & -0.00307957 \\ 0.209686 & 0.0758441 & -0.0425853 \\ 0.438522 & -0.150742 & -0.0706912 \\ 0.418257 & -0.20126 & -0.0640185 \\ 0.232068 & -0.121142 & -0.0349616 \\ 0.0872125 & -0.0467666 & -0.0130635 \\ 0.0249132 & -0.0134701 & -0.00372503 \\ 0.00608984 & -0.00330711 & -0.000909677 \\ 0.00157141 & -0.000868448 & -0.000233816 \\ 0.000348975 & -0.000172577 & -0.0000531557 \\ 0.0000923158 & -0.0000402793 & -0.0000143874 \\ 0. & 0. & 0.\end{array}\right)

… and here is what they look like (yellow, violet, orange in order):

g1102d

If we multiply each of those spectra by our original numbers {356.817,354.641,271.109}… and add the results.. we get

Vdb = {0.240105,2.51498,23.8508,64.4948,61.7507,33.9437,25.2229,43.8813,
66.2108,82.129,90.1716,83.8477,60.51,30.3654,10.9919,3.10251,0.753495,
0.189327,0.0489062,0.0147546,0.}

which looks like this (you can see that the orange one shrank a little compared to the other two):

g1102e

where I have shown the scaled basis vectors (i.e. the scalar component times each vector), and added their sum in white.

Let’s look quickly at the reflected spectrum (in yellow) and that sum (in white):

g1102f

They are quite different. What’s going on?

What does the dual basis describe?

Well, look at the long wavelengths of the white curve. We — i.e. the standard observer — can’t see that part of the reflected spectrum at all. Parts of the reflected spectrum are invisible to us.

Duh. Been there, done that. “Metameric black”.

What we have — a basis of 3 vectors — is a basis for the visible-to-us part of the reflected spectrum — i.e. a basis for what we called the fundamental. We have in fact found that part of the reflected spectrum which the standard observer actually sees; and by subtraction we could find what part of it is– let’s call it the residual — what we have called a metameric black. (“Metameric” because it is a nonzero spectrum; it is perceptually black, but it is not physically the absence of light.)

We have the fundamental; it’s that white curve. But I haven’t really proved that.

It’s not hard. By definition, each of the 3 vectors in the dual basis is in the pre-image of A^T\ . And, since A^T\ has 3 rows, it is of rank at most 3, so its pre-image is of dimension at most 3. In fact, we know that this A^T\ is of full rank, hence of rank 3, and therefore its pre-image is of dimension 3.

If the 3 dual basis vectors are linearly independent (i.e. the E matrix is of full rank 3), they span a 3D space which is a subspace of the 3D pre-image — so they span the entire pre-image.

And I’m certain that A^T\ being of full rank guarantees that the dual basis is of full rank.

But exactly how do I prove that? Oh, the pre-image of A^T\ is the range of A, hence of dimension 3.

Do you find it convincing?

Let’s look at it another way. But remember that we really did just get the fundamental f of the reflected spectrum S, and we could compute the residual n by subtraction; the residual is the reflected spectrum minus the fundamental.

n = S – f.

An orthonormal basis for fundamental spectra.

For the other way, we need a really nice basis for the entire domain of the A^T\ matrix, one which decomposes its domain into nullspace and not-nullspace (i.e. nullspace and preimage of the range). But we know how to do that.

The SVD (Singular Value Decomposition).

We will get an orthonormal basis v1 for the pre-image of A^T\ , then a projection operator R onto that pre-image, then get the fundamental by applying the projection operator to the reflected spectrum.

First, let

A^T\ = u\ w\ v^T\ .

(The first time I did this, I let B = A^T\ . Now we’re more experienced, but if you want to use B instead of A^T\ , feel free.)

Now, v is 21×21. Then the leftmost 3 (= rank of A^T\ ) columns of v are an orthonormal basis for the preimage of A^T\ , and the rightmost 18 (= 21-3) are an orthonormal basis for the nullspace of A^T\ .

Here are the leftmost 3 columns of v:

v1 = \left(\begin{array}{ccc} -0.00201177 & 0.00143737 & 0.000374614 \\ -0.0210175 & 0.0149756 & 0.00335666 \\ -0.199597 & 0.142652 & 0.0305073 \\ -0.539263 & 0.385948 & 0.0611556 \\ -0.513526 & 0.367224 & -0.00344907 \\ -0.258102 & 0.159433 & -0.11971 \\ -0.121682 & -0.0133979 & -0.265642 \\ -0.134798 & -0.177324 & -0.481195 \\ -0.193185 & -0.30181 & -0.471062 \\ -0.246728 & -0.377774 & -0.254046 \\ -0.283178 & -0.410206 & 0.0979204 \\ -0.273373 & -0.377298 & 0.391838 \\ -0.201477 & -0.270573 & 0.408489 \\ -0.102016 & -0.135415 & 0.232377 \\ -0.0370629 & -0.0489598 & 0.0880787 \\ -0.0104736 & -0.0138137 & 0.0252276 \\ -0.00254533 & -0.00335417 & 0.00617542 \\ -0.000641283 & -0.000842035 & 0.00160261 \\ -0.000163267 & -0.00021855 & 0.000343642 \\ -0.0000487128 & -0.0000661713 & 0.0000876566 \\ 0. & 0. & 0.\end{array}\right)

We should look at those. First alone:

g1102g

… and then juxtaposed with the dual basis:
g1102g2

OK, the orthonormal basis is different from the dual basis. (Actually, one pair look similar. I’ll take a look in a moment.) These vectors (spectra) are orthonormal, the dual basis is not. (To check that, just compute v1^T v1\ ; it’s a 3×3 identity matrix.)

Let me look at the two basis vectors that were similar. From the dual basis, the first vector; from the orthonormal basis, the 3rd vector.

g1102h

Interesting. I do not know that this result has any significance, but what’s the point of drawing pictures and not looking at them?

Let’s continue.

The projection operator R.

Having an orthonormal basis for the preimage of A^T, we know that we can construct a projection operator R from the domain of A^T\ onto the preimage; we just take the product of v1 and v1^T in the other order:

R = v1\ v1^T

It’s 21×21, so I’m not going to show it to you. But, for checking, I’ll show you the first column:

g1102 R

And I’ll tell you that the last column is identically 0.

The fundamental, again? and more.

So. Apply the projection operator R to the reflected spectrum S to get the fundamental f…

R S = f = {0.240105,2.51498,23.8508,64.4948,61.7507,33.9437,25.2229,43.8813,
66.2108,82.129,90.1716,83.8477,60.51,30.3654,10.9919,3.10251,0.753495,
0.189327,0.0489062,0.0147546,0.}

and subtract to get the residual n = S – f:

S – f = n = {4.85989,26.2994,19.1132,-14.6673,-7.32707,18.6749,27.0703,15.2259,
3.00644,-13.029,-23.7822,-12.7477,13.5965,41.0307,59.1247,63.6874,
60.2497,52.1091,59.2801,33.022,39.625}

We can check that n is in the nullspace of A^T\ by computing A^T\ n\ :

A^T\ n = \{-1.01618*10^-13,\ -1.09023*10^-13,\ 4.93919*10^-15\}

We should also check that the fundamental has the same 3 values as the entire reflected spectrum. That is, we had

A^T\ S = \{356.817,\ 354.641,\ 271.109\}

for A^T\ applied to the reflected spectrum. If we now apply A^T\ to the fundamental, we get

A^T\ f = \{356.817,\ 354.641,\ 271.109\}

The same. Of course. (I hope you think so.)

Let’s look at the decomposition. The reflected spectrum is in yellow, the fundamental in white, and their difference n, the residual, is in black.

g1102i

Yes, where the fundamental is zero, the reflected spectrum and the residual coincide.

At the risk of annoying you, let me emphasize: the standard observer (at 20 nm intervals!) would perceive the yellow spectrum and the white spectrum as being exactly the same color. We would say they are metameric spectra. He is actually perceiving the yellow spectrum, but if we somehow generated the white one, it would look the same to him.

And now we should compare the fundamental with the vector Vdb constructed from the dual basis. We could look at the numbers, of course, but an easy way to see that they coincide is to plot only points for Vdb and only a piecewise-linear curve for the fundamental.

g1102j

They are the same.

Thus we confirm that the 3 values A^T S are the components of the fundamental (not of the entire reflected spectrum) wrt the basis E dual to A.

The dual basis tells us what those first 3 numbers mean, and would let us construct the fundamental, and thence the residual.

The orthonormal basis, on the other hand, let us compute a projection operator, which gets us the fundamental by applying the projection operator to the reflected spectrum.

If i could only do one of these, it would be to construct the dual basis rather than the projection operator… but computing power is cheap and there’s no reason to do only one of these.

Incidentally, we could have simply appled the Gram-Schmidt orthogonalization procedure to the dual basis… but I really like having a basis for the null space, and that’s what the rest of the v matrix from the SVD got me.

I’m going to wait, however, before looking at the basis vectors for the null space.

PCA / FA example 4: davis. Reciprocal basis 4.

(this has nothing to do with the covariance matrix of X; we’re back with A^R for the SVD of X.)

in the course of computing the reciprocal basis for my cut down  A^R

\left(\begin{array}{ll} 7.48331 & 0. \\ -3.74166 & -2.44949 \\ -3.74166 & 2.44949\end{array}\right)

i came up with the following matrix:

\beta = \left(\begin{array}{ll} 0.0445435 & 0 \\ -0.0890871 & -0.204124 \\ -0.0890871 & 0.204124\end{array}\right)

now, \beta is very well behaved. Its two columns are orthogonal to each other:

\left(0.0445435,-0.0890871,-0.0890871\right) \left(\begin{array}{l} 0 \\ -0.204124 \\ 0.204124\end{array}\right) = 0

And each column of \beta is orthogonal to one of the A^R vectors…

\left(0.0445435,-0.0890871,-0.0890871\right) \left(\begin{array}{l} 0. \\ -2.44949 \\ 2.44949\end{array}\right) = 0

\left(0,-0.204124,0.204124\right) \left(\begin{array}{l} 7.48331 \\ -3.74166 \\ -3.74166\end{array}\right) = 0

and each column of \beta has dot product 1 with the other of the A^R vectors:

\left(0.0445435,-0.0890871,-0.0890871\right) \left(\begin{array}{l} 7.48331 \\ -3.74166 \\ -3.74166\end{array}\right) = 1

\left(0,-0.204124,0.204124\right) \left(\begin{array}{l} 0. \\ -2.44949 \\ 2.44949\end{array}\right) = 1

i know perfectly well that i could have simply said that \beta is diagonal and that \beta^T\ A^R = I, for my cut-down A^R.

I wanted to write out the orthogonality relations for \beta explicitly. you see, there’s just one tiny problem: i know that \beta is wrong. It is not the reciprocal basis for A^R.

What could be wrong? how can it be so well-behaved but wrong? on a hunch, i computed the normals to the planes spanned by the columns of B and by the columns of \beta, and by the (first two) columns of A^R. in 3D we just use the cross product of two vectors to find their normal.

The cross product of the first two columns of A^R is:

\{-18.3303,-18.3303,-18.3303\}

That’s a normal to the plane spanned by the first two columns of A^R. it’s also (anti-)parallel to the third column of v:

\{0.57735,0.57735,0.57735\}.

Of course: the eigenvector basis described by v is orthogonal, so the third one is normal to the plane spanned by the first two.

Similarly, the cross product of the first two columns of B is:

\{-0.0181848,-0.0181848,-0.0181848\}.

(how convenient that the normal vectors have all components equal!) that, too, is (anti-)parallel to the third column of v, so it’s also normal to the plane spanned by the first two columns of A^R.

and the cross product of the columns of \beta?

\{-0.0363696,-0.00909241,-0.00909241\}.

well, shiver my timbers. that is not like the other two cross products, hence not normal to the plane spanned by the first two columns of A^R.

In contrast to the two vectors of B, the two vectors of \beta do not lie in the same plane as the A^R vectors. They do not span the same plane. They are not a reciprocal basis for the same plane.

\beta is very nice 2D basis. it just happens to span a plane other than the one spanned by A^R. it’s not normal to the third column of v.

It isn’t sufficient to require \beta^T\ A = I; we also need to require that the columns of \beta span the same space – in this case, the same plane – as the columns of A^R!

And how would we realize that our alleged reciprocal basis \beta wasn’t one? well, the projections of X onto the reciprocal basis should be the S^R. if they’re not, something’s wrong.

(guess what? now you know how i discovered that \beta was wrong.)

i would hate weighted eigenvector matrices when there are eigenvalues equal to zero, except that i think i can handle them. this is when i decided to replace zero eigenvalues by 1s, take the inverse transpose, and then take a sub-basis (i.e. the appropriate number of columns).

My own list of things to do now includes constructing the reciprocal basis in precisely that fashion. i choose to construct it because it corresponds to S^R, and let’s me compute the data values for the A^R basis.

PCA / FA example 4: davis. Davis & harman 3.

Hey, since we have the reciprocal basis, we can project onto it to get the components wrt the A^R basis. After all that work to show that the S^R are the components wrt the reciprocal basis, we ought to find the components wrt the A^R basis. And now we know that that’s just a projection onto the reciprocal basis: we want to compute X B.

Recall X:

X = \left(\begin{array}{lll} -6 & 3 & 3 \\ 2 & 1 & -3 \\ 0 & -1 & 1 \\ 4 & -3 & -1\end{array}\right)

Recall B:

B = \left(\begin{array}{ll} 0.0890871 & 0. \\ -0.0445435 & -0.204124 \\ -0.0445435 & 0.204124\end{array}\right)

The product is:

X\ B = \left(\begin{array}{ll} -0.801784 & 0 \\ 0.267261 & -0.816497 \\ 0 & 0.408248 \\ 0.534522 & 0.408248\end{array}\right)

What are the column variances?

\{0.333333,0.333333\}

Does that surprise you? the 1/3 comes from our using eigenvalues of X^T\ X instead of eigenvalues of the covariance matrix. The new variables have a common variance, but it isn’t 1. this corresponds to our initial work in harman, when we discovered that using the weighted eigenvector matrix gave us new variables with variance 1.

at first glance, it seems not worthwhile to work that out using the covariance matrix, but i suppose it might be worthwhile to come at the covariance matrix from the SVD. (oh, yes! it is!)

Let’s do it.

The covariance matrix of X is \frac{X^T\ X}{N-1}

We should just replace X by X2 = \frac{X}{\sqrt{N-1}} and find the SVD of X2. but there’s a catch.

Do it anyway. here’s our X2:

X2 = \left(\begin{array}{lll} -3.4641 & 1.73205 & 1.73205 \\ 1.1547 & 0.57735 & -1.73205 \\ 0. & -0.57735 & 0.57735 \\ 2.3094 & -1.73205 & -0.57735\end{array}\right)

Compute the variances…

\{6.22222,2.22222,2.22222\}

and their sum … 10.6667 .

Hang on. The data X2 has 1/3 the variance of the original data. We’re trying to find new forms of X (not of X2) with variance 1. but let’s just keep going.

Get the SVD of X2:

here are new u…

u2 = \left(\begin{array}{llll} -0.801784 & 0 & -0.571429 & 0.174964 \\ 0.267261 & -0.816497 & -0.449762 & -0.24417   \\ 0. & 0.408248 & -0.267261 & -0.872872 \\ 0.534522 & 0.408248 & -0.632262 & 0.384531\end{array}\right)

new v…

v2 = \left(\begin{array}{lll} 0.816497 & 0 & 0.57735 \\ -0.408248 & -0.707107 & 0.57735 \\ -0.408248 & 0.707107 & 0.57735\end{array}\right)

and new w…

w2 = \left(\begin{array}{lll} 5.2915 & 0. & 0. \\ 0. & 2. & 0. \\ 0. & 0. & 0. \\ 0. & 0. & 0.\end{array}\right)

Then i compute new A’s and S’s from

A^R = v\ w^T.

A^Q = u\ w.

S^R = X\ A^R.

S^Q = X^T\ A^Q.

(using X2, u2, v2, and w2, of course)

A^Q are the components of the data X2 wrt the orthogonal eigenvector matrix v2.

A^Q = \left(\begin{array}{lll} -4.24264 & 0 & 0 \\ 1.41421 & -1.63299 & 0 \\ 0. & 0.816497 & 0 \\ 2.82843 & 0.816497 & 0\end{array}\right)

Compute the variances…

\{9.33333,1.33333,0.\}

and their sum is 10.6667 . Yes, the new orthogonal eigenvector matrix v has redistributed the total variance of X2 among two new variables.

What about the R-mode scores?

S^R = \left(\begin{array}{llll} -22.4499 & 0 & 0 & 0 \\ 7.48331 & -3.26599 & 0 & 0 \\ 0. & 1.63299 & 0 & 0 \\ 14.9666 & 1.63299 & 0 & 0\end{array}\right)

Compute the variances:

\{261.333,5.33333,0.,0.\}

i say again, yikes!

For the record, here are the new

A^R = \left(\begin{array}{llll} 4.32049 & 0 & 0 & 0 \\ -2.16025 & -1.41421 & 0 & 0 \\ -2.16025 & 1.41421 & 0 & 0\end{array}\right)

and

S^Q = \left(\begin{array}{lll} 22.8619 & 0 & 0 \\ -11.431 & -2.82843 & 0 \\ -11.431 & 2.82843 & 0\end{array}\right)

From the orthonormal basis v2 , i construct a reciprocal basis for the \sqrt{\text{eigenvalue}}-weighted matrix A^R. i’ll just use the diagonal of w2 (instead of the \sqrt{\text{eigenvalue}} since i haven’t computed them) to scale two of the vectors. Recall the new w matrix:

w2 = \left(\begin{array}{lll} 5.2915 & 0. & 0. \\ 0. & 2. & 0. \\ 0. & 0. & 0. \\ 0. & 0. & 0.\end{array}\right)

i construct a diagonal matrix using 1/w when possible, 1 otherwise.

\left(\begin{array}{lll} 0.188982 & 0. & 0. \\ 0. & 0.5 & 0. \\ 0. & 0. & 1.\end{array}\right)

then i premultiply that by the new v, and keep only the first two columns; the new reciprocal basis is:

B2 = \left(\begin{array}{ll} 0.154303 & 0 \\ -0.0771517 & -0.353553 \\ -0.0771517 & 0.353553\end{array}\right)

Having the reciprocal basis, we project the data onto it get the new data wrt the A^R basis. But we project the original data. That’s the catch. As i said, it’s the variance of X we’re trying to scale to 1, not the variance of X2. we compute X B2:

X\ B2 = \left(\begin{array}{ll} -1.38873 & 0 \\ 0.46291 & -1.41421 \\ 0 & 0.707107 \\ 0.92582 & 0.707107\end{array}\right)

And the variances?

\{1.,1.\}.

Indeed. this corresponds even more closely to what we first saw in harman. (if instead we project X2 onto B2, we get a common variance of 1/3 again.)

PCA / FA example 4: davis. Reciprocal basis 2 & 3.

The reciprocal basis for A^R using explicit bases.

Here I go again, realizing that I’m being sloppy. i call

\{2,1,-3\}

the second data vector, but of course, those are the components of the second data vector. a lot of us blur this distinction a lot of the time, between a vector and its components. So long as we work only with components, this isn’t an issue, but we’re about to write vectors. i wouldn’t go so far as to say that the whole point of matrix algebra is to let us blur that distinction, but it’s certainly a major reason why we do matrix algebra. but for now, let’s look at the linear algebra, distinquishing between vectors and their components.

i need a name for the second data vector; let’s call it s. we will write it two ways, with respect to two bases, and show that the two ways are equivalent.

We have the original basis. We never write it, because the basis vectors have components (1,0,0), (0,1,0), and (0,0,1). let me call this basis e_i. because it’s an orthonormal basis, it is its own reciprocal basis, and i will call the reciprocal basis e^i. yes, i use the same symbol e, but the original basis has subscripts, the reciprocal basis has superscripts.

We have the basis described by A^R. let me call those vectors f_i. we found the reciprocal basis B, and i will call it f^i. same convention: subscripts or superscripts on the common symbol f.

let us recall the (transition matrix B) for the reciprocal basis:

\left(\begin{array}{ll} 0.0890871 & 0. \\ -0.0445435 & -0.204124 \\ -0.0445435 & 0.204124\end{array}\right)

let us also recall the design matrix X:

\left(\begin{array}{lll} -6 & 3 & 3 \\ 2 & 1 & -3 \\ 0 & -1 & 1 \\ 4 & -3 & -1\end{array}\right)

When we say the components of the second data vector are

\{2,1,-3\}

we are asserting that the second data vector is

s = 2\ e_1 + e_2 - 3\ e_3.

When we say that the first reciprocal basis vector has components given by the first column of B…

\{0.0890871,-0.0445435,-0.0445435\}

we are asserting that the first reciprocal basis vector is

f^1= 0.0890871\ e^1\ -0.0445435\ e^2 - 0.0445435\ e^3

Similarly, the second reciprocal basis vector, whose components are the second column of B, is

f^2 = 0. e^1 - 0.204124\ e^2 + 0.204124\ e^3

Let us recall (the first two columns of) S^R:

\left(\begin{array}{ll} -67.3498 & 0. \\ 22.4499 & -9.79796 \\ 0. & 4.89898 \\ 44.8999 & 4.89898\end{array}\right)

When we say that the second row of S^R

\{22.4499,-9.79796\}

are the components of the second data vector wrt the reciprocal basis \left(f^i\right), we are saying that

s = 22.4499 f^1 - 9.79796 f^2.

but we defined the vector s as

s = 2\ e_1 + e_2 - 3\ e_3.

The two forms of s should be the same vector. We start with…

s = 22.4499 f^1 - 9.79796 f^2.

Plug in the expressions for f^1 and f^2

s = 22.4499 \left(0.0890871 e^1 - 0.0445435   e^2 - 0.0445435 e^3\right)
- 9.79796 \left(0.   e^1 - 0.204124 e^2 + 0.204124 e^3\right)

and Mathematica® simplifies that to

s = 2. e^1 + 1. e^2 - 3. e^3,

just what it should be.

The reciprocal basis for A^R using tensors.

Third, i can write this another way, and some of you will have seen it this way. If you have, i just want to refresh your memory. Using tensor notation (and the einstein summation convention), we could have written

s = s_i\ f^i and s = s^i\ f_i.

which is another way of saying that the components of s wrt the reciprocal basis f^i are s_i and the components of s wrt the basis f_i are s^i. It is absolutely crucial that corresponding components and vectors do not both have superscripts or subscripts, but are mixed.

what is significant are these equations

s\ \cdot\ f_i = s_i

and

s\ \cdot\ f^i = s^i,

which I offer as reminders; i’m not going to work them out for you at this time. they are another way of writing what we’ve been saying all along: dotting a vector into some basis vectors gives you the components wrt the reciprocal basis.

PCA / FA example 4: davis. Scores & reciprocal basis 1.

The reciprocal basis for A^R, using matrix multiplication.

i keep emphasizing that A^Q is the new data wrt the orthogonal eigenvector matrix; and i hope i’ve said that the R-mode scores S^R = X\ A^R are not the new data wrt the weighted eigenvector matrix A^R. In a very real sense, the R-mode scores S^R do not correspond to the R-mode loadings A^R. This is important enough that i want to work out the linear algebra. In fact, i’ll work it out thrice. (And i’m going to do it differently from the original demonstration that A^Q is the new data wrt v.)

Of course, people will expect us compute the scores S^R, and I’ll be perfectly happy to. I just won’t let anyone tell me they correspond to the A^R.

First off, we have been viewing the original data as vectors wrt an orthonormal basis. Recall the design matrix:

\left(\begin{array}{lll} -6 & 3 & 3 \\ 2 & 1 & -3 \\ 0 & -1 & 1 \\ 4 & -3 & -1\end{array}\right)

Let’s take just the second observation:

X_2 = (2,\ 1,\ -3)

Those are the components of a vector.

Second, we have an orthonormal basis defined by the orthogonal eigenvector matrix v. Recall A^Q:

\left(\begin{array}{lll} -7.34847 & 0. & 0. \\ 2.44949 & -2.82843 & 0. \\ 0. & 1.41421 & 0. \\ 4.89898 & 1.41421 & 0.\end{array}\right)

for the second row, in particular, we have new components….

A^{Q}_2 = (2.44949,\ -2.82843,\ 0.)

We’ve already seen that old and new components (x and y, resp) are related by

X_2 = v\ A^{Q}_2,

although we wrote the whole thing as A^Q = X\ v. yes? we confirm it simply by computing…

v\ A^{Q}_2 = \left(\begin{array}{lll} 0.816497&0&0.57735\\ -0.408248&-0.707107&0.57735\\ -0.408248&0.707107&0.57735\end{array}\right) x \left(\begin{array}{l} 2.44949\\ -2.82843\\ 0.\end{array}\right)

= \left(\begin{array}{l} 2.\\ 1.\\ -3.\end{array}\right)

which is X_2.

The second row of A^Q represents the same vector as the second row of X, but the components are different because the basis vectors are different.

All that was review, albeit from a slightly different point of view. let’s move on to the weighted eigenvector matrix A^R; it is not an orthonormal basis because the basis vectors are not of length 1, although they are orthogonal to each other. Here’s my A^R from the SVD:

\left(\begin{array}{llll} 7.48331&0.&0.&0.\\ -3.74166&-2.44949&0.&0.\\ -3.74166&2.44949&0.&0.\end{array}\right)

i just want the first two columns. Here’s my A^R cut down…

\left(\begin{array}{ll} 7.48331&0.\\ -3.74166&-2.44949\\ -3.74166&2.44949\end{array}\right)

and my S^R cut down:

\left(\begin{array}{ll} -67.3498&0.\\ 22.4499&-9.79796\\ 0.&4.89898\\ 44.8999&4.89898\end{array}\right)

I’ve been implying, if not saying, that the projections of X onto A^R, namely X\ A^R (which are the R-mode scores S^R), are the components of the data wrt a basis other than A^R. it’s what may best be called the reciprocal basis.

(if you haven’t looked at “the reciprocal basis” in http://rip94550.wordpress.com/2008/03/21/transpose-matrix-adjoint-operator-2/, you might want to.)

Briefly, a reciprocal basis is constructed to make the dot product of vectors work out right when we have a non-orthonormal basis. What we do is get the components of one vector wrt the basis, the other vector wrt the reciprocal basis, and then we can compute their dot product in the usual way, as the sum of products of corresponding entries; i.e. as a row vector times a column vector.

If we have a transition matrix A, then the transition matrix B for the reciprocal basis is B = A^{-T}. (because it gives us B^T\ A = A^{-1}\ A = I, which says that the dot product of a row of B^T with a column of A is either 0 or 1. it’s equivalent to saying that the dot product of a column of B with a column of A is either 0 or 1. either way, the column vectors of B and of A are pairwise orthogonal.)

we have a problem, however: we have a matrix A^R which is not invertible. Another damned dragon to cope with.

We could look for a matrix B such that B^T\ A = I. unfortunately, such matrices are not unique.

Let’s construct the reciprocal basis geometrically. As A^R was constructed from v by multiplying columns by \sqrt{\text{eigenvalue}} we will construct B by dividing v by \sqrt{\text{eigenvalue}}, but we’ll just do it for the first two columns. The quickest way to accomplish that is to post-multiply by a diagonal matrix whose entries are \frac{1}{\sqrt{\text{Eigenvalue}}}, when possible and 1 otherwise… the diagonal eigenvalue matrix is…

\left(\begin{array}{lll} 9.16515&0.&0.\\ 0.&3.4641&0.\\ 0.&0.&0.\end{array}\right)

so i form a new diagonal matrix

\left(\begin{array}{lll} 0.109109&0.&0.\\ 0.&0.288675&0.\\ 0.&0.&1.\end{array}\right)

Note the 1 instead of 0 in the (3, 3) slot.

recall v…

\left(\begin{array}{lll} 0.816497&0&0.57735\\ -0.408248&-0.707107&0.57735\\ -0.408248&0.707107&0.57735\end{array}\right)

and compute the product of v and the new diagonal matrix

\left(\begin{array}{lll} 0.816497&0&0.57735\\ -0.408248&-0.707107&0.57735\\ -0.408248&0.707107&0.57735\end{array}\right) x \left(\begin{array}{lll} 0.109109&0.&0.\\ 0.&0.288675&0.\\ 0.&0.&1.\end{array}\right)

= \left(\begin{array}{lll} 0.0890871&0.&0.57735\\ -0.0445435&-0.204124&0.57735\\ -0.0445435&0.204124&0.57735\end{array}\right)

and since i only want the first two columns, i cut that down to

B = \left(\begin{array}{ll} 0.0890871&0.\\ -0.0445435&-0.204124\\ -0.0445435&0.204124\end{array}\right)

and call it B.

It is true that B^T\ A = I. (so the columns of B are mutually orthogonal to the columns of A.)

Now, B is the transition matrix for the reciprocal basis of A^R. S^R was defined as X\ A^R, the projections of the data onto the A^R basis. i claim that S^R is the components of the data wrt B, the reciprocal basis for A^R.

i want to confirm this just for the second data vector, namely that

X_2 = B\ S^{R}_2,

i.e. the old components are found by applying the transition matrix B to the new components. the RHS is

B\ S^{R}_2 = \left(\begin{array}{ll} 0.0890871&0.\\ -0.0445435&-0.204124\\ -0.0445435&0.204124\end{array}\right) x \left(\begin{array}{l} 22.4499\\ -9.79796\end{array}\right)

= \left(\begin{array}{l} 2\\ 1\\ -3\end{array}\right)

and, once again, we have found the old components of the second observation. More importantly, we have exhibited a basis B \ne\ A^R wrt which the R-mode scores represent the new components of the data.

transpose matrix & adjoint operator 2

(
begin digression
just in case you’ve seen this in another form, let me make some connections. if you don’t recognize any of this digression, that’s ok. you can move along, there’s nothing to see here.
i am taking a linear operator L: V\rightarrow V, from a vector space V to the same vector space V. in fact, i have more than a vector space: V is an inner product space; i have a “dot product”. 
what i call the reciprocal basis is often called the dual basis. in fact, halmos calls it the dual basis. but that terminolgy is also associated specifically with the so-called dual space V* of linear functionals on V; in that case, the dual basis is a basis in V*. there can be a great deal of confusion here. the dual space V* can be defined without an inner product on V; the inner product on V can be defined without ever mentioning the dual space V*. but if we introduce both inner product and V*, then there is a natural isomorphism between elements of V and of V*. i have seen people think of the one-to-one relationship between elements of V and V* as an identity, and to confuse the inner product of two vectors in V with the effect of a linear functional on a vector. (worse, i have seen people assert that an inner product involves one element of V and one element of V*.
there is a one-to-one correspondence between my right shoe and my left, but they are not identical. isomorphism is not always identity.
here, i have a finite-dimensional vector space V with an inner product on it. i have two bases (original and new) on V, and i want to construct a third basis for V. i call it the reciprocal basis to emphasize that it is not a basis on V*.
end digression
)
let’s see how this plays out.
our new basis defined by the columns of P is not orthonormal. we need to figure out what the reciprocal basis needs to be. its purpose is to make dot products – and transposes – come out right.
the columns of P are the basis vectors for the new basis in which our matrix A became the diagonal matrix B.
recall P:
P = \left(\begin{array}{cc} 1&1\\ 1&0\end{array}\right)
what are the dot products of those two basis vectors with each other? well, we have them wrt the original basis, so we can compute their dot products as P^T\ P (that’s a convenient way of getting them in one fell swoop):
P^T\ P =\left(\begin{array}{cc} 1&1\\ 1&0\end{array}\right) x \left(\begin{array}{cc} 1&1\\ 1&0\end{array}\right) 
= \left(\begin{array}{cc} 2&1\\ 1&1\end{array}\right)
in fact, we did that in the schur’s theorem post, but i let it slide after saying “but P is not orthogonal”. 
the “2″ says the first vector is of length \sqrt{2}; the diagonal “1″ says the second vector is of length 1. each off-diagonal “1″ says that the cross-term dot products are 1 instead of the 0 we get from orthogonal vectors: that is, if the basis vectors are e_1 and e_2:
1 = e_1\cdot e_2 = |e_1| \ |e_2| \ cos \theta = \sqrt{2}\ cos \theta
so
cos \theta = \frac{1}{\sqrt{2}}\ and \theta = 45{}^{\circ}
“we knew that.”
now, what do these two basis vectors look like wrt the new basis, i.e. wrt themselves? what are their new components? they’re trival wrt themselves:
{(1,\ 0.)}
{(0,\ 1)}
the first new basis vector is 1 times itself, plus no part of the second vector; the second new basis vector is no part of the first vector plus 1 times itself.
how could we compute the euclidean inner product of the new basis vectors using new components? 
we can’t do it by just multiplying together the components. if we did that, we would get the identity matrix, and we would mistakenly conclude that the two basis vectors were orthonormal. 
there’s something going on with the inner product and the transformation to the new basis. (we should get to this, but not today. what we’re looking at is called the induced metric; what we’re about to do is the linear algebra equivalent of lowering indices using the metric tensor g_{ ij} in tensor analysis.)
i want another basis, the so-called reciprocal basis. i want a pair of vectors whose dot products with the new basis are 1 and 0. to put it another way, for this reciprocal basis i want an attitude matrix \alpha such that when i multiply \alpha times P (the rows of the reciprocal basis times the columns of the new basis) i get the identity. bear in mind that i am writing a matrix equation in the original basis, where the inner products come out correctly.
that is, i want
\alpha \ P = I
but P is invertible, inverses are unique, and therefore 
\alpha = P^{-1},
so we want to define the reciprocal basis by that attitude matrix.
\alpha = P^{-1} = \left(\begin{array}{cc} 0&1\\ 1&-1\end{array}\right)
and then i would write the transition matrix for the reciprocal basis as the transpose of its attitude mattrix. call it Q:
Q = \alpha^T = \left(\begin{array}{cc} 0&1\\ 1&-1\end{array}\right)
(yes, that was a trivial computation.) 
we now have three bases: original, new, and reciprocal.
it’s worth noting for future use that the transition matrix for the reciprocal basis is the inverse transpose of the transition matrix of the new basis: 
Q = P^{-T}.
what i have claimed is that: as A is diagonal in the new basis, the transpose of A is diagonal in the reciprocal basis; more, i claim that it’s the very same diagonal matrix B. in our original basis, we have A and its transpose:
A = \left(\begin{array}{cc} 1&1\\ 0&2\end{array}\right)
A^T = \left(\begin{array}{cc} 1&0\\ 1&2\end{array}\right)
they represent L and L* in the original basis. in the new basis, whose transition matrix is P, we knew that A becomes the diagonal matrix B:
B = \left(\begin{array}{cc} 2&0\\ 0&1\end{array}\right)
but that A^T becomes C^T which is not diagonal:
C^T = \left(\begin{array}{cc} 3&1\\ -2&0\end{array}\right)
we also believe that C^T represents L* wrt the new basis; it was defined by the change-of-basis formula for a matrix. now, in the reciprocal basis, whose transition matrix is Q, we compute Q^{-1}\ A^T\ Q to see what A^T becomes.
Q^{-1}\ A^T\ Q = \left(\begin{array}{cc} 1&1\\ 1&0\end{array}\right) x \left(\begin{array}{cc} 1&0\\ 1&2\end{array}\right) x \left(\begin{array}{cc} 0&1\\ 1&-1\end{array}\right)
=\left(\begin{array}{cc} 2&0\\ 0&1\end{array}\right)
which is B, as promised.
to repeat: after we diagonalize A, getting B, we still say that the transpose of B – which is B itself – is the matrix of the adjoint of A. the catch is that we have to be using a different basis, the reciprocal basis, for B^T.
it is so tempting to say the following, that i must:
the adjoint of B is B^T, but wrt a different basis.
there, i’ve said it. it’s terribly sloppy, sloppier than even i am comfortable being. the word adjoint should be reserved for an operator, as the word transpose is reserved for a matrix. but if we started with a matrix A, and we never really got our hands on the operator L, it can be awkward. say what works for you, but be ready to introduce the operator L as soon as you need the clarity.
in our case, A and A^T represent L and its adjoint L* wrt the original basis; B and C^T represent L and L* wrt the new basis; and B represents L* wrt the reciprocal basis. in tabular form, what we have so far is:
picture-8.png
or, A and B represent L wrt the original and new bases; A^T, C^T and B represent L* wrt the original, new, and reciprocal bases.
so what represents L wrt the reciprocal basis?
i hope you answered C, and not just because it wasn’t listed! with that hole filled, we have:
picture-10.png
to check on C, we transform A to the reciprocal basis by computing Q^{-1}\ A \ Q: we belive this is C.
Q^{-1}\ A \ Q = \left(\begin{array}{cc} 1&1\\ 1&0\end{array}\right) x \left(\begin{array}{cc} 1&1\\ 0&2\end{array}\right)x \left(\begin{array}{cc} 0&1\\ 1&-1\end{array}\right)
=\left(\begin{array}{cc} 3&-2\\ 1&0\end{array}\right)
and that is, indeed, the transpose of C^T, which is C:
C = \left(\begin{array}{cc} 3&-2\\ 1&0\end{array}\right)
let me close by saying that if we start with a given matrix M, instead of a linear operator L, the presumption is that M is wrt an orthonormal basis. if that’s not true, someone should have said something.
we started with the matrix A; we never had any other definition of the linear operator it represents.
the pair of operators L and L* are represented by a pair of matrices M and M^T so long as that pair are wrt a basis and its reciprocal basis.
confusing? if a basis is not orthonormal, you’ve got to introduce the reciprocal basis because dot products using components are messed up; and you can always get the matrix of an adjoint L* by transposing the matrix of L, but it’s wrt the reciprocal basis.
if you think you’ve got it, then i have to ask you: what would have happened if we had started with a self-adjoint or normal matrix?

transpose matrix and adjoint operator 1


recall the post about schur’s lemma. i provided an example of a matrix A which could be diagonalized (to B) but not by an orthogonal transition matrix. A is not a normal matrix: it does not commute with its transpose:

A\ A^T \neq A^T \ A.

it was the monday morning after the post when i woke up thinking, “but B is a diagonal matrix, it is its own transpose… 

B^T = B 

and therefore it is trivially normal:

B \ B^T = B^2 = B^T \ B,

but i know that’s wrong.”

that is, i seem to have converted A into a normal matrix, but that’s wrong because normality should be a property of the operator represented by A or B. if one matrix is normal, the other should be, since they represent the same operator. (now may be a good time to remind you that the whole point of the change-of-basis equation for square matrices

B = P^{-1}\ A \ P

is that the matrix B represents the same linear operator L as A does, but taken wrt a new basis whose transition matrix is P.)

what’s going on here? A and B represent the same non-normal operator L; how is B its own transpose?  that’s the wrong question. correct is, how can B represent both the operator and its adjoint?

i wish i could say that the answer sprang full-blown into my head right after i said all that, but it didn’t. i knew three pieces of the answer, and as soon as i saw the complete answer in halmos, i understood it.

i am going to be a little sloppy. when i refer to the adjoint of a matrix M, i really mean the adjoint operator whose matrix is M wrt some basis. OTOH, the term transpose refers only to a matrix, and means exactly what it has always meant: interchange the rows and columns. in everything that follows, we have only two linear operators (an operator and its adjoint), which for one brief shining moment i will call L and L*. what i have given you is that wrt the original basis, they are represented by matrices A and A^T.

recall A…

\left(\begin{array}{cc} 1&1\\ 0&2\end{array}\right)

its transpose is…

\left(\begin{array}{cc} 1&0\\ 1&2\end{array}\right)

we found a transition matrix P which diagonalizes A; it also defines what i’m calling the new basis. we compute 

B = P^{-1}\ A \ P

= \left(\begin{array}{cc} 0&1\\ 1&-1\end{array}\right) x \left(\begin{array}{cc} 1&1\\ 0&2\end{array}\right) x \left(\begin{array}{cc} 1&1\\ 1&0\end{array}\right)

= \left(\begin{array}{cc} 2&0\\ 0&1\end{array}\right)

giving us what we called B. since B is diagonal, it is its own transpose. the problem is that B appears to be not only normal, but in fact self-adjoint. but it can’t be, because A isn’t normal.

the first question that comes to my mind is, what does the transition matrix P do to A^T? let’s call the result C^T. (not C; that would lead to a notational nightmare down the road.) since A^T represents the adjoint of A wrt the original basis, then C^T must represent the adjoint of A wrt the new basis.

we compute C^T = P^{-1}\ A^T\ P:

= \left(\begin{array}{cc} 0&1\\ 1&-1\end{array}\right) x \left(\begin{array}{cc} 1&0\\ 1&2\end{array}\right) x \left(\begin{array}{cc} 1&1\\ 1&0\end{array}\right)

= \left(\begin{array}{cc} 3&1\\ -2&0\end{array}\right)

and that’s not B. it’s not supposed to be. somehow.

so the transpose matrix B^T is not always the adjoint operator? what’s going on?

i knew three things.
  • one, that whenever i have a basis (in this case, the new basis) which is not orthonormal, i have to jump through a hoop to compute the dot product of two vectors. i have to introduce what’s called the reciprocal basis, and compute the components of one vector (either one, but only one) wrt the reciprocal basis. then i compute the dot product in the usual way, but using the components wrt the new basis for one vector and wrt the reciprocal basis for the other vector. (you may have seen this in tensor analysis as “lowering an index”.
  • two, an orthonormal basis is its own reciprocal basis.
  • three, the transpose of a matrix corresponds to the adjoint operator if the matrix is wrt an orthonormal basis. i knew that, but i wasn’t comfortable with it. nobody wrote it, at least not where i saw it.

here’s the truth: we can always form the matrix of the adjoint operator \left( B^T\right) by transposing the matrix B of the operator; but that transpose matrix is wrt a different basis, the so-called reciprocal basis; hence we get the same basis as a special case if the basis is orthonormal.

it would help if i gave you the definition of the adjoint operator. in an inner product space, the adjoint operator L* of a linear operator L (operators, not matrices) is defined by

L\ x\cdot y = x\cdot L^*\ y

where \cdot indicates the inner product (dot product) of two vectors. so, the definition of the adjoint operator hinges on the inner product.

(to be quite explicit about the order of operations, \left(L\ x\right)\cdot y = x\cdot \left(L^*\ y\right), although no other interpretation holds water. on the left, L does not apply to the real number x \cdot y; and on the right, x \cdot L^* is not defined, not for a vector and an operator.)

trust me, we could show that wrt any orthonormal basis, the matrices of L and L* are transposes. the closest thing to a challenge in working that out is to define the matrix of an operator wrt a basis.

what i claim is that the matrices of L and L* are always transposes, provided they are taken wrt two different bases, each being called the reciprocal basis of the other. if the basis is orthonormal, it is its own reciprocal basis; and in that case only, the matrices are wrt the same basis.

we need the reciprocal basis because the computation of the dot product of two vectors cannot be just the usual multiplication of corresponding components and summing. the definition of the adjoint operator requires the inner product; writing that in matrices essentially requires that we change the matrix of the inner product from the identity to whatever it is in the new non-orthonormal basis.

all i’m i’m going to do next is show you how to get the reciprocal basis and confirm that it diagonalizes A^T to the matrix B which is the diagonalized matrix A.