Color: decomposing the RGB A transpose matrix

(abuse of) terminology

Sometimes I get tired of writing “xbar, ybar, zbar tables” — and I just write “xyz bar tables” or even “XYZ tables”. Similarly for rbar, gbar, bbar tables — rgb bar. I’m not talking about anything new, just abbreviating the names.

Introduction

This is the third post about the example on p. 160 of W&S. Once again, I am going to decompose the reflected spectrum into its fundamental and its residual.

This time, however, I’m going to use the rbar, gbar, bbar tables (RGB) instead of the xbar, ybar, zbar (XYZ) tables. They did not do this.

I’ll tell you now there is one little twist in these calculations. We will need the ybar table, because we still need to use it to scale our results.

In addition to showing, as I did previously, the dual basis spectra and the orthonormal basis spectra for the non-nullspace, I will display the orthonormal basis for the nullspace.

As usual, I am using the CIE 1931 tables, and I am working at 20 nm intervals.

Our ingredients were: a reflectance spectrum, an illiminant spectrum, and some choice of color matching functions. For this post, I am choosing a different set of color matching functions.

Our results in the previous post were a few sets of 3 numbers. The final result was called xyz — meaning the triple (x, y, z) — for plotting a point on the CIE chromaticity chart. We will not do this again.

That result came from a triple denoted XYZ, which are values in XYZ color-space. That space has the advantage of being device-independent. We could — and did — convert those values to an RGB color space, specifically the one defined by the rgb bar (i.e. rbar, gbar, bbar) color-matching tables.

This post will compute the RGB values directly, and we will find that we get almost exactly the same RGB values. (We really ought to.)

The XYZ result in turn came from a triple for which I have no convenient notation, but I keep saying without proof, and confirming by computation, that that triple is the components of the fundamental of the reflected spectrum wrt the basis dual to the chosen color-matching functions.

Let us go see that, this time for the rbar, gbar, bbar tables instead of xbar, ybar, zbar.

The same old calculations, but with a twist

But first, let me review the calculations. Feel free to skip to the next section.

Our first computation is to get the reflected spectrum, which is the pointwise product of the illuminant spectrum and the reflectance spectrum. For this problem, the reflected spectrum is the perceived spectrum, the input to the eye of “the standard observer”. The data itself is in the fourth case of the first post about this example.

Here is what the three of those spectra looked like:

g1107a

As before, I’ve decided to avoid red, green, and blue — except where they are appropriate. For this post, they will be appropriate in some cases. The illuminant is shown in white, the reflectance spectrum is shown in purple, and the reflected spectrum is shown in yellow.

Having said that, I am going to use red, green, blue colors for the three columns of our A matrix, because these entires are rbar, gbar, bbar values. These are the 1931 color matching functions (i.e. a subset, at 20 nm intervals)

g1107b

BTW, we can tell that we are using RGB (i.e. rbar, gbar, bbar tables) because one of these functions is clearly negative over part of its range. In fact all of them are negative somewhere, but it’s only obvious for the red one.

Since I’ve just introduced new data, let me display it:

A = \left(\begin{array}{ccc} 0.00003 & -0.00001 & 0.00117 \\ 0.0003 & -0.00014 & 0.01214 \\ 0.00211 & -0.0011 & 0.11541 \\ -0.00261 & 0.00149 & 0.31228 \\ -0.02608 & 0.01485 & 0.29821 \\ -0.04939 & 0.03914 & 0.14494 \\ -0.07173 & 0.08536 & 0.04776 \\ -0.09264 & 0.17468 & 0.01221 \\ -0.03152 & 0.21466 & 0.00146 \\ 0.0906 & 0.19702 & -0.0013 \\ 0.24526 & 0.1361 & -0.00108 \\ 0.34429 & 0.06246 & -0.00049 \\ 0.29708 & 0.01828 & -0.00015 \\ 0.15968 & 0.00334 & -0.00003 \\ 0.05932 & 0.00037 & 0 \\ 0.01687 & 0.00003 & 0 \\ 0.0041 & 0 & 0 \\ 0.00105 & 0 & 0 \\ 0.00025 & 0 & 0 \\ 0.00006 & 0 & 0 \\ 0 & 0 & 0\end{array}\right)

Our second computation (or third, this order doesn’t matter) should be to apply the matrix A^T — the (transpose of) the color matching functions A — to the reflected spectrum S:

A^T\ S = \{70.5889,\ 61.249,\ 47.8412\}\ .

As before these numbers are components of (the fundamental of) the reflected spectrum wrt the basis dual to A^T\ . In Cohen’s notation, these 3 numbers are the components wrt the columns of the E matrix.

The other computation, whether we do it second or third, is to find the constant for normalizing things, using the illuminant spectrum; that is, to set the white point (1,1,1) in our color space.

Here’s the twist. We still have to use the second column of the XYZ (xyz bar) tables; not one of the columns of our new A is the total cone response. (Nor is the sum of them, and I’ll confess that I don’t know why. My WAG (my guess) is that the three color matching functions each have to be weighted differently in order to sum to the total cone response.)

The number we got by applying the second column of the XYZ tables to the illuminant was 529.764 .

Now we move on to the fourth computation: we divide our 3 values A^T\ S by that to get RGB.

RGB = {0.133246, 0.115616, 0.0903067}.

It is encouraging to see that these RGB values are in [0, 1]. These are tristimulus values in an RGB color space, namely the one I keep showing on the chromaticity chart.

Let me be clear. The red, green, and blue dots on this diagram are the projections of (1,0,0), (0,1,0) and (0,0,1) onto xy-space. The values I just got are in this RGB space.
CIE with orig RGB

Oh, in the previous post we transformed the computed XYZ values into RGB; what did we get?

{0.133246, 0.115616, 0.0903103}

So. Almost exactly the same answers: only the B value differs. That’s good. I said I would confirm this equivalence.

The blue value is off a tiny bit. (The simple fact is that the two sets of tables — xyz bar and rgb bar — and the transition matrix between them are not consistent. In fact, it appears that the columns of the two tables do not quite span the same 3D space. If I’m right, I could fix this by using one table and the transition matrix to construct a very slightly different version of the second table. But bear in mind that I am printing far more digits than W&S showed.)

the dual basis

Now, let’s get the dual basis E. We recall that instead of

E^T = A^{-1}\ ,

we use the pseudo inverse and write

E^T = (A'A)^{-1}\ A'\

It is easier to work with E^T but easier to display E:

E = \left(\begin{array}{ccc} 0.000561675 & -0.000849316 & 0.00535311 \\ 0.00583075 & -0.00906171 & 0.0555629 \\ 0.0527362 & -0.0836627 & 0.52778 \\ 0.110419 & -0.184984 & 1.42203 \\ 0.0130902 & -0.0537785 & 1.34031 \\ -0.149518 & 0.244672 & 0.616947 \\ -0.311401 & 0.664664 & 0.135783 \\ -0.492205 & 1.35497 & -0.0912093 \\ -0.339342 & 1.5831 & -0.142044 \\ 0.0806418 & 1.32314 & -0.0962321 \\ 0.656427 & 0.713092 & 0.00326589 \\ 1.0634 & 0.072223 & 0.0910496 \\ 0.956881 & -0.194913 & 0.10246 \\ 0.521479 & -0.151607 & 0.0594277 \\ 0.194688 & -0.0626132 & 0.0226816 \\ 0.0554502 & -0.0183495 & 0.00649827 \\ 0.0134844 & -0.00451219 & 0.00158394 \\ 0.00345331 & -0.00115556 & 0.000405644 \\ 0.000822217 & -0.000275134 & 0.0000965819 \\ 0.000197332 & -0.0000660321 & 0.0000231797 \\ 0. & 0. & 0.\end{array}\right)

Check that we do have E^T\ A = I\ . I didn’t say it last time, so let me emphasize: I didn’t prove that the pseudo-inverse would give me the dual basis. This easy “check” is the real confirmation that my recipe actually works. This check is crucial, not merely convenient.

Let me call the basis vectors E1, E2, E3, and let’s look at them. In fact, let me juxtapose the dual basis (right) with the color matching functions (left).

g1107c

If we multiply each of those dual basis vectors (E1, E2, E3) by our original three numbers…

{70.5889, 61.249, 47.8412}

and add the results.. we get numbers…

{0.243727,2.51476,23.848,64.4959,61.7521,33.9471,25.2246,
43.8826,66.214,82.1295,90.1689,83.8434,60.5088,30.3679,
10.993,3.10116,0.751257,0.192395,0.0458083,0.010994,0.}

and a picture…

g1107e

where I have shown the sum in white.

Let’s look quickly at the reflected spectrum (in yellow) and that sum (still in white):

g1107f

Yes, we’ve seen that before. And as before, that white curve is the fundamental. It differs from the yellow curve by a metameric black, a spectrum in the null space of A^T\ .

As before, we could stop here. We have gotten RGB values from the reflected spectrum… we have found the dual basis… we have found the fundamental of the reflected spectrum… we know that the first three numbers we computed (A^T\ S\ ) are the components of the fundamental wrt the dual basis.

The SVD, orthonormal basis, and projection operator

Now let’s do the SVD (Singular Value Decomposition), because we can get a nice orthonormal basis for the entire domain of A^T\ including its null space, i.e. for spectra including the metameric blacks.

As before, we will get an orthonormal basis v1 for the pre-image of A^T\ , then a projection operator R onto that pre-image, then recompute the fundamental by applying the projection operator to the reflected spectrum.

First, let

A^T = u\ w\ v^T\ .

(The first time I did this, I let B = A^T\ . Now we’re more experienced, but if you want to use B instead of A^T\ , feel free.)

As always, our vectors have length 21 because we used 20 nm intervals from 380 to 780 nm. Then, the leftmost 3 columns of v are an orthonormal basis for the pre-image of A^T\ , and the rightmost 18 (i.e. 21-3) are an orthonormal basis for the nullspace of A^T\ .

Here are the leftmost 3 columns of v, denoted v1 as is my habit:

v1 = \left(\begin{array}{ccc} 0.000242117 & 0.00237304 & 0.000773655 \\ 0.00254521 & 0.0246016 & 0.00811386 \\ 0.0253371 & 0.23387 & 0.0760077 \\ 0.0806111 & 0.633731 & 0.188135 \\ 0.110853 & 0.608184 & 0.12797 \\ 0.102268 & 0.306133 & -0.04696 \\ 0.0972532 & 0.129667 & -0.243487 \\ 0.0880827 & 0.101349 & -0.512971 \\ -0.0317461 & 0.111471 & -0.580417 \\ -0.228205 & 0.117862 & -0.449608 \\ -0.460923 & 0.112998 & -0.181177 \\ -0.596203 & 0.0923988 & 0.0811638 \\ -0.500375 & 0.0613641 & 0.162946 \\ -0.266387 & 0.0296408 & 0.104503 \\ -0.0986158 & 0.0105714 & 0.0410965 \\ -0.0280157 & 0.00296645 & 0.0118833 \\ -0.00680593 & 0.000717077 & 0.00290704 \\ -0.00174298 & 0.000183642 & 0.000744487 \\ -0.000414996 & 0.0000437242 & 0.000177259 \\ -0.000099599 & 0.0000104938 & 0.0000425421 \\ 0. & 0. & 0.\end{array}\right)

OK, I have to know: did v1 change? Yes. I’ll show you, soon.

Let me juxtapose the orthonormal basis v1 (right) with the dual basis E vectors. Both bases span the same space, the pre-image of A^T\ , but they are different.

g1107g

And in this case, I can’t see that any vectors in one basis look similar to vectors in the other basis.

Let me recall an image from the previous post: here are the dual basis E and the o/n basis v1 from the previous post. This is the old counterpart to the preceding drawing.

g1107h

And the right hand pieces of those two drawings are the new v1 and the old v1. We can see that they are quite different.

Having an orthonormal basis for the preimage of A^T\ , we know that we can construct a projection operator R from the domain of A^T\ onto the preimage; we just take the product of v1 and v1^T\ in the other order:

R = v1\ v1^T\ .

It’s 21×21, as before, so I’m not going to show it to you. But, for checking, I’ll show you the first column:

g1107i

And I’ll tell you that the last column, as before, is identically 0.

the fundamental and the residual

So. Apply the projection operator R to the reflected spectrum S to get the fundamental f…

f = {0.243727,2.51476,23.848,64.4959,61.7521,33.9471,25.2246,
43.8826,66.214,82.1295,90.1689,83.8434,60.5088,30.3679,
10.993,3.10116,0.751257,0.192395,0.0458083,0.010994,0.}

and subtract to get the residual n = S – f…

n = {4.85627,26.2996,19.116,-14.6684,-7.32845,18.6715,27.0686,
15.2246,3.00319,-13.0295,-23.7795,-12.7434,13.5977,41.0282,
59.1236,63.6887,60.2519,52.106,59.2832,33.0258,39.625}

Did we get the same fundamental? They are very close. Again, I think the real problem is that the two tables are not exactly consistent. Here’s

old f minus new f =

{-0.00362232,0.000221807,0.00281357,-0.00101889,-0.00138667,
-0.00346164,-0.00173991,-0.00126245,-0.00325203,-0.000486578,
0.00271479,0.00422385,0.00120929,-0.00248099,-0.00107542,
0.0013427,0.00223839,-0.00306751,0.00309787,0.00376061,0.}

We already know that the fundamental is the same whether we compute it using the dual basis or using the projection operator. Let me illustrate that the fundamental computed using RGB is the same as the fundamental computed using XYZ.

I have plotted points using the previous fundamental, and I have plotted a piecewise linear curve using the new fundamental:

g1107j

I should probably emphasize that although this drawing looks like one from the previous post, it illustrates something else. The apparently identical drawing in the previous post showed the equivalence of the fundamental computed two ways using XYZ; this drawing shows the equivalence of the fundamental from the previous post with the fundamental from this post.

And the null? Close, too. We have

old n minus new n =

{0.00362232,-0.000221807,-0.00281357,0.00101889,0.00138667,
0.00346164,0.00173991,0.00126245,0.00325203,0.000486578,
-0.00271479,-0.00422385,-0.00120929,0.00248099,0.00107542,
-0.0013427,-0.00223839,0.00306751,-0.00309787,-0.00376061,0.}

We can check that n is in the nullspace of A^T\ by computing A^T\ n\ :

{-6.77791*10^-14, 6.72509*10^-15, -3.50254*10^-14}.

We should also check that the fundamental has the same 3 components as the entire reflected spectrum. That is, we had

A^T\ S = \{70.5889,\ 61.249,\ 47.8412\}\ .

for A^T\ applied to the reflected spectrum. If we apply A^T\ to the fundamental, we get

A^T\ f = \{70.5889,\ 61.249,\ 47.8412\}.

The same. As they should be.

You know what? There’s another pair of calculations I can do, to make a point.

What are the components of the fundamental wrt the orthonormal basis v1? We just dot f into the v1 matrix, because its columns are orthonormal basis vectors:

f \cdot v1 =\
{-129.491,141.916,-79.7421,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}.

That confirms that no part of the fundamental lies in the nullspace. That was the whole point of the projection operator.

And what are the components of the residual? I hope you’ve got it: exactly the first three are zero:

n \cdot v1 =\
{0,0,0,-24.6623,-15.3786,17.8955,33.3365,29.6755,19.4414,
-0.412828,-18.9652,-15.4907,8.59491,37.8598,57.8821,63.3301,
60.1642,52.0835,59.2778,33.0245,39.625}.

The decomposition looks the same as in the previous post. The reflected spectrum is in yellow, the fundamental in white, and their difference n, the residual, is in black.

g1107j2

NOTE that although reflected and fundamental are non-negative, the residual is not. The good news is that, in principle, we could generate the fundamental spectrum, as well as the reflected. But I can’t imagine how to generate a spectrum with negative values.

We face this problem not just for the residual, but for just about every basis vector we’ve computed!

(The middle vector of the orthonormal basis, i.e. the middle column of v1, is the only exception so far; it is everywhere non-negative. But that’s only for the orthonormal basis for the RGB color matching functions; for the previous post, every column of v1 has some negative values.)

Because the basis vectors are, frankly, weird spectra, don’t expect anything nice if you try to find their RGB, XYZ, or xyz values. Most of them lie outside the CIE chromaticity chart! Incidentally, I have computed Cohen’s F1 and F2 bases for this case, and some of them are still negative, too. Well, they have to be: things have to cancel out so that the dot products among the orthonormal vectors are actually zero.

But while we’re looking at the linear algebra, let’s take a look at the orthonormal basis for the null space.

the null space basis

Now for something new.

We have 18 basis vectors (metameric black spectra) for the nullspace. It’s rather confusing to look at all of them… it’s not much better to look at a few of them close together. In the top left, we have the 1st 5th, 9th, and 13th (1,5,9,13)… the top right is 2,6,10,14… bottom left is 3,7,11,15… bottom right is 4,8,12,16:

g1107j3

There are two more to show (17 & 18) and I thought I show a few more adjacent pairs, too. I have added 1 & 2 and 9 & 10:

g1107k

While I’m playing around…. What about four spectra centered around 500? And add them up, to get another metameric black.

g1107m

Let me do one last thing.

Graphs at 5 nm

What would the bases look like if I used the full 5 nm tables? The fundamental depended on the reflected spctrum, but the bases depend only on my choice of color matching functions. So, let me show them for the 1931 CIE rgb bar tables at 5nm intervals.

Here are the orthonormal basis vectors “v1″ — at 20 nm on the left, 5 nm on the right:

g1107n

And the dual basis vectors “E”:

g1107o

And the dual basis “E” juxtaposed with the orthonormal basis “v1″:

g1107p

I might as well do this for the XYZ color matching functions, too. The more detailed curves are from the 1931 xyz bar tables, at 5 nm intervals.

Here are the orthonormal basis vectors “v1″:

g1107p2

The dual basis vectors “E”:

g1107q

And the dual basis “E” juxtaposed with the orthonormal basis “v1″:

g1107r

Color: decomposing the A transpose matrix and the reflected spectrum

This is the second post about the example on p. 160 of W&S. Here’s the first. (Incidentally, I am going to decompose the reflected spectrum into its fundamental and its residual. I do not think that they did this. Oh, this is what I did for Cohen’s toy example, but now it’s for real. Approximate, but for real.)

review the XYZ for this

Let’s review one case from the previous post: 1931 CIE at 20 nm intervals. As is my custom, I took a subset of their values — I did not average the numbers to account for the data I omitted. If this approximation isn’t good enough, simply go to 5 nm or 1 nm intervals.

Our ingredients were: a reflectance spectrum, an illiminant spectrum, and some choice of color matching functions.

Our results were a few sets of 3 numbers. The final result was called xyz — meaning the triple (x, y, z) — for plotting a point on the CIE chromaticity chart.

That result came from a triple denoted XYZ, which are values in XYZ color-space. That space has the advantage of being device-independent. We could — and will shortly — convert those values to an RGB color space, specifically the one defined by the CIE 1931 rgb bar (i.e. rbar, gbar, bbar) color-matching tables.

That XYZ result in turn came from a triple for which I have no convenient notation, but I keep saying without demonstration that that triple is the components of the fundamental of the reflected spectrum wrt the basis dual to the chosen color-matching functions. (Try saying that with a mouthfull of rocks!)

Let us go see that.

But first, let me review the calculations. Feel free to skip to the next section.

Our first computation is to get the reflected spectrum, which is the pointwise product of the illuminant spectrum and the reflectance spectrum. For this problem, the reflected spectrum is the perceived spectrum, the input to the eye of “the standard observer”.

Here is what the three of those spectra looked like:

g1102a

For this post, I’ve decided to avoid red, green, and blue, except where it is really appropriate. The illuminant is shown in white, the reflectance spectrum is shown in purple, and the reflected spectrum is shown in yellow.

For this example, I had chosen the 1931 xyzbar tables for our color functions, and they looked like this:

g1102b

Once again, I’ve changed the colors. X is orange, Y is white, and Z is yellow. It might help if you remember that the Y column of this A is nothing like green, being instead the “photopic response” — the total energy received by the cones — of the standard observer; white seems a sensible color for it.

BTW, we can tell that we are using XYZ (i.e. xbar, ybar, zbar tables) because these functions are never negative; we recall that each of the rbar, gbar, bbar color matching functions has some negative values.

Our second computation (or third, it doesn’t matter) should be to apply the matrix A^T\ — the (transpose of) the color matching functions A — to the reflected spectrum S, getting:

A^T\ S = \{356.817,\ 354.641,\ 271.109\}

I claim that these numbers are components of the fundamental of the reflected spectrum wrt the basis dual to A^T\ . In Cohen’s notation, these 3 numbers are the components wrt the columns of the E matrix.

Please be patient. We’ll get there real soon now.

The other computation, whether we do it second or third, is to normalize things using the illuminant spectrum; that is, to set the white point (1,1,1) in our XYZ space.

We did this easily because the second column of A — the second row of A^T\ — is the total cone response — the photopic response — of the standard observer.

Here’s the second row of A^T\ applied to D65: 529.764

Now we move on to the fourth computation: we divide our 3 values by that to get X,Y,Z.

XYZ = {0.673539, 0.669432, 0.511754}

NOTE that the XYZ values are in [0, 1]. These are tristimulus values in a sort-of color space. (Let’s see. It is not HSI, even though that middle number is an intensity. In fact, it does not directly specify a color; that’s done by a change-of-basis. Maybe I should call it a pre-color space? Maybe I’m getting carried away here.)

And then our fifth computation is to get x,y,z. Here’s the sum X+Y+Z: 1.85472

and we divide XYZ by it to get xyz values: xyz = {0.363148, 0.360933, 0.275919}.

Where was that on the chromaticity chart? The yellow dot.

g1102c

All that was done in the previous post. Now let’s move on.

Moving on.

Get the RGB

Let’s do one last thing before putting this down. I want to get RGB values from these XYZ. I have in my hands, and have shown you before, the transition matrix T31 (having once needed the 1964 CIE, I have renamed some things)…

T31 = \left(\begin{array}{ccc} 2.76888 & 1.75175 & 1.13016 \\ 1 & 4.5907 & 0.0601 \\ 0 & 0.05651 & 5.59427\end{array}\right)

It treats XYZ as old, RGB as new, so we have

XYZ = T31 RGB,

so

RGB = T31^{-1}\ XYZ = \{0.133246,\ 0.115616,\ 0.0903103\}\ .

We’ll come back to these in the next post. These should be RGB color-space values using the 3 sources shown on the CIE chromaticity diagram; alternatively, these should be the RGB values computed directly from the rbar, gbar, bbar tables. That’s what I’ll confirm in the next color post.

Moving on, finally, there are two things we need to do. Actually, we might get away with only one of them, and that’s the dual basis, so let’s go get it first.

getting the dual basis E.

As I have said before, given a basis with transition matrix A we can find a basis with transition matrix E dual to it by requiring that the dot products of the vectors satisfy

E^T A = I\ .

That is, by laying out the E vectors as rows and the A vectors as columns, we compute all of the possible dot products as E^T A\ .

A digression on dual bases

I’ll remind you that we are working in finite dimensional vector spaces, and the dual basis and reciprocal basis look exactly the same. (So if you need to review, look at my posts on reciprocal bases. See the tag cloud.) But I think it’s worthwhile to think of reciprocal bases as living in the same vector space, while dual bases live in a vector space V and its dual V*.

Oh, the dual space V* is the vector space of all linear functionals whose domain is V, i.e. all real-valued linear operators from V to R.

I’ll also remind you that although one usually starts with a basis in V and finds “the dual basis” in V*, we are going backwards. The rows of A^T\ are a basis for V*, and we wish to find a matrix E whose columns are a basis for V, and more, a dual basis to the rows of A^T\ .

What a tangled knot! In my head I was going backwards, but I computed it the other way! What I really computed was a basis E^T dual to A — but that is the same as finding a basis E dual to A^T\ . The two equations

E^T A = I

and

A^T E = I

are equivalent. I chose to solve the first instead of the second, even though I was thinking of the second. (Aside: it’s a very good thing that the dual of the dual space is the original space, for finite dimensional vector spaces.)

Finally, I suppose I should say again that although we often compute the dot product of two vectors (u and v, say) by writing one as a row (say u) and the other (v) as a column, we may view that matrix product as the effect of applying the linear functional u to the vector v. And what I did was the equivalent of talking about u’v but computing the equivalent, v’u.

Why might we view it that way? Among other things, we can apply u to v without even having a dot product!

Let’s return to finding the dual basis E.
end digression

If A is square and of full rank — i.e. we really do have a basis, hence a linearly independent set of vectors — then we can invert A and write

E^T A = I

E^T = A^{-1}\ .

In this case, however — with A our color matching functions –as we first saw in the first Cohen post, A is not square and therefore cannot possibly be invertible. The 3×3 matrix A’A (A^T A\ ), however, is invertible, and we construct the pseudo-inverse

(A'A)^{-1} A^T

and substitute it for A^{-1}\ , getting

E^T = A (A'A)^{-1}\ .

(We remember that A’A is symmetric, i.e. its own transpose.)

BTW, it is very convenient to write the equation that way, because to check that we have a dual basis we want to compute E^T A\ . We might as well have found E^T first.

When we check it, we see that E^T A = I\ .

Let me call the three dual basis vectors E1, E2, E3, and let’s look at them. Here they are in columns…

E = \left(\begin{array}{ccc} 0.000358806 & -0.000360371 & 0.000884809 \\ 0.00330737 & -0.00334373 & 0.00929769 \\ 0.0302172 & -0.0308581 & 0.0885709 \\ 0.0643748 & -0.0678399 & 0.241909 \\ 0.0116003 & -0.0190679 & 0.237446 \\ -0.0843071 & 0.0839105 & 0.126398 \\ -0.190772 & 0.216831 & 0.060478 \\ -0.329572 & 0.420346 & 0.0457603 \\ -0.286455 & 0.453811 & 0.0276004 \\ -0.0869933 & 0.321465 & -0.00307957 \\ 0.209686 & 0.0758441 & -0.0425853 \\ 0.438522 & -0.150742 & -0.0706912 \\ 0.418257 & -0.20126 & -0.0640185 \\ 0.232068 & -0.121142 & -0.0349616 \\ 0.0872125 & -0.0467666 & -0.0130635 \\ 0.0249132 & -0.0134701 & -0.00372503 \\ 0.00608984 & -0.00330711 & -0.000909677 \\ 0.00157141 & -0.000868448 & -0.000233816 \\ 0.000348975 & -0.000172577 & -0.0000531557 \\ 0.0000923158 & -0.0000402793 & -0.0000143874 \\ 0. & 0. & 0.\end{array}\right)

… and here is what they look like (yellow, violet, orange in order):

g1102d

If we multiply each of those spectra by our original numbers {356.817,354.641,271.109}… and add the results.. we get

Vdb = {0.240105,2.51498,23.8508,64.4948,61.7507,33.9437,25.2229,43.8813,
66.2108,82.129,90.1716,83.8477,60.51,30.3654,10.9919,3.10251,0.753495,
0.189327,0.0489062,0.0147546,0.}

which looks like this (you can see that the orange one shrank a little compared to the other two):

g1102e

where I have shown the scaled basis vectors (i.e. the scalar component times each vector), and added their sum in white.

Let’s look quickly at the reflected spectrum (in yellow) and that sum (in white):

g1102f

They are quite different. What’s going on?

What does the dual basis describe?

Well, look at the long wavelengths of the white curve. We — i.e. the standard observer — can’t see that part of the reflected spectrum at all. Parts of the reflected spectrum are invisible to us.

Duh. Been there, done that. “Metameric black”.

What we have — a basis of 3 vectors — is a basis for the visible-to-us part of the reflected spectrum — i.e. a basis for what we called the fundamental. We have in fact found that part of the reflected spectrum which the standard observer actually sees; and by subtraction we could find what part of it is– let’s call it the residual — what we have called a metameric black. (“Metameric” because it is a nonzero spectrum; it is perceptually black, but it is not physically the absence of light.)

We have the fundamental; it’s that white curve. But I haven’t really proved that.

It’s not hard. By definition, each of the 3 vectors in the dual basis is in the pre-image of A^T\ . And, since A^T\ has 3 rows, it is of rank at most 3, so its pre-image is of dimension at most 3. In fact, we know that this A^T\ is of full rank, hence of rank 3, and therefore its pre-image is of dimension 3.

If the 3 dual basis vectors are linearly independent (i.e. the E matrix is of full rank 3), they span a 3D space which is a subspace of the 3D pre-image — so they span the entire pre-image.

And I’m certain that A^T\ being of full rank guarantees that the dual basis is of full rank.

But exactly how do I prove that? Oh, the pre-image of A^T\ is the range of A, hence of dimension 3.

Do you find it convincing?

Let’s look at it another way. But remember that we really did just get the fundamental f of the reflected spectrum S, and we could compute the residual n by subtraction; the residual is the reflected spectrum minus the fundamental.

n = S – f.

An orthonormal basis for fundamental spectra.

For the other way, we need a really nice basis for the entire domain of the A^T\ matrix, one which decomposes its domain into nullspace and not-nullspace (i.e. nullspace and preimage of the range). But we know how to do that.

The SVD (Singular Value Decomposition).

We will get an orthonormal basis v1 for the pre-image of A^T\ , then a projection operator R onto that pre-image, then get the fundamental by applying the projection operator to the reflected spectrum.

First, let

A^T\ = u\ w\ v^T\ .

(The first time I did this, I let B = A^T\ . Now we’re more experienced, but if you want to use B instead of A^T\ , feel free.)

Now, v is 21×21. Then the leftmost 3 (= rank of A^T\ ) columns of v are an orthonormal basis for the preimage of A^T\ , and the rightmost 18 (= 21-3) are an orthonormal basis for the nullspace of A^T\ .

Here are the leftmost 3 columns of v:

v1 = \left(\begin{array}{ccc} -0.00201177 & 0.00143737 & 0.000374614 \\ -0.0210175 & 0.0149756 & 0.00335666 \\ -0.199597 & 0.142652 & 0.0305073 \\ -0.539263 & 0.385948 & 0.0611556 \\ -0.513526 & 0.367224 & -0.00344907 \\ -0.258102 & 0.159433 & -0.11971 \\ -0.121682 & -0.0133979 & -0.265642 \\ -0.134798 & -0.177324 & -0.481195 \\ -0.193185 & -0.30181 & -0.471062 \\ -0.246728 & -0.377774 & -0.254046 \\ -0.283178 & -0.410206 & 0.0979204 \\ -0.273373 & -0.377298 & 0.391838 \\ -0.201477 & -0.270573 & 0.408489 \\ -0.102016 & -0.135415 & 0.232377 \\ -0.0370629 & -0.0489598 & 0.0880787 \\ -0.0104736 & -0.0138137 & 0.0252276 \\ -0.00254533 & -0.00335417 & 0.00617542 \\ -0.000641283 & -0.000842035 & 0.00160261 \\ -0.000163267 & -0.00021855 & 0.000343642 \\ -0.0000487128 & -0.0000661713 & 0.0000876566 \\ 0. & 0. & 0.\end{array}\right)

We should look at those. First alone:

g1102g

… and then juxtaposed with the dual basis:
g1102g2

OK, the orthonormal basis is different from the dual basis. (Actually, one pair look similar. I’ll take a look in a moment.) These vectors (spectra) are orthonormal, the dual basis is not. (To check that, just compute v1^T v1\ ; it’s a 3×3 identity matrix.)

Let me look at the two basis vectors that were similar. From the dual basis, the first vector; from the orthonormal basis, the 3rd vector.

g1102h

Interesting. I do not know that this result has any significance, but what’s the point of drawing pictures and not looking at them?

Let’s continue.

The projection operator R.

Having an orthonormal basis for the preimage of A^T, we know that we can construct a projection operator R from the domain of A^T\ onto the preimage; we just take the product of v1 and v1^T in the other order:

R = v1\ v1^T

It’s 21×21, so I’m not going to show it to you. But, for checking, I’ll show you the first column:

g1102 R

And I’ll tell you that the last column is identically 0.

The fundamental, again? and more.

So. Apply the projection operator R to the reflected spectrum S to get the fundamental f…

R S = f = {0.240105,2.51498,23.8508,64.4948,61.7507,33.9437,25.2229,43.8813,
66.2108,82.129,90.1716,83.8477,60.51,30.3654,10.9919,3.10251,0.753495,
0.189327,0.0489062,0.0147546,0.}

and subtract to get the residual n = S – f:

S – f = n = {4.85989,26.2994,19.1132,-14.6673,-7.32707,18.6749,27.0703,15.2259,
3.00644,-13.029,-23.7822,-12.7477,13.5965,41.0307,59.1247,63.6874,
60.2497,52.1091,59.2801,33.022,39.625}

We can check that n is in the nullspace of A^T\ by computing A^T\ n\ :

A^T\ n = \{-1.01618*10^-13,\ -1.09023*10^-13,\ 4.93919*10^-15\}

We should also check that the fundamental has the same 3 values as the entire reflected spectrum. That is, we had

A^T\ S = \{356.817,\ 354.641,\ 271.109\}

for A^T\ applied to the reflected spectrum. If we now apply A^T\ to the fundamental, we get

A^T\ f = \{356.817,\ 354.641,\ 271.109\}

The same. Of course. (I hope you think so.)

Let’s look at the decomposition. The reflected spectrum is in yellow, the fundamental in white, and their difference n, the residual, is in black.

g1102i

Yes, where the fundamental is zero, the reflected spectrum and the residual coincide.

At the risk of annoying you, let me emphasize: the standard observer (at 20 nm intervals!) would perceive the yellow spectrum and the white spectrum as being exactly the same color. We would say they are metameric spectra. He is actually perceiving the yellow spectrum, but if we somehow generated the white one, it would look the same to him.

And now we should compare the fundamental with the vector Vdb constructed from the dual basis. We could look at the numbers, of course, but an easy way to see that they coincide is to plot only points for Vdb and only a piecewise-linear curve for the fundamental.

g1102j

They are the same.

Thus we confirm that the 3 values A^T S are the components of the fundamental (not of the entire reflected spectrum) wrt the basis E dual to A.

The dual basis tells us what those first 3 numbers mean, and would let us construct the fundamental, and thence the residual.

The orthonormal basis, on the other hand, let us compute a projection operator, which gets us the fundamental by applying the projection operator to the reflected spectrum.

If i could only do one of these, it would be to construct the dual basis rather than the projection operator… but computing power is cheap and there’s no reason to do only one of these.

Incidentally, we could have simply appled the Gram-Schmidt orthogonalization procedure to the dual basis… but I really like having a basis for the null space, and that’s what the rest of the v matrix from the SVD got me.

I’m going to wait, however, before looking at the basis vectors for the null space.

Color: from spectrum to tristimulus

Introduction

Edit 30 Oct 2009:
This is the first post about an example on p. 160 of Wyszecki & Stiles (see the bibliography).

and find another edit about the components wrt the dual basis.
End edit.

As always, if my chatter is confusing, just start following the examples.

In a sense, what I am about to do now is one of the most important color calculations I will show you.

How do we get from a spectrum to tristimulus values?

I am going to get from a spectrum to XYZ values, and then to xyz values. Recall that XYZ come from xyz bar tables, which in turn come from rbar, gbar, bbar tables by a change of basis. Then xyz values come from XYZ by making the sum = 1.

For more about the CIE, see the previous post.

The question before us is to get from a spectrum to XYZ (and then to xyz).

What we are talking about is the perception of color, rather than the reproduction. Sort of. We wil use “the standard observer” rather than my eyes or your eyes, but we are going to translate an arbitrary spectrum into 3 scalar values, which can then be transformed to some RGB values in a color space. But not in this post.

This example comes from w&s (Wyszecki & Stiles, in the bibliography), page 160. Believe it or not, I am going to work it four ways — but I will give you the full details for only two of them, pictures for all four of them.

The original problem uses data at 10 nm intervals, and it uses the 1964 CIE. Working this example as they did showed me that there were 3 typos in their solution.

First, however, I will work the problem using the data at 20 nm intervals, with the 1964 CIE. This will show you explicitly how the calculations proceed. At 20 nm intervals, I can display the data itself.

Second, I will work the original problem as they posed it. I will not get their answers, so I will also show you their typos.

Third and fourth, I will again work the problem at both 10 nm and 20 nm intervals, but using the 1931 CIE. As will be my ongoing custom, I will show you all the numbers only for the 20 nm case. Ultimately, what choice we make isn’t going to matter a lot. And that’s worth knowing.

We are given a reflectance spectrum for a plastic object illuminated by daylight. In fact, we will use “standard illuminant D65″ for “daylight”. The “reflectance spectrum” is the reflection from a perfectly uniform — perfectly equal energy — illuminant.

Let me say that another way. We are going to assume that the object is illuminated by one spectrum (“daylight” D65) and that the reflected spectrum which we perceive is the combination of the D65 spectrum and the so-called reflectance spectrum of the object.

The reflectance spectrum is a property of the object; by combining it with any arbitrary illumination spectrum, we get the reflected spectrum.

We are asked, in the given example, to find the XYZ tristimulus values, and the xy chromaticity coordinates, using the 1964 CIE color matching functions. That is, we will use the 1964 xyzbar tables.

(I didn’t plan to show this to you, but prudence dictated that I at least work it out. When it turned out that they had slightly wrong answers… well, I have to show it to you.)

Here’s what we will do. We will find the reflected spectrum, i.e. what is reflected from the object illuminated by D65 rather than by an equal energy illumninant. Then we will apply the xyzbar color matching functions to the reflected spectrum to get XYZ tristimulus values. Then we will get the normalized xyz values.

For more about the color matching functions, our matrices A for each problem, see the first Cohen post, and the second Cohen post.

CIE 1964 at 20 nm

Here is the reflectance spectrum \beta\ , from 380 to 780 at 20 nm intervals. As will be my custom, I am taking a subset of their values — I am not averaging the numbers. That is, their example provides data every 10 nm, but I am using half of that given data. For one thing, as I said, that gives us matrices of manageable size.

\beta = \begin{array}{c} 0.102 \\ 0.348 \\ 0.46 \\ 0.475 \\ 0.462 \\ 0.454 \\ 0.478 \\ 0.564 \\ 0.663 \\ 0.691 \\ 0.693 \\ 0.79 \\ 0.845 \\ 0.853 \\ 0.853 \\ 0.853 \\ 0.852 \\ 0.849 \\ 0.79 \\ 0.712 \\ 0.625\end{array}

Here’s a picture of that reflectance spectrum:

p64201

A word about scaling. It is appropriate that the reflectance spectrum be less than, or on the order of magnitude of, 1. At any given frequency, the amount reflected can range from significantly less to significantly more (since light absorbed at one frequency can be emitted at another) than the incident spectrum — but we dont want to be multiplying by a hundred. That would be bad.

Here is the D65 spectrum, from 380 to 780 nm at 20 nm intervals:

D65 = \begin{array}{c} 50. \\ 82.8 \\ 93.4 \\ 104.9 \\ 117.8 \\ 115.9 \\ 109.4 \\ 104.8 \\ 104.4 \\ 100. \\ 95.8 \\ 90. \\ 87.7 \\ 83.7 \\ 82.2 \\ 78.3 \\ 71.6 \\ 61.6 \\ 75.1 \\ 46.4 \\ 63.4\end{array}

Another word about scaling. The D65 spectrum ranges from 50 to more than 100. I have no idea what units it is measured in — but the reflected spectrum will be in the same units, whatever they are.

Here is a picture of that illumninant spectrum:

p64202

As I said, the scale is 100 times that of the reflectance spectrum. This doesn’t matter for the computations. If, however, I want to show both of these on the same axes, I will need to scale these values — but for the graph only, not for the computations.

Here are the 1964 xyzbar tables from 380 to 780 nm, at 20 nm intervals. This is the A matrix for this problem.

A = \left(\begin{array}{ccc} 0.0002 & 0. & 0.0007 \\ 0.0191 & 0.002 & 0.086 \\ 0.2045 & 0.0214 & 0.9725 \\ 0.3837 & 0.0621 & 1.9673 \\ 0.3023 & 0.1282 & 1.7454 \\ 0.0805 & 0.2536 & 0.7721 \\ 0.0038 & 0.4608 & 0.2185 \\ 0.1177 & 0.7618 & 0.0607 \\ 0.3768 & 0.962 & 0.0137 \\ 0.7052 & 0.9973 & 0. \\ 1.0142 & 0.8689 & 0. \\ 1.124 & 0.6583 & 0. \\ 0.8563 & 0.3981 & 0. \\ 0.4316 & 0.1798 & 0. \\ 0.1526 & 0.0603 & 0. \\ 0.0409 & 0.0159 & 0. \\ 0.0096 & 0.0037 & 0. \\ 0.0022 & 0.0008 & 0. \\ 0.0005 & 0.0002 & 0. \\ 0.0001 & 0. & 0. \\ 0. & 0. & 0.\end{array}\right)

And here is a picture of the color matching functions:

p64203

Those are our ingredients: the reflectance spectrum, the illiminant spectrum, and the color matching functions.

Now, let’s get the reflected spectrum. We multiply the values pointwise… i.e. at every wavelength, what is actually reflected is the product of the reflectance spectrum and the illuminant spectrum.

(Perhaps I should be calling this the perceived spectrum, since it is what our standard observer is seeing. I should not call it the incident spectrum, because that would be a sensible alternative to “the illimination spectrum”, in this case D65. Incident illumination on an object gives us a reflected spectrum, which our standard observer perceives.)

The result of multiplying the reflectance spectrum \beta and the illuminant spectrum D65 is

S = \begin{array}{c} 5.1 \\ 28.8144 \\ 42.964 \\ 49.8275 \\ 54.4236 \\ 52.6186 \\ 52.2932 \\ 59.1072 \\ 69.2172 \\ 69.1 \\ 66.3894 \\ 71.1 \\ 74.1065 \\ 71.3961 \\ 70.1166 \\ 66.7899 \\ 61.0032 \\ 52.2984 \\ 59.329 \\ 33.0368 \\ 39.625\end{array}

and the 3 spectra look like this (after I divide the D65 and S by 100, otherwise the reflectance spectrum is effectively the x-axis):

p64204

Let’s be clear: the reflectance spectrum is in green, the illuminant spectrum is in blue, and the product, at each wavelength, is the reflected spectrum, in black.

Before we deal with the reflected spectrum, however, I want to deal with the illumination itself. If we apply the color functions to the D65 spectrum, that is we compute

A^T\ D65\ ,

we get

{554.645, 582.565, 631.147}.

Are those our tristimulus values? No way. These numbers depend on the scale of the spectrum, and the number of components of the vectors. Somehow we have to normalize them, and we will do that for the total energy received.

As an aside, which I will try to remember to repeat in subsequent posts…. We have spectra and color-matching functions which live in vector spaces. The result of applying color matching functions to spectra is a set of 3 scalars, i.e. 3 real numbers. But our vector spaces are not bounded spaces, so those 3 numbers are not, in principle, bounded. They are simply the components of a spectrum wrt a basis. Edit: oops. Careful! They are components of the fundamental of the spectrum. End edit. OTOH, we have color spaces, such as RGB, in which the RGB values are bounded (whether between 0 and 1, or 0 and 100, or 0 and 255, they are limited).

We need to move from the unbounded results of vector operators to the bounded values of color spaces.

But how do we do that?

Well. Recall that ybar, the second column of A, is the photopic response (the total response of the cones) of the standard observer. This is incredibly useful: the second column of AT tells us the total energy perceived. If we were doing calculus, we would compute the areas under all three curves. But we’re doing linear algebra, and the appropriate number is the middle one.

We get XYZ values for the illumnination perceived by the observer by dividing those 3 numbers by the middle number, 582.565:

XYZ = {0.952075, 1., 1.08339}.

Let’s keep going with this. How do we get xyz coordinates from XYZ?

Normalize the sum of the XYZ to 1. The sum is 3.03547 … so we get that x,y,z are

xyz = {0.31365, 0.329438, 0.356911}.

Since all three values are close to 1/3, this is a plausible number for illumination.

Now we can do the same thing with the reflected spectrum. That is, we apply the matrix A^T to the reflected spectrum S:

{386.796, 381.417, 293.87}.

We are not going to divide by this middle number — this is the reflected spectrum, not the illimination spectrum. We need to divide by the same number as before, i.e. by the photopic response to the illumination, not the photopic response to the reflection. That number was 582.565, and now we divide our 3 values by that to get X,Y,Z…

XYZ = {0.663953, 0.654719, 0.504441}.

And, as before, we compute xyz values, too: we normalze the XYZ so that their sum is 1. Since the sum is 1.82311 … we get

xyz = {0.364186,0.359122,0.276692}.

This isn’t all that far from the illuminant.

p64205

Those dots are the outline of the 1964 chromaticity diagram. I got it the same way as the 1931, just with different data. Get the xyz tables — the ones where the rows add up to 1 — and just plot y versus x.

The red, green, and blue dots show the pure primaries for the 1964 standard. The white dot is (x,y) for D65, and the yellow dot is (x,y) for the reflected sprectrum. (Oh, I don’t think the reflection is yellow, but it’s rather closer to the light source than to much else.)

CIE 1964 at 10 nm

Now let me show you the pictures, and the answers, for 10 nm intervals. In fact, let me juxtapose the new pictures (on the right) with the old (on the left):

Here are the old and new reflectance spectra:

p64101

Here are the D65 spectra:

p64102

And here are the color matching functions (xyzbar tables) from 380 to 780 nm The data on the right is the A matrix for this problem.
p64103

We have the same ingredients: the reflectance spectrum, the illiminant spectrum, and the color matching functions.

Now, let’s get the reflected spectrum. We multiply the values pointwise… i.e. at every wavelength, what is actually reflected is the product of the reflectance spectrum and the illuminant spectrum.

The result of multiplying \beta and D65 together looks like this (after I divide the D65 and S by 100, otherwise the reflectance spectrum is effectively the x-axis):

p64104

As before, the reflectance spectrum is in green, the illuminant spectrum is in blue, and the product, at each wavelength, is the reflected spectrum, in black.

First, we apply the color functions to the D65 spectrum, that is we compute

A^T D65\ ,

and we get

{1102.11, 1162.02, 1247.78}.

Here we see quite clearly that these numbers depend on the scale of the spectrum and the number of values: they are about double what the previous values were. Oh, of course: our vectors are twice as long. (Even though the XYZ and xyz values will be comparable, these initial 3 values will not always be comparable. As I said, they depend on the dimension of our spectra and on the scaling of the illuminant spectrum.)

Once again, we need to scale those numbers, dividing each of them by the middle number, which is the total perceived illumination.

So we get XYZ values for the illumninant by dividing those 3 numbers by the middle number, 1162.02:

XYZ = {0.948445, 1., 1.0738}.

How do we get xy coordinates? Normalize the sum to 1. The sum is 3.02225 … so we get

xyz = {0.313821, 0.330879, 0.3553}.

As before, this is a plausible number for illumination.

Now we do the same thing with the reflected spectrum. That is, we apply the matrix A^T to the reflected spectrum S:

{769.263, 761.189, 581.056}.

Once again, we divide by the same number as before, i.e. by the photopic response to the illumination, not to the reflection. (That number was 1162.02.)

Now we divide our 3 values by that to get X,Y,Z…

XYZ = {0.662005, 0.655057, 0.50004}.

And, as before, we compute xy values, too: we normalze the XYZ so that their sum is 1. Since the sum is 1.8171 … we get

xyz = {0.364319, 0.360495, 0.275185}.

p64105

You can’t tell much from these two pictures; the final picture in this post will plot (almost) all of the xy answers. The numbers using 20 nm intervals were not much different: for the XYZ values of the reflected spectrum, we had:

XYZ = {0.663953, 0.654719, 0.504441}.

The corresponding answers using 10 nm intervals were:

XYZ = {0.662005, 0.655057, 0.50004}.

Pretty darn close. 10 nm or 20 nm, both using CIE 1964, not a whole lot of difference.

Now, what were their answers? They got sums of

{758.6, 760.8, 580.8}

… while I had

{769.263, 761.189, 581.056}.

We do not agree. What’s going on? I found that there is a problem in their work at 640 nm.

For a wavelength of 640 nm, we agree on the values of all the ingredients. We agree that the reflectance spectrum value is 0.853 … the D65 illuminant spectrum value is 83.7 … the values of that row of A are {0.4316, 0.1798, 0.}.

The pointwise products of these numbers are…

{30.8146, 12.837, 0.}

but what they show is

30.2 12.6 0.

Two typos. I have no idea why. And they appear to be the only typos pertaining to individual wavelengths. Of course, it is possible that the error is in the reflectance spectrum instead, but I have no other data-source to compare that to. In any event, it is clear that their entries (30.2 and 12.6) are not in fact the product of the inputs they show.

There’s another problem. I pasted the column that leads to the X-column-sum into a spreadsheet and computed the sum. For the data they have shown, the first sum is really 768.6 rather than 758.6 . That’s the third typo.

(In addition, they round off at each wavelength, before adding the columns: this leads to some round-off error. That’s why we differ on the sum that leads to Z; no typos there, they just rounded before adding.)

But, where they show X,Y,Z as (change of scale, decimals versus percentages; ignore it) …

XYZ = {65.3, 65.5, 50.}

the correct answers are

XYZ = {66.2005, 65.5057, 50.004}

and even using 20 nm intervals instead of 10, I got

XYZ = {66.3953, 65.4719, 50.4441}.

For x,y, they show

xy = {0.3612, 0.3623}

but correct (at 10 nm) is

xy = {0.364319, 0.360495}.

Just working at half the resolution (20 nm), I got

xy = {0.364186, 0.359122}.

I daresay their mistakes are worse than my cutting the resolution in half.

CIE 1931 10 nm

To switch from 1964 at 10 nm to 1931 at 10 nm requires only that I change the A matrix. (I have everything else in memory at 10 nm intervals. The reflectance, reflected, and D65 spectra do not change.)

Let me show 1931 on the right, for comparison with the 1964 color matching functions on the left. Note that the vertical scales are different.

p31101

Although they have not changed, I show again the reflectance spectrum and the illuminant spectrum for this case:

p31102

Go ahead, curse me out. After showing different cases side-by-side, I have just switched and shown side-by-side graphs for two different things for one and the same case.

Those are our ingredients: the reflectance spectrum, the illiminant spectrum, and the color matching functions. Only the color functions have changed, from 1964 to 1931.

Now, the pointwise product of the illuminant and reflectance spectra will be exactly the same reflected spectrum.

p31103

Let’s be clear: the reflectance spectrum is in green, the illumninant spectrum is in blue, and the product, at each wavelength, is the reflected spectrum — in black.

As before, we apply the color matching functions to the D65 spectrum, that is we compute

A^T\  D65\ ,

getting

{1004.46, 1056.89, 1150.02}.

These numbers are comparable to the previous case: we are using vectors of the same dimension.

Once again, we need to scale those numbers, dividing each of them by the middle number, which is the total perceived illumination.

So we get XYZ values or the illumninant by dividing those 3 numbers by the middle number:

XYZ = {0.950398, 1., 1.08812}.

To get xy coordinates, we normalize the sum to 1. The sum is 3.03852 … so we get

xyz = {0.312783, 0.329108, 0.358109}.

Now we do the same thing with the reflected spectrum. That is, we apply the matrix A^T to the reflected spectrum S:

{707.724, 707.564, 537.101}.

Once again, we divide by the same number as before, i.e. by the photopic response to the illumination, not to the reflection. That number was 1056.89 .

So we divide our 3 values by that to get X,Y,Z…

XYZ = {0.669631, 0.669479, 0.508191}.

And, as before, we compute xy values, too: we normalze the XYZ so that their sum is 1. Since the sum is 1.8473 … we get xyz as

xyz = {0.362491, 0.362409, 0.275099}.

Again, this isn’t all that far from the illuminant. We’re getting very similar answers.

p31104

The CIE boundaries look pretty similar, but the triangles — the gamut of the 3 primaries — are different. And recall that the line x + y = 1 is tangent to the right-side boundary; furthermore, for the 1931 CIE, one side of the triangle is very, very close to the tangent line.

CIE 1931 20 nm

Having shown you pictures with the 1931 CIE and 10 nm intervals, let me close by showing you the numerical data, as well as pictures, for the 1931 CIE 20 nm solution.

To switch to 20 nm intervals, I need to rebuild the color matching functions. Here’s the data:

A = \left(\begin{array}{ccc} 0.0014 & 0. & 0.0065 \\ 0.0143 & 0.0004 & 0.0679 \\ 0.1344 & 0.004 & 0.6456 \\ 0.3483 & 0.023 & 1.7471 \\ 0.2908 & 0.06 & 1.6692 \\ 0.0956 & 0.139 & 0.813 \\ 0.0049 & 0.323 & 0.272 \\ 0.0633 & 0.71 & 0.0782 \\ 0.2904 & 0.954 & 0.0203 \\ 0.5945 & 0.995 & 0.0039 \\ 0.9163 & 0.87 & 0.0017 \\ 1.0622 & 0.631 & 0.0008 \\ 0.8544 & 0.381 & 0.0002 \\ 0.4479 & 0.175 & 0. \\ 0.1649 & 0.061 & 0. \\ 0.0468 & 0.017 & 0. \\ 0.0114 & 0.0041 & 0. \\ 0.0029 & 0.001 & 0. \\ 0.0007 & 0.0003 & 0. \\ 0.0002 & 0.0001 & 0. \\ 0. & 0. & 0.\end{array}\right)

and here are the graphs (20 nm and 10 nm side by side):

p31201

I need to restore the reflectance spectrum \beta and the illuminant spectrum. As you recall, they were

\beta = \begin{array}{c} 0.102 \\ 0.348 \\ 0.46 \\ 0.475 \\ 0.462 \\ 0.454 \\ 0.478 \\ 0.564 \\ 0.663 \\ 0.691 \\ 0.693 \\ 0.79 \\ 0.845 \\ 0.853 \\ 0.853 \\ 0.853 \\ 0.852 \\ 0.849 \\ 0.79 \\ 0.712 \\ 0.625\end{array}

and

D65 = \begin{array}{c} 50. \\ 82.8 \\ 93.4 \\ 104.9 \\ 117.8 \\ 115.9 \\ 109.4 \\ 104.8 \\ 104.4 \\ 100. \\ 95.8 \\ 90. \\ 87.7 \\ 83.7 \\ 82.2 \\ 78.3 \\ 71.6 \\ 61.6 \\ 75.1 \\ 46.4 \\ 63.4\end{array}

p31202

The reflected spectrum was the pointwise product of those, namely

S = \begin{array}{c} 5.1 \\ 28.8144 \\ 42.964 \\ 49.8275 \\ 54.4236 \\ 52.6186 \\ 52.2932 \\ 59.1072 \\ 69.2172 \\ 69.1 \\ 66.3894 \\ 71.1 \\ 74.1065 \\ 71.3961 \\ 70.1166 \\ 66.7899 \\ 61.0032 \\ 52.2984 \\ 59.329 \\ 33.0368 \\ 39.625\end{array}

Those are our ingredients: the reflectance spectrum, the illuminant spectrum, and the color matching functions. Only the color functions have changed, from 1964 to 1931.

The product of illumninant spectrum and reflectance spectrum is the same as shown before for 20 nm intervals:

p64204

As before, the reflectance spectrum is in green, the illumninant spectrum is in blue, and the product, at each wavelength, is the reflected spectrum — in black.

As before, we apply the color matching functions to the D65 spectrum, that is we compute

A^T\ D65\ ,

getting

{506.693, 529.764, 581.089}.

These numbers are smaller than the previous case: we are using vectors of half the dimension.

Once again, we need to scale those numbers, dividing each of them by their middle number, which is the total perceived illumination.

So we get XYZ values or the illumninant by dividing those 3 numbers by the middle number, getting:

XYZ = {0.956451, 1., 1.09688}.

We get xy coordinates by normalizing the sum to 1. The sum is 3.05333 … so we get

xyz = {0.313248, 0.327511, 0.359241}.

Now we do the same thing with the reflected spectrum. That is, we apply the matrix A^T to the reflected spectrum S, getting:

{356.817,354.641,271.109}.

Once again, we divide by the same number as before, i.e. by the photopic response to the illumination, not to the reflection. That number was 529.764, so we get XYZ are

XYZ = {0.673539, 0.669432, 0.511754}.

And, as before, we normalze the XYZ so that their sum is 1. Since the sum is 1.85472 … we get

xyz = {0.363148 ,0.360933, 0.275919}.

p31203

OK, let’s compare.

Here are the unscaled sums (including their incorrect answers). Reading down, these are my 1964 at 20, then at 10, their answers, my 1931 at 10, then at 20.

\begin{array}{ccc} X & Y & Z \\ 386.796 & 381.417 & 293.87 \\ 769.263 & 761.189 & 581.056 \\ 758.6 & 760.8 & 580.8 \\ 707.724 & 707.564 & 537.101 \\ 356.817 & 354.641 & 271.109\end{array}

and the XYZ values scaled by the photopic response…

\begin{array}{ccc} X & Y & Z \\ 0.663953 & 0.654719 & 0.504441 \\ 0.662005 & 0.655057 & 0.50004 \\ 0.653 & 0.655 & 0.5 \\ 0.669631 & 0.669479 & 0.508191 \\ 0.673539 & 0.669432 & 0.511754\end{array}

and the xyz values

\begin{array}{ccc} x & y & z \\ 0.364186 & 0.359122 & 0.276692 \\ 0.364319 & 0.360495 & 0.275185 \\ 0.3612 & 0.3623 & 0.2765 \\ 0.362491 & 0.362409 & 0.275099 \\ 0.363148 & 0.360933 & 0.275919\end{array}

There’s not a whole lot of difference there, whether we use 1931 or 1964, or 10 nm or 20 nm, or make a few mistakes!

To look at it, here are my 4 computed xy, “white points”, for the illuminant, and the 5 xy points for the reflections.

p sum

(I didn’t grab their white point — the x,y coordinates — for D65, so only my 4 computed values are shown.)

The most important single computation in all this was to scale the values by the middle sum, i.e. by the total energy perceived by the observer. To be specific, we applied the ybar function to the illuminant spectrum.

I know, it looks just like the dot product of two vectors. But Cohen taught us that the color functions are really linear functionals. We want to visualize that computation as the matrix product of a row (the ybar) and a column (the D65 spectrum).

Of course, only such a trick as that division by the middle number could distract us from the marvelous computation of 3 numbers in the first place. We get from a spectrum to 3 scalars by applying 3 color matching functions (the transpose of our A matrix) to the reflected spectrum, which is also the spectrum incident on the eyes of the standard observer.

The first 3 numbers we get are, as we saw before and will see again, the (potentially unbounded) components of the incident spectrum wrt the dual basis spectra. The second set of 3 numbers are tristimulus values in XYZ space. The third set of numbers were xyz coordinates for the chromaticity chart.

The first 3 numbers depend on the dimensionality of our spectra, how many samples we have. In fact, we compute two such sets: one for the illuminant and one for the reflection. The middle number of the 3 for the illuminant sets the scale for white = (1,1,1).

The second set, XYZ, is found by dividing the first 3 numbers by that middle number. That is true whether we have XYZ for the illuminant or for the reflection: divide either by the total energy of the illuminant.

The third set, xyz, is found by dividing the XYZ numbers by their sum — instead of by their middle, or anybody’s middle.

I must emphasize that we have seen 4 different choices for the A matrix, and those were all “xyz bar” values! “The” A matrix is hardly unique.

More to come. You better believe it.

Posted in color. Tags: . Leave a Comment »

Color: the 1931 CIE color-matching functions and chromaticity chart

Introduction

The CIE chromaticy chart is one of the things we are headed for. Here is a black-and-white drawing of its boundary. I’ll show you later how I got it.

CIE bdy

You should be able to find color versions of the CIE chromaticity chart (1931) all over the Internet. Here’s one that shows some curves of constant hue. I’ll point out now that these curves are not straight lines. I will also use that image to decide that something is a greenish-yellow or yellow-green.

Wikipedia seems to have a lot of information, such as this, for example.

What my drawing shows is that I can reproduce the analytical content of it. Black and white, but no color.

How did I do that?

the color-matching functions and chromaticity coordinates

I have four tables of data, from Wyszecki and Stiles, for which see the bibliography; and I will abbreviate them “w&s” in this post.

“rgb” are chromaticity coordinates (row sums = 1), “rgb bar” are color-matching functions. The full heading on the combined “rgb” and “rgb bar” tables is “Chromaticity Coordinates and Color-matching Functions of the CIE 1931 Standard Colorimetric System, with respect to real primary stimuli:
R at \lambda_R = 700.0\ nm;
G at \lambda_G = 546.1\ nm;
B at \lambda_B = 435.8\ nm”

We also have “xyz” chromaticity coordinates (row sums = 1), and “xyz bar” color-matching functions.

All 4 tables are at 5 nm intervals from 380 to 780.

Well, these 4 tables come in a variety of sets. I chose to use the 1931 tables at 5 nm intervals.

They have been interpolated to 1 nm intervals. They have been corrected by several authors in different ways. They were re-done in 1964 by the CIE to use a 10° field instead of the 2° field of the 1931 standard.

My choice of the uncorrected 5 nm 1931 standard is not universal — but it is an extremely common choice, and an awful lot of the work done since 1931 has used it.

At my level of usage, it probably doesn’t matter a whole lot. In fact, a whole lot of other things will turn out to be more important.

Since I’ve had more than a little time to get used to the 4 different tables, and you may not have, let me list them:

  • rgb bar
  • rgb
  • xyz bar
  • xyz

My source for the tables, as I said, was w&s. I am not going to reproduce them — as far as I can see the CIE charges for some of them — but I wouldn’t be surprised if you can find them on the Internet.

In principle, three of the tables are secondary. Their source is the table of rgb bar. First, let me show you what they look like.

Here is a picture of the rbar, gbar, bbar data. The x-axis runs through wavelengths from 380-780, the points are 5 nm apart.

rgb bar plot

It is clear that rbar is significantly negative over part of its range (440-545 nm); it is true, but not clear, that gbar is very slightly negative over 380-435, and bbar is very slightly negative over 550-655 — and it’s shown as 0 for longer wavelengths.

In fact, at any given wavelength, exactly one of the three is negative; nevertheless, only the red rbar gets seriously negative.

What do these negative numbers mean?

I suspect that the simple description of the color matching experiment is too simple experimentally, but I also believe it suffices in principle. Imagine that you are looking at a split screen, at two halves of a colored disk. The left half of the disk is a monochromatic color. (Think of a tunable laser — but of course they didn’t have lasers in the 1920s when the experiments were done!) The right half of the disk is controlled by the viewer, who has three dials and may choose mixtures of red, green, and blue in order to make the two halves of the disk match.

It turns out that we can’t do that. Over most of the range, we have to change the color of the left half of the disk as well as the right half. We can’t match a wavelength of 380 nm (the violet end of the visble spectrum) with red, green, and blue. But if we add a tiny of green to the violet, we can match that combination with red and blue.

We record a negative number for green at that wavelength.

Until the blue goes to zero thru negative values, exactly one of the 3 numbers is negative. At long wavelengths, when blue is zero, we just use red and green.

So much for the negative values.

Oh, having said that I will not reproduce the table, I will present — so you can do these calculations yourself — one quarter of it. Here is the rgb bar table at 20-nm intervals.

rgb bar ~ \left(\begin{array}{cccc} 380 & 0.00003 & -0.00001 & 0.00117 \\ 400 & 0.0003 & -0.00014 & 0.01214 \\ 420 & 0.00211 & -0.0011 & 0.11541 \\ 440 & -0.00261 & 0.00149 & 0.31228 \\ 460 & -0.02608 & 0.01485 & 0.29821 \\ 480 & -0.04939 & 0.03914 & 0.14494 \\ 500 & -0.07173 & 0.08536 & 0.04776 \\ 520 & -0.09264 & 0.17468 & 0.01221 \\ 540 & -0.03152 & 0.21466 & 0.00146 \\ 560 & 0.0906 & 0.19702 & -0.0013 \\ 580 & 0.24526 & 0.1361 & -0.00108 \\ 600 & 0.34429 & 0.06246 & -0.00049 \\ 620 & 0.29708 & 0.01828 & -0.00015 \\ 640 & 0.15968 & 0.00334 & -0.00003 \\ 660 & 0.05932 & 0.00037 & 0 \\ 680 & 0.01687 & 0.00003 & 0 \\ 700 & 0.0041 & 0 & 0 \\ 720 & 0.00105 & 0 & 0 \\ 740 & 0.00025 & 0 & 0 \\ 760 & 0.00006 & 0 & 0 \\ 780 & 0 & 0 & 0\end{array}\right)

The second table, rgb, is derived from the first in principle — but not in practice!

In principle, take the rgb bar table and recsale it by setting every row sum to 1.

In parctice, we don’t have enough digits to do that. They provide a table. It is called the rgb (chromaticity coordinates).

Here is what the rgb table looks like at 5 nm intervals.
rgb plot

And here is a subset of the rgb table, at 20-nm intervals.

rgb ~ \left(\begin{array}{cccc} 380 & 0.0272 & -0.0115 & 0.9843 \\ 400 & 0.0247 & -0.0112 & 0.9865 \\ 420 & 0.0181 & -0.0094 & 0.9913 \\ 440 & -0.0084 & 0.0048 & 1.0036 \\ 460 & -0.0909 & 0.0517 & 1.0392 \\ 480 & -0.3667 & 0.2906 & 1.0761 \\ 500 & -1.1685 & 1.3905 & 0.778 \\ 520 & -0.983 & 1.8534 & 0.1296 \\ 540 & -0.1707 & 1.1628 & 0.0079 \\ 560 & 0.3164 & 0.6881 & -0.0045 \\ 580 & 0.6449 & 0.3579 & -0.0028 \\ 600 & 0.8475 & 0.1537 & -0.0012 \\ 620 & 0.9425 & 0.058 & -0.0005 \\ 640 & 0.9797 & 0.0205 & -0.0002 \\ 660 & 0.994 & 0.0061 & -0.0001 \\ 680 & 0.9984 & 0.0016 & 0. \\ 700 & 1. & 0. & 0. \\ 720 & 1. & 0. & 0. \\ 740 & 1. & 0. & 0. \\ 760 & 1. & 0. & 0. \\ 780 & 1. & 0. & 0.\end{array}\right)

Let me show you an example of why we need their table, instead of trying to generate it ourselves. At 380 nm, the rgb bar table has values…

\{0.00003,-0.00001,0.00117\}

which sum to 0.00119 .

If I divide that row by its sum, and round, I get…

\{0.0252,-0.0084,0.9832\}

but the corresponding row of the rgb table is…

\{0.0272,-0.0115,0.9843\}\ .

Close, but not the same.

That’s two of the four tables. As it happens, they’re the two we may be able to dispense with in practice — but in principle, rgb bar is the starting point.

Let me emphasize that these values depend on a specific choice of red, green, and blue.

We can fix that.

The CIE defined a specific change-of-basis. The tri-stimulus coordinates (rbar, gbar, bbar) or (R,G,B) are a vector in R^3; they defined a vector denoted either (xbar, ybar, zbar) or (X,Y,Z).

Oh, although the table is denoted rgb bar, specific coordinates wrt it are usually denoted RGB; similarly, specific coordinates wrt the xyz bar table are usually denoted upper-case XYZ.

Bear in mind that these RGB are not the same as coordinates in RGB-color space. (The latter is bounded by 0 and 1; the former is not!) We’ll see more about this down the road.

Let me show it to you:

xyz bar

And let me show you the cut-down data, every 20-nm; this is a subset of the “xyz bar” table:

xyz bar ~ \left(\begin{array}{cccc} 380 & 0.0014 & 0. & 0.0065 \\ 400 & 0.0143 & 0.0004 & 0.0679 \\ 420 & 0.1344 & 0.004 & 0.6456 \\ 440 & 0.3483 & 0.023 & 1.7471 \\ 460 & 0.2908 & 0.06 & 1.6692 \\ 480 & 0.0956 & 0.139 & 0.813 \\ 500 & 0.0049 & 0.323 & 0.272 \\ 520 & 0.0633 & 0.71 & 0.0782 \\ 540 & 0.2904 & 0.954 & 0.0203 \\ 560 & 0.5945 & 0.995 & 0.0039 \\ 580 & 0.9163 & 0.87 & 0.0017 \\ 600 & 1.0622 & 0.631 & 0.0008 \\ 620 & 0.8544 & 0.381 & 0.0002 \\ 640 & 0.4479 & 0.175 & 0. \\ 660 & 0.1649 & 0.061 & 0. \\ 680 & 0.0468 & 0.017 & 0. \\ 700 & 0.0114 & 0.0041 & 0. \\ 720 & 0.0029 & 0.001 & 0. \\ 740 & 0.0007 & 0.0003 & 0. \\ 760 & 0.0002 & 0.0001 & 0. \\ 780 & 0. & 0. & 0.\end{array}\right)

Here is the change-of-basis matrix:

T = \left(\begin{array}{ccc} 2.76888 & 1.75175 & 1.13016 \\ 1 & 4.5907 & 0.0601 \\ 0 & 0.05651 & 5.59427\end{array}\right)

They tell us we can check it, for example, by applying their matrix T to RGB coordinates, such as for example at 500 nm…

\{-0.07173,0.08536,0.04776\}

Apply T, and round off, getting

\{0.0049,0.323,0.272\}\

… and compare that to the XYZ coordinates at 500 nm:

\{0.0049,0.323,0.272\}\ .

The same.

Still… raise your hands and step away from the keyboard.

What we just did was apply their matrix T to RGB components to get XYZ components. Perfectly natural. But we know that the transition matrix delivers old components when it is applied to new ones. If we want to call that given matrix T the transition matrix, then we must think of XYZ as the old components.

In practice, this really just means that we better know which way the transformation goes. They gave us the transition matrix for old components XYZ and new components RGB — even though they derived the XYZ basis from the RGB basis.

OK? If not, please be patient.

Now just where did that change-of-basis transformation come from? Malacara has a readable discussion. The most important point, to my mind, is that the Y column vector, the second column, is chosen to be the “phototopic response” (1924) of the human eye, that is, daylight vision, that is, color vision. There’s not much point in my digging out that data and plotting it: it is exactly the green curve in the drawing of the xyz bar table.

Second, all the values X,Y,Z are non-negative. Third, they also set the column sums almost equal to the value for the Y column. There was another criterion, but I’ll show it to you after I show you the fourth table.

Table 4 comes from table 3 as table 2 comes from table 1: get table 4 by setting the row sums of table 3 to 1, just as we got table 2 by setting row sums of table 1 to 1. All in principle, of course, or having more digits available than were printed.

Here is what the xyz table looks like:

xyz plot

and here is the cut-down data:

 xyz ~ \left(\begin{array}{cccc} 380 & 0.1741 & 0.005 & 0.8209 \\ 400 & 0.1733 & 0.0048 & 0.8219 \\ 420 & 0.1714 & 0.0051 & 0.8235 \\ 440 & 0.1644 & 0.0109 & 0.8247 \\ 460 & 0.144 & 0.0297 & 0.8263 \\ 480 & 0.0913 & 0.1327 & 0.776 \\ 500 & 0.0082 & 0.5384 & 0.4534 \\ 520 & 0.0743 & 0.8338 & 0.0919 \\ 540 & 0.2296 & 0.7543 & 0.0161 \\ 560 & 0.3731 & 0.6245 & 0.0024 \\ 580 & 0.5125 & 0.4866 & 0.0009 \\ 600 & 0.627 & 0.3725 & 0.0005 \\ 620 & 0.6915 & 0.3083 & 0.0002 \\ 640 & 0.719 & 0.2809 & 0.0001 \\ 660 & 0.73 & 0.27 & 0. \\ 680 & 0.7334 & 0.2666 & 0. \\ 700 & 0.7347 & 0.2653 & 0. \\ 720 & 0.7347 & 0.2653 & 0. \\ 740 & 0.7347 & 0.2653 & 0. \\ 760 & 0.7347 & 0.2653 & 0. \\ 780 & 0.7347 & 0.2653 & 0.\end{array}\right)

Just for the fun of it, check the column sums of XYZ. I said they were almost equal. I get

\{21.3713,21.3714,21.3715\}\ .

I do not know if that is deliberate, or whether its round-off and the real tables have more digits. (I don’t really care. But I couldn’t very well say they were equal….)

The CIE chromaticity chart

Ok, here’s the next major idea. Since x+y+z = 1, this table is essentially 2D, and we can plot y vs x.

That is exactly the picture I started this post with, the boundary of the chromaticity chart.

That’s all it is: take the first two columns of the xyz table, read each row as (x,y) coordinates, and plot each row.

The fourth criterion was to make the line x+y = 1 tangent to the curve. (In fact, the earlier criterion about X,Y,Z being non-negative was implemented by setting the y-axis tangent to the boundary.

Let me show you:

CIE with tangent

So, 4 tables, one 2D chart. We could also draw a 2D chart of g vs r, or b vs r, from the rgb table, since its rows each sum to 1.

What we have lost, in that chart, is the intensity. To recover XYZ from xy, we would need, for example, xyY; x,y and any one of X, Y, or Z would suffice.

Let me start using that chart right now. We have RGB basis vectors (1,0,0), (0,1,0), and (0,0,1}. What are their (x,y) coordinates?

There are a few ways to do this. Given the scale at which we are working, the easiest is to interpolate the x,y coordinates from the xyz table at wavelengths 435.8, 546.1, and 700 (that last one is a table value).

I get x,y coordinates of

R = {0.7347,0.2653}

G = {0.2737,0.717411}

B = {0.166541,0.00892768}

Let’ s plot them on the chromaticity chart… and let’s draw lines connecting them.

CIE with orig RGB

Think back to simplices. All of the points inside (and on) that triangle can be written as linear combinations of the coordinates of the 3 vertices; most importantly, the coefficients of those linear combinations are between 0 and 1, inclusive. That is, when the coefficients are in the closed interval [0,1], the combination is inside or on the triangle.

That restriction to [0,1] is what restricts us to the inside (and on) the triangle. That triangle is said to describe the gamut of these three R,G,B vertices.

The points outside the triangle have either negative coefficients (or perhaps coefficients larger than 1; I’m not sure). This is why we have negative values inside the primary table, rgb bar: all the points on the curved boundary are outside, or just on, the rgb bar triangle.

That triangle is said to to show the gamut of those choices for R,G,B primaries. One of the most common uses of the chromaticity chart is to compare gamuts.

From Foley, van Dam et al., I get the following x,y coordinates for the “standard NTSC RGB phosphor” (the color TV standard):

R = {0.67,0.33},

G = {0.21,0.71},

B = {0.14,0.08}

Let’s add them to the CIE, triangle in red:
CIE with NTSC

A TV with these phosphors can reproduce colors in the red triangle.

Another set of coordinates from Foley & van Dam et al., for “short-persistence phosphors” are:

R = {0.61,0.35},

G = {0.29,0.59},

B = {0.15,0.063}

Let’s add them to the drawing, with their triangle in green:

CIE with short

Whatever these are, they can only produce the comparatively limited set of colors bounded by the green triangle.

To reproduce the entire CIE chart, within a triangle — i.e. with linear combinations having components in [0,1], we would need to have B at x,y = (0,0), R at x,y = (1,0), and G at x,y = (1,0).

Easy enough to draw, but don’t ask me how to get such sources physically.
CIE with biggest

A different form of the same principle — linear combinations of vectors — leads to the idea of mixing two colors. That is, instead of looking at all that can be produced by a particular set of RGB phosphors, we take two points and consider the line joining them.

A particularly useful choice is when one of the points is defined as “white”. Here are the x,y coordinates of a white known as D65, a standard form of “daylight”; here also are x,y coordinates of a form of green (from Andrew Glassner’s “Principles of Digital Image Synthesis”, volume 2, Appendix G; Morgan kaufmann, 1995), known as “foliage” on the MacBeth Color Checker.

p65 = {0.31271,0.32902}

pf = {0.4002,0.3504}

I want the equation of the line thru those points, and considering triangles, I’m going to write the equation as

line[t_] := t pf + (1 – t) p65.

Like our discussion of triangles, if t is restricted to [0,1], then we get the line segment bounded by the two points. Only if we let t range outside [0,1] do we get line segments that extend beyond the two points.

For what it’s worth, the numerical equation of the line is

line[t] = \{0.08749 t+0.31271,0.02138 t+0.32902\}\ .

Here are the two points and the line segment joining them:
CIE with 2 pts

But I’d rather do something else: extend the line to the boundary, and consider the green point to be a combination of the white point and the boundary point.
CIE with 3 pts

According to my poster of the 1931 CIE, a wavelength of 570 is on the border between greenish-yellow and yellowish-green. Go to the first link in this post. So our “foliage” at the red point is a mixture of white and a wavelength very close to 570 nm. (Or, Malacara has a version of this drawing.)

(I meant to show the two points as black and red in both drawings, but I forgot. I didn’t want to imply that the dot was the “right” color.)

One final point. We could go from HSB to RGB (for some standard set of RGB) and compute curves of constant hue. They are not straight lines. Rather than prove this by computation, let me just point you at the very first link again: the boundaries which it shows are not straight lines, and those boundaries certainly ought to be curves of constant hue, namely the boundary hue. (It ought to be easy enough to show this, but not now.)

It is crucial that the XYZ values are independent of any particular RGB values. Of course, they originally came from the spectrallly pure ones — but once we have the XYZ basis, we can go to any other RGB basis we choose.

Having seen other gamuts, we should expect that people get XYZ values from a spectrum. I’ll show you that next.

Color: re-doing Cohen’s example

Cohen’s example again

Let me now show you how I would do Cohen’s example. (His computations, pretty much, were the previous post.) I cannot over-emphasize that he deserves a lot of credit for getting the mathematics right, even if he didn’t name it correctly or do it beautifully.

I start with the A matrix:

A = \left(\begin{array}{ccc} 0.121934 & 0.0339986 & 0.219431 \\ 0.105204 & -0.0428189 & 0.139047 \\ 0.20452 & -0.36676 & -0.108477 \\ -0.449607 & 0.101795 & 0.292825 \\ -0.161109 & -0.0945546 & -0.0522694 \\ -0.554469 & 0.305203 & 0.229803 \\ -0.542936 & 0.493257 & 0.288467\end{array}\right)

This, as I have said, is a toy example of real color matching functions. Each column corresponds to one of the tri-stimulus values; each row represents a wavelength.

My summary of Cohen began with a different matrix — the E matrix — only because it was easier to enter into the computer. A is primary.

The next issue we have to cope with is that we really care about the transpose A' = A^T\ rather than about A. A’ will be applied to a spectrum, and will deliver three tri-stimulus values for it. I’m not convinced that any notational choice I make now will be without problems.

(As I said in the previous post, I use A’ for the transpose A^T\ whenever it is convenient.)

I am going to let

B = A^T\ .

B represents the linear operator I wish to study. Thus, B is

B = \left(\begin{array}{ccccccc} 0.121934 & 0.105204 & 0.20452 & -0.449607 &   -0.161109 & -0.554469 & -0.542936 \\ 0.0339986 & -0.0428189 & -0.36676 & 0.101795 &   -0.0945546 & 0.305203 & 0.493257 \\ 0.219431 & 0.139047 & -0.108477 & 0.292825 &   -0.0522694 & 0.229803 & 0.288467\end{array}\right)

It maps R^7\ to R^3\ . At most, it can be of rank 3; and then it would have a four dimensional null space.

What shall we do with it? All together now, class: find its singular value decomposition (SVD).

B = u\ w\ v'\ ,

where u and v are orthogonal (hence square) matrices, and w is as close to diagonal as it can be, given that it is the same shape as B.

The Mathematica command is

{u,w,v}=SingularValueDecomposition[B];

where the final semi-colon suppresses printing.

It is convenient to look first at the singular values themselves, the nonzero “diagonal” elements of the matrix w:

\{1.21434,0.354727,0.307628\}\ .

That all three are “significantly” larger than zero tells us that the matrix B (and therefore its transpose, A) is, indeed, of rank 3.

In case this is new to you, the w matrix is

w = \left(\begin{array}{ccccccc} 1.21434 & 0. & 0. & 0. & 0. & 0. & 0. \\ 0. & 0.354727 & 0. & 0. & 0. & 0. & 0. \\ 0. & 0. & 0.307628 & 0. & 0. & 0. & 0.\end{array}\right)

(And if this is new to you, check the tag cloud and the categories. There’s a lot examples of the Singular Value Decomposition. I could almost rename this blog “Applications of the SVD”.)

What about v?

v = \left(\begin{array}{ccccccc} 0.00783002 & 0.594295 & -0.455574 & 0.145535 &   0.183004 & 0.43387 & 0.443047 \\ -0.0405824 & 0.32624 & -0.416473 & -0.400691 &   0.350383 & -0.293626 & -0.590706 \\ -0.321854 & -0.360317 & -0.448269 & -0.580591 &   -0.296401 & -0.0331037 & 0.373633 \\ 0.416686 & -0.241837 & -0.606932 & 0.490082 &   -0.256011 & -0.298065 & -0.0704081 \\ 0.042639 & -0.520505 & -0.0951579 & 0.073689 &   0.821135 & 0.0465338 & 0.19064 \\ 0.551142 & -0.204115 & -0.0485665 & -0.317617 &   -0.104278 & 0.672485 & -0.297034 \\ 0.644592 & 0.198695 & 0.195648 & -0.367 & 0.0721767   & -0.425676 & 0.430864\end{array}\right)

It is 7×7. We know from the discussion of the four fundamental subspaces (look at the blue text) that the 4 rightmost columns of v are an orthonormal basis for the null space of B, and the 3 leftmost columns are an orthonormal basis for the pre-image of the range of B.

That is, writing a partitioned matrix [v] = [v1 v2], we have

v1 = \left(\begin{array}{ccc} 0.00783002 & 0.594295 & -0.455574 \\ -0.0405824 & 0.32624 & -0.416473 \\ -0.321854 & -0.360317 & -0.448269 \\ 0.416686 & -0.241837 & -0.606932 \\ 0.042639 & -0.520505 & -0.0951579 \\ 0.551142 & -0.204115 & -0.0485665 \\ 0.644592 & 0.198695 & 0.195648\end{array}\right)

v2 = \left(\begin{array}{cccc} 0.145535 & 0.183004 & 0.43387 & 0.443047 \\ -0.400691 & 0.350383 & -0.293626 & -0.590706 \\ -0.580591 & -0.296401 & -0.0331037 & 0.373633 \\ 0.490082 & -0.256011 & -0.298065 & -0.0704081 \\ 0.073689 & 0.821135 & 0.0465338 & 0.19064 \\ -0.317617 & -0.104278 & 0.672485 & -0.297034 \\ -0.367 & 0.0721767 & -0.425676 & 0.430864\end{array}\right)

We should confirm that B.v2 = 0, i.e. every column of v2 is in the null space of B:

B\ v2 = \left(\begin{array}{cccc} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0\end{array}\right)

Once having a basis, v1, for the pre-image of the range, we know that we can construct a projection operator as v1 v1′:

P = v1\ v1' = \left(\begin{array}{ccccccc} 0.560795 & 0.383299 & -0.012435 & 0.136042 &   -0.265648 & -0.0948635 & 0.0339986 \\ 0.383299 & 0.281529 & 0.0822036 & 0.156964 &   -0.131909 & -0.0687305 & -0.0428189 \\ -0.012435 & 0.0822036 & 0.434363 & 0.225095 &   0.216479 & -0.0820701 & -0.36676 \\ 0.136042 & 0.156964 & 0.225095 & 0.600478 & 0.201399   & 0.308492 & 0.101795 \\ -0.265648 & -0.131909 & 0.216479 & 0.201399 &   0.281798 & 0.134365 & -0.0945546 \\ -0.0948635 & -0.0687305 & -0.0820701 & 0.308492 &   0.134365 & 0.34778 & 0.305203 \\ 0.0339986 & -0.0428189 & -0.36676 & 0.101795 &   -0.0945546 & 0.305203 & 0.493257\end{array}\right)

Because I didn’t call it R, I can recall Cohen’s R matrix, computed using a pseudo-inverse. I assure you that it is the same, we have P = R.

On the other side, since B is of full rank, its range is equal to its codomain, both being R^3\ , so the entire matrix u is an orthonormal basis for the range of B.

u = \left(\begin{array}{ccc} -0.756721 & 0.651151 & 0.0580932 \\ 0.530816 & 0.560136 & 0.635989 \\ 0.381585 & 0.512103 & -0.769509\end{array}\right)

Now the thing to do is, as before, split some given spectrum into its fundamental and its residual (metameric black). We only really have one good test, an equal energy spectrum.

We get the fundamental of our test spectrum…

e = \{1,1,1,1,1,1,1\}

…by applying our projection operator to it:

f = P\ e = \{0.741189,0.660536,0.496875,1.73027,0.34193,0.850176,0.43012\}

and that is exactly what we got before (come on, P = R; we have the same projection operator.)

Then we get the residual n as the difference…

n = e - f = \{0.258811,0.339464,0.503125,-0.730266,0.65807,0.149824,0.56988\}

and that, too, is the same as before.

As a refesher, let’s compute the components of the fundamental wrt the v1 basis. Since v1 is orthonormal, we may find the components of the fundamental by taking dot products, the fundamental dotted into each basis vector in v1. But that’s just

f\ v1 = \{1.30045,-0.207545,-1.87532\}

which says that f can be written as the sum

\{0.741,0.66,0.497,1.73,0.341,0.85,0.43\}

of the following three multiples of the basis vectors– I’m going to round off for clarity –

\{0.01,-0.053,-0.419,0.542,0.055,0.717,0.838\}

\{-0.123,-0.068,0.075,0.05,0.108,0.042,-0.041\}

\{0.854,0.781,0.841,1.138,0.178,0.091,-0.367\}.

The sum does indeed (almost) match our fundamental. (The discrepancy comes from my rounding off.)

\{0.741,0.661,0.497,1.73,0.342,0.85,0.43\}\ .

The last thing we need to get is the dual basis E for the rows of B. I just go look it up in the previous post…

E = A\ (A'A)^{-1} = B'\ (BB')^{-1}\ :

E = \left(\begin{array}{ccc} 1. & 0 & 2. \\ 0.5455 & -0.3636 & 1.5 \\ -0.5455 & -1.6364 & 0.5 \\ -0.8182 & -1.4545 & 1.3 \\ -1. & -1. & -0.5 \\ -0.7273 & -0.1818 & 0 \\ 0 & 1. & 0\end{array}\right)

Check that B E = I; it does.

And why do we care about the dual basis, i.e. the columns of E? Well, if we apply B to our test spectrum, we get

B\ e = \{-1.27646,0.43012,1.00883\}\ .

What are those three numbers? They are components of our test spectrum wrt the dual basis E!

That is, the sum of the following three vectors (which are multiples of the basis vectors in E)…

\{-1.276,-0.696,0.696,1.044,1.276,0.928,0\}

\{0,-0.156,-0.704,-0.626,-0.43,-0.078,0.43\}

\{2.018,1.513,0.504,1.311,-0.504,0,0\}

is

\{0.742,0.661,0.496,1.729,0.342,0.85,0.43\}

… which, once again, differs from our fundamental

f = \{0.741,0.661,0.497,1.73,0.342,0.85,0.43\}

only because of the rounding used to make the numbers readable.

To say that again: the dual basis E gives us a geometric interpretation of the tri-stimulus values we computed by applying B (i.e. A’) to a spectrum. The tri-stimulus values are components of the spectrum (equivalently, of its fundamental) wrt the E basis.

So. What did Cohen and I have that was the same? The matrices A and E, and the projection operator R. But note that where he computed R as

R = E’A,

and that’s short for R = A\ (A'A)^{-1}\ A'\ ,

I would use

R = v1\ v1^T\ .

I’ll admit that’s not much of a savings if we have both A and E.)

What did we have that was related? Where he had two orthonormal bases F1 and F2, I have one, v1. All three of these bases span the same subspace of R^7\ .

Digression: ways to show that F1, F2, and v1 span the same subspace of R7

One way to show that is to show that the columns of F1 and F2 are preserved by the projection operator cobstructed from v1. We have

F1 = \left(\begin{array}{ccc} 0.513425 & -0.372389 & 0.398141 \\ 0.280073 & -0.373412 & 0.252292 \\ -0.280073 & -0.563189 & -0.196825 \\ -0.420084 & -0.376455 & 0.53131 \\ -0.513425 & -0.0959116 & -0.0948392 \\ -0.373414 & 0.185702 & 0.416961 \\ 0. & 0.468301 & 0.523404\end{array}\right)

and we compute

R\ F1 = \left(\begin{array}{ccc} 0.513425 & -0.372389 & 0.398141 \\ 0.280073 & -0.373412 & 0.252292 \\ -0.280073 & -0.563189 & -0.196825 \\ -0.420084 & -0.376455 & 0.53131 \\ -0.513425 & -0.0959116 & -0.0948392 \\ -0.373414 & 0.185702 & 0.416961 \\ 0. & 0.468301 & 0.523404\end{array}\right)

And yes, their difference is the zero matrix.

Similarly for F2.

Alternatively, F1 and F2 should generate the same projection operator as v1 did.

F1\ F1^T = \left(\begin{array}{ccccccc} 0.560795 & 0.383299 & -0.012435 & 0.136042 &   -0.265648 & -0.0948635 & 0.0339986 \\ 0.383299 & 0.281529 & 0.0822036 & 0.156964 &   -0.131909 & -0.0687305 & -0.0428189 \\ -0.012435 & 0.0822036 & 0.434363 & 0.225095 &   0.216479 & -0.0820701 & -0.36676 \\ 0.136042 & 0.156964 & 0.225095 & 0.600478 & 0.201399   & 0.308492 & 0.101795 \\ -0.265648 & -0.131909 & 0.216479 & 0.201399 &   0.281798 & 0.134365 & -0.0945546 \\ -0.0948635 & -0.0687305 & -0.0820701 & 0.308492 &   0.134365 & 0.34778 & 0.305203 \\ 0.0339986 & -0.0428189 & -0.36676 & 0.101795 &   -0.0945546 & 0.305203 & 0.493257\end{array}\right)

I assure you that is the same as P and R. And similarly for F2.

Alternatively, again, we could simply write the columns of F1 (then F2) as a linear combination of the columns of v1, and solve for the components.

Alternatively, yet again, we could try to find a transition matrix relating F1 and v1, then F2 and v2. That’s awkward because they’re not square matrices. Guess what? Use a pseudo-inverse! (Very likely the next math post.)

I have no idea what is special about F1 and F2. They are orthonormal bases, like v1, for the pre-image of the range of B. Since they span the same space, they are related by rotations (and possibly a reflection).

end digression, return to main thread

Finally, what did I do that Cohen did not? The singular value decomposition, which got me a basis, v2, for the null space of B — which he did not have. And I daresay that v1 was easier to get than either F1 or F2.

I think having v2 is incredibly useful. I get a basis for metameric blacks.

(Of course, Cohen had computed the matrices Ma and Me, and Ga and Ge, but I don’t need them because I wasn’t computing F1 and F2.)

Let me summarize that.

Cohen

  • A and E;
  • Ma = A’A, Me = E’E, where Me = Ma^-1;
  • Ga and Ge, Cholesky decompositions of Ma and Me;
  • orthonormal bases F1 = A Ge and F2 = E Ga;
  • the projection operator R = E'\ A (= A\ (A'A)^{-1}\ A')\ .

Rip

  • A;
  • A’ = B = u w v’;
  • [v] = [v1] [v2] where both v1 and v2 are orthonormal bases;
  • the projection operator R = v1 v1′;
  • the dual basis E.

(Strictly speaking, A is the dual basis to E, but the relationship is symmetric. Nevertheless, the rows of A’ live in the dual space V*, while the columns of E live in the original space V.)

I dispensed with everything leading to F1 and F2. I use v1 in place of F1 and/or F2. I get a basis, v2, for the null space.

I think the key point is using the projection operator to split a spectrum into fundamental and null vectors. It’s a unique orthogonal decomposition into a vector which has the same tri-stimulus values as the original spectrum, and a vector in the null space (metameric black, with tri-stimulus values = {0,0,0}.)

(The decomposition is unique to A, but A itself is not unique. More to follow.)

Cohen: “Visual Color and Color Mixture”

Introduction

Oct 10 edit: the third heading has been changed to “Computing E from A”

I want to work an example from Cohen, “Visual Color and Color Mixture” (see the bibliography, and this “books added” post). I am not, however, going to do it exactly the way he did. Nevertheless, I will show you everything he calculated.

Because I want to get everything of his into one post, I will break this into small sections. I expect that my next post will show you what I would have done instead.

All of his matrices can be found on p. 70 of his book.

What we have here is an extremely small example to illustrate “color matching functions” applied to light spectra, resulting in three real numbers which we call R,G,B or X,Y,Z. This is a prelude to using real color matching functions on real spectra. I will refer to the 3 numbers I get during the course of this example as “R,G,B”, in quotes because this is a toy example, and because down the road I’ll be computing “XYZ tristimulus values” in preference to RGB.

The A and E matrices: computing A from E

He has two matrices denoted A and E. They are closely related, and the typing is simpler if I start with the E matrix:

E = \left(\begin{array}{ccc} 1 & 0 & 2 \\ 0.5455 & -0.3636 & 1.5 \\ -0.5455 & -1.6364 & 0.5 \\ -0.8182 & -1.4545 & 1.3 \\ -1 & -1 & -0.5 \\ -0.7273 & -0.1818 & 0 \\ 0 & 1 & 0\end{array}\right)

Let me emphasize that the A matrix is primary, and we should have computed the E matrix from it.

Let me answer a question at the outset. What is the point of the A matrix? Applied to a spectrum — careful! it’s actually the transpose A’ that is applied to a spectrum — it generates R,G,B or X,Y,Z values. This will get us from light spectra to colors! And the matrix E contains spectra; more to follow about that, I assure you.

Given E, he would compute the A matrix as

A = E\ (E'E)^{-1}\

where I will frequently use A’ for the transpose A^T\ , simply because it is sometimes typographically more convenient — and it is an extremely common notation. But note that I will write A^{-T}\ for the inverse transpose. In general, I will use whichever notation is more convenient for any particular expression, and I may even mix them shamelessly.

We get (with 4 places, as he showed it)

A = \left(\begin{array}{ccc} 0.1219 & 0.034 & 0.2194 \\ 0.1052 & -0.0428 & 0.139 \\ 0.2045 & -0.3668 & -0.1085 \\ -0.4496 & 0.1018 & 0.2928 \\ -0.1611 & -0.0946 & -0.0523 \\ -0.5545 & 0.3052 & 0.2298 \\ -0.5429 & 0.4933 & 0.2885\end{array}\right)

Hey what? What did he accomplish with that? First, we may say that he found an A such that both

A’E = I,

and E’A = I,

i.e.

A^T\ E = E^T\ A = \left(\begin{array}{ccc} 1. & 0 & 0 \\ 0 & 1. & 0 \\ 0 & 0 & 1.\end{array}\right)

A straightforward way of stating those relationships is: A’ is a left inverse of E; A is a right inverse of E’. We should also observe that by transposing

A’ E = I

we get

E’ A = I,

i.e. the second of those relationships. They are equivalent.

Now, we computed A from E as

A = E (E'E)^{-1}\ ,

Please note that the existence of (E'E)^{-1}\ depends on E having more rows than columns, and on E being of full rank. The matrix E is of rank 3, and the 3×3 matrix E’E is invertible. The 7×7 matrix EE’ is not, and can never be (not with E having 3 columns). And, of course, the matrices A and E themselves cannot possibly be invertible, since they are not square.

The A and E matrices: computing E from A

Had we been given the matrix A, we might guess that we could have computed E from A by some similar formula. Let’s try the obvious,

A\ (A'A)^{-1}\

and see what we get. We have

A' = (E'E)^{-T}\ E'\

A'A = (E'E)^{-T}E'\ E (E'E)^{-1} = (E'E)^{-T}\

(A'A)^{-1} = (E'E)^T

A\ (A'A)^{-1} = E\ (E'E)^{-1}(E'E)^T\ .

That doesn’t look very good — but wait a minute. We have spent a lot of time computing “dispersion matrices”, things of the form E’E — and they are symmetric. That is,

(E'E)^T = E'E\ .

So we continue, getting

A\ (A'A)^{-1} = E\ (E'E)^{-1}(E'E) = E\ .

That’s what we were hoping for, that E is computed from A exactly the same way as A was computed from E. Given the matrix A we would compute the matrix E as

E = A\ (A'A)^{-1}\ .

By the way, we saw, in that process, that

A'A = (E'E)^{-T} = (E'E)^{-1}\ .

This matters later when we define Ma and Me. That latest equation will tell us that Ma and Me are inverses.

The “real” relationship between A and E: dual (reciprocal) bases

All of Cohen’s work in this example treats A and E as matrices. There is another interpretation, a crucial interpretation. I will return to this in another post, but I have to tell you what he “really” did.

We have seen this computation before. Some of you may be sick of it. Well, it isn’t quite what we have seen before.

What we have seen before is a square matrix — specifically a transition matrix defining a basis — and we have wanted to compute the reciprocal basis. I even emphasized that I do not remember the answer, but I remember how to get it. If P is the given transition matrix, and R is the transition matrix for the reciprocal basis, then I want to compute all the dot products of the columns of R with the columns of P, and I want the answers to be an identity matrix. That is, I want an identity matrix to be the matrix product of rows of R’ with columns of P,

R’P = I

so R is the inverse transpose of P:

R = P^{-T}\ .

Wait just a minute. For a reciprocal basis, we just wrote

R’P = I.

For A and E, we had

E’A = I.

Guess what? Each column of E is a reciprocal basis vector for the columns of A. What we have are three columns of A, 7-dimensional but spanning a 3-D subspace. And the three columns of E are a reciprocal basis.

For reasons that will be far more obvious later, here is a place where I would choose to call the columns of E a dual basis rather than a reciprocal basis. Computationally — in finite dimensional spaces — there is no difference, but conceptually it will help a lot even there. In fact, I would say that we need to distinquish them conceptually precisely because we can’t tell the difference numerically.

It also turns out that it’s better to think of the columns of E (column vectors) as the original basis, and the rows of A’ (row vectors) as the dual basis. It might be best to think of the pair, columns of E and row of A’, as a set of dual bases, plural. Each row of A’ represents a so-called linear functional, a linear operator mapping a spectrum (a column vector) to a real number (the suffix -al in functional says the output is 1D). If this is new to you, relax. We’ll see it again, with real matrices A and E and real spectra.

I have barely discussed the dual space (that’s where dual vectors live, while reciprocal vectors live in the same space as the original basis) in these posts. We will get to it. If you know about it, good: for color theory, think dual basis rather than reciprocal basis. If you don’t know about the dual space, it’s okay to think reciprocal basis, for a while.

Okay, I’ll tell you that if you’ve ever seen differential forms, you’ve seen dual bases:

Picture 46

I should remark that Cohen never uses either term, reciprocal basis or dual basis. He knows that the relationship between A and E is crucial, he just doesn’t use the mathematical name for the relationship.

Now is a good time to remark, as I have before for other authors, that Cohen was almost certainly breaking new ground. I will not fault him for not saying it “correctly”; he deserves far too much credit for getting the mathematics right in the first place.

So, we still have exactly what we started with, the A and E matrices. In fact, we start with either one of them and get the other.

His dispersion matrices Me and Ma, and their Cholesky (LU) decompositions

I am going to show you the rest of what he did. We will later do almost all of this differently, but I had to know what he was doing. And if you are reading Cohen, then I have to show you a better way to do what he did. He computed (and named) the two intermediate dispersion matrices:

Me = E’E

Ma = A’A

and then he focuses on the inverses Me^{-1}\ and Ma^{-1}\ . Oh, he also knows that Me and Ma are themselves inverses, as I noted earlier when we worked out E in terms of A:

Me^{-1} = E'E^{-1} = A'A = Ma\ .

(That is, there are only two distinct matrices among these 4 names.)

Anyway, here are Me, and its inverse…

Me = \left(\begin{array}{ccc} 3.7936 & 3.0166 & 1.9818 \\ 3.0166 & 6.9586 & -2.7544 \\ 1.9818 & -2.7544 & 8.44\end{array}\right)

Me^{-1} = \left(\begin{array}{ccc} 0.8981 & -0.5429 & -0.3881 \\ -0.5429 & 0.4933 & 0.2885 \\ -0.3881 & 0.2885 & 0.3038\end{array}\right)

and, for example, Ma…

Ma = A^T A= \left(\begin{array}{ccc} 0.8981 & -0.5429 & -0.3881 \\ -0.5429 & 0.4933 & 0.2885 \\ -0.3881 & 0.2885 & 0.3038\end{array}\right)

which is, as it should be, Me^{-1}\ .

Although he did not describe it this way, he next computed two matrices by doing Cholesky decompositions of Ma and Me. I think it is safe to describe the Cholseky decomposition as a special case of the LU decomposition. It is the result of an LU decomposition applied to a positive define symmetric (Hermitian in general) matrix, and the resulting L and U matrices are transposes, so there’s really only one of L and U to be found.

If you are following along with Cohen, let me warn you that I am going to change notation. Why? Because there’s a little magic going on here and a slightly different notation will focus on it. Besides, his notation seems silly. He has A and E, ends up with F1 and F2 — and along the way he uses G and Gbar.

I’ll keep his F1 and F2, since they are among his final results, but I am going to use Ga and Ge for the intermediate G’s. Trust me: you want me to do that.

Let me just do them. For Ma, we get…

Ga = \left(\begin{array}{ccc} 0.9477 & 0 & 0 \\ -0.5729 & 0.4062 & 0 \\ -0.4095 & 0.1326 & 0.3442\end{array}\right)

What he has found is a lower triangular “square root” of Ma. That is, we have Ga such that

Ga Ga’ = Ma.

(This is his Gbar.)

We do it for Me, too…

Ge = \left(\begin{array}{ccc} 1.9477 & 0 & 0 \\ 1.5488 & 2.1354 & 0 \\ 1.0175 & -2.0279 & 1.8144\end{array}\right)

As before, we have gotten Ge such that

Ge Ge’ = Me.

Note that I have used Ga and Ge to denote what we took the Cholesky decomposition of: Ga came from A’A, and Ge came from E’E, which is the inverse of A’A.

His orthonormal bases F1 and F2

He then defines

F1 = A Ge

F2 = E Ga.

That is what I want you to see: he uses Ge with A, and Ga with E. This guarantees that F1 and F2 are orthonormal matrices. (See below.) And if I view Ge and Ga as transition matrices, then F1 is apparently a basis for the column space of A, F2 for the column space of E. I have to confess that I don’t really care about F1 and F2 — I can do far better.

But, here are Cohen’s F1 and F2:

F1 = \left(\begin{array}{ccc} 0.5134 & -0.3724 & 0.3981 \\ 0.2801 & -0.3734 & 0.2523 \\ -0.2801 & -0.5632 & -0.1968 \\ -0.4201 & -0.3765 & 0.5313 \\ -0.5134 & -0.0959 & -0.0948 \\ -0.3734 & 0.1857 & 0.417 \\ 0 & 0.4683 & 0.5234\end{array}\right)

F2 = \left(\begin{array}{ccc} 0.1287 & 0.2652 & 0.6884 \\ 0.111 & 0.0512 & 0.5163 \\ 0.2158 & -0.5985 & 0.1721 \\ -0.4744 & -0.4185 & 0.4475 \\ -0.17 & -0.4725 & -0.1721 \\ -0.5851 & -0.0739 & 0 \\ -0.5729 & 0.4062 & 0\end{array}\right)

Okay, just why did F1 and F2 turn out to be orthonormal? That is, they each satisfy F’ F = I:

 F1^TF1 = F2^T F2 = \left(\begin{array}{ccc} 1. & 0 & 0 \\ 0 & 1. & 0 \\ 0 & 0 & 1.\end{array}\right)

Let’s see why. Let’s compute F1′ F1. We have

F1 = A Ge.

Then

F1′ F1 = Ge’ A’A Ge = Ge’ Ma Ge

= Ge' Me^{-1} Ge = Ge' (Ge Ge')^{-1} Ge

= Ge^T (Ge^{-T} Ge^{-1}) Ge = I\ I = I\ .

It works out because Ma and Me are inverses, and Ge is used with A to get F1. A similar calculation works for F2.

I want to point out that F1 and F2 are not dual to each other; we have, for example, F1′ F2 != I:

F1^T F2 = \left(\begin{array}{ccc} 0.541775 & 0.764073 & 0.350247 \\ -0.392952 & 0.598605 & -0.698041 \\ -0.743014 & 0.240551 & 0.624553\end{array}\right)

Of course not. Just as an orthonormal basis is its own reciprocal basis, so it “is” its own dual basis, at least in a finite dimensional space. I waffle and use quoted “is”, because the two bases which are dual to each other live in different vector spaces — but if one is orthonormal, the dual has the same numerical components. We will see this explicitly. In this case, it means that F1 “is” its own dual basis, and F2 “is” its own dual basis.

Let me say that again, approximately but simply: F1 is dual to itself, and F2 is dual to itself; they are not dual to each other.

But today we know how to get such bases without the Cholesky decomposition. We also know that we can use the same orthonormal basis for both of those spaces.

I will do that in the next post.

Using the A matrix: applied to the columns of E

I ask again, what is the point of the A matrix? Applied to a spectrum — careful! it’s actually the transpose A’ that is applied to a spectrum — it generates “R,G,B” values. Of course, this A matrix is a toy, but better to start with 3×7 than 3×81 or even larger. We’ll get there.

There are two interesting possible spectra to which we might apply A’, even from a toy matrix A. The obvious possibility, to me at least, it to apply A’ to E. (That is, we are applying A to 3 spectra at once.) Now, we already know the answer, because E was chosen as the dual basis. That is, we have already seen that

A’E = I.

That is, if we let the columns of E be E1, E2, E3, then we have

A’ E1 = (1,0,0)

A’ E2 = (0,1,0)

A’ E3 = (0,0,1).

In words, E1 is a spectrum which leads pure red; E2 generates pure green; and E3 generates pure blue. More precisely, E1 (etc.) is a spectrum which generates our chosen red. I’ll make more sense of that when we have real A matrices. (Oh, yes, there are more than one.)

Using the A matrix: applied to an equal energy (constant) spectrum

The second possibility is an equal energy spectrum:

ee= \{1,1,1,1,1,1,1\}

A\ ee = \{-1.27646,0.43012,1.00883\}

Let’s not worry too much about that negative value; what it means is that we cannot actually match the equal energy spectrum (with whatever our toy “R,G,B” light sources are); instead, we have to add red light to the equal energy spectrum, and match that resulting mixture.

Perhaps now is a good time to say that the key point here is that we get 3 values; our perception of color vision is specified by 3 numbers. In practice, “CIE tristimulus values X,Y,Z” will be our objective.

The projection matrix R

There is one final crucial question. A’ maps from 7D to 3D. It is of full rank 3, so it has a nullspace of dimension 4. (If you need, now would be a good time time to look at fundamental subspaces.)

There are a whole lot of spectra out there whose R,G,B values are {0,0,0}. More importantly, any given spectrum can be split into two vectors, one in the nullspace of A’ and the other in the preimage of its range (i.e. in the range of A).

How would we find that decomposition?

Any direct sum decomposition has projection operators associated with it, so let’s find the projection operator onto the preimage of the range.

Malinowski did that. I did it differently. Cohen does it the same way Malinowski did, but it looks more compact because he defines both E and A. Cohen writes

R = E A’.

If we use

E = A\ (A'A)^{-1}\

to expand Cohen’s equation, we get

R = A\ (A'A)^{-1}\ A^T\ .

We have also seen that this is the “hat matrix”, which projects an observation onto the subspace spanned by the least-squares fit. That is, we have

yhat = X\ \beta\ ,

and

\beta = (X'X)^{-1}\ X'\ y\ ,

so

yhat = X\ (X'X)^{-1}X'\ y\

and if we define the hat matrix H by

H = X\ (X'X)^{-1}\ X'\ ,

we have

yhat = H y.

The key is that H is to X as R is to A.

Just for completeness, I compute R as Cohen did:

R = E^TA = \left(\begin{array}{ccccccc} 0.5608 & 0.3833 & -0.0124 & 0.136 & -0.2656 & -0.0949 & 0.034   \\ 0.3833 & 0.2815 & 0.0822 & 0.157 & -0.1319 & -0.0687 & -0.0428   \\ -0.0124 & 0.0822 & 0.4344 & 0.2251 & 0.2165 & -0.0821 &   -0.3668 \\ 0.136 & 0.157 & 0.2251 & 0.6005 & 0.2014 & 0.3085 & 0.1018 \\ -0.2656 & -0.1319 & 0.2165 & 0.2014 & 0.2818 & 0.1344 &   -0.0946 \\ -0.0949 & -0.0687 & -0.0821 & 0.3085 & 0.1344 & 0.3478 &   0.3052 \\ 0.034 & -0.0428 & -0.3668 & 0.1018 & -0.0946 & 0.3052 & 0.4933\end{array}\right)

Since we know that projection operators are idempotent, we can check that R^2 = R\ . (It is.)

The projection of the columns of E

Having gotten the projection operator R, let’s use it. First, apply it to E, the dual basis to the rows of A’.

A^T E= \left(\begin{array}{ccc} 1. & 0 & 2. \\ 0.5455 & -0.3636 & 1.5 \\ -0.5455 & -1.6364 & 0.5 \\ -0.8182 & -1.4545 & 1.3 \\ -1. & -1. & -0.5 \\ -0.7273 & -0.1818 & 0 \\ 0 & 1. & 0\end{array}\right)

I hope it is not a surprise that we just got E back: the projection of E is E.

But what does that mean? No part of any column of E is in the nullspace of A’.

The projection of the equal energy spectrum

Recall the equal energy spectrum:

ee = \{1,\ 1,\ 1,\ 1,\ 1,\ 1,\ 1\}\

I will denote the projection as f, for fundamental.

f = R\ ee = \{0.741189,0.660536,0.496875,1.73027,0.34193,0.850176,0.43012\}

Well, that has a nontrivial component in the nullspace, call it n. (We know that, as soon as we see f != ee.) We have

ee = f + n,

where n is simply the difference, n = ee – f, namely

n = \{0.258811,\ 0.339464,\ 0.503125,\ -0.730266,\ 0.65807,\ 0.149824,\ 0.56988 \}

Direct computation confirms that n is in the nullspace of A’:

A^T\ n = \{0,\ 0,\ 0\}

Hang on just one second. What would you call a color whose “R,G,B” values were all zero?

I would call it black.

Because n itself, however, is a nontrivial spectrum — in the sense that there’s light there at each of those 7 frequencies! — let us refer to that spectrum as metameric black. (Two different spectra are said to be metamers or metameric if they appear to be the same color under the same illumination.)

It’s black to our eyes. (Although I could guess that if it were sufficiently intense, we might find ourselves with sore eyes from the radiation that is hitting them.)

I would emphasize that n depends more on A’, in a sense, than on the original equal energy signal ee. Of course, it came from ee and its fundamental, but once I have this n, I could add it to any other spectrum (of length 7!) without affecting the observed color of that spectrum. (In fact, it will depend on the illumination, too, but I’ll show you that. For now, pretend the illuminant is equal energy, too.)

The point is that in the real world, with real spectra — from pine needles or from some Munsell color chip or anything — every “metameric black” we compute could be added to any other spectrum we have, without affecting the color we perceive.

This, if northing else, reminds us that color is our perception of a light spectrum. And there are physical spectra in the visble region that appear black to us.

We can confirm that the fundamental f has precisely the same “R,G,B” values as the equal energy signal, by computing A’ f

A^T\ f = \{-1.27646,\ 0.43012,\ 1.00883\}

Recall, or recompute A’ ee

A^T\ ee = \{-1.27646,\ 0.43012,\ 1.00883\}

The same. We have indeed split ee into two parts

ee = f + n,

one of which (n) contributes nothing to the final R,G,B values, the other of which (f) contributes everything. This is why f is called the fundamental.

And, our observation that the columns of E are projected onto themselves says that the columns of E are also fundamental spectra.

And they are a basis, so every fundamental — everything we see — could be written as a linear combination of the columns of E. They’re not an orthonormal basis, so we would have to use the dual basis to compute coefficients of the linear combinations. Well, that’s exactly what we accomplish by applying A’ (the dual basis) to the fundamental.

Those 3 numbers (A’f = A’ ee) we get describe the fundamental f in terms of the basis vectors E1, E2, E3:

f = R E1 + G E2 + B E3.

They are simply the components of f wrt the basis E1, E2, E3.

They are not, however, the coefficients of the original signal ee. That lies not entirely in the space spanned by E — that’s what the metameric black n is, the part of the original signal that is not in the space spanned by E.

(They also are not necessarily the RGB values we would use in RGB color space. After all, the components of f could be any real numbers, while RGB color space has restricted values, whether 0 to 1, 0 to 100, or 0 to 255. Don’t worry, we’ll get there.)

Summary

Cohen has a matrix A of “color matching functions”; it has 3 columns and lots more rows.

We construct a dual basis E via

E = A\ (A'A)^{-1}\ .

Since “dual basis” is a dual relationship, we could have — and did! — compute A from a given E:

A = E\ (E'E)^{-1}\

We construct two dispersion matrices

Me = E’E,

Ma = A’A,

and we saw that they were inverses.

We did Cholesky decompositions (LU decompositions of positive definite symmetric matrices) of Ma and Me, getting

Ga Ga’ = Ma,

Ge Ge’ = Me.

Then we used Ge and Ga to construct two orthonormal bases F1 and F2

F1 = A Ge,

F2 = E Ga.

Finally, Cohen computed the projection operator R onto the non-nullspace part of the domain of A’ (the preimage of the range of A’, equivalently the image of A).

Most importantly, at the end, we saw that an equal-energy (constant) spectrum could be split into two parts (f and n), one of which (f) contained all the “R,G,B” nformation, the other of which (n) was metameric black, invisible.

Next, I will show what I would do differently.

Posted in color. Tags: , . Leave a Comment »

Fourier Series and Fourier Transform

I want to show you something that I think is really neat. (Whether you think it is neat is your decision, not mine. I don’t like it when people tell me what I will think about something they’re about to show me.)

I do not know why this calculation is not readily available. All I know is that I had to search high and low to find an example with sufficient detail that I could duplicate it. In fact, it is difficult to find a text which actually states the result at all, never mind one which works it out for a particular case.

  • I intend to take a function which is zero outside the interval [-1/2, 1/2]…
  • I will compute several terms in the associated Fourier series…
  • I will compute its Fourier transform…
  • and I will hit you between the eyes with the relationship between them.

I won’t even keep it a secret:

  • the coefficients in the Fourier series are samples of the Fourier transform.
  • alternatively, the Fourier transform of the Fourier series is discrete samples of the original continuous Fourier transform.

I think that’s marvelous. But even knowing that it’s true, I don’t see it stated very often.

There are several things I will not do in this post.

  • I will use but not discuss Fourier series or the Fourier transform;
  • I will work the example for a very specific set of parameters;
  • that is, I will not yet try changing the interval to, say, [0, 2 \pi\ ];
  • I won’t even prove the relationship;
  • I will not show you the relationship to the discrete Fourier transform (i.e. the FFT);

I should say that the proof really just amounts to showing that the Fourier transform of the cosine exists and is a pair of delta functions.

It may be that the FFT is so important that most people do not care about the relationship between a Fourier series and a Fourier transform. Well, I do.

Having told you all the things I will not do in this post, let me emphasize that I think this example — or, frankly, any similar example — is marvelous and worthwhile. There are a lot of things that have to be done right to get the details to come out right. And given one good example, I can make a whole lot of sense out of theory.

This example is based on one in Bracewell (Bracewell, Ronald N.; The Fourier Transformation and Its Applications. McGraw Hill, 2000 (3rd ed). ISBN 0 07 303938 1), around p. 244, slightly modified. It is a most excellent book for insight; the mathematics is there, and right, but in the background. The text is marred by typos, but is extraordinary despite them.

I need some pictures. To get them, I need four things. I start with a Fourier transform, namely the square of a sinc function, which itself is some form of sin(x) / x:

F(s) = \frac{1}{5} \text{sinc}\left(\frac{\pi\ s}{5}\right)^2

Picture 45

We find the inverse Fourier transform:

f(t) = \frac{5}{2} t   \text{sgn}\left(t-\frac{1}{5}\right)-\frac{1}{2}   \text{sgn}\left(t-\frac{1}{5}\right)-5 t   \text{sgn}(t)+\frac{5}{2} t   \text{sgn}\left(t+\frac{1}{5}\right)+\frac{1}{2}   \text{sgn}\left(t+\frac{1}{5}\right)

Picture 46

Hey, it’s our favorite counterexample for wavelets: the linear spline scaling function. Ok, it’s skinnier, not of width 2, but it is a triangle. This one does not even fill up the interval [-1/2, 1/2].

Next, I need the Fourier Series of that triangle. Well, I don’t really want an infinite series; I just want a finite approximation. Let’s get 7 terms icluding the constant:

0.350056 \cos (2 \pi  t)+0.229115 \cos (4 \pi    t)+0.101829 \cos (6 \pi  t)+0.0218785 \cos (8 \pi    t)+0.00972378 \cos (12 \pi  t)+0.2

Mathematica complains about convergence, but it turns out, it’s good enough. We see that there is no term in  Cos[10 \pi t]\ , so it looks like we only got 6 terms.

Picture 47

I have shown both the Fourier Series in red and the original single triangle in blue. Strictly speaking, and we need to be strict on this point, that is not the Fourier series of the triangle, but of its periodic extension. We might call it the Fourier series associated with the single triangular pulse.

Finally, just for the drawings, I want to sample the Fourier Transform at the integers:

\left(\begin{array}{cc} -6. & 0.00486189 \\ -5. & 0. \\ -4. & 0.0109393 \\ -3. & 0.0509144 \\ -2. & 0.114557 \\ -1. & 0.175028 \\ 0. & 0.2 \\ 1. & 0.175028 \\ 2. & 0.114557 \\ 3. & 0.0509144 \\ 4. & 0.0109393 \\ 5. & 0. \\ 6. & 0.00486189\end{array}\right)\ .

Picture 48

Let me put it all together.

Picture 49

The top left is a single triangular pulse in the time domain, of height 1, centered at the origin, of width 0.4. We might say it is band-limited in time, or that it is of compact support.

The top right is the Fourier transform of that triangular pulse. It is continuous, and not of compact support.

The bottom left shows the triangular pulse and (the first 7 terms of) its fourier series. Although the triangular pulse is not periodic, the Fourier series is.

The bottom right shows two things. The continuous Fourier transform and red dots which are, by definition, samples of that transform.

Drum roll, please. They are also the coefficients of the fourier series.

How can that be? For crying out loud, there are 13 samples but 7 coefficients! Worse, although the constant term in the Fourier series is .2, and that is equal to the sample F[0], the first cosine term has coefficient 0.350056 while the sample F[1] = 0.175028 .

Hey, if we multiply the sample value by 2, we get: 0.350056 . Which is our leading cosine coefficient.

Conceptually, some of our samples are for negative frequencies, but we wrote the Fourier series with positive frequencies only. If we use complex exponentials instead of cosines, then we get negative frequencies too, and the series coefficients will be cut in half (except the constant term).

Alternatively (but equivalently), since Cos[-2 \pi t] = Cos[2 \pi t]\ , an individual series term such as

0.350056 \cos (2 \pi  t)\

could be wriiten as

0.175028 Cos[2 \pi t] + 0.175028 Cos[-2 \pi t]\ ,

and there are our sample values equal to 0.175028 at -1 and 1.

Let me take a longer look.

We started with the Fourier transform:

F(s) = \frac{1}{5} \text{sinc}\left(\frac{\pi    s}{5}\right)^2\ .

If we remember that the sinc function is the Fourier transform of a rectangle, then the sinc^2 function is the Fourier transform of the convolution of two rectangles.

But the convolution of two rectangles is a triangle. That the inverse Fourier transform is a triangle should seem more than plausible.

Specifically, it turns out to be:

f(t) = \frac{5}{2} t\    \text{sgn}\left(t-\frac{1}{5}\right)-\frac{1}{2}   \text{sgn}\left(t-\frac{1}{5}\right)-5 t\   \text{sgn}(t)+\frac{5}{2} t\    \text{sgn}\left(t+\frac{1}{5}\right)+\frac{1}{2}   \text{sgn}\left(t+\frac{1}{5}\right)\ .

I can’t say I like the way Mathematica writes it, but the graph showed that we do in fact have a single triangular pulse.

Then I asked Mathematica for the first 7 terms of the Fourier series of the triangle:

fs(t) = 0.350056 \cos (2 \pi  t)+0.229115 \cos (4 \pi    t)+0.101829 \cos (6 \pi  t)+0.0218785 \cos (8 \pi    t)+0.00972378 \cos (12 \pi  t)+0.2\ .

As I said, it turns out there are in fact only 6 terms: we are missing one in Cos[10 \pi t]\ . (Strictly speaking, my command probably asked Mathematica for the first 6 non-zero terms. Whatever.)

Finally, I computed and plotted samples of the fourier transform. What were the samples? The plotted points were

\left(\begin{array}{cc} -6. & 0.00486189 \\ -5. & 0. \\ -4. & 0.0109393 \\ -3. & 0.0509144 \\ -2. & 0.114557 \\ -1. & 0.175028 \\ 0. & 0.2 \\ 1. & 0.175028 \\ 2. & 0.114557 \\ 3. & 0.0509144 \\ 4. & 0.0109393 \\ 5. & 0. \\ 6. & 0.00486189\end{array}\right)\

so the y-values were the right column of that array.

Let me explicitly convert the fourier series to complex exponentials, using

(e^{i t} + e^{i t})/2 = Cos (t)\

fs(t) = 0.2+0.175028 e^{-2 i \pi  t}+0.175028 e^{2 i \pi    t}+0.114557 e^{-4 i \pi  t}+0.114557 e^{4 i \pi    t}+0.0509144 e^{-6 i \pi  t}+0.0509144 e^{6 i \pi    t}+0.0109393 e^{-8 i \pi  t}+0.0109393 e^{8 i \pi    t}+0.00486189 e^{-12 i \pi  t}+0.00486189 e^{12 i   \pi  t}\ .

It takes me a little work to get Mathematica to display those coefficients in order (i.e. the coefficient of E^{-12 i \pi t} \ followed by the coefficient of E^{-8 i \pi t}\ , and so on). but when I have them, the ordered list is:

{0.00486189, 0, 0.0109393, 0.0509144, 0.114557, 0.175028, 0.2,
0.175028, 0.114557, 0.0509144, 0.0109393, 0, 0.00486189}

so there we have the 13 fourier series coefficients for the first several powers of e^{2 i n \pi t}\ , where “several” is n = -6 to 6.

I observe that the missing term Cos[10 \pi t]\ in the Fourier series corresponds to a sample value of… 0.

Hit that drum again, maestro, while I repeat myself. Those 13 coefficients are exactly the 13 samples of the fourier transform.

I am not quite done. What we have is a fascinating observation. The Fourier series coefficients are samples of the Fourier transform.

But why?

Consider that we have two different functions of time: the triangle and the (truncated) Fourier series.

Maybe we should find the Fourier transform of the Fourier series instead of the Fourier transform of the single triangle.

Recall the truncated fourier series

fs(t) = 0.350056 \cos (2 \pi  t)+0.229115 \cos (4 \pi    t)+0.101829 \cos (6 \pi  t)+0.0218785 \cos (8 \pi    t)+0.00972378 \cos (12 \pi  t)+0.2\ .

Ask for its fourier transform and get

FS(s) = 0.00486189\ \delta (s-6)+0.0109393\ \delta (s-4)+0.0509144\ \delta (s-3)+0.114557\ \delta (s-2)+0.175028\ \delta (s-1)+ 0.2\ \delta   (s)+0.175028\ \delta (s+1)+0.114557\ \delta (s+2)+0.0509144\ \delta (s+3)+0.0109393\ \delta(s+4)+0.00486189\ \delta (s+6)\ .

We see two things. One, Dirac deltas at the integers. This tells us that the Fourier transform of the Fourier series is discrete not continuous. Two, the coefficients in front of the Dirac deltas are precisely the coefficients in the Fourier series. This is why the coefficients of the Fourier series match the samples — because the samples are the Fourier transform of the Fourier series.

Blow your trumpet, Gabriel:

  • we have a continuous Fourier transform for the triangular pulse,
  • we have a discrete Fourier transform for the (periodic) Fourier series associated with the triangular pulse,
  • and that discrete Fourier transform consists of samples of the continuous Fourier transform.

I’m not sure that a drawing of the Fourier transform of the Fourier series will be convincing. After all, the easiest way to draw it is to plot the samples of the continuous Fourier transform! Those red dots on the picture, by themselves, are the Fourier transform of the Fourier series.

Maybe it’s okay to ask and emphasize: why are the two Fourier transforms not the same? Because we’ve taken Fourier transforms of two different functions. The (infinite) Fourier series is not equal to the triangular pulse (nor is the finite approximation); the series and the finite approximation correspond to a periodic extension of the triangular pulse, and an approximation of that periodic extension. In our first math class about Fourier series we ignored that difference, and focused on one period of the periodic extension; the Fourier transform shows us that we sometimes need to honor the difference between a non-periodic function and a periodic extension of it.

That it works out exactly depends on the Fourier transform of a cosine being precisely a (pair of) Dirac delta(s). If there’s a proportionality constant there, then we’ll be off by that constant factor. Similarly, that the samples were at the integers just required that we be taking the fourier transform of, e.g. Cos[2 \pi t]\ rather than of Cos[t].

This may be why this example is so hard to find. You will discover that most of your books use a different form of the Fourier transform (different choices of the FourierParameters).

Let me state my Fourier transform pair explicitly. I used the Mathematica option FourierParameters -> \{0, -2 \pi\}\ . Then the Fourier transform F(s) of a function f(t) is

Picture 52

and the inverse Fourier transform recovers f(t) from F(s):

Picture 53

Picture 54

Oh, there’s one other convention to be aware of, the definition of the sinc function.

Mathematica is using

sinc(x) = (sin x)/x,

and Bracewell is using

sinc(x) = (sin\ (\pi\ x))/(\pi\ x)\ .

Where Bracewell wrote (for the transform of a triangle of half-width 1/10)

\frac{1}{10}\ sinc^2(s/10)\ ,

I would have needed to write

1/10 sinc^2((\pi s)/10) = 1/10 sin^2((\pi s)/10) / (\pi s/10)^2\ .

(In fact, I arranged to use a triangle of half-width 1/5, instead.)

Let me close with a Mathematica detail. If you attempt to confirm either of those transform integrals directly, you should define a function using “set-delayed”, i.e. “:=”, instead of “=”. There’s a parameter inside each of those integrals, and you want to evaluate the integral first, then the parameter. That is, for example, if we write the inverse transform as… then we get…

Picture 50

and in particular, f1[0] = 1/2, which is wrong.

Instead,

Picture 51

Further, if you want to plot that inverse transform, you will probably find that you will need to compute points explicitly and plot them; the “//Evaluate” that would speed up the computation within the plot command appears to override the “set-delayed”.

It would probably be very instructive for me to look at all this for a triangular pulse of width greater than 1. Maybe I will. But this post was delayed while I fought with the explicit integrals, and I’ve returned to color theory….

Oh, I should at least point out that this example gives us a very useful insight:
because we recognize the Fourier series coefficients as samples of the Fourier transform, we know a heck of a lot about the sizes of the Fourier series coefficients, even without computing them. In particular, looking at the continuous transform at integers 7 and 8, we see that the next coefficient in the series is larger than the last one we computed, and the one after that is about the same size as the last one computed, and all the ones after that are very small. We know this without computing them, just by looking at the graph. (OK, it helps that we understand that the graph does get smaller and smaller as the x^2 — make that s^2 — in the denominator gets larger.)

Linear Algebra: the four fundamental subspaces

Edit 18 Sep 2009. Purely cosmetic. I have highlighted in blue the result describing the spans of the parts of the u and v matrices. I have found that I still need to refer to it rather than reconstruct it when I want it. In other words, it’s what I want to pick out quickly from this post.

Given a linear operator A, we are told that there are four fundamental subspaces associated with it. They are

  • the nullspace of A
  • the column space of A
  • the nullspace of A^T\ (i.e. of the transpose of A)
  • the column space of A^T\ .

Okay, they are easy enough to list.

Next, we need to realize that the column space of A is the range of A, and the column space of A^T\ is the range of A^T\ . That’s what we really care about — the ranges.

But why is A^T\ involved? In particular, why do we care about the nullspace and the range of A^T\ ?

Let me show you what they “really” are. (That is, this is how I understand them.)

I’m going to use an exercise from Strang’s “Linear Algebra and Its Applications”; see the bibliography.

This is exercise 2.4.2.. “Find the dimension and construct a basis for the four subspaces associated with…”

A = \left(\begin{array}{cccc} 0 & 1 & 4 & 0 \\ 0 & 2 & 8 & 0\end{array}\right)

(Of course, since the second row is twice the first, we already know that the matrix is of rank 1.)

Now, Strang presents the four fundamental subspaces by using the LDU (LU) decomposition; I am going to use — what else? — the Singular Value Decomposition (SVD).

Here is the Mathematica command…

Picture 42

We know all about the SVD. In particular, we know that the SVD decomposes the matrix A as

A = u\ w\ v^T\ ,

where u and v are orthogonal matrices; and w is the same shape as A and, loosely speaking, as diagonal as it can be.

Here is the w matrix…

w = \left(\begin{array}{cccc} \sqrt{85} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0\end{array}\right)

That we have only one nonzero value tells us, if we didn’t already know it, that the matrix A is of rank 1. ( This is a representation of the A matrix with respect to different bases, specified by u and v. As such, it is the same shape and of the same rank. Since w is of rank 1, so is A.)

Here is the v matrix…

v = \left(\begin{array}{cccc} 0 & 0 & 0 & 1 \\ \frac{1}{\sqrt{17}} & 0 & -\frac{4}{\sqrt{17}} & 0 \\ \frac{4}{\sqrt{17}} & 0 & \frac{1}{\sqrt{17}} & 0 \\ 0 & 1 & 0 & 0\end{array}\right)

The columns of v are an orthonormal basis for the domain of A. Let’s apply A to it, that is to each of these basis vectors.

A\ v = \left(\begin{array}{cccc} \sqrt{17} & 0 & 0 & 0 \\ 2 \sqrt{17} & 0 & 0 & 0\end{array}\right)

Well! The three rightmost columns of v were mapped to zero by A; we conclude that they are a basis for the nullspace of A.

What about the leftmost (remaining) column of v? Its image, by definition, is in the range of A. So the first column of v is a basis for the pre-image of the range of A.

(Once we know what A does to a basis for its domain, we know what it does to any vector in its domain.)

That’s the pre-image of the range. What about the range of A? Well, let’s look at the matrix u:

u = \left(\begin{array}{cc} \frac{1}{\sqrt{5}} & -\frac{2}{\sqrt{5}} \\ \frac{2}{\sqrt{5}} & \frac{1}{\sqrt{5}}\end{array}\right)

We see that the first column of u is proportional to the image of the first column of v. They span the same space — which is the range of A. Look at the algebra instead of the numbers:

A\ v = u\ w\ v^T\ v = u\ w\ ,

and the almost-diagonal form of w says that the columns of u w (equivalently, the columns of A v) are multiples of the columns of u. OK, some of them are zero multiples, but the real point is that the nonzero columns of A v are multiples of corresponding columns of u.

What about the other column of u? The matrix A maps from R^4 to R^2; we say that its codomain is R^2. And its codomain is split into two subspaces, each 1-dimensional:

  • the range of A
  • the part of the codomain that is not the range (“the part we cannot reach with A”).

The rightmost column of u is a basis for the rest of the codomain.

Okay…. We seem to have four fundamental subspaces, and we have orthonormal bases for them, which means we have orthogonal direct sums:

  • domain of A = nullspace of A \oplus\ preimage of the range of A
  • codomain of A = range of A \oplus\ the part we cannot reach with A.

The columns of v split naturally into bases for the nullspace and for the pre-image of the range of A; the columns of u split naturally into bases for the range of A and for the part we cannot reach with A. That is,

  • nullspace of A is spanned by rightmost columns of v
  • pre-image of range of A is spanned by leftmost columns of v
  • range of A is spanned by leftmost columns of u
  • the part we cannot reach is spanned by rightmost columns of u

But what about the range and nullspace of A^T\ ? Now would be a good time to guess that

  • the preimage of the range of A is the range of A^T\
  • the part we cannot reach with A is the nullspace of A^T\ ,

and that’s why A^T\ shows up: it provides more convenient (and in a sense, symmetric) names for two of the four spaces.

Maybe we should write the SVD of A^T\ ? From

A = u\ w\ v^T\ ,

we get

A^T = v\ w^T\ u^T\ .

The roles of u and v have been swapped. (The point is that there are no new bases involved, still just the columns of u and v.)

Then our description of the spaces and bases of A translates to the following description of the spaces and bases of A^T\ :

  • nullspace of A^T\ is spanned by rightmost columns of u
  • pre-image of range of A^T\ is spanned by leftmost columns of u
  • range of A^T\ is spanned by leftmost columns of v
  • the part we cannot reach with A^T\ is spanned by rightmost columns of v.

Our guess was right. The bases tell us that we have two names for the spaces spanned by each of the 4 partial bases.

  • the rightmost columns of v span the nullspace of A, and equivalently the part we cannot reach with A^T\
  • the leftmost columns of v span the pre-image of the range of A, and equivalently the range of A^T\
  • the rightmost columns of u span the nullspace of A^T\ , and equivalently the part we cannot reach with A
  • the leftmost columns of u span the pre-image of the range of A^T\ , and equivalently the range of A.

Therefore, instead of naming the orthogonal direct sums as

  • domain of A = nullspace of A \oplus\ preimage of the range of A
  • codomain of A = range of A \oplus\ the part we cannot reach with A

we call them

  • domain of A = nullspace of A \oplus\ range of A^T\
  • codomain of A = range of A \oplus\ the nullspace of A^T\

(If you think about it, those are the only two combinations of “range” and “nullspace” that make sense, i.e. that are subspaces of the domain and codomain respectively.)

BTW, we should actually solve the exercise (!) and write down the bases.

Recall v.

v = \left(\begin{array}{cccc} 0 & 0 & 0 & 1 \\ \frac{1}{\sqrt{17}} & 0 & -\frac{4}{\sqrt{17}} & 0 \\ \frac{4}{\sqrt{17}} & 0 & \frac{1}{\sqrt{17}} & 0 \\ 0 & 1 & 0 & 0\end{array}\right)

For the nullspace of A, the 3 rightmost columns of v:

v2 = \left(\begin{array}{ccc} 0 & 0 & 1 \\ 0 & -\frac{4}{\sqrt{17}} & 0 \\ 0 & \frac{1}{\sqrt{17}} & 0 \\ 1 & 0 & 0\end{array}\right)

Strang’s basis is equivalent:

\left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & -4 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{array}\right)

Check mine by applying A to each column…

A\ v2 = \left(\begin{array}{ccc} 0 & 0 & 0 \\ 0 & 0 & 0\end{array}\right)

For the nullspace of A^T\ , recall u:

u = \left(\begin{array}{cc} \frac{1}{\sqrt{5}} & -\frac{2}{\sqrt{5}} \\ \frac{2}{\sqrt{5}} & \frac{1}{\sqrt{5}}\end{array}\right)

then the rightmost columns of u are:

u2 = \left(\begin{array}{c} -\frac{2}{\sqrt{5}} \\ \frac{1}{\sqrt{5}}\end{array}\right)

Strang’s basis is equivalent: \{-2,1\}\ .

Check mine by applying A^T\ to it:

A^T\ u2 = \left(\begin{array}{c} 0 \\ 0 \\ 0 \\ 0\end{array}\right)

For the range of A, the first column of u:

u1 = \left(\begin{array}{c} \frac{1}{\sqrt{5}} \\ \frac{2}{\sqrt{5}}\end{array}\right)

Strang’s basis is equivalent: \{1,2\}\ .

For the range of A^T\ , the leftmost column of v:

v1 = \left(\begin{array}{c} 0 \\ \frac{1}{\sqrt{17}} \\ \frac{4}{\sqrt{17}} \\ 0\end{array}\right)

Again, his basis is equivalent: \{0,1,4,0\}\ .

I should point out that Strang’s bases for the domain and codomain (found from the LDU decomposition) are orthogonal but not orthonormal. Mine are orthonormal.

I should also point out that I really, really like orthonormal bases. Sometimes there’s a good reason not to use them, but barring such a reason – orthonormal. And we will be using this material for light spectra.

So, as I see it, we have two principles here.

First, the 4 fundamental subspaces “really” are

  • an orthogonal decomposition of the domain of A:
    • nullspace of A
    • preimage of the range of A
  • an orthogonal decomposition of the codomain of A:
    • the range of A
    • the subspace we cannot reach with A

Second, we can describe two of those subspaces in terms of A^T\ ,

  • the range of A^T\ = preimage of the range of A
  • the nullspace of A^T\ = the subspace we cannot reach with A.

Thus, more compactly, we have the 4 fundamental subspaces as

  • an orthogonal decomposition of the domain of A:
    • nullspace of A
    • range of A^T\
  • an orthogonal decomposition of the codomain of A:
    • the range of A
    • nullspace of A^T\

As Paul Harvey used to say, “And now you know the rest of the story.”

Transpose & Adjoint: review & example

introduction

Let me pick up an old problem. The mathematics comes from the two posts “transpose matrix and adjoint operator” part 1 & part 2. The problem itself comes from the schur’s lemma post.

Upon further reflection, I am going to change the problem a little bit. Do not expect to see the same answers as before.

I am also going to work it twice, assuming that we are given different information as our starting point, but I’ll do it for the very same problem.

As I have said, the appropriate question for an introduction to ABO blood groups is: Can your mother donate blood to you? Until you can answer that question, you’re missing something about blood groups.

This example is not that good, but it does tie together the following concepts and computations:

  • the transition matrix;
  • the attitude matrix;
  • the change of basis formula;
  • the reciprocal basis;
  • the transpose of a matrix;
  • the adjoint of a linear operator;
  • the matrix of the adjoint operator.

This example is intended as a guide to and a review of my relevant posts, and will not repeat very much of the explanations in them.

Well, I have found that the explanations are growing.

What is the problem? Given a matrix and a non-orthonormal basis, find the matrix of the adjoint operator with respect to the non-orthonormal basis.

What is the key? We only know one way to find the matrix of the adjoint operator: take the transpose of some matrix.

Here is the general case: given a matrix X with respect to a basis E, we can transpose it, getting X^T\ . But that is the matrix of the adjoint with respect to the dual basis F. To get the matrix with respect to the basis E, we must do a change-of-basis to get from X^T\ with respect to to F to, say, Y^T\ with respect to E.

There is, however, a very special case: if E is an orthonormal basis, then it is its own reciprocal basis, F = E, and we were done as soon as we computed the transpose, becuse X^T\ is the matrix of the adjoint with respect to E.

The general problem, then, is: given a matrix, find the matrix of its adjoint with respect to a non-orthonormal basis. We will consider two possibiities: first, that the original basis is orthonormal; second, that it is not.

I would like to make one final point before we dive in. There is a fundamental truth here, regardless of the precise definition of the adjoint operator. The fact is that two matrices A and A^T\ represent two different but related linear operators (with respect to an orthonormal basis); but if we transform those matrices to a non-orthonormal basis, the two results will not be the transposes of each other.

In other words, the relationship between two linear operators which is represented by a matrix and its transpose, holds only in an orthonormal basis.

first form of the question

First, let us suppose we are given a matrix A with respect to an orthonormal basis; and a description of a non-orthonormal basis.

Here is the given matrix A:

A = \left(\begin{array}{cc} 1 & 1 \\ 0 & 2\end{array}\right)

(This much is the same as before; this is the same matrix as in the previous posts.)

Here is the given non-orthonormal basis \{f_1, f_2\}\ specified by an attitude matrix…

att = \left(\begin{array}{cc} 1 & -1 \\ 1 & 0\end{array}\right)

Find the matrix, which I will call C^T\ , of the adjoint operator with respect to the non-orthonormal basis.

Easy enough, once you’re used to it: we have A with respect to an orthonormal basis, so we form its transpose A^T\ . This is the matrix of the adjoint with respect to the orthonormal basis. Now we just need to transform A^T\ to the non-orthonormal basis.

We want the transition matrix P, which is the transpose of the attitude matrix: P = {att}^T\ .

P = \left(\begin{array}{cc} 1 & 1 \\ -1 & 0\end{array}\right)

One form of the change-of-basis formula tells us how to find the matrix B with respect to the new basis corresponding to A with respect to the original:

B = P^{-1} A P\ ,

so we get

B = \left(\begin{array}{cc} 2 & 0 \\ -2 & 1\end{array}\right)

That is, A and B are matrix representations, with respect to two bases, of a single linear operator. We don’t actually need the matrix B, but it serves for displaying the general change-of-basis formula for from A to B. And we will want to contrast it with C^T\ .

We have been asked to find the matrix C^T\ corresponding to A^T\ with respect to a new basis. But that uses the change-of-basis formula, with A^T\ and C^T\ in place of A and B respectively:

C^T = P^{-1} A^T P\ ,

so we get

C^T = \left(\begin{array}{cc} 1 & -1 \\ 0 & 2\end{array}\right)

What we see (as we should expect) is that B and C^T\ are not the transposes of each other.

We have, then, 4 matrices. A and A^T\ represent a linear operator and its adjoint with respect to an orthornormal basis. B and C^T\ represent the same linear operator and its adjoint with respect to the non-orthonormal basis \{f_1, f_2\}\ .

(In principle, of course, we only needed three of those matrices: A, A^T\ , and C^T\ ; but it may be informative to compute B.)

second form of the question
first of three solutions

We are given the matrix B, representing a linear operator with respect to a non-orthonormal basis…

B = \left(\begin{array}{cc} 2 & 0 \\ -2 & 1\end{array}\right)

and we were given the attitude matrix describing that non-orthonormal basis…

att = \left(\begin{array}{cc} 1 & -1 \\ 1 & 0\end{array}\right)

We want to find the matrix C^T\ of the adjoint operator with respect to the non-orthonormal basis.

The quickest way to do it is to transform B to the orthonormal basis, which gives us A. We still use the transition matrix P = {att}^T \

P = \left(\begin{array}{cc} 1 & 1 \\ -1 & 0\end{array}\right)

and the change of basis formula (but note that we’re going in the other direction, to the orthonormal basis instead of from it)…

A = P B P^{-1}\ .

so we get

A = \left(\begin{array}{cc} 1 & 1 \\ 0 & 2\end{array}\right)

We have just reduced this question to the previous problem: now we can get C^T\ by transforming A^T\ to the non-orthonormal basis…

C^T = P^{-1} A^T P\ .

We’re done. We could stop here. Maybe we should. But it’s nice to know more than one way to do something, because then we can check our work.

second of three solutions

Alternatively, we could use the reciprocal basis.

This gets a little messy only because we’re introducing a 3rd basis, so we get 2 more matrices.

We transpose B, and we get the matrix of the adjoint operator with respect to the reciprocal basis:

B^T = \left(\begin{array}{cc} 2 & -2 \\ 0 & 1\end{array}\right)\ .

But we’ve only just begun. We don’t want the adjoint with respect to the reciprocal basis; we want it with respect to the given non-orthonormal basis. It’s just that, ultimately, we have to take the transpose with respect to some basis; it’s the only way to compute the matrix of the adjoint. Then the question

how do we transform from the reciprocal basis (where we have B^T\ ) to the non-orthonormal basis where we want C^T\ ?

becomes

what is the transition matrix from the reciprocal basis to the basis?

And that leads to the question: exactly what is our reciprocal basis?

I could simply say (!) that the attitude matrix for the reciprocal basis is the inverse of the transition matrix P for the non-orthonormal basis; that is, the transition matrix Q for the reciprocal basis is the inverse transpose of P:

Q = P^{-T}\ ,

Q = \left(\begin{array}{cc} 0 & 1 \\ -1 & 1\end{array}\right)

But I think it’s only fair to admit that I don’t remember the description of the attitude matrix for the reciprocal basis. What I have in my head is a matrix equation, which says that the product of the attitude matrix Q for the reciprocal basis with the transition matrix P for the given basis, is the identity:

Q^T P = I\ .

(And that matrix equation simply says that the dot products of the reciprocal basis vectors – the rows of Q^T\ – with the given non-orthonormal basis vectors – the columns of P – are either 0 or 1.)

Then I solve for Q:

Q = P^{-T}\ .

Of course Q is what I said it was, but the point is that I remember the derivation rather than the answer. Fortunately it’s a very short and easy derivation, so I see the answer almost immediately….

Having Q, we can transform B^T\ to (not from) the orthonormal basis…

D^T = Q B^T Q^{-1}\ ,

so we get

D^T = \left(\begin{array}{cc} 1 & 0 \\ 1 & 2\end{array}\right)

and then we could transform D^T\ to the non-orthonormal basis:

C^T = P^{-1} D^T P\ ,

C^T = \left(\begin{array}{cc} 1 & -1 \\ 0 & 2\end{array}\right)

That is, as it should be, the same answer.

third of three solutions

And the second solution shows us another way to have done it. From

C^T = P^{-1} D^T P\

and

D^T = Q B^T Q^{-1}\

we get

C^T = P^{-1} Q B^T Q^{-1} P\ ,

which we write as

C^T	= R^{-1} B^T R\ ,

with R defined as

R = Q^{-1} P\ .

By writing out the algebra first, we can dispense with explicit the computation and transformation of D^T\ .

In other words, the transition matrix R from the reciprocal basis to the non-orthonormal basis is given by R = Q^{-1} P\ . Equivalently, the transition matrix from the non-orthonormal basis to its reciprocal basis is given by P^{-1} Q\ .

(But even if we use R, we are implicitly going through the original orthonormal basis.)

Correction to Happenings Aug 29

For any of you who get an RSS feed, the first publication of the “happenings” post did not actually include two links, which have just been added.