**Typographic edits, 4 Jan 2010.**

I struggled to make sense of pages 94-101 of Cohen’s “Visual Color and Color Mixture”.

I understand them now. Mostly.

The challenge was to derive a drawing of three basis vectors (Figure 20, p. 95), shown here :

He calls these three vectors both “an intriguing F matrix” and “This canonical orthonormal configuration….” **But as far as I can tell, he never said where he got this particular one.** He made it sound important, so I had to derive it. Figuring it out, as usual, was very educational.

What follows is a detective story.

I eventually decided that the most likely candidate was a “transformation matrix” on p. 101. There are five transformation matrices on that page, but the key is that exactly one of them is associated with an equal-energy illuminant — and that’s related to “EE” in the figure.

Let us proceed somewhat as I did, but without all the wrong turns and blind alleys. If you want a chance to wander those byways, then close this page, open a copy of Cohen, and go to work. OTOH, I’m about to take you down one, just one, wrong turn. Hey, it’s interesting and valuable.

First, let’s get that particular transition matrix of his. He denotes it but I will just refer to it as “TE” (outside of LaTeX).

He refers to the five T’s as “direct transformation matrices from the 1964 CIE X,Y,Z color-matching functions to orthonormal F matrices.” TE is specifically related to the equal-energy illumninant; the other four are related — in the same but as yet unexplained fashion — to the standard illuminants A, B, C, and D_65. (I have done the work to confirm that.)

OK. TE relates an A matrix to an F matrix. But how? “Transformation matrix” just doesn’t tell me enough.

A has 3 columns, so we could compute A TE, i.e. post-multiply A by TE. Since , , and are all the same size, we could use any of them in place of TE.

Well, A TE doesn’t look right — I dismissed it immediately when I first saw it — but eventually I saw the light. A TE does not produce an orthonormal matrix — but it’s only off by a factor of (and some change).

Let’s compute A TE and call it F….

Oh, first I had better get the A matrix.

Cohen uses the 1964 CIE tables for this, so I take the A matrix to be the 1964 XYZ (i.e. xyz bar) tables at 5 nm intervals. Not at 20 nm. My only target is a drawing, and I want a lot of detail. I plan to repeat part of these calculations, and to do some alternatives, at 20 nm in the next post. But for this one, I need detail.

I have found a set of these tables online. Let me show you, for checking purposes, a tiny subset of the XYZ tables as I store them:

My A matrix has 3 columns and 61 rows. The six rows of A corresponding to that part of the XYZ64 table are:

**Mathematica aside.**

There, I’ve finally shown you how I’m getting different A matrices whenever I change things. The table XYZ64 — like my other three tables — has wavelength for the first argument, and the Interpolation command lets me select any subsets, or even to interpolate. (It always gets the actual points right. In fact, I am usually simply getting subsets of the data, not truly interpolating. If I ever ask for 1 nm intervals, then I would be interpolating from the 5 nm table.)

For example, the following command gives me the entries for an A matrix from 400 to 500 in 20 nm intervals (and displays it nicely):

Table[Interpolation[XYZ64,i],{i,400,500,20}]//MatrixForm

So, once having set up my XYZ64 table, and the others, I can get any plausible A matrix I want.

**end Mathematica aside**.

For this problem, my A matrix is the entire 1964 CIE XYZ table at 5 nm intervals. Now we compute

F = A TE

and plot the results. Let me put the two drawings side-by-side (mine on the left):

That black curve could be his “EE” on a different scale, but my other two are utterly different from “I” and “II”.

Let me show you the first six rows of F:

**Cohen said that F would be orthonormal. It is not.** We would conclude that F was orthonormal if F’F = I. So we compute F’F:

Round it off:

I have no idea why his F — equivalently, his TE — is too large by . (The off-diagonal terms are almost zero, so these vectors are almost orthogonal. But their squared lengths are 2, so their lengths are . We could have fixed that by dividing TE by .)

Since the book was published posthumously, perhaps the editor was editing without computing. Perhaps he took what he found in Cohen’s notes or drafts and hoped that everything was consistent.

I assure you that the other possibilities (, , and ) do not come close to delivering an orthonormal F. Feel free to try them.

We conclude that there is a minor error in TE; it turns out that the other four transformation matrices on p. 100 also need to be divided by . Whatever he had in mind, he was consistent. Under the circumstances — posthumous publication, and the ease of computing unit vectors — this is rather unimportant — conceptually and computationally — once we know it.

**More importantly, I think, we also conclude that the F produced by TE is not the F matrix in Figure 20.** They seem to have — and actually do have — proportional first columns, but that’s all.

That is a little more troubling. I could have searched for a derivation of TE; instead, I searched for a derivation of Figure 20. I don’t care to do both. Let me be explicit: I don’t know how he got his TE matrix, nor any of the other 4 transformation matrices on p. 100. I do know how he got Figure 20.

It will be more than enough work to get Figure 20. I don’ t know where TE came from. I have some ideas, and I expect that you will too by the time we finish, but I’ m done with TE. Maybe I shouldn’t be, but enough is enough.

I should emphasize, however, that I am keeping the first column of F computed from TE, at least for a little while.

Oh, let me do one last thing with TE before I abandon it.

We know the relationship between A and TE: F = A TE. How would we express that in my usual terminology? Using T for TE, we have

F = A T

and we transpose, getting

F’ = T’ A’.

That says that each column of F’ is obtained by applying the matrix T’ to the corresponding column of A’. That tells us two or three things.

**First and second, T’ is the transition matrix, and therefore his “transformation matrix” TE is actually an attitude matrix; third, he takes A as the new components and F as the old.**

Just so I know.

Now let’s find a way to match Figure 20.

First, where did the first column — the black curve or “EE” — come from? (Yes, we showed that F produced from TE has a first column which is proportional to the “EE” curve. But where did this common vector come from?)

He tells us that it is proportional to the fundamental of the illuminant.

Ok. To get the fundamental, we need the projection operator. To get that, we need the SVD (Singular Value Decomposition) of the color-matching functions, i.e. of the matrix .

A’ = u w v’.

More to the point, we need the first three columns of the matrix v. (We need three columns because A’, like A, is of rank 3.)

Let me show you the first six (of 61) rows of v1:

Those 3 column vectors are an orthonormal basis for the preimage of the range of A’ (“the fundamental color space”). We form a projection operator by

R = v1 v1′

Now to get compute the fundamental of an equal-energy illuminant ee, we need to define the illuminant (I used all 100’s):

ee = {100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100}

Its fundamental f is the projection under R:

f = R ee =

{5.53525, 12.6614, 24.9595, 41.9659, 61.9674, 81.6212, 98.8882, 114.647, 125.821, 130.294, 129.298, 124.753, 116.918, 107.167, 95.2578, 80.7352, 68.1596, 60.0607, 54.8153, 53.7962, 55.5973, 59.7538, 65.4578, 72.9488, 81.1471, 88.5547, 95.6709, 102.8, 109.309, 114.255, 118.44, 122.816, 126.542, 129.052, 130.251, 129.621, 127.996, 126.529, 123.888, 119.146, 112.759, 105.059, 96.1651, 86.301, 76.0275, 65.9399, 55.8152, 45.5545, 36.3286, 28.6452, 22.185, 16.823, 12.5322, 9.19186, 6.64563, 4.72835, 3.33391, 2.32954, 1.61867, 1.12825, 0.779473}

Let me graph a scaled version of f along with the first column of F. (Scaled? I have to divide f by 470. Trust me.)

As I have done before, the black dots are one vector — in this case, the fundamental f — and the red curve is the other vector, in this case the first column of F.

We see that they are the same. Where we used to know that his TE gave us a column of F proportional to the EE curve in Figure 20, now we know what the curve is. Now I can dispense with F computed from TE, because I can use the fundamental f of the equal-energy illuminant instead.

The horizontal axis is off — I didn’t bother showing the wavelength.

That scaling by 470? Ha! The 1964 XYZ table at 1 nm intervals is over the range 360 to 830 nm. **And 830-360 = 470.**

(Probably “the length of the interval of integration”. Not that an integral is appropriate, in my opinion. And Cohen has pointedly been doing linear algebra, not calculus. But the literature is littered with integrals, even though the data is discrete. This is the mathematics I am skeptical of in all such works: they use calculus for their theory but linear algebra for their computations. (The example from Wyszecki & Stiles was closer to linear algebra, and the example they did two pages later is clearly not done by integration — no, I didn’t work publish that example.) I suspect it’s because a long time ago we pretty much restricted ourselves to continuous systems…. But I keep coming back to the fact that their data — the color matching functions — are discrete, and linear algebra is the appropriate tool. That’s my opinion.)

That’s where the 470 seems to be from, but I don’t really care. It would have made more sense for him to have used 400-700 instead of 360-830, and that’s one reason I don’t care about the scaling. Further, and more importantly, we will ultimately make a unit vector out of the fundamental f. Its scaling along the way just isn’t all that important.

And I didn’t scale the first column of F, so the in F is consistent with the 470 in f.

Sheesh! And I thought Fourier transforms had too many different scalings!

What interests me is that the first column of F is proprtional to f. And both appear to be proportional to the curve “EE” in figure 20. This is progress.

Now let’s get the dual basis E’. (I think I like idea of calling the rows of E’ a dual basis for the columns of A… and the columns of E a reciprocal basis for the columns of A… and, finally, the columns of E are a dual basis for the rows of A’. Row vectors and column vectors live in vector spaces which are dual to each other.)

Oh, say it compactly! **The columns of E are simultaneously a reciprocal basis for the columns of A and a dual basis for the rows of A’.** Numerically the same, conceptually different.

I recall the required computation as

E’A = I,

,

where doesn’t exist, but led me to the appropriate pseudo-inverse.

Like everything else in this example, E is too big to display, but here are its first six rows:

And here is a graph of the three rows of E’ (equivalently, the three columns of E):

NOTE that these are the “CIE 1964 XYZ fundamentals” on p. 163 of Cohen. Some of those resemble Figure 20… wait, E is not orthonormal, so none of them should be curves in Figure 20 (which are orthogonal, although not unit, vectors!) The resemblance is just that: not much more than vague similarity.

OK, let’s try getting an orthonormal basis, somehow.

I don’t know about you, but I want something I understand. Let’s apply Gram-Schmidt to a non-orthonormal basis. (This is almost certainly not what Cohen did. I’ll show you what I think he did in the next post. We’ll get the same answer.)

The first two non-orthonormal bases I started with were

f, E1, E2

and

f, E1, E3.

In retrospect, my third choice

f, E2, E3

seems more obvious. **OTOH, more obvious or not, what matters is that it does, in fact, generate Figure 20!**

As is my custom, I am going to describe Gram-Schmidt for this 3D case. The general method just extends it from 3D to n-dimensions.

We have our first vector f. Normalize it, by

.

This gives us our first unit vector, f1, for the new basis.

f1 = {0.00827079, 0.0189187, 0.0372946, 0.0627057, 0.092592, 0.121959, 0.147759, 0.171307, 0.188002, 0.194686, 0.193198, 0.186406, 0.174699, 0.16013, 0.142335, 0.120635, 0.101844, 0.0897429, 0.0819053, 0.0803825, 0.0830737, 0.0892844, 0.0978074, 0.109, 0.12125, 0.132319, 0.142952, 0.153603, 0.163331, 0.170721, 0.176973, 0.183512, 0.189079, 0.192829, 0.194622, 0.19368, 0.191252, 0.18906, 0.185114, 0.178029, 0.168484, 0.15698, 0.14369, 0.128951, 0.113601, 0.0985278, 0.0833993, 0.0680677, 0.0542824, 0.0428017, 0.033149, 0.025137, 0.0187257, 0.0137345, 0.00992992, 0.00706512, 0.00498154, 0.00348081, 0.00241862, 0.00168584, 0.00116469}

Normalize E2 and E3, as we did f, just to start with vectors of the same size. Call the unit vectors e2 and e3.

e2 = {-0.00326355, -0.00739089, -0.0142972, -0.0235132, -0.0336395, -0.0422167, -0.0480022, -0.0514717, -0.0506656, -0.0450514, -0.0346783, -0.0212322, -0.00302856, 0.0178248, 0.0440051, 0.0714143, 0.0954016, 0.120945, 0.141852, 0.165312, 0.189969, 0.213125, 0.235048, 0.255043, 0.270026, 0.276282, 0.276115, 0.273313, 0.264794, 0.24833, 0.226531, 0.201183, 0.171297, 0.136324, 0.0977252, 0.0577811, 0.0185254, -0.0185732, -0.0524476, -0.0805823, -0.102238, -0.116778, -0.124173, -0.124757, -0.11956, -0.109979, -0.0974696, -0.0829623, -0.0686652, -0.0558823, -0.0443658, -0.0340783, -0.0255961, -0.0189148, -0.013768, -0.00982428, -0.00696574, -0.0048783, -0.00340378, -0.00233611, -0.0016479}

e3 = {0.0134212, 0.0307963, 0.0609534, 0.103042, 0.152976, 0.202366, 0.245956, 0.285754, 0.313861, 0.325009, 0.321797, 0.308947, 0.28685, 0.259425, 0.22519, 0.182631, 0.143994, 0.114635, 0.0919794, 0.0761547, 0.0649611, 0.0570429, 0.0503958, 0.0457693, 0.041089, 0.0352787, 0.0289087, 0.0223688, 0.0152318, 0.00702492, -0.00182114, -0.0115187, -0.0217216, -0.0324157, -0.0433804, -0.0537281, -0.0633608, -0.0725008, -0.0801518, -0.0851463, -0.0875296, -0.0873093, -0.08454, -0.0794315, -0.0725582, -0.0646114, -0.0558614, -0.0465045, -0.0377562, -0.0302361, -0.0237077, -0.0180942, -0.0135352, -0.00996527, -0.00722962, -0.00515145, -0.0036426, -0.00254819, -0.00177437, -0.00122704, -0.000856806}

Now apply Gram-Schmidt to the pair f1 and e2. The vector e2 is not orthogonal to f1, but it can be written as the sum of something parallel to f1 plus something perpendicular to f1. We will find the parallel part using the dot product, and get the perpendicular part by subtraction.

The component of e2 wrt f1 is… (since f1 is a unit vector), so we make a vector of that length, parallel to f1, and subtract it from e2. The result of that vector subtraction is perpendicular to f1. We normalize it and call it f2.

.

Then

f1, f2

is an orthonormal basis that spans the same subspace as

f1, e2.

f2 = {-0.00662133, -0.0150664, -0.029409, -0.0488852, -0.0710293, -0.0913202, -0.107276, -0.119901, -0.125361, -0.121889, -0.110234, -0.0932842, -0.0693866, -0.041572, -0.00683967, 0.0306849, 0.063449, 0.0953437, 0.120668, 0.14633, 0.171675, 0.194084, 0.214299, 0.231442, 0.242825, 0.245323, 0.241118, 0.234089, 0.221296, 0.200893, 0.175217, 0.145636, 0.111572, 0.072756, 0.0308042, -0.0115508, -0.052607, -0.0914459, -0.126173, -0.153574, -0.173117, -0.184308, -0.187183, -0.182227, -0.170858, -0.154906, -0.135801, -0.114483, -0.0939761, -0.0759605, -0.0599912, -0.0459572, -0.0344598, -0.0254257, -0.0184818, -0.0131801, -0.00933457, -0.00653426, -0.00455539, -0.00313629, -0.00220307}

Now what do we do with e3? We want to break off the piece which lies in the plane spanned by f1 and f2. That’s just the vector

.

When we subtract that from e3, what remains is perpendicular to the plane spanned by f1 and f2; we will normalize it to length 1 and then call it f3. (It’s crucial that f1 and f2 are an orthonormal basis for that plane; otherwise we would need an explicit reciprocal basis to find components of e3 wrt f1 and f2.)

f3 = {0.0108058, 0.0248259, 0.0492157, 0.0833774, 0.124048, 0.164396, 0.200091, 0.232718, 0.255782, 0.264988, 0.262322, 0.251597, 0.233086, 0.2101, 0.18122, 0.145023, 0.111701, 0.085438, 0.064457, 0.0480153, 0.034675, 0.0232252, 0.0120345, 0.00174899, -0.00919785, -0.0209678, -0.0332539, -0.0457821, -0.0586168, -0.071629, -0.0848832, -0.0992835, -0.113845, -0.128148, -0.141854, -0.153559, -0.163729, -0.173431, -0.180567, -0.183118, -0.181455, -0.175821, -0.166361, -0.153478, -0.138238, -0.121869, -0.10453, -0.0863848, -0.0696761, -0.0554858, -0.0433135, -0.0329815, -0.0246352, -0.0181132, -0.0131248, -0.00934719, -0.00660276, -0.00461709, -0.00321259, -0.00222783, -0.00154979}

(Maybe I shouldn’t have printed all those, but if you have the 1964 xyz bar tables and you’re trying to verify my calculations….)

So, it’s time to look at what we got. Here are f1, f2, f3. They are an orthonormal basis, and they span the same space as E1, E2, E3 — the fundamental color space, the pre-image of the range of A’.

The brown looks like Cohen’s “I”. But the mauve curve is off. Still….

**Do I need the negative of the mauve curve? **

**Yes! **

And multiply by 44 ! (No, I have no idea why! I just kept changing the multiplier until I matched his scale on Figure 20.)

Having found them, make them pretty — i.e. set the horizontal scale.

Here, let me put them side-by-side. They look very close, but not exact. Nevertheless, I am content: his numbers tend to be off a little, so what’s the point of trying to match them perfectly?

So. Start with the non-orthonormal basis f, E2, E3 and apply Gram-Schmidt to get the orthonormal basis f1, f2, f3. Then take the negative of f2.

I must emphasize that his scale (i.e. multiply by 44) does not represent orthonormal vectors. Pairwise orthogonal, yes — **but the f1, f2, and f3 I derived were specifically unit vectors**. The scale on his figure requires that I multiply by 44 (or something very close to it). Once we have unit vectors, we can scale them any way that is convenient. Here it was convenient to match his — frankly, incomprehensible — vertical scale.

Oh, let’s close with one last calculation, to compute the “transformation matrix” for Figure 20. We have found the F matrix for Figure 20, and we have the A matrix. What is the transition matrix T between them? (Actually, we know we want the attitude matrix instead.) Recall that he takes A as the new, and F as the old, components.

So, as usual, we compute each column of F’ (old) by applying T to each column of A’ (new). That is,

F’ = T A’

F = A T’

and we write the false but suggestive

which gives us the true

using the appropriate pseudo-inverse.

So we compute and display T’, and remember that, as the transpose of a transition matrix, it is an attitude matrix…

We can compare that to the transformation matrix TE we tried at the beginning:

As we would expect — since the F produced by TE did not match Figure 20 — these are different.

**Real Short Summary**

I matched his Figure 20 by applying the Gram-Schmidt orthogonalization procedure to the vectors f, E2, E3… where f was the fundamental of an equal-energy illuminant, and E2 and E3 were two of the basis vectors dual to the rows of A’.

**Summary**

We really started with two separate items: a “transformation matrix” TE — one of five on p. 100 — and Figure 20 on p. 95 of Cohen. The transformation matrix TE turned out to be an attitude matrix describing a new A matrix in terms of an old F matrix. (Yes, that’s backwards, but so long as we knew it, it didn’t matter. This is why I carefully distinguish attitude and transition matrices; so long as I know what I’ve been handed, I can use it properly and confidently.)

We saw that F = A TE reproduced only the “EE” curve of his Figure 20. And his F matrix failed to be orthonormal by a factor of .

We then saw that the “EE” curve was (proportional to) the fundamental f of an equal-energy illuminant. In order to find the fundamental of a spectrum, we had to find a projection operator R onto the pre-image of the range of A^T. The fundamental f is then the result of applying the projection operator R to a constant vector (“equal energy”).

By trial and error, I found that the F matrix for Figure 20 could be (very nearly) reproduced by applying the Gram-Schmidt orthogonalization procedure to the vectors f, E2, E3, where E1, E2, E3 are a basis dual to the rows of A’. We found the dual basis as the rows of and then transposed E’ to get E, a matrix of 3 columns.

In the next post, I will repeat these calculations quickly using 20 nm intervals, and I will show you a few alternatives. In particular, I will show you that his method (a Cholesky decomposition) gives the same answer — even to having to switch the sign of the second column!

## Leave a Reply