PCA / FA example 2: jolliffe. discussion 1: notation & vocabulary

Let’s do a little housekeeping. First off, Jolliffe ’s notation. Recall Harman’s:
Z = A \ F
where Z was the original data, variables in rows; A was a transition matrix, a \sqrt{eigenvalues} weighted eigenvector matrix; and F was the new (transformed) variables wrt the basis specified by A.
In that case, each column of Z is an old vector equal to the application of A to the new vector in the corresponding column of F. This is fairly natural. If we let lower case z and f be corresponding columns of Z and F, then
z = A \ f
We call A a transition matrix because it says that the old vectors z are the images under A of the new vectors f. Finally, the columns of A are the (old) components of the new basis vectors; the transpose of A is called the attitude matrix of the new basis vectors.
Let me emphasize that when I say “new data” I mean the data transformed using the transition matrix. Each observation is a vector, and that vector has components wrt the original basis and the new basis (the eigenvector basis). When I say “new data” I mean components wrt the new basis.
That was Harman.
We decided that we needed the orthogonal eigenvector matrix P, and I decided that I needed the design matrix X which has the old data with variables in columns. thus,
Z = X^T ,
Then we replaced
Z = A \ F
Z = P \ F
and then
Z^T = F^T \ P^T
X = F^T \ P^T
F^T = X \ P.
F^T , of course, is the same size as X; it has variables in columns.
That was rip.
Jolliffe writes his model as
Z = X \ A
His X is the same as mine: a design matrix of old data with variables in columns. His A is an orthogonal eigenvector matrix, so I would denote it by P. his Z is not the old data, but the new data. It corresponds to F^T . in any case, I can’t use Z for it: I still need Z = X^T , so I will use Y for the new data. I would write jolliffe’s model as
Y = X \ P
We know that it is equivalent to Harman’s model, and it is a perfectly sensible vector-matrix equation. It’s a good thing.
It’s a little hard to get too excited by this, because Jolliffe has no data in his book. Oh, yes, he cites literature where the data can be found. In that sense, he has lots of examples. Unfortunately, only 3 are self-contained: there are only 3 correlation matrices in the book. At least we can start from them and confirm his analyses.
But without data, he can never compute his Z, my Y. and there are plots for which we need data. We’ll just have to see them elsewhere.
The second piece of housekeeping is vocabulary.
On p. 4, Jolliffe defines principal components: “… the kth PC is given by z_k = \alpha_k \ x , where \alpha _k is an eigenvector … corresponding to its kth largest eigenvalue \lambda _k “. That is, the new data Z contains PCs and A contains eigenvectors; X is the old data.
On p. 6, he says, “sometimes the vectors \alpha _k are referred to as ‘PCs’…. it is preferable to… refer to \alpha _k as the vector of coefficients or loadings for the kth PC.”. He’s saying that sometimes the eigenvectors are called PCs, but they should be called the coefficients or loadings, while the transformed data Z are the principal components.
He got this table of 4 eigenvectors, and we confirmed it:
\left(\begin{array}{cccc} 0.2&-0.4&0.4&0.6\\ 0.4&-0.2&0.2&0.\\ 0.4&0.&0.2&-0.2\\ 0.4&0.4&-0.2&0.2\\ -0.4&-0.4&0.&-0.2\\ -0.4&0.4&-0.2&0.6\\ -0.2&0.6&0.4&-0.2\\ -0.2&0.2&0.8&0.\end{array}\right)
He says, “Each of the first four PCs for the correlation matrix has moderate-sized coefficients for several of the variables….”
My first reaction to that statement was: he’s talking about the eigenvectors. Why is he calling them PCs? He explicitly said the eigenvectors are not the PCs.
Did you catch that distinction? He didn’t say the PCs had moderate-sized components, but moderate-sized coefficients.
My second reaction was: damn, the components of the eigenvectors are the coefficients of the PCs. Rephrase that: the components of one vector are called the coefficients of another vector.
No wonder some people think the eigenvectors are the PCs!
My third reaction is: keep your eyes on the linear algebra, while using whatever terminology is customary in your field. Don’t expect that a different field uses the same terminology. (If you must not call them eigenvectors, use the synonym loadings.)
Frankly, with one exception, I can’t see that even “factor analysis” or “principal component analysis” are uniquely defined; as far as I can tell, with one exception, they are synonyms. (There is one model, which we will see later, which is always called factor analysis and never called principal components.)
If I learn otherwise as I go through the various books, I’ll let you know.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: