Recall Harman’s or Bartholomew’s model
Z = A F
with , X standardized, and A a -weighted eigenvector matrix, with eigenvalues from the correlation matrix.
We saw how to compute the scores in the case that A was invertible (here). If, however, any eigenvalues are zero then A will have that many columns of zeroes and will not be invertible.
What to do?
One possibility – shown in at least one of the references, and, quite honestly, one of the first things I considered – is to use a particular example of a pseudo-inverse. I must tell you up front that this is not what I would recommend, but since you will see it out there, you should see why I don’t recommend it.
(Answer: it works, it gets the same answer, but computing the pseudo-inverse explicitly is unnecessary. In fact, it’s unnecessary even if we don’t have the Singular Value Decomposition (SVD) available to us.)
Suppose we are given raw data which has, in fact, constant row sums (in this case, 1):
A couple of those fractions do not display well on my screen, so let me convert to decimals:
PCA a la Harman or Bartholomew et al.
Now, we do a PCA using Harman’s or Bartholomew’s model – but my way, not theirs. Standardize the input:
The X matrix is “the data”.
Get the eigenvalues of the correlation matrix:
Yes, one of them is identically zero. We learned here that this happens when we start with constant nonzero row sums and compute the correlation matrix.
Get the diagonal matrix of square roots:
Get the SVD of X… and look at w:
Get the -weighted matrix A, as . (The only reason I’m using “1” is so that in subsequent algebra and computations I can use A, v, etc. consistently for cut-down matrices.)
Compute as I would usually, as columns of , in this case 2 u1, and only the first 3 columns:
Let’s check it by computing :
and that is, indeed, X:
But what if the SVD is not available to us, and we don’t have u?
Let us recall the derivation in the case that A was invertible (here it was called A1; this is old stuff, so try not to worry about all the missing “1”s), we could also have computed by projecting X onto the reciprocal basis :
but we can also derive that formula without ever knowing about the reciprocal basis:
In that case, we could also have used to simplify things further, and get
(because v is orthogonal)
That was fine (and it’s what Bartholomew had us do), but our A is not invertible.
So let’s work with the cut-down matrices. To be specific, we start with the two nonzero columns of A1:
This cut-down A has, in a sense, eliminated the problem: although the new A can’t be inverted because it’s not square, it is of full rank. Even better, it’s A1 with one column removed: in a very real sense, it’s the same linear operator.
Any time I have a formula which involves a matrix inverse, and I want to generalize it to a case where I have a rectangular matrix – hence, no inverse – I investigate whether a pseudo-inverse will work.
In this case, A is of rank 2, so the two square matrices and are of rank 2. But A is 5×2, so is 2×2 and is 5×5, so is invertible, even though A1 wasn’t. There’s no point to considering , because it’s not invertible.
So, for the rectangular A – because what I dropped was a column of zeroes – we still have
Z = A F
(That is crucial. We didn’t change the data , because we didn’t lose any singular values.)
Premultiply by :
now premultiply by , which we know exists:
It may not be pretty, but it will work.
I should remind us that this is exactly what’s goes on in regression (OLS) here: the “normal equations” for the regression model
Let me also point out that if A is invertible, then the inverse expands as
That is, the equation written with the pseudo-inverse
contains our earlier special case
when A is invertible. That derivation, incidentally, reminds me that although I think of as the pseudo-inverse, it’s really the product which is the pseudo-inverse, since it’s the product that collapsed to .
It gets better, in this case. The pseudo-inverse is far prettier than it looks, but let’s see it before we show it.
Here is :
Even better, it’s , where is the cut-down :
That is, it’s the diagonal matrix of instead of . Not only is it diagonal, but its elements are almost already computed. And in fact the inverse is almost immediate:
Then a quick check shows that just as
(We also need a cut-down version v of v1 (its first two columns) as well as the cut down of (2×2) which we got earlier.)
(That is, whether we start with A1 and cut it down to A, or compute the cut-down A from v and , we get the same thing.)
Putting it all together, from
we get, as before,
That is, if A is invertible, we can compute as any one of
- (the first “few” columns).
If A is not invertible but the cut-down A is of full rank (so v and are also cut down as above), we can compute as any of
- and take the transpose.
- (the first “few” columns).
The only difference between A not invertible and A invertible is in place of .
(Yes, one of those uses F, the other .)
But as I said at the beginning, I didn’t go thru all that in order to encourage you to compute the pseudo-inverse , but unfortunately you may very well see it in texts.
When you do, realize that it’s true but utterly unnecessary for computing F.
(There’s no reason to tell you which of the references show it.)
I would always compute as . If the SVD makes you uncomfortable, or isn’t available (I weep for you), then compute it as , using V from the eigendecomposition.
Even though the pseudo-inverse isn’t all that useful in this case, it’s a good thing to know about in general.
Oh, and don’t even dream that it’s unnecessary for regression. For PCA / FA, it’s
which makes it unnecessary to actually compute the pseudo-inverse . For regression, isn’t at all likely to be diagonal.