PCA / FA Basilevsky: with data

Introduction

I am looking into Basilevsky because he did something I didn’t understand: he normalized the rows of the Ac matrix (which I denoted Ar). We discussed that, and we illustrated the computations, in the previous two posts. But we did those computations without having any data. I want to take a closer look, with data.

In contrast to As and Ac, which are eigenvector matrices, his Ar matrix is not. Nevertheless, as I said, his Ar is not without some redeeming value. In fact, all three of the A matrices have the same redeeming value.

I will show, first by direct computation and then by proof, that each of these A matrices is the cross covariance between X data and Z data.

(That doesn’t mean I want to use his Ar matrix; all it means is that I learned something new about an A matrix.)

Recall the raw data of example 9. We’re about to go get the A matrices and the new data Z for both standardized and centered data.

X = \left(\begin{array}{lll} 2.09653 & -0.793484 & -7.33899 \\ -1.75252 & 13.0576 & 0.103549 \\ 3.63702 & 29.0064 & 8.52945 \\ 0.0338101 & 46.912 & 19.8517 \\ 5.91502 & 70.9696 & 36.0372\end{array}\right)

Yes, we’ve seen all these forms of the data before, but I want this post to be self-contained.

Now, I’m going to switch to Basilevsky’s notation, using Z for F^T. If you will, we’re mixing Harman’s model

Z = A\ F\ \text{or}\ X = F^T\ A^T

where Z = X^T, and F^T is the new data (“scores”) WRT the transition matrix (“loadings”) A; and Jolliffe’s model

Z = X\ V\ \text{or}\ X = Z\ V^{-1}

where Z is the new data (“scores”)….

We start with Harman’s model

Z = A\ F\ .

eliminate Z:

X^T = A\ F\ ,

and we know that F^T is the new data corresponding to X WRT the transition matrix A .

Now we reintroduce Z, redefining it as Jolliffe uses it, for the new data; that is, we let

Z = F^T

and get

X^T = A\ Z^T\ .

If I had to do it all over again, I would probably not have used Basilevsky’s notation. This is the notation I had for the previous two posts, but i’ll admit, it’s confusing.

But for this post, as for the previous two, that is our model.

First, let’s go get the standardized and the centered data (“X”), the corresponding A matrices, and the corresponding new data (“Z”).

Basilevsky Standardized

If you are comfortable with these calculations – which we’ve seen before for this very data – you should move right along. If, on the other hand, you need a refresher, here it is. And if you need to see the calculations explained, here

I standardize the data and call it Xs:

Xs = \left(\begin{array}{lll} 0.0368717 & -1.15632 & -1.09998 \\ -1.24681 & -0.665379 & -0.663951 \\ 0.550632 & -0.100095 & -0.170316 \\ -0.651057 & 0.534548 & 0.493006 \\ 1.31036 & 1.38724 & 1.44124\end{array}\right)

Get the eigenvalues of the correlation matrix:

\lambda s = \{2.43279,\ 0.565781,\ 0.00143288\}

Get the SVD (Singular Value Decomposition) of Xs:

Xs = us\ ws\ vs^T.

ws = \left(\begin{array}{lll} 3.11948 & 0. & 0. \\ 0. & 1.50437 & 0. \\ 0. & 0. & 0.0757068 \\ 0. & 0. & 0. \\ 0. & 0. & 0.\end{array}\right)

Form the diagonal matrix of \sqrt{\text{eigenvalues}}\

\Lambda s = \left(\begin{array}{lll} 1.55974 & 0. & 0. \\ 0. & 0.752184 & 0. \\ 0. & 0. & 0.0378534\end{array}\right)

Form the weighted eigenvector matrix A = v\ \Lambda

As = vs\ \Lambda s = \left(\begin{array}{lll} -0.752315 & 0.658804 & -0.000578824 \\ -0.963827 & -0.265197 & -0.0266011 \\ -0.968424 & -0.247849 & 0.0269245\end{array}\right)

Get the scores FT as the first three columns of 2u (where 2 = Sqrt[N-1], N=5). This is as close to the u matrix itself as I need to get. But instead of calling these F^T, we are calling them Z:

Zs = \left(\begin{array}{lll} 0.88458 & 1.06679 & 0.782819 \\ 0.913474 & -0.849064 & 0.380336 \\ -0.0628237 & 0.762691 & -1.56451 \\ -0.206697 & -1.22463 & -0.396946 \\ -1.52853 & 0.244206 & 0.798301\end{array}\right)

Check it by confirming that we have

X^T = A\ Z^T \text{or}\ X = Z\ A^T\ .

(We do.)

Basilevsky centered

Recall the raw data:

X = \left(\begin{array}{lll} 2.09653 & -0.793484 & -7.33899 \\ -1.75252 & 13.0576 & 0.103549 \\ 3.63702 & 29.0064 & 8.52945 \\ 0.0338101 & 46.912 & 19.8517 \\ 5.91502 & 70.9696 & 36.0372\end{array}\right)

I will center the data and call it Xc. Get the column means…

\{1.98597,31.8304,11.4366\}

and subtract them from each column to get Xc:

Xc = \left(\begin{array}{lll} 0.110558 & -32.6239 & -18.7756 \\ -3.73849 & -18.7728 & -11.333 \\ 1.65105 & -2.82404 & -2.90714 \\ -1.95216 & 15.0815 & 8.41516 \\ 3.92905 & 39.1392 & 24.6006\end{array}\right)

Proceeding as before, we get the SVD (Singular Value Decomposition) and display w…

Xc = uc\ wc\ vc^T

wc = \left(\begin{array}{lll} 66.0141 & 0. & 0. \\ 0. & 5.01572 & 0. \\ 0. & 0. & 1.54942 \\ 0. & 0. & 0. \\ 0. & 0. & 0.\end{array}\right)

Get the eigenvalues of the covariance matrix (and that’s the only change we make after switching from standardized to centered data):

\lambda c = \{1089.47,6.28937,0.600172\}

Get the diagonal matrix of \sqrt{\text{eigenvalues}} of the covariance matrix…

\Lambda c = \left(\begin{array}{lll} 33.0071 & 0. & 0. \\ 0. & 2.50786 & 0. \\ 0. & 0. & 0.774708\end{array}\right)

Get the weighted eigenvector matrix A = v\ \Lambda

Ac = vc\ \Lambda c = \left(\begin{array}{lll} -1.67235 & 2.48708 & -0.0914469 \\ -28.2097 & -0.261331 & -0.394039 \\ -17.0553 & 0.188375 & 0.660714\end{array}\right)

Get the scores F^T as the first three columns of 2u (where 2 = Sqrt[N-1], N=5). except that we are calling them Z, as Basilevsky does.

Zc = \left(\begin{array}{lll} 1.13849 & 0.83693 & 0.73261 \\ 0.669241 & -1.03777 & 0.41852 \\ 0.116099 & 0.683164 & -1.59786 \\ -0.519249 & -1.14658 & -0.340197 \\ -1.40458 & 0.664253 & 0.786923\end{array}\right)

Check it by confirming that we have

X^T = A\ Z^T \text{or}\ X = Z\ A^T\ .

(We do.)

Key: the A matrices are cross-covariances

We have As and Ac:

As = \left(\begin{array}{lll} -0.752315 & 0.658804 & -0.000578824 \\ -0.963827 & -0.265197 & -0.0266011 \\ -0.968424 & -0.247849 & 0.0269245\end{array}\right)

Ac = \left(\begin{array}{lll} -1.67235 & 2.48708 & -0.0914469 \\ -28.2097 & -0.261331 & -0.394039 \\ -17.0553 & 0.188375 & 0.660714\end{array}\right)

We have new data Zs and old Xs:

Zs = \left(\begin{array}{lll} 0.88458 & 1.06679 & 0.782819 \\ 0.913474 & -0.849064 & 0.380336 \\ -0.0628237 & 0.762691 & -1.56451 \\ -0.206697 & -1.22463 & -0.396946 \\ -1.52853 & 0.244206 & 0.798301\end{array}\right)

and old:

Xs = \left(\begin{array}{lll} 0.0368717 & -1.15632 & -1.09998 \\ -1.24681 & -0.665379 & -0.663951 \\ 0.550632 & -0.100095 & -0.170316 \\ -0.651057 & 0.534548 & 0.493006 \\ 1.31036 & 1.38724 & 1.44124\end{array}\right)

Now, what is the cross-covariance between Xs and Zs? I need either

\frac{X^T\ Z}{N-1}

or

\frac{Z^T\ X}{N-1}\ ,

and N-1 = 4. Note that they are the transposes of each other. We compute the former:

\frac{Xs^T\ Zs}{4} = \left(\begin{array}{lll} -0.752315 & 0.658804 & -0.000578824 \\ -0.963827 & -0.265197 & -0.0266011 \\ -0.968424 & -0.247849 & 0.0269245\end{array}\right)

Not symmetric. But it’s a cross-covariance, between two different variables, not the covariance of one variable.

Now recall As:

As = \left(\begin{array}{lll} -0.752315 & 0.658804 & -0.000578824 \\ -0.963827 & -0.265197 & -0.0266011 \\ -0.968424 & -0.247849 & 0.0269245\end{array}\right)

They are the same.

For centered data:

\frac{Xc^T\ Zc}{4} = \left(\begin{array}{lll} -1.67235 & 2.48708 & -0.0914469 \\ -28.2097 & -0.261331 & -0.394039 \\ -17.0553 & 0.188375 & 0.660714\end{array}\right)

and we recall

Ac = \left(\begin{array}{lll} -1.67235 & 2.48708 & -0.0914469 \\ -28.2097 & -0.261331 & -0.394039 \\ -17.0553 & 0.188375 & 0.660714\end{array}\right)

Again, they are the same.

Now, get Ar, by normalizing the rows of Ac:

Ar = \left(\begin{array}{lll} -0.557739 & 0.829456 & -0.030498 \\ -0.99986 & -0.00926256 & -0.0139662 \\ -0.99919 & 0.011036 & 0.0387082\end{array}\right)

and compute the cross-covariance between Xs and Zc:

\frac{Xs^T\ Zc}{4} = \left(\begin{array}{lll} -0.557739 & 0.829456 & -0.030498 \\ -0.99986 & -0.00926256 & -0.0139662 \\ -0.99919 & 0.011036 & 0.0387082\end{array}\right)

0nce again, they are the same. We have good reason, now, to believe that

\text{loadings}\ A = \frac{X^T\ Z}{N-1} =  covariance(X,Z)\ ,

where Z is the new data WRT A (i.e. the scores, as most people define them).

That’s what I set out to show.

We have seen it for As, relating standardized data Xs to the scores Zs; for Ac, relating centered data Xc to the scores Zc; and finally for Ar, relating standardized data Xs to the scores Zc.

Proving it

Let’s take a look at the proofs. The question in the back of my mind is: when is a transition matrix also a cross-covariance matrix? (I will confess that I do not have a universal answer; it appears that I can generalize from the Ar matrix somewhat.)

That is, given old data X and new data Z WRT transition matrix T,

X^T = T\ Z^T

X = Z\ T^T\ .

when can we conclude that T is the covariance between X and Z?

For either standardized data or centered data X, one argument works. We have

A = V \Lambda\ ,

where \Lambda is the diagonal matrix of \sqrt{\text{eigenvalues}} and V is an orthogonal eigenvector matrix of the covariance matrix c; that is, we have the eigendecomposition

\Lambda^2 = V^T\ c\ V = V^T \frac{X^T\ X}{N-1}\ V

\left(\text{or}\  \frac{1}{N-1}\ X^T\ X = V \Lambda^2\ V^T\right)\ .

We have the decomposition of X:

X = Z\ A^T\ \left(\text{or}\ X^T = A\ Z^T\right)\ .

For simplicity, i’m going to assume that A is invertible (i.e. that \Lambda is invertible; i.e. that all the eigenvalues are nonzero): then

Z = X\ A^{-T} \left(\text{or}\ Z^T = A^{-1}X^T\right)

and I note that

A^{-T} = V^{-T}\ \Lambda^{-T} = V\ \Lambda^{-T} = V\ \Lambda^{-1}\ .

Let me name those three equations: we have an eigendecomposition of the covariance matrix, the decomposition of X, and the definition of A.

Now we compute the cross-covariance. There may be simpler ways, and there are certainly other paths to the same answer, but this works for me. As I said, I’m going to assume that A is invertible, because for one thing we can cope if it is not; and for another, if it is not, we would have fewer Z variables than X variables.

\frac{1}{N-1}\ X^T\ Z

\frac{1}{N-1}\ X^T\ \left(X\ A^{-T}\right) (the decomposition of X)

= c\ A^{-T} (definition of the covariance matrix)

= \left(V\ \Lambda^2\ V^T\right)\ \left(V\ \Lambda^{-1}\right) (the eigendecomposition of c and definition of A)

= \left(V\ \Lambda^2\right)\ \Lambda^{-1} = V\  \Lambda (simplify)

= A (definition of A)

QED.

The eigendecomposition is crucial. Without it, we are left hanging at

c\ A^{-T}\ ,

Now let us try to prove it for Ar. It’s a surprisingly simple consequence of what we just did. We need only one new equation, the relationship between standardized and centered data:

Xs = Xc\ \sigma^{-1}\ ,

which says that we get standardized data from centered data by dividing each column by its standard deviation; and we recall that scaling columns can be accomplished by post-multiplication by a diagonal matrix. (After all, that’s how we get the A matrix from the V matrix.)

What we actually need is the transpose:

Xs^T = \sigma^{-1}\ Xc^T\ .

Let us compute the cross covariance of Xs and Zc.

\frac{ Xs^T\ Zc}{N-1}

= \frac{\sigma^{-1}\ Xc^TZc}{N-1} (Xs in terms of Xc)

= \sigma^{-1}\ Ac (by the previous result for Ac!)

= Ar (by definition of Ar)

QED.

The key, then, was that Ar was to Ac as Xs was to Xc (OK, not exactly: scale the rows of one and the columns of the other). I don’t know of any other suitable relationships, but if we found one to use, we should apply it to Xc and Ac.

Let me be more specific. We never really used the fact that \sigma was a diagonal matrix of standard deviations. If we had constructed completely different data Xd by scaling the columns of centered data Xc, using an arbitrary diagonal matrix \delta^{-1}\ , so that

Xd = Xc\ \delta^{-1}\ ,

and if we had defined a new A matrix by scaling the rows of Ac using the same diagonal matrix:

Ad = \delta^{-1}\ Ac\ ,

then we would find that Ad was the cross covariance of Xd and Zc.

I need to look at this some more.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: