PCA / FA example 4: davis. Davis & harman 1.

Let us now recall what we saw harman do. Does he have anything to add to what we saw davis do? if we’re building a toolbox for PCA / FA, we need to see if harman gave us any additional tools.

here’s what harman did or got us to do:

  • a plot of old variables in terms of the first two new ones
  • redistribute the variance of the original variables
  • his model, written Z = A F, was (old data, variables in rows) = (some eigenvector matrix) x (new data)

We should remind ourselves, however, that harman used an eigendecomposition of the correlation matrix; and the eigenvectors of the correlation matrix are not linearly related to the eigenvectors of the covariance matrix (or, equivalently, of X^T\ X.) having said that, i’m going to go ahead and apply harman’s tools to the davis results. i will work from the correlation matrix when we get to jolliffe.

What exactly was that graph of harman’s? he took the first two columns of the weighted eigenvector matrix, then plotted each row as a point. He had 5 original variables. His plot

shows that the old 1st and 3rd variables, for example, are defined very similarly in terms of the first two new variables. let me emphasize that this is a description of defining variables, not a display of the resulting data. Simple counting can help with the distinction, e.g. 3 variables but 4 observations.

from davis we only have 3 variables to start, and our weighted eigenvector matrix has only two nonzero columns; and it’s one form of what we called A^R. we have to be careful, because there are sign differences among the various forms of A^R; not because the definitions are different, but because eigenvectors are not unique. In particular, my u and v from the SVD have some different signs from davis’ U and V from eigendecompositions.

most of the time, i will use my SVD results, but right now let’s compare davis with harman. Davis had A^R as…

\left(\begin{array}{ll} 7.48331&0.\\ -3.74166&2.44949\\ -3.74166&-2.44949\end{array}\right)

Davis showed us a plot of these numbers viewed as three points.

So the R-mode loadings A^R tell us how the 3 old variables are described by the new ones; we see that a plot of A^R corresponds to harman’s plot.

i want to emphasize that even the orthogonal eigenvector matrix shows us that the data is really only 2D. Don’t think that we had to scale the eigenvectors in order to get 2D.

my orthogonal eigenvector matrix v is

\left(\begin{array}{lll} 0.816497&0&0.57735\\ -0.408248&-0.707107&0.57735\\ -0.408248&0.707107&0.57735\end{array}\right)

but wrt that basis, the new data is A^Q:

\left(\begin{array}{lll} -7.34847&0.&0.\\ 2.44949&-2.82843&0.\\ 0.&1.41421&0.\\ 4.89898&1.41421&0.\end{array}\right)

i.e. the new data is described by the first two eigenvectors, hence it lies in a plane. but then so does the original data, because the transformation between them is linear (a plane cannot be mapped to 3D, although it could be mapped to a line). we could plot the data in 3D and rotate the point of view until we see the plane edge on.

Do davis and harman’s models resemble each other?

Let’s recall the SVD form of the A’s and S’s.

X = u\ w\ v^T

A^R = v\ w^T.

A^Q = u\ w.

S^R = X\ A^R = u\ w\ w^T= A^Q\ w^T.

S^Q = X^T\ A^Q = v\ w^T\ w = A^R\ w.

Then

X = u\ w\ v^T = A^Q\ v^T = u\ AR^T.

in particular,

X\ v = A^Q and then X^T = Z = v\ AQ^T.

Z = v\ AQ^T is harman’s model (Z = A F, where Z = X^T, A is a weighted eigenvector matrix, F is the new data), while A^Q = X\ v is jolliffe’s (Z = X A, where Z is the new data, X is the original data, and A is the orthogonal eigenvector matrix). i think it would be unfair to say that davis is using either of those models, especially since harman and jolliffe (usually) work from the correlation matrix.

i would summarize davis with all four A’s and S’s.

i can’t, however, say too many times that A^R and A^Q can be constructed using u or v:

A^R = X^T\ u = v\ w^T,

and

A^Q = X\ v = u\ w.

i see no grounds for associating A^R with just u or just v. and no grounds for associating either u or v primarily with R-mode or Q-mode; they’re equally important to the SVD.

i would probably focus on v and A^Q: one defines new variables, the other gives me the associated new data.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: