PCA / FA. Example 4 ! Davis, and almost everyone else

I would like to revisit the work we did in Davis (example 4). For one thing, I did a lot of calculations with that example, and despite the compare-and-contrast posts towards the end, I fear it may be difficult to sort out what I finally came to.

In addition, my notation has settled down a bit since then, and I would like to recast the work using my current notation.

The original (“raw”) data for example 4 was (p. 502, and columns are variables):

X_r = \left(\begin{array}{lll} 4 & 27 & 18 \\ 12 & 25 & 12 \\ 10 & 23 & 16 \\ 14 & 21 & 14\end{array}\right)
Read the rest of this entry »

PCA / FA example 4: davis. Davis & harman 3.

Hey, since we have the reciprocal basis, we can project onto it to get the components wrt the A^R basis. After all that work to show that the S^R are the components wrt the reciprocal basis, we ought to find the components wrt the A^R basis. And now we know that that’s just a projection onto the reciprocal basis: we want to compute X B.

Recall X:

X = \left(\begin{array}{lll} -6 & 3 & 3 \\ 2 & 1 & -3 \\ 0 & -1 & 1 \\ 4 & -3 & -1\end{array}\right)

Recall B:

B = \left(\begin{array}{ll} 0.0890871 & 0. \\ -0.0445435 & -0.204124 \\ -0.0445435 & 0.204124\end{array}\right)

The product is:

X\ B = \left(\begin{array}{ll} -0.801784 & 0 \\ 0.267261 & -0.816497 \\ 0 & 0.408248 \\ 0.534522 & 0.408248\end{array}\right)

What are the column variances?

\{0.333333,0.333333\}

Does that surprise you? Read the rest of this entry »

PCA / FA example 4: davis. Davis & harman 2.

What about redistributing the variance?

We learned from harman that if we use the orthogonal eigenvector matrix v to compute new data, the result has redistributed the original variance according to the eigenvalues; and we also learned that if we use the weighted eigenvector matrix instead, we get new variables of unit variance.

(i think i thought this required us to start with the correlation matrix; it does not.)

let’s see this again for davis’ centered data X:

\left(\begin{array}{lll} -6&3&3\\ 2&1&-3\\ 0.&-1&1\\ 4&-3&-1\end{array}\right)

We have variances…

\{\frac{56}{3},\ \frac{20}{3},\ \frac{20}{3}\}

or…

{18.6667,\ 6.66667,\ 6.66667}

and the sum of the variances is…

32.

Read the rest of this entry »

PCA / FA example 4: davis. Davis & harman 1.

Let us now recall what we saw harman do. Does he have anything to add to what we saw davis do? if we’re building a toolbox for PCA / FA, we need to see if harman gave us any additional tools.

here’s what harman did or got us to do:

  • a plot of old variables in terms of the first two new ones
  • redistribute the variance of the original variables
  • his model, written Z = A F, was (old data, variables in rows) = (some eigenvector matrix) x (new data)

We should remind ourselves, however, that harman used an eigendecomposition of the correlation matrix; and the eigenvectors of the correlation matrix are not linearly related to the eigenvectors of the covariance matrix (or, equivalently, of X^T\ X.) having said that, i’m going to go ahead and apply harman’s tools to the davis results. i will work from the correlation matrix when we get to jolliffe.

What exactly was that graph of harman’s? he took the first two columns of the weighted eigenvector matrix, then plotted each row as a point. He had 5 original variables. His plot

shows that the old 1st and 3rd variables, for example, are defined very similarly in terms of the first two new variables. let me emphasize that this is a description of defining variables, not a display of the resulting data. Simple counting can help with the distinction, e.g. 3 variables but 4 observations.

Read the rest of this entry »

PCA / FA example 1: harman. discussion 2

recall that we had discovered that harman’s new F variables were uncorrelated and standardized: they had mean 0 and variance 1. he had implied, however, that the variances of the new variables would be the eigenvalues of the correlation matrix of Z.
 
he didn’t get that. can we?
 
instead of his model

Z = A F 

(where Z is the standardized data, A is a weighted eigenvector matrix, and F is the transformed, new, data), which implies 

F^T = X \ A^{-T}

we take

Z = P Y,

(where P is an orthogonal eigenvector matrix, and Y is the transformed, new, data), i.e.

Z^T = Y^T \ P^T

hence

Y^T = Z^T \ P^{-T} = X P.

(in words, we fix his messy A^{-T} by using an orthogonal eigenvector matrix P, for which P^{-T}= P.)
Read the rest of this entry »

PCA / FA Example 1: Harman. discussion 1.



harman gave us two outputs as a result of his analysis: a table and a picture.

let’s consider the picture first. it clearly shows z2 and z5 similar, z1 and z3 similar, and z4 roughly in the middle between them.
the 5 z variables in terms of F1 and F2 

i don’t know about you, but this surprised me. should it have? well, let’s take a look at the correlation matrix, from which we got our results. i’m going to round it off just a little bit, so i can refer to 3-digit numbers instead of 5.
Read the rest of this entry »

PCA / FA Example 1: Harman. what he did.

Less is more. And more is huge. It is easy for me to end up with huge posts to put out here, but I’d rather go with smaller.

Let’s get started with PCA / FA, principal components analysis and factor analysis.

In case it matters, I am using Mathematica to do these computations.

Here is an example, the first of several. This comes from Harman’s “factor analysis”. In order to emphasize the distinction between PCA and FA, he has one example of principal component analysis, and this is it.

Let me tell you up front what he did:

  • Get some data;
  • Compute its correlation matrix;
  • Find the eigenstructure of the correlation matrix;
  • Weight each eigenvector by the square root of its eigenvalue;
  • Tabulate the results;
  • Plot the original variables in the space of the two largest principal components.

I also need to say that his conceptual model is written

Z = A F,

And from the dimensions of the matrices it is clear that

A is square, k by k

Z and F are the same shape, with k rows.

We infer from its size that A will be derived from the eigenvector matrix, and that Z is derived from the given data matrix. From the shapes, we conclude that Z has observations in columns, rather than in rows. (If you’re used to econometrics or regression, you expect the transpose, observations in rows.)

But this is a fine thing, because we recognize that Z = A F is a change-of-basis equation for corresponding columns of Z and F; A is a transition matrix mapping new components (any one column of F) to old components (a column of Z).

Read the rest of this entry »