PCA / FA. Example 4 ! Davis, and almost everyone else

I would like to revisit the work we did in Davis (example 4). For one thing, I did a lot of calculations with that example, and despite the compare-and-contrast posts towards the end, I fear it may be difficult to sort out what I finally came to.

In addition, my notation has settled down a bit since then, and I would like to recast the work using my current notation.

The original (“raw”) data for example 4 was (p. 502, and columns are variables):

X_r = \left(\begin{array}{lll} 4 & 27 & 18 \\ 12 & 25 & 12 \\ 10 & 23 & 16 \\ 14 & 21 & 14\end{array}\right)
Read the rest of this entry »

PCA / FA example 4: davis. Summary.

Considering how much I learned from Davis’ little example, I almost hate to summarize it. The following just does not do justice to it. We’ll see if the passage of time – coming back to Davis with fresh eyes after doing the next example – improves the summary.

OTOH, an awful lot of what I got out of Davis came from bringing linear algebra to it, and just plain mucking about with the matrices. And I discovered, having picked up the two chemistry books since I drafted this post, that they made a lot more sense than when I first looked at them. I have learned a lot from working with Davis’ example.

Here’s what I have. My own strong preference, but I’d call it a personal choice, is to use the SVD of the data matrix X:

X = u\ w\ v^T

and compute the A’s and S’s (R-mode and Q-mode loadings and scores):

A^R = v\ w^T.

A^Q = u\ w.

S^R = X\ A^R = u\ w\ w^T = A^Q\ w^T.

S^Q = X^T\ A^Q = v\ w^T\ w = A^R\ w.

Those equations are worth having as a conceptual basis even if one chooses to do eigendecompositions of X^T\ X or X\ X^T instead of the SVD.
Read the rest of this entry »

PCA / FA example 4: davis. Davis & Jolliffe.

What tools did jolliffe give us (that i could confirm)?

  1. Z = X A, A orthogonal, X old data variables in columns
  2. serious rounding
  3. plot data wrt first two PCs
  4. how many variables to keep?

but jolliffe would have used the correlation matrix. Before we do anything else, let’s get the correlation matrix for davis’ example. Recall the centered data X:

X = \left(\begin{array}{lll} -6 & 3 & 3 \\ 2 & 1 & -3 \\ 0 & -1 & 1 \\ 4 & -3 & -1\end{array}\right)

i compute its correlation matrix:

\left(\begin{array}{lll} 1. & -0.83666 & -0.83666 \\ -0.83666 & 1. & 0.4 \\ -0.83666 & 0.4 & 1.\end{array}\right)

Now get an eigendecomposition of the correlation matrix; i called the eigenvalues \lambda and the eigenvector matrix A. Here’s A:

A = \left(\begin{array}{lll} -0.645497 & 0 & -0.763763 \\ 0.540062 & -0.707107 & -0.456435 \\ 0.540062 & 0.707107 & -0.456435\end{array}\right)

If we compute A^T\ A and get an identity matrix, then A is orthogonal. (it is.)
Read the rest of this entry »

PCA / FA example 4: davis. Reciprocal basis 4.

(this has nothing to do with the covariance matrix of X; we’re back with A^R for the SVD of X.)

in the course of computing the reciprocal basis for my cut down A^R

\left(\begin{array}{ll} 7.48331 & 0. \\ -3.74166 & -2.44949 \\ -3.74166 & 2.44949\end{array}\right)

i came up with the following matrix:

\beta = \left(\begin{array}{ll} 0.0445435 & 0 \\ -0.0890871 & -0.204124 \\ -0.0890871 & 0.204124\end{array}\right)

now, \beta is very well behaved. Its two columns are orthogonal to each other:
Read the rest of this entry »

PCA / FA example 4: davis. Davis & harman 3.

Hey, since we have the reciprocal basis, we can project onto it to get the components wrt the A^R basis. After all that work to show that the S^R are the components wrt the reciprocal basis, we ought to find the components wrt the A^R basis. And now we know that that’s just a projection onto the reciprocal basis: we want to compute X B.

Recall X:

X = \left(\begin{array}{lll} -6 & 3 & 3 \\ 2 & 1 & -3 \\ 0 & -1 & 1 \\ 4 & -3 & -1\end{array}\right)

Recall B:

B = \left(\begin{array}{ll} 0.0890871 & 0. \\ -0.0445435 & -0.204124 \\ -0.0445435 & 0.204124\end{array}\right)

The product is:

X\ B = \left(\begin{array}{ll} -0.801784 & 0 \\ 0.267261 & -0.816497 \\ 0 & 0.408248 \\ 0.534522 & 0.408248\end{array}\right)

What are the column variances?


Does that surprise you? Read the rest of this entry »

PCA / FA example 4: davis. Reciprocal basis 2 & 3.

The reciprocal basis for A^R using explicit bases.

Here I go again, realizing that I’m being sloppy. i call


the second data vector, but of course, those are the components of the second data vector. a lot of us blur this distinction a lot of the time, between a vector and its components. So long as we work only with components, this isn’t an issue, but we’re about to write vectors. i wouldn’t go so far as to say that the whole point of matrix algebra is to let us blur that distinction, but it’s certainly a major reason why we do matrix algebra. but for now, let’s look at the linear algebra, distinquishing between vectors and their components.

i need a name for the second data vector; let’s call it s. we will write it two ways, with respect to two bases, and show that the two ways are equivalent.
Read the rest of this entry »

PCA / FA example 4: davis. Scores & reciprocal basis 1.

The reciprocal basis for A^R, using matrix multiplication.

i keep emphasizing that A^Q is the new data wrt the orthogonal eigenvector matrix; and i hope i’ve said that the R-mode scores S^R = X\ A^R are not the new data wrt the weighted eigenvector matrix A^R. In a very real sense, the R-mode scores S^R do not correspond to the R-mode loadings A^R. This is important enough that i want to work out the linear algebra. In fact, i’ll work it out thrice. (And i’m going to do it differently from the original demonstration that A^Q is the new data wrt v.)

Of course, people will expect us compute the scores S^R, and I’ll be perfectly happy to. I just won’t let anyone tell me they correspond to the A^R.

First off, we have been viewing the original data as vectors wrt an orthonormal basis. Recall the design matrix:

\left(\begin{array}{lll} -6 & 3 & 3 \\ 2 & 1 & -3 \\ 0 & -1 & 1 \\ 4 & -3 & -1\end{array}\right)

Let’s take just the second observation:

X_2 = (2,\ 1,\ -3)

Those are the components of a vector.
Read the rest of this entry »