## PCA / FA example 4: davis. Summary.

Considering how much I learned from Davis’ little example, I almost hate to summarize it. The following just does not do justice to it. We’ll see if the passage of time – coming back to Davis with fresh eyes after doing the next example – improves the summary.

OTOH, an awful lot of what I got out of Davis came from bringing linear algebra to it, and just plain mucking about with the matrices. And I discovered, having picked up the two chemistry books since I drafted this post, that they made a lot more sense than when I first looked at them. I have learned a lot from working with Davis’ example.

Here’s what I have. My own strong preference, but I’d call it a personal choice, is to use the SVD of the data matrix X:

$X = u\ w\ v^T$

and compute the A’s and S’s (R-mode and Q-mode loadings and scores):

$A^R = v\ w^T$.

$A^Q = u\ w$.

$S^R = X\ A^R = u\ w\ w^T = A^Q\ w^T$.

$S^Q = X^T\ A^Q = v\ w^T\ w = A^R\ w$.

Those equations are worth having as a conceptual basis even if one chooses to do eigendecompositions of $X^T\ X$ or $X\ X^T$ instead of the SVD.

• I would construct a reciprocal basis for $A^R$.
• And, whenever I construct any basis, I would compute the data wrt that basis.
• And I would probably compute the variances of any new forms of the data.

The Q-mode loadings $A^Q$ are the new data wrt the orthogonal eigenvector matrix v (properly named, the columns of v are the right principal vectors of X, but i think of them as the eigenvectors of $X^T\ X$). The R-mode loadings $A^R$ (cut down a little) can be thought of as a non-orthonormal basis, and the R-mode scores $S^R$ are the new data wrt the reciprocal basis for $A^R$.

A lot of what I did with Davis’ example was look at alternative forms and the duality between R-mode and Q-mode. By using the SVD, I choose one particular form, and I have the duality.

Bear in mind that the SVD does not require that X be centered. But that’s a whole ‘nother story.