PCA / FA example 9: centered and raw, 3 models

What follows is simple computation, solely to show us exactly what happens. It continues the work of the previous post, which did my default calculations for standardized data. Here I do the same calculations for centered and raw data.

Centered

The raw data is still

\text{raw} = \left(\begin{array}{lll} 2.09653 & -0.793484 & -7.33899 \\ -1.75252 & 13.0576 & 0.103549 \\ 3.63702 & 29.0064 & 8.52945 \\ 0.0338101 & 46.912 & 19.8517 \\ 5.91502 & 70.9696 & 36.0372\end{array}\right)

I will center the data and call it Xc. Get the column means…

\{1.98597,\ 31.8304,\ 11.4366\}

and subtract them from each column to get Xc:
Read the rest of this entry »

PCA / FA example 9: standardized data, 3 models

Introduction

edited 16 Jan 2009: I found a place where I called F^T the loadings instead of the scores. That’s all.

I want to run thru what is admittedly a toy case, but this seems to be where I stand on the computation of PCA / FA.

Recall the raw data of example 9:

\text{raw} = \left(\begin{array}{lll} 2.09653 & -0.793484 & -7.33899 \\ -1.75252 & 13.0576 & 0.103549 \\ 3.63702 & 29.0064 & 8.52945 \\ 0.0338101 & 46.912 & 19.8517 \\ 5.91502 & 70.9696 & 36.0372\end{array}\right)

Get the mean and variance of each column. The means are

\{1.98597,\ 31.8304,\ 11.4366\}

and the variances are

\{8.99072,\ 796.011,\ 291.354\}

We see that the raw data will differ from the centered data, and that will differ from the standardized data. Let’s do the standardized data first, because that’s what we’ve been doing most recently.

Here’s what I’m going to do. For a data matrix X

  1. get the SVD, X = u\ w\ v^T
  2. get the eigenvalues \lambda\ \text{of } X^T\ X/\left(N-1\right) (in 2 cases, that’s the correlation matrix or the covariance matrix)
  3. form the diagonal matrix \Lambda\ \text{of } \sqrt{\lambda}
  4. form the weighted eigenvector matrix A = v\ \Lambda
  5. form the loadings scores F^T= \sqrt{N-1}\ u
  6. form the new data Y wrt v, Y = u w
  7. form Davis’ loadings A^R = v\ w^T
  8. form Davis’ scores S^R = X\ A^R\ .

Read the rest of this entry »

PCA / FA Example 9: scores & loadings

I want to look at reconstituting the data. Equivalently, I want to look at setting successive singular values to zero.

This example was actually built on the previous one. Before I set the row sums to 1, I had started with

t1 = \left(\begin{array}{lll} 1 & 1 & -3 \\ -1 & 2 & -2 \\ 1 & 3 & -1 \\ -1 & 4 & 1 \\ 1 & 5 & 4\end{array}\right)

I’m going to continue with Harmon’s & Bartholomew’s model: Z = A F, Z = X^T, X is standardized, A is an eigenvector matrix weighted by the square roots of the eigenvalues of the correlation matrix of X.

I want data with one eigenvalue so large that we could sensibly retain only that one. Let me show you how I got that.
Read the rest of this entry »

PCA / FA Example 8: the pseudo-inverse

introduction

Recall Harman’s or Bartholomew’s model

Z = A F

with Z = X^T\ , X standardized, and A a \sqrt{\text{eigenvalue}}\ -weighted eigenvector matrix, with eigenvalues from the correlation matrix.

We saw how to compute the scores F^T in the case that A was invertible (here). If, however, any eigenvalues are zero then A will have that many columns of zeroes and will not be invertible.

What to do?

One possibility – shown in at least one of the references, and, quite honestly, one of the first things I considered – is to use a particular example of a pseudo-inverse. I must tell you up front that this is not what I would recommend, but since you will see it out there, you should see why I don’t recommend it.

(Answer: it works, it gets the same answer, but computing the pseudo-inverse explicitly is unnecessary. In fact, it’s unnecessary even if we don’t have the Singular Value Decomposition (SVD) available to us.)
Read the rest of this entry »

PCA / FA Bartholomew et al.: discussion

This is the 4th post in the Bartholomew et al. sequence in PCA/FA, but it’s an overview of what I did last time. Before we plunge ahead with another set of computations, let me talk about things.

I want to elaborate on the previous post. We discussed

  • the choice of data corresponding to an eigendecomposition of the correlation matrix
  • the pesky \sqrt{N-1} that shows up when we relate the \sqrt{\text{eigenvalues}},\ \Lambda, of a correlation matrix to the principal values w of the standardized data
  • the computation of scores as \sqrt{N-1}\ u
  • the computation of scores as F^T\ where X^T = Z = A\ F
  • the computation of scores as projections of the data onto the reciprocal basis
  • different factorings of the data matrix as scores times loadings

Read the rest of this entry »

PCA / FA Example 7: Bartholomew et al.: the scores

edited 17 Oct 2008 to round-off the first two columns of 5 u.

This is the 3rd post about the Bartholomew et al. book.

Introduction

It would be convenient if Bartholomew’s model were one we had seen before.

It is.

We got their scores and loadings just by following their instructions. Although they didn’t use matrix notation, their equations amounted to

loadings = the \sqrt{\text{eigenvalue}}-weighted eigenvector matrix, A = V\ \Lambda

“component score coefficients” = reciprocal basis vectors, cs = V\ \Lambda^{-1}

scores = X cs.

where V is an orthogonal eigenvector matrix of the correlation matrix, \Lambda is the diagonal matrix of (nonzero) \sqrt{\text{eigenvalues}}\ , X is the standardized data.

Let’s recall Harman’s model. Read the rest of this entry »

PCA / FA Example 7: Bartholomew et al. Calculations

The familiar part

This is the second post about Example 7. We confirm their analysis, but we work with the computed correlation matrix rather than their published rounded-off correlation matrix.

I am pretty well standardizing my notation. V is an orthogonal eigenvector matrix from an eigendecomposition; \lambda is the associated eigenvalues, possibly as a list, possibly as a square diagonal matrix. \Lambda is the square roots of \lambda\ , possibly as a list, possibly as a square matrix, and possibly as a matrix with rows of zeroes appended to make it the same shape as w (below).

Ah, X is a data matrix with observations in rows. Its transpose is Z = X^T\ .

The (full) singular value decomposition (SVD) is the product u\ w\ v^T\ , with u and v orthogonal, w is generally rectangular, as it must be the same shape as X.
Read the rest of this entry »

surfaces: visualizing the gluing of them

Quite some time ago, a friend asked me what would happen if we tried to construct a torus by gluing all 4 sides of a sheet of paper together, instead of first one pair then the other. Didn’t the math have to specify first one pair then the other?

One reason I’ve been hesitating over this post is that it doesn’t seem to be “real” mathematics – though any number of people might howl that PCA / FA isn’t “real” mathematics either. This is just a small drawing that I cobbled together to show that the homeomorphism between a circle and a line segment with endpoints identified… well, it doesn’t have to correspond to a physical process. (Why don’t I refer to “the glued line”?)

“… algebra provides rigor while geometry provides intuition.”
from the preface to “A Singular Introduction to Commutative Algebra” by Greuel & Pfister

It helped me to go back and read my original comment when I acklowledged Jim’s question to this post. I see that I did not understand that what matters to the formalism, the algebra, is before and after; what matters to the geometry is between or during. We reconcile them by permitting some things in the geometric visualization that we would not permit in the formal algebra, if the algebra even formalized the process: points passing thru points; or even some tearing, if when we reglue it we restore it rather than take the opportunity to change it.

That’s what bloch says, on p.57, discussing the physical process of getting from the knotted torus

to the regular torus.
Read the rest of this entry »