Rip | Rip's Applied Mathematics Blog

PCA / FA. A Map to my posts

April 1, 2009 — rip

The Posts

The purpose of this post is to provide guidance to a reader who has just discovered that I have a large pile of posts about principal components / factor analysis. This pile of posts might seem a very jungle, without any map.

Here, have a map.

As I finalize this post, it will be number 52 in PCA / FA. Here’s a list of the 52 posts, including the dates spanned by any group, and the number of posts in that group. (When the picture was taken, I didn’t know when this would be published. In fact, post 51 was scheduled but not yet published. Even more, post 51 did not even exist when the first picture was created.)

posts-table
transition/attitude matrices is a post that is sometimes relevant when we discuss “new data” in PCA, but it is not in the PCA / FA category.

“tricky prepro” is short for “tricky preprocessing”, and discusses the combination of constant row sums and covariance or correlation matrix.
Read the rest of this entry »

Posted in math PCA. Tags: mathematics, PCA FA principal components factor analysis, Rip. Leave a Comment »

PCA/FA Answers to some Basilevsky questions

March 11, 2009 — rip

Let us look at three of the questions I asked early in February, and answer two of them.

First, what do we know? What have we done?

We assume that we have data X, with the variables in columns, as usual. In fact, we assume that the data is at least centered, and possibly standardized.

We compute the covariance matrix

$c = \frac{X^T X}{N-1}\$ ,

then its eigendecomposition

$c = v\ \Lambda^2\ v^T\$ ,

where $\Lambda^2$ is the diaginal matrix of eigenvalues. We define the $\sqrt{\text{eigenvalue}}$ -weighted matrix

$A = v\ \Lambda\$ .

Finally, we use A as a transition matrix to define new data Z:

$X^T = A\ Z^T\$ .

We discovered two things. One, the matrix A is the cross covariance between Z and X:

$A = \frac{X^T Z}{N-1}\$ .

I find this interesting, and I suspect that it would jump off the page at me out of either Harman or Jolliffe; that is, I suspect it is written there but it didn’t register.

Two, we discovered that we could find a matrix Ar which is the cross covariance between Zc and Xs. Read the rest of this entry »

Posted in math PCA. Tags: Basilevsky, mathematics, PCA FA principal components factor analysis, Rip. Leave a Comment »

PCA / FA Basilevsky: with data

January 21, 2009 — rip

Introduction

I am looking into Basilevsky because he did something I didn’t understand: he normalized the rows of the Ac matrix (which I denoted Ar). We discussed that, and we illustrated the computations, in the previous two posts. But we did those computations without having any data. I want to take a closer look, with data.

In contrast to As and Ac, which are eigenvector matrices, his Ar matrix is not. Nevertheless, as I said, his Ar is not without some redeeming value. In fact, all three of the A matrices have the same redeeming value.

I will show, first by direct computation and then by proof, that each of these A matrices is the cross covariance between X data and Z data.
Read the rest of this entry »

Posted in math PCA. Tags: Basilevsky, mathematics, PCA FA principal components factor analysis, Rip. Leave a Comment »

PCA / FA example 9: centered and raw, 3 models

October 31, 2008 — rip

What follows is simple computation, solely to show us exactly what happens. It continues the work of the previous post, which did my default calculations for standardized data. Here I do the same calculations for centered and raw data.

Centered

The raw data is still

$\text{raw} = \left(\begin{array}{lll} 2.09653 & -0.793484 & -7.33899 \\ -1.75252 & 13.0576 & 0.103549 \\ 3.63702 & 29.0064 & 8.52945 \\ 0.0338101 & 46.912 & 19.8517 \\ 5.91502 & 70.9696 & 36.0372\end{array}\right)$

I will center the data and call it Xc. Get the column means…

$\{1.98597,\ 31.8304,\ 11.4366\}$

and subtract them from each column to get Xc:
Read the rest of this entry »

Posted in math PCA. Tags: mathematics, Rip. Leave a Comment »

PCA / FA example 9: standardized data, 3 models

October 27, 2008 — rip

Introduction

edited 16 Jan 2009: I found a place where I called F^T the loadings instead of the scores. That’s all.

I want to run thru what is admittedly a toy case, but this seems to be where I stand on the computation of PCA / FA.

Recall the raw data of example 9:

Get the mean and variance of each column. The means are

$\{1.98597,\ 31.8304,\ 11.4366\}$

and the variances are

$\{8.99072,\ 796.011,\ 291.354\}$

We see that the raw data will differ from the centered data, and that will differ from the standardized data. Let’s do the standardized data first, because that’s what we’ve been doing most recently.

Here’s what I’m going to do. For a data matrix X

get the SVD, $X = u\ w\ v^T$
get the eigenvalues $\lambda\ \text{of } X^T\ X/\left(N-1\right)$ (in 2 cases, that’s the correlation matrix or the covariance matrix)
form the diagonal matrix $\Lambda\ \text{of } \sqrt{\lambda}$
form the weighted eigenvector matrix $A = v\ \Lambda$
form the ~~loadings~~ scores $F^T= \sqrt{N-1}\ u$
form the new data Y wrt v, Y = u w
form Davis’ loadings $A^R = v\ w^T$
form Davis’ scores $S^R = X\ A^R\$ .

Read the rest of this entry »

Posted in math PCA. Tags: mathematics, Rip. Leave a Comment »

PCA / FA Example 9: scores & loadings

October 22, 2008 — rip

I want to look at reconstituting the data. Equivalently, I want to look at setting successive singular values to zero.

This example was actually built on the previous one. Before I set the row sums to 1, I had started with

$t1 = \left(\begin{array}{lll} 1 & 1 & -3 \\ -1 & 2 & -2 \\ 1 & 3 & -1 \\ -1 & 4 & 1 \\ 1 & 5 & 4\end{array}\right)$

I’m going to continue with Harmon’s & Bartholomew’s model: Z = A F, Z = X^T, X is standardized, A is an eigenvector matrix weighted by the square roots of the eigenvalues of the correlation matrix of X.

I want data with one eigenvalue so large that we could sensibly retain only that one. Let me show you how I got that.
Read the rest of this entry »

Posted in math PCA. Tags: mathematics, PCA FA principal components factor analysis, Rip. 2 Comments »

PCA / FA Example 8: the pseudo-inverse

October 19, 2008 — rip

introduction

Recall Harman’s or Bartholomew’s model

Z = A F

with $Z = X^T\$ , X standardized, and A a $\sqrt{\text{eigenvalue}}\$ -weighted eigenvector matrix, with eigenvalues from the correlation matrix.

We saw how to compute the scores $F^T$ in the case that A was invertible (here). If, however, any eigenvalues are zero then A will have that many columns of zeroes and will not be invertible.

What to do?

One possibility – shown in at least one of the references, and, quite honestly, one of the first things I considered – is to use a particular example of a pseudo-inverse. I must tell you up front that this is not what I would recommend, but since you will see it out there, you should see why I don’t recommend it.

(Answer: it works, it gets the same answer, but computing the pseudo-inverse explicitly is unnecessary. In fact, it’s unnecessary even if we don’t have the Singular Value Decomposition (SVD) available to us.)
Read the rest of this entry »

Posted in math PCA. Tags: mathematics, PCA FA principal components factor analysis, pseudo-inverse, Rip. Leave a Comment »

PCA / FA tricky preprocessing

August 16, 2008 — rip

Introduction

I have stumbled across a tricky point in the preprocessing of data. The most relevant post is probably

this of April 7. Rather than lecture, let me ask and answer some questions. The fundamental question is:
Can I inadvertently reduce the rank (the dimensionality) of the data matrix?
The answer is yes.
Read the rest of this entry »

Posted in math PCA. Tags: mathematics, PCA FA principal components factor analysis, preprocessing, Rip. 2 Comments »

	Xobailee Madison on An Overview of Truss Desi…
	Andrew curves… on Andrews Curves
	Hakanai on Color: from XYZ to spectr…
	Hakanai on Color: from spectrum to t…
	VÄRIT! – Desig… on HSB: Hue, Saturation, Bri…
	jf08056 on Color: from spectrum to t…
	Kingsley Kayira on Calculus: Organizing technique…
	marthalindeman on Orbits: the elliptical or…
	prof drdf horia oras… on Control Theory – Example …
	pprof dr mircea oras… on Happenings – 2012 Nov 10
	prof drd horia orasa… on Regression 1: Archer Daniel Mi…
	prof drd horia orasa… on Happenings – 2013 Mar 30
	prof dr mircea orasa… on Happenings – 2012 Dec 29
	prof dr mircea orasa… on Color: Cohen Figure 20,…
	prof dr mircea orasa… on Rotations, especially Euler an…

Rip’s Applied Mathematics Blog

Pages

Categories

Meta

PCA / FA. A Map to my posts

The Posts

PCA/FA Answers to some Basilevsky questions

First, what do we know? What have we done?

PCA / FA Basilevsky: with data

Introduction

PCA / FA example 9: centered and raw, 3 models

Centered

PCA / FA example 9: standardized data, 3 models

Introduction

PCA / FA Example 9: scores & loadings

PCA / FA Example 8: the pseudo-inverse

introduction

PCA / FA tricky preprocessing

Introduction

Recent Posts

Search this blog (see Suggestions)

Recent Comments

Calendar of Posts

Top Posts

Archives