PCA / FA example 4: davis. Davis & harman 3.

Hey, since we have the reciprocal basis, we can project onto it to get the components wrt the A^R basis. After all that work to show that the S^R are the components wrt the reciprocal basis, we ought to find the components wrt the A^R basis. And now we know that that’s just a projection onto the reciprocal basis: we want to compute X B.

Recall X:

X = \left(\begin{array}{lll} -6 & 3 & 3 \\ 2 & 1 & -3 \\ 0 & -1 & 1 \\ 4 & -3 & -1\end{array}\right)

Recall B:

B = \left(\begin{array}{ll} 0.0890871 & 0. \\ -0.0445435 & -0.204124 \\ -0.0445435 & 0.204124\end{array}\right)

The product is:

X\ B = \left(\begin{array}{ll} -0.801784 & 0 \\ 0.267261 & -0.816497 \\ 0 & 0.408248 \\ 0.534522 & 0.408248\end{array}\right)

What are the column variances?

\{0.333333,0.333333\}

Does that surprise you? the 1/3 comes from our using eigenvalues of X^T\ X instead of eigenvalues of the covariance matrix. The new variables have a common variance, but it isn’t 1. this corresponds to our initial work in harman, when we discovered that using the weighted eigenvector matrix gave us new variables with variance 1.

at first glance, it seems not worthwhile to work that out using the covariance matrix, but i suppose it might be worthwhile to come at the covariance matrix from the SVD. (oh, yes! it is!)

Let’s do it.

The covariance matrix of X is \frac{X^T\ X}{N-1}

We should just replace X by X2 = \frac{X}{\sqrt{N-1}} and find the SVD of X2. but there’s a catch.

Do it anyway. here’s our X2:

X2 = \left(\begin{array}{lll} -3.4641 & 1.73205 & 1.73205 \\ 1.1547 & 0.57735 & -1.73205 \\ 0. & -0.57735 & 0.57735 \\ 2.3094 & -1.73205 & -0.57735\end{array}\right)

Compute the variances…

\{6.22222,2.22222,2.22222\}

and their sum … 10.6667 .

Hang on. The data X2 has 1/3 the variance of the original data. We’re trying to find new forms of X (not of X2) with variance 1. but let’s just keep going.

Get the SVD of X2:

here are new u…

u2 = \left(\begin{array}{llll} -0.801784 & 0 & -0.571429 & 0.174964 \\ 0.267261 & -0.816497 & -0.449762 & -0.24417   \\ 0. & 0.408248 & -0.267261 & -0.872872 \\ 0.534522 & 0.408248 & -0.632262 & 0.384531\end{array}\right)

new v…

v2 = \left(\begin{array}{lll} 0.816497 & 0 & 0.57735 \\ -0.408248 & -0.707107 & 0.57735 \\ -0.408248 & 0.707107 & 0.57735\end{array}\right)

and new w…

w2 = \left(\begin{array}{lll} 5.2915 & 0. & 0. \\ 0. & 2. & 0. \\ 0. & 0. & 0. \\ 0. & 0. & 0.\end{array}\right)

Then i compute new A’s and S’s from

A^R = v\ w^T.

A^Q = u\ w.

S^R = X\ A^R.

S^Q = X^T\ A^Q.

(using X2, u2, v2, and w2, of course)

A^Q are the components of the data X2 wrt the orthogonal eigenvector matrix v2.

A^Q = \left(\begin{array}{lll} -4.24264 & 0 & 0 \\ 1.41421 & -1.63299 & 0 \\ 0. & 0.816497 & 0 \\ 2.82843 & 0.816497 & 0\end{array}\right)

Compute the variances…

\{9.33333,1.33333,0.\}

and their sum is 10.6667 . Yes, the new orthogonal eigenvector matrix v has redistributed the total variance of X2 among two new variables.

What about the R-mode scores?

S^R = \left(\begin{array}{llll} -22.4499 & 0 & 0 & 0 \\ 7.48331 & -3.26599 & 0 & 0 \\ 0. & 1.63299 & 0 & 0 \\ 14.9666 & 1.63299 & 0 & 0\end{array}\right)

Compute the variances:

\{261.333,5.33333,0.,0.\}

i say again, yikes!

For the record, here are the new

A^R = \left(\begin{array}{llll} 4.32049 & 0 & 0 & 0 \\ -2.16025 & -1.41421 & 0 & 0 \\ -2.16025 & 1.41421 & 0 & 0\end{array}\right)

and

S^Q = \left(\begin{array}{lll} 22.8619 & 0 & 0 \\ -11.431 & -2.82843 & 0 \\ -11.431 & 2.82843 & 0\end{array}\right)

From the orthonormal basis v2 , i construct a reciprocal basis for the \sqrt{\text{eigenvalue}}-weighted matrix A^R. i’ll just use the diagonal of w2 (instead of the \sqrt{\text{eigenvalue}} since i haven’t computed them) to scale two of the vectors. Recall the new w matrix:

w2 = \left(\begin{array}{lll} 5.2915 & 0. & 0. \\ 0. & 2. & 0. \\ 0. & 0. & 0. \\ 0. & 0. & 0.\end{array}\right)

i construct a diagonal matrix using 1/w when possible, 1 otherwise.

\left(\begin{array}{lll} 0.188982 & 0. & 0. \\ 0. & 0.5 & 0. \\ 0. & 0. & 1.\end{array}\right)

then i premultiply that by the new v, and keep only the first two columns; the new reciprocal basis is:

B2 = \left(\begin{array}{ll} 0.154303 & 0 \\ -0.0771517 & -0.353553 \\ -0.0771517 & 0.353553\end{array}\right)

Having the reciprocal basis, we project the data onto it get the new data wrt the A^R basis. But we project the original data. That’s the catch. As i said, it’s the variance of X we’re trying to scale to 1, not the variance of X2. we compute X B2:

X\ B2 = \left(\begin{array}{ll} -1.38873 & 0 \\ 0.46291 & -1.41421 \\ 0 & 0.707107 \\ 0.92582 & 0.707107\end{array}\right)

And the variances?

\{1.,1.\}.

Indeed. this corresponds even more closely to what we first saw in harman. (if instead we project X2 onto B2, we get a common variance of 1/3 again.)

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: