## PCA / FA example 4: davis. Davis & harman 2.

We learned from harman that if we use the orthogonal eigenvector matrix v to compute new data, the result has redistributed the original variance according to the eigenvalues; and we also learned that if we use the weighted eigenvector matrix instead, we get new variables of unit variance.

(i think i thought this required us to start with the correlation matrix; it does not.)

let’s see this again for davis’ centered data X:

$\left(\begin{array}{lll} -6&3&3\\ 2&1&-3\\ 0.&-1&1\\ 4&-3&-1\end{array}\right)$

We have variances…

$\{\frac{56}{3},\ \frac{20}{3},\ \frac{20}{3}\}$

or…

${18.6667,\ 6.66667,\ 6.66667}$

and the sum of the variances is…

32.

Now is a good time to recall $w^T\ w$:

$\left(\begin{array}{lll} 84.&0.&0.\\ 0.&12.&0.\\ 0.&0.&0.\end{array}\right)$

The sum of the diagonal elements is 96, and if we divide by 3 (= N-1, for the covariance matrix instead of for $X^T\ X$), gee, we get 32.

stop for a moment. We have two, admittedly proportional, sets of eigenvalues, one set from the covariance matrix, the other from $X^T\ X$. this gives us two different possible sets of weights to apply to the eigenvector matrix v. (recall that v is an eigenvector matrix both for $X^T\ X$ and for the covariance matrix of X, as well as coming out of the SVD of X.)

Let’s stay with what i know: our new data, using the orthogonal eigenvector matrix v, is $A^Q$.

$\left(\begin{array}{lll} -7.34847&0.&0.\\ 2.44949&-2.82843&0.\\ 0.&1.41421&0.\\ 4.89898&1.41421&0.\end{array}\right)$

The variances are…

${28.,\ 4.,\ 0.}$

and we don’t even need to consciously add those: the sum is 32. We have redistributed the original variance among our two new variables.

Now, everyone seems to say that the $\sqrt{\text{eigenvalue}}$-weighted eigenvectors have variances equal to the eigenvalues. What those eigenvectors really have is lengths equal to their $\sqrt{\text{eigenvalue}}$.

here indeed be dragons.

as linear algebra is my touchstone in general for all of this, that redistributed variance using the orthogonal eigenvector matrix is my vorpal blade (dragon, jabberwock, no big deal – snicker-snack either way). any different choice of basis must lead to different results.

we looked at that redistribution of variance in harman. We learned then that the correct way to get new data with redistributed variance is to use the orthogonal eigenvector matrix. We just reconfirmed it for davis’ example.

Instead of actually kicking one of the dragons awake, let’s try counting its scales while it sleeps. If the R-mode scores $S^R$ are some form of data (and they are), what are their variances?

my $S^R$ is

$\left(\begin{array}{llll} -67.3498&0.&0.&0.\\ 22.4499&-9.79796&0.&0.\\ 0.&4.89898&0.&0.\\ 44.8999&4.89898&0.&0.\end{array}\right)$

the variances of the first two columns are

${2352.,\ 48.}$

yikes! if nothing else, that should convince you that the R-mode scores are weird.Life is so much simpler if we use the orthogonal eigenvector matrix. The new data is trivial to compute (X v), and we can see that it has redistributed the variance of the original data.Trying to use any weighted eigenvector matrix is dicey if there is a zero eigenvalue (because our transition matrix is not invertible); and confusing if we used $X^T\ X$ instead of the covariance matrix (because they have different eigenvalues).i will eventually calculate the data wrt both of the weighted eigenvector matrices. The dragons will be awake and aloft and pretty to look at.