more books

The following books have been added to the bibliography page. They were mentioned in today’s “Happenings”. One is control theory, two are upper division physics books, and one is a popular book about physics.

Carstens, James R.; Automatic Control Systems and Components.
Prentice Hall, 1990. ISBN 0 13 054297 0
[classical controls; 25 feb 2008]
this book caught my eye when i saw that he had transfer functions for specific devices used in control systems; it won my heart when he distinquished between the parameters in his math models and the parameters to be found in catalogs!
This is an introductory and hands-on book. 

Zwiebach, Barton; A First Course in String Theory.
Cambridge University Press, 2004. ISBN 0 521 83143 1.
[string theory; 25 feb 2008]
This is the text for an upper-division couse at M.I.T.

Lederman, Leon, with Teresi, Dick; The God Particle.
Bantam Doubleday Dell, 1993. ISBN 0 385 31211 3.
[popular physics, particle physics; 25 feb 2008]
This is a popular book, and I had forgotten just how much fun it was to read. Even if you’ve seen The Standard Model of particles, you may enjoy this book; and if you don’t know the standard model, this is a fine place to start. (The title refers to the Higgs boson; it was that or the god-damned particle, he said.)

Griffiths, David; Introduction to Elementary Particles.
Wiley-VCH, 1987. ISBN 0 471 60386 3.
[elementary particle physics; 25 feb 2008]
An upper-division text. I really like his style, as well as his apparent precision. For an example of style: “In general, when you hear a physicist invoke the uncertainty principle, keep a hand on your wallet.”


What’s new with me?
I’ve been doing special relativity, controls in Carstens, and I just started geometric topology in Bloch. In addition, I’m still playing with normal operators in linear algebra.
Oh, and at bedtime i’m rereading Leon Lederman’s “The God Particle.” I needed to look up a quotation from it, and decided it was worth rereading for the fun of it.
Geometric topology? Read the rest of this entry »

PCA / FA example 4: Davis. R-mode FA, eigenvectors

so much for the eigenvalues. now for the eigenvectors.
it is usual in mathematica that for an integer or rational result, the eigenvector matrix is not orthogonal; for real results, by contrast, the eigenvector matrix returned is orthogonal. we got that the eigenvector matrix was (but let’s call it something else, now):
B = \left(\begin{array}{ccc} -2&0.&1\\ 1&-1&1\\ 1&1&1\end{array}\right)
by computing B^T B, we see that the dot products among all the vectors are:
\left(\begin{array}{ccc} 6&0.&0.\\ 0.&2&0.\\ 0.&0.&3\end{array}\right)
which, as i expected, says the eigenvectors are mutually orthogonal (they have to be, because the eigenvalues are distinct) but not orthonormal. (In fact, their squared lengths are 6, 2, 3, resp.) so we scale the columns of B, getting…
\left(\begin{array}{ccc} -\sqrt{\frac{2}{3}}&0&\frac{1}{\sqrt{3}}\\ \frac{1}{\sqrt{6}}&-\frac{1}{\sqrt{2}}&\frac{1}{\sqrt{3}}\\ \frac{1}{\sqrt{6}}&\frac{1}{\sqrt{2}}&\frac{1}{\sqrt{3}}\end{array}\right)
\left(\begin{array}{ccc} -0.816497&0.&0.57735\\ 0.408248&-0.707107&0.57735\\ 0.408248&0.707107&0.57735\end{array}\right)


What’s been happening? Among other things, I have been so busy putting math out here that I’ve completely forgotten to talk about the doing of it.
Of course, doing mathematics is fundamentally different from putting mathematical posts out here. Last Monday I was all set to prepare a set of posts for the coming week, only I got distracted by doing mathematics.
Can you imagine that? Well, math is my fundamental goal after all. Even the blog is secondary.
But the blog is very good for me, and I want to get stuff out here. Unfortunately, it’s a nontrivial task. I need to stop doing mathematics for a while in order to publish stuff. And because it’s a different activity, I need to develop different habits.
For the longest time, I’ve really done two or three different things when I “do math”. The easiest is to curl up in a reading chair with a book and a recorder, and browse and take notes. I do this with new books as they arrive, or with any book I’m about to do some work in. In fact, I try to have the recorder handy whenever I read anything. The point is to record whatever thoughts I have or whatever I see of interest. It’s a comfortable thing to do. (Yes, I usually read first and compute later, though I’ve been known to run to the computer to verify something that just couldn’t wait.)
This doesn’t commit me to doing any real work in a book; I can always come back to the notes later.

Read the rest of this entry »

PCA / FA example 4: Davis. R-mode FA, eigenvalues

from davis’ “statistics and data analysis in geology”, we take the following extremely simple example (p. 502). his data matrix is
D = \left(\begin{array}{ccc} 4&27&18\\ 12&25&12\\ 10&23&16\\ 14&21&14\end{array}\right)
where each column is a variable. he now centers the data, by subtracting each column mean from the values in the column.
let me do that. i compute the column means…
{10,\ 24,\ 15}
and subtract each one from the appropriate column, getting
X = \left(\begin{array}{ccc} -6&3&3\\ 2&1&-3\\ 0.&-1&1\\ 4&-3&-1\end{array}\right)

PCA / FA example 3: Jolliffe. analyzing the covariance matrix

we have seen what jolliffe did with a correlation matrix. now jolliffe presents the eigenstructure of the covariance matrix of his data, rather than of the correlation matrix. in order for us to confirm his work, he must give us some additional information: the standard deviations of each variable. (recall that he did not give us the data.)
we have to figure how to recover the covariance matrix c from the correlation matrix r, when for each and every ith variable we have its standard deviation s_i.
it’s easy: multiply the (i,j) entry in the correlation matrix r by both s_i and s_j
c_{i j} = r_{i j} \ s_i \ s_j
the diagonal entries r_{i i}, which are 1, become variances c_{i i} = s_i^2, and each off-diagonal correlation r_{i j} becomes a covariance. maybe it would have been more recognizable if i’d written
r_{i j} = \frac{c_{i j}}{\ s_i \ s_j}
which says that we get from covariances to correlations by dividing by two standard deviations.
here are the standard deviations he gives:
{.371,\ 41.253,\ 1.935,\ .077,\ .071,\ 4.037,\ 2.732,\ .297}

more books for PCA / FA (principal components & factor analysis)

The following books pertaining to PCA / FA (principal components and factor analysis) have been added to the complete bibliography. One is data analysis for geology, another is “factor analysis in chemistry”. The geology book has much more material in it than PCA / FA, while the chemistry book is devoted to the one topic.
I had already listed a more general book about data analysis in chemistry (“brereton”). I don’t know yet if I’ll discuss any material from it. I have no idea why I put it out there ahead of time.
I will be discussing at least one example from the geology book Real Soon Now; and I will certainly get to the “factor analysis in chemistry” book. Both are very important for understanding – as opposed to merely computing – PCA / FA.
In addition, there are two statistics books, both by Ronald Christensen. I bought his “Advanced Linear Modeling” during the annual Springer Verlag “buy ‘em cheap” sale, just because it was cheap. I liked it enough that I hunted down “Plane Answers”, which is effectively volume 1 of this two-volume set. I like his writing style; but the books are at a graduate level; further, like the authors of “Numerical Recipies”, he is opinionated, and I love that.
He has a chapter on PCA / FA in the “second volume”, which is why both books are listed. I do not know yet if I will be using material from him, but I do keep looking back at it.
I did say he was opinionated, right? From the chapter on PCA / FA: 
“The point of this example is to illustrate the type of analysis commonly used in identifying factors. No claim is made that these procedures are reasonable.”

Read the rest of this entry »

PCA / FA example 2: jolliffe. discussion 3: how many PCs to keep?


from jolliffe’s keeping only 4 eigenvectors, i understand that he’s interested in reducing the dimensionality of his data. in this case, he wants to replace the 8 original variables by some smaller number of new variables. that he has no data, only a correlation matrix, suggests that he’s interested in the definitions of the new variables, as opposed to the numerical values of them.
there are 4 ad hoc rules he will use on the example we’ve worked. he mentions a 5th which i want to try.
from the correlation matrix, we got the following eigenvalues.
{2.79227, \ 1.53162, \ 1.24928, \ 0.778408, \ 0.621567, \ 0.488844, \ 0.435632, \ 0.102376}
we can compute the cumulative % variation. recall the eigenvalues as percentages…
{34.9034, \ 19.1452, \ 15.6161, \ 9.7301, \ 7.76958, \ 6.11054, \ 5.4454, \ 1.2797}
now we want cumulative sums, rounded….
{34.9, \ 54., \ 69.7, \ 79.4, \ 87.2, \ 93.3, \ 98.7, \ 100.}

schur’s lemma: any matrix is unitarily similar to an upper triangular

i bumped into someone last night who asked me about schur’s lemma, something about bringing a matrix to triangular form. i’ve spent so much time looking at diagonalizng things that i didn’t appreciate schur’s lemma, and it deserves to be appreciated.
it says that we can bring any (complex) matrix A to upper triangular form using a unitary similarity transform. in this form, the restriction to “unitary” is a bonus: a perfectly useful but weaker statement is that any matrix is similar to an upper triangular matrix.
now, we’re usually interested in diagonalizing a matrix. when can we go that far?
easy: that upper triangular matrix is in fact diagonal iff the original matrix A is normal; that is, iff A commutes with its conjugate transpose:
A \ A^{\dagger } = A^{\dagger }\ A.

so, any normal matrix can be diagonalized; furthermore, the similarity transform is unitary.

Read the rest of this entry »

PCA / FA example 2: jolliffe. discussion 2: what might we have ended up with?

back to the table. here’s what jolliffe showed for the “principal components based on the correlation matrix…” with a subheading of “coefficients” over the columns of the eigenvectors.
\left(\begin{array}{cccc} 0.2&-0.4&0.4&0.6\\ 0.4&-0.2&0.2&0.\\ 0.4&0.&0.2&-0.2\\ 0.4&0.4&-0.2&0.2\\ -0.4&-0.4&0.&-0.2\\ -0.4&0.4&-0.2&0.6\\ -0.2&0.6&0.4&-0.2\\ -0.2&0.2&0.8&0.\end{array}\right)
under each column, he also showed “percentage of total variation explained”. those numbers were derived from the eigenvalues. we saw this with harman:
  • we have standardized data;
  • we find an orthogonal eigenvector matrix of the correlation matrix;
  • which we use as a change-of-basis to get data wrt new variables;
  • the variances of the new data are given by the eigenvalues of the correlation matrix.
the most important detail is that the eigenvalues are the variances of the new data if and only if the change-of-basis matrix is an orthogonal eigenvector matrix.
and that is what jolliffe has: the full eigenvector matrix P is orthogonal. OTOH, we don’t actually know that the data was standardized, but the derivation made it clear that if we want the transformed data to have variances = eigenvalues, then the original data needs to be standardized.
again, since jolliffe never uses data, we can’t very well transform it.