PCA / FA Brereton Summary

(No, you didn’t miss any previous posts about Brereton’s “Chemometrics”. This is it.)

Having made it through chapter 4 of Brereton, I will revise the bibliographic entry to echo the following sentiment: it might be fair to say that I view this as a workbook; figure the stuff out elsewhere, but play with it here. He has several examples in chapter 4: two case studies, a small worked-out example of cross-validation, and a small example illustrating various forms of preprocessing. He also provides data to illustrate “procrustes analysis”, but he does not show how to do it.

On my first pass thru here, all I tried was the two big case studies. I could match his first, but it was ugly getting there, and I could not match his second. On my second pass thru here, a week ago, having learned the SVD from Davis and the X = RC factoring from Malinowski, it was clear what Brereton was doing. I tore thru the chapter, and matched all of his results; more importantly, I got them easily and cleanly. (Mostly. His cross-validation example was more like tiptoeing than tearing thru, and there is one figure in the chapter that must come from some data not supplied.)

As I said in the bibliography, the data is available in electronic form once you purchase the book. Under the circumstances, I won’t publish any of his examples.

But let’s talk about them anyway.
Read the rest of this entry »

PCA / FA tricky preprocessing

Introduction

I have stumbled across a tricky point in the preprocessing of data. The most relevant post is probably

this of April 7. Rather than lecture, let me ask and answer some questions. The fundamental question is:
Can I inadvertently reduce the rank (the dimensionality) of the data matrix?
The answer is yes.
Read the rest of this entry »

Triangulations of Surfaces: minimum number of triangles

Edited 4 Sep. search on “edit”.

Take a cube. If you cut it along a few edges, you could lay it out flat. To restore the cube, we identify certain edges with each other. Similarly for a tetrehedron, or a theoretical soccer ball (with flat faces), or any polyhedron. For studying surfaces by looking at polyhedra (i.e. picewise linear structure), it is convenient to use only triangular faces (2D simplices) rather than arbitrary polygonal faces. The analog of our cut and flattened cube is called a triangulation. As before, we want to identify certain edges with each other.

In particular, the following

is offered as a triangulation of a torus (with the top & bottom edges identified, and the left & right, as we’ve seen before).

Why are there so many triangles? I have wondered that since the first time I saw it.
Read the rest of this entry »

books added 9 Aug 2008: Algebraic Topology

introduction

Let me begin by citing a site: here you will find, among other things, a free downloadable version of an algebraic topology book, offered by its author. It looks pretty good.

http://www.math.cornell.edu/~hatcher/

Someday I’d like to write an introduction to topology (a post! not a book!), but trying to do it now is taking me too far out of my comfort zone: I am reasonably familiar with general (also called point-set) topology, but I am rather ignorant of algebraic topology; I am reasonably familiar with differential geometry, but differential topology is a different and unknown beast. I have opinions about how topology hangs together, but when I try to be precise, I find that I’m not sure I can justify my opinions. I’d rather get it more right later than get it wrong now.

Remember that the path from ignorance to knowledge in any subject is not straight and true, but is almost always rather zigzagged. One seems to learn things by a method of successive approximations to the truth.

William S. Massey, Algebraic Topology: An Introduction. p xiii.

Read the rest of this entry »

PCA / FA Malinowski Summary

Malinowski’s work is considerably different from everything else we’ve seen before.

First of all, he expects that in most cases one will neither standardize nor even center the data X. We can do his computations as an SVD of X, or an eigendecomposition of X^{T}X or of XX^T – but because the data isn’t even centered, X^{T}X and XX^T are not remotely covariance matrices. For this reason, I assert that preprocessing is a separate issue.
Read the rest of this entry »

Books Added 1 Aug 2008: Graph Theory

I have had a long-standing question: why is the usual triangulation of a torus so large? It is a square with 18 triangles. Why so many?

While looking through algebraic topology books for more information about simplicial complexes – which is supplementary material for Bloch (“A First Course in Geometric Topology….” – I found the answer. Well, I found a clue. I found an inequality, which implied that the minimum number of triangles was 14 in any possible triangulation of the torus.

What I didn’t find was a proof of that inequality, and I couldn’t work it out for myself.

Eventually, in another algebraic topology book, I found another clue, and it was enough to let me work out the first inequality.

Along the way I found some pretty interesting things, and I want to chatter about them. The result will not really be math, but more in the nature of a travel guide.

Well, no, not even that detailed. More in the nature of some really cool pictures from a foreign country.

The search took me into both algebraic topology and graph theory. Before I put out the posts (there will be at least two), I need to put out some bibliography.
Read the rest of this entry »