Many years ago I took a bunch of courses in multimedia studies… among them, 3-D animation. My original goal was simple enough: to learn how to use my computer for more than number crunching.

Of course, that couldn’t have been all that long ago… although the program included a course in the history of multimedia. It was hard to believe, going in, that it had any history. Did you know that Richard Wagner (e.g. The Ring Cycle) was the first dramatist to insist that the lights be turned off for his shows? Composer, oh yes. But impressario, too, and the whole show was important to him. (And that’s about all I remember of the history.)

Ironically, because I was in a certificate program, I was able to get a student version of Mathematica – which put me on the road to using my computer for even more number crunching.

For one of my assignments, I needed to take a picture of a table and record a lot of information about the camera – its characteristics, settings, and position. The purpose was to combine the photograph of the real table with an animated ball bouncing on it. That is, the ball would be made to bounce on an animated table – which would be made invisible and rotated to overlay the real table.

Ultimately, however, I decided that I did not have enough information about the camera to determine the geometry… so I simply moved the viewpoint of the animated table until it coincided with the photograph. Once I did that, it was easy enough to add in the animated ball.

And that is why I jumped on 2 books I saw recommended a week and a half ago.

“Multiple View Geometry in Computer Vision” by Richard Hartley & Andrew Zisserman (2nd edition) is the more readable of the 2: it starts by introducing projective geometry – which I’ve seen, no pun intended, but I would welcome the chance to play with it some more. Its other major tool, not surprisingly, is linear algebra. Its major goal is the construction of a 3-D scene from a collection of 2-D images. That is, to assemble objects and assign 3-D coordinates to them.

That’s the book I will be working through 1st. In addition to being more readable, for me, it looks more hands-on, too.

Nevertheless, I’ve already gotten something out of the other book. “An Invitation to 3-D Vision” by Yi Ma & Stefano Soatto & Jana Kosecka & S. Shankar Sastry. Overall, it seems to be written more abstractly… and it seems to come at it from a different direction (okay, from a different point of view… cringe). I look forward to reading it after the other book.

And what did I get out of it? They referred to the Gram-Schmidt orthogonalization algorithm and the QR decomposition of a matrix in the same sentence.

Oh my God.

Just the juxtaposition of the 2 was enough for me to understand: they are the same thing.

The Gram-Schmidt algorithm starts with a set of linearly independent vectors and it generates an orthonormal set of vectors.

The QR decomposition writes a square matrix X as

X = Q R,

where Q is orthogonal and R is upper triangular. If the matrix X is not square, then Q has orthonormal columns.

So?

Instead of using Gram-Schmidt to orthogonalize data – to eliminate multicollinearity perhaps – we can use the QR decomposition. Six of one… half dozen of the other.

So?

The matrix R specifies the relationship between the raw data X and the orthonormal data Q. Yes, it’s equivalent to the calculations I illustrated in this post – but we don’t have to do those calculations anymore, because the matrix R gives the same result. If we need it, we’ve already got it as a by-product of the QR decomposition.

So. It’s all easier to obtain than I thought. I need to illustrate this process.

And we see once again why I buy books like a madman. Without a collegial environment… without a chance to sit around and chat about math… I am dependent on my reading for cross-fertilization. Even if I get nothing else out of that second book – not likely, but even if – the book will have been worth it for connecting the QR decomposition to the Gram-Schmidt process.

(No, I’m not going to look through other books of mine to see if I missed it in earlier reading. I see what I’m ready to see. Or, as they say, when the student is ready, the teacher will appear.)

## Leave a Reply