## Happenings – 2012 Jun 30

In contrast to recent weeks, this has not been a slow week. It has, however, been a week of tunnel vision. I’ve really only done two things; even my alter-ego the kid has been short-changed.

I’ve spent a little time reading about probability distributions (t, chi-square, and F) derived from the normal distribution. Unfortunately, as a post for this coming Monday, this material is still at stage II – and I find it hard enough to go from stage III to a published post in one weekend. This should be a short post, however, so maybe I have shot at getting from stage II all the way to a published post.

(Stage III means that the mathematics is done, but is not yet even the beginning of a lecture. Stage II means I’m still reading and taking notes.)
## Regression 1 – Assumptions and the error sum of squares

There’s one thing I didn’t work out in the previous post: the relatinship between the error sum of squares and the variance of the u. We have already computed the variance of the e, that is,

V(ee’).

What we want now is the expected value of the error sum of squares:

E(e’e).

(I should perhaps remind us that e is, by convention, a column vector… so its transpose e’ is a row vector… so e’e is a scalar, equal to the dot product of e with itself… while ee’ is a square matrix. Vectors can be pretty handy for this kind of stuff.)

The expected value of the sum of squared errors is surprisingly complicated. Well, maybe I should just say it’s different from what we did in the last post… and that’s one reason I moved it to a post of its own.
## Happenings – 2012 Jun 23

Yet another slow weekend, last week.

Yes, a technical post went out… yes, my alter ego the kid played with stellar interiors – it’s been a while – and with ARMA (autoregressive moving average) time series modeling – it’s been years. But other than that, all I did was to look at a little control theory.

There was one earthquake (3.0) in my vicinity last Sunday… and 3 of them last Tuesday (2.7 to 2.9).

Oh, the temperature in Livermore last Saturday was about 106°. My brand-new air-conditioner wasn’t perfect… it couldn’t hold the temperature to 74° inside the great room… that reached 86°. While that might be terrible for a central air system, I think it wasn’t too bad for a window-mounted air conditioner.

Finally, I got in another wavelet book: “discrete wavelet transformations” by Patrick Van Fleet (Wiley Interscience). I do, however, have some reservations about the website for the book. It doesn’t look, so far, anywhere near complete… and the book was published in 2008, so there’s been plenty of time to add material.

For example… under “computer labs” – and then under Mathematica – there is a list of 68 items… unfortunately, only 9 of them can actually be downloaded.

I’ll let you know how it goes when I actually try working through the book – oh, I just discovered that the image and sound files are included in the author’s Discrete Wavelets package. That helps a lot.

And with that, let me start doing some mathematics. Or maybe I’ll take a nap.

## introduction

We have spent a lot of time getting least-squares fits to data. We’ve also spent a lot of time looking at numbers which Mathematica can compute after we’ve gotten a fit. We didn’t have to use LinearModelFit… we could have used a simple Fit command instead.

But the Fit command just does the fit… it doesn’t even give us standard errors and t-statistics.

To get those, however, requires that we make some additional assumptions about the fit. And that’s what I want to talk about, in this and – probably – three more posts.

Let’s get started.
## happenings – 2012 Jun 16

It has been another slow week for mathematics… but I have a new air conditioner installed in the great room, which is my study. It will get a trial by fire today: the National Weather Service forecast calls for 101° today.

There was one earthquake in the past week within a couple of hundred miles of me, a 2.7 last Sunday. That makes 3 for June.

Let me show you just how slow the week has been. Here are 2 of the spam which were held for moderation. Boy, am I glad I don’t see most of these.

Could not become composed any further effective. Reading this article blog post jogs my memory of my own old place mate! This individual consistently retained chatting concerning this. I will mail this blog post to your pet. Fairly certain he will have a good read. Many thanks for discussing! [Intended for “trig parameters in practice”.]

Mail that blog post to my pet, indeed! You leave my cat out of this. Or, if you must, send it to both of them.

Wonderful post but I was wondering if you could write a litte more on topic about generic medicines? I⎟d be very grateful if you could elaborate a little bit more. Thank you! [This was intended for “happenings jun 9”.]

Sheesh! Or Furrfu, whichever you prefer.

Some of the ones held for moderation are so realistic that I’m tempted to allow them – until I notice that all of their praise is being applied to either my “about” page or to my “bibliographies” page. At that point, I trash them.

The 1st order of business today is, of course, to write and to put out this post. My 2nd order of business will be to turn loose my alter ego the kid; I think I’ve been shortchanging him.

A technical post is pretty much through stage IV… I think I’ve answered all of my own questions about it, so the mathematics is done. This, as you might guess, is the post I had hoped to put out last Monday… but there were too many loose threads. I would really like to take this through stage V today, so that it will need only final edits Monday evening.

And that will leave me free to do whatever I want tomorrow. Okay, some of that will be more theory of regression, for next week’s post. But I’m hoping that some of it will be wavelets, and control theory, and maybe I’ll return to electric circuits. With any luck, a little work and a little fun, too.

And that’s all the chatter for this morning.

## Happenings – 2012 Jun 9

It’s been a really slow week.

There have been a couple of earthquakes (3.6 and 2.9)… the 1st about 15 miles from San Jose… the 2nd under the ocean about 30 miles west of San Francisco City Hall. This is 2 so far in June. I’m still surprised at how rare magnitude 2.5 events are in my neighborhood, on the order of 10 per month.

I’ve ordered and received 3 more books… about filter banks and/or wavelets. Yes, I’ve been playing with wavelets in Mathematica® version 8 – but I’ve already seen one answer that completely bewilders me. I’m sure one of us did something wrong – but is it me or is it Mathematica? (I actually hope it’s me… because me I can fix, in time.)

The books were…
Fundamentals of Wavelets… Goswami and Chan, 2nd ed, Wiley 978-0-470-48412-5.
Digital Signal Processing… Diniz, da Silva, and Netto, 2nd ed, Cambridge 978-0-521-88775-5.
Multirate Digital Signal Processing, Crochiere and Rabiner, Prentice-Hall 0-13-605162-6.

Oh, I also noticed that the classic “Wavelets and Subband Coding” by Vetterli and Kovacevic is now available from Dover… I had to track down a used copy, a few years ago.

I’ve got a post – the first of two about the assumptions underlying the analysis of a regression fit – nearing the end of stage III… so most of the mathematics is done, but not all of it. There’s just one piece to figure out how to describe. Perhaps surprisingly, it’s the expected value of the sum of squared errors – it’s tricky to compute, and I haven’t decided how to present the things we need for it. I know them – but just how do I want to introduce them?

(Not only do I hate reading solutions that come out of left field, I hate writing them, too.)

I’m beginning to wonder if I need a break. I’m struggling just to get the blog posts out, and I don’t seem to have much energy for any other mathematics. And that feels terrible, it really does.

And with that, let me see what mathematics I can get done today.

## introduction

It was only on this round of repression regression studies that it really registered with me that the residuals e are correlated with the dependent variable y, but are uncorrelated with the fitted values yhat.

And it was only a couple of weeks ago that the more precise statements registered, and I decided to prove them. In fact, what we have are the following. Suppose we run a regression, with any number of variables, and we end up with some R^2.

• If we fit a line to the residuals e as a function of the fitted values yhat, we will get a slope of zero.
• If we fit a line to the residuals e as a function of the dependent variable y, we will get a slope of 1 – R^2.

We will see that we can rephrase those statements:

• the correlation coefficient between e and yhat is zero;
• the correlation coefficient between e and y is Sqrt[1-R^2].

What that means is that if we look at the residuals as a function of y, we should expect to see a significant slope – unless the R^2 is close to 1. If our purpose in drawing the graph is to look for structure in the residuals which might point to problems in our fit, well, we’ll almost always see such structure – a line of slope 1 – R^2 – and it’s meaningless.

Therefore:
Look at e versus yhat, not e versus y.
