Regression 1 – Multicollinearity in Review

As I draft this, I plan to do four things in this post.

  1. Summarize the methods I’ve used to analyze multicollinearity.
  2. Suggest that multicollinearity is a continuum with no clear-cut boundaries.
  3. Summarize the conventional wisdom on its diagnosis and treatment.
  4. Flag significant points made in my posts.

Let me say up front that there is one more thing I know of that I want to learn about multicollinearity – but it won’t happen this time around. I would like to know what economists did to get around the multicollinearity involved in estimating production functions, such as the Cobb-Douglas.
Read the rest of this entry »

Using the QR Decomposition to orthogonalize data

This is going to be a very short post, illustrating one idea with one example (yes, one, not five).

It turns out that there is another way to have Mathematica® orthogonalize a matrix: it’s called the QR decomposition. The matrix Q will contain the orthogonalized data… and the matrix R will specify the relationship between the original data and the orthogonalized.

That means we do not have to do the laborious computations described in this post. Understand, if we do not care about the relationship between the original data and the orthogonalized data, then I see no advantage in Mathematica to using the QR over using the Orthogonalize command.
Read the rest of this entry »

Regression 1: ADM polynomials – 2

Let’s look again at a polynomial fit for our small set of annual data. We started this in the previous technical post.

What we used last time was

That is, I had divided the year by 1000… because, as messy as our results were, they would have been a little worse using the years themselves.

But there’s a simple transformation that we ought to try – and it will have a nice side effect.

Just center the data. Start with the years themselves, and subtract the mean:

I’ll observe that if we wanted to work with integers, we could just multiply by 2. In either case, our new x is not a unit vector.

Oh, the nice side effect? Our centered data is orthogonal to a constant vector.

Let’s see what happens.
Read the rest of this entry »

Regression 1: Archer Daniel Midlands (polynomials) – 1

Now I want to illustrate another problem, this time with the powers of x. The following comes from Draper & Smith, p. 463, Archer Daniel Midlands data; it may be in a file, but – with only 8 observations – it was easier to type the data in. Heck, I didn’t even look to see if it was all in some file somewhere.

raw data

I have chosen to divide the years by 1000; in the next post I will do something else.

The output of the following command is the given y values… I typed integers and then divided by 100 once rather than type decimal points.

Read the rest of this entry »

Regression 1: Example 8, Fitting a polynomial

I want to revisit my old 2nd regression example of May 2008. I have more tools available to me today than I did when I first created it – and it was originally done before Regress was replaced by LinearModelFit.

Recap: fitting a quadratic and a cubic

What I had was five observations x, five disturbances u – and an equation defining the true model: y = 2 + x^2 + u. Here they are:

Construct a full data matrix with x, x^2, and y:

Run forward selection… and backward selection…

Read the rest of this entry »

Regression 1: eliminating multicollinearity from the Toyota data

We have seen that we can eliminate the multicollinearity from the Hald data if we orthogonalize the design matrix – thereby guaranteeing that the new data vectors will be orthogonal to a column of 1s. That, in turn, centers the new data, so that it is uncorrelated as well as orthogonal.

Doing that to the Toyota data will seem strange… because we have to do it to the dummy variables, too! But it will eliminate the multicollinearity.

I’m not sure it’s worthwhile to eliminate it… but we can… so let’s do it.
Read the rest of this entry »

Regression 1: eliminating multicollinearity from the Hald data

I can eliminate the multicollinearity from the Hald dataset. I’ve seen it said that this is impossible. Nevertheless I conjecture that we can always do this – provided the data is not linearly dependent. (I expect orthogonalization to fail precisely when X’X is not invertible, and to be uncertain when X’X is on the edge of being not invertible.)

The challenge of multicollinearity is that it is a continuum, not usually a yes/no condition. Even exact linear dependence – which is yes/no in theory – can be ambiguous on a computer. In theory we either have linear dependence or linear independence. In practice, we may have approximate linear dependence, i.e. multicollinearity – but in theory approximate linear dependence is still linear independence.

But if approximate linear dependence is a continuum then it is also a continuum of linear independence.

So what’s the extreme form of linear independence?


What happens if we orthogonalize our data?

The procedure isn’t complicated: use the Gram-Schmidt algorithm – on the design matrix. Let me empahsize that: use the design matrix, which includes the columns of 1s. (We will also, in a separate calculation, see what happens if we do not include the vector of 1s.)

Here we go….
Read the rest of this entry »