## A Minimization problem

Let me work a minimization problem rather than a maximization. Like the previous problem, this one comes from Loomis & Turban, “Applied programming for management”, 003-078240-6 (pp. 97-99).

A customer asks a butcher to grind up several cuts of beef to form a blend of not less than 17.6% protein and 14.8% fat.

What he has available is so that table gives us the A matrix and the c vector; the protein and fat requirements give us the b vector. Let the variables be x1, x2, x3.
Read the rest of this entry »

## Linear programming – Getting the final tableau given the answer

(That may sound strange. Bear with me.)

## Introduction

Every once in a while, I pick up and play with linear programming. This post will show you a couple of elementary ways to set up and solve a small linear programming problem….

But that is not the main purpose of this post.

For the record, Mathematica® has some special–purpose commands for solving large linear programming problems. I believe it handles them as data sets in a standardized format. I also believe its smallest example has more than 30 variables. So, if you need to do linear programming professionally, you should look at Mathematica’s linear programming command.

But if, like me, you pick up linear programming as a student who wants to work his way through a textbook, then you need something else.

Working through a textbook will almost certainly involve tabular displays of the initial problem, a sequence of tabular displays of intermediate non-optimal solutions, and a display of the final optimal solution. Each of these tables is typically called by the French “tableau”, plural tableaux.

You need to know how to get the final tableau, given Mathematica’s solution and the initial tableau (i.e. the initial data).
Read the rest of this entry »

## Regression 1 – Assumptions and the error sum of squares

There’s one thing I didn’t work out in the previous post: the relatinship between the error sum of squares and the variance of the u. We have already computed the variance of the e, that is,

V(ee’).

What we want now is the expected value of the error sum of squares:

E(e’e).

(I should perhaps remind us that e is, by convention, a column vector… so its transpose e’ is a row vector… so e’e is a scalar, equal to the dot product of e with itself… while ee’ is a square matrix. Vectors can be pretty handy for this kind of stuff.)

The expected value of the sum of squared errors is surprisingly complicated. Well, maybe I should just say it’s different from what we did in the last post… and that’s one reason I moved it to a post of its own.
Read the rest of this entry »

## Using the QR Decomposition to orthogonalize data

This is going to be a very short post, illustrating one idea with one example (yes, one, not five).

It turns out that there is another way to have Mathematica® orthogonalize a matrix: it’s called the QR decomposition. The matrix Q will contain the orthogonalized data… and the matrix R will specify the relationship between the original data and the orthogonalized.

That means we do not have to do the laborious computations described in this post. Understand, if we do not care about the relationship between the original data and the orthogonalized data, then I see no advantage in Mathematica to using the QR over using the Orthogonalize command.
Read the rest of this entry »

## the relationship between the raw and the orthogonalized data

OK, so we orthogonalized the hald data, including the constant (the column of 1s).

What’s the relationship between the new variables and the old? We might someday get a new observation, and if we were using the fit to the orthogonalized data, we might want to see what it predicts for a new data point.

(In all honesty, I would use the original fit – but I still want to know what the relationship is.)

My notation is a little awkward. I’m going to stay with what is used for this post, in which I first showed how to find….

Let me start fresh. If we have two typical data matrices (i.e. taller than wide), and they are supposed to be the same data, how do we find the relationship?
Read the rest of this entry »

## introduction

A long time ago, in a post about PCA (principal component analysis), I said that I did not know what Andrews curves were. (The suggestion was made that Andrews curves might help us decide how many principal components to keep. I did not understand how they were to be computed.)

Now I know. Let me show you. I will compute Andrews curves for what is called “the iris data”… for both the full data set (150 observations) and a reduced data set (30 observations). I will also show you a possible variant.

In addition, we will know that there are in fact three kinds of irises in the data – so we can assess how well the Andrews curves did. In practive, of course, we will be trying to figure out how many kinds of observations we have.

The data is here. The paper which explained Andrews curves to me is here. Andrews original paper is: Andrews D. Plots of high-dimensional data Biometrics 1972 28:125-136… but I haven’t found it freely available anywhere online.

In addition, there is a short and sweet webpage by one of the authors of the paper I read.
Read the rest of this entry »

## Norms and Condition Numbers

We have had a few occasions to talk about norms of vectors. I want to take a look at them, and to also look at norms of matrices, and then at condition numbers of matrices.

## Vector Norms

Let me jump right in. Here’s a 2-dimensional vector: we get the Euclidean length of the vector – it’s the length of the hypotenuse of a 3-4-5 right triangle.