Introduction to CAPM, the Capital Asset Pricing Model


It can be difficult to find a clear statement of what the Capital Asset Pricing Model (henceforth CAPM) is. I’m not trying to do much more than provide that. In particular, I did not find the wiki article to be useful, even after acquiring a couple of recent books on the subject.

I own six references:

  • Sharpe, Wiliam F.; “Investments”, Prentice Hall, 1978; 0-13-504605-X.
  • Reilly, Frank K.; “Investments”, CBS College Publishing (The Dryden Press), 1980; 0-03-056712-2.
  • Gringold, Richard C and Kahn, Ronald N.; Active Portfolio Management, McGraw-Hil, 2000; 0-07-024882-6.
  • Roman, Steven; Introduction to the Mathematics of Finance, Springer, 2004; 0-387-21364-3.
  • Benninga, Simon; Financial Modeling, 3rd ed. MIT, 2008; 0-262-02628-7.
  • Ruppert, David; Statistics and Data Analysis for Financial Engineering; Springer 2011; 978-1-4419-7786-1.

There is more than one version of the CAPM… Roman (p. 62) tells me that “The major factor that turns Markowitz portfolio theory into capital market theory is the inclusion of a riskfree asset in the model…. generally regarded as the contribution of William Sharpe, for which he won the Nobel Prize…. the theory is sometimes referred to as the Sharpe-Lintner-Mossin (SLM) capital asset pricing model.”

Then Benninga (p. 265) told me about “Black’s zero-beta CAPM… in which the role of the risk-free asset is played by a portfolio with a zero beta with respect to the particular envelope portfolio y.” (We’ll come back to this, briefly.)
Read the rest of this entry »

Regression 1 – Inapplicability of the Fundamental Theorem


Aug 12. Edit: I’ve added a few remarks in one place. As usual, search on “edit”.

I want to look at the t-statistics for two regressions in particular. I will refresh our memories very soon, but what we had was two regressions that we could not particularly decide between. Let’s go back and look at them.

Let me get the Hald data. I set the file path…

I set the usual uninformative names – I wouldn’t dare change them after all the time I’ve spent getting used to them!… and I might as well display the data matrix…

Read the rest of this entry »

Regression 1 – Normality, and the Chi-square, t, and F distributions


(There are more drawings of the distributions under discussion, but they’re at the end of the post. This one, as you might guess from the “NormalDistribution[0,1]”, is a standard normal.)
Read the rest of this entry »

Regression 1 – Assumptions and the error sum of squares

There’s one thing I didn’t work out in the previous post: the relatinship between the error sum of squares and the variance of the u. We have already computed the variance of the e, that is,


What we want now is the expected value of the error sum of squares:


(I should perhaps remind us that e is, by convention, a column vector… so its transpose e’ is a row vector… so e’e is a scalar, equal to the dot product of e with itself… while ee’ is a square matrix. Vectors can be pretty handy for this kind of stuff.)

The expected value of the sum of squared errors is surprisingly complicated. Well, maybe I should just say it’s different from what we did in the last post… and that’s one reason I moved it to a post of its own.
Read the rest of this entry »

Regression 1: Assumptions, Expectations, and Variances


We have spent a lot of time getting least-squares fits to data. We’ve also spent a lot of time looking at numbers which Mathematica can compute after we’ve gotten a fit. We didn’t have to use LinearModelFit… we could have used a simple Fit command instead.

But the Fit command just does the fit… it doesn’t even give us standard errors and t-statistics.

To get those, however, requires that we make some additional assumptions about the fit. And that’s what I want to talk about, in this and – probably – three more posts.

Let’s get started.
Read the rest of this entry »

Regression 1: Secrets of the Correlation Coefficient


It was only on this round of repression regression studies that it really registered with me that the residuals e are correlated with the dependent variable y, but are uncorrelated with the fitted values yhat.

And it was only a couple of weeks ago that the more precise statements registered, and I decided to prove them. In fact, what we have are the following. Suppose we run a regression, with any number of variables, and we end up with some R^2.

  • If we fit a line to the residuals e as a function of the fitted values yhat, we will get a slope of zero.
  • If we fit a line to the residuals e as a function of the dependent variable y, we will get a slope of 1 – R^2.

We will see that we can rephrase those statements:

  • the correlation coefficient between e and yhat is zero;
  • the correlation coefficient between e and y is Sqrt[1-R^2].

What that means is that if we look at the residuals as a function of y, we should expect to see a significant slope – unless the R^2 is close to 1. If our purpose in drawing the graph is to look for structure in the residuals which might point to problems in our fit, well, we’ll almost always see such structure – a line of slope 1 – R^2 – and it’s meaningless.

Look at e versus yhat, not e versus y.
Read the rest of this entry »

Regression 1 – Multicollinearity in Review

As I draft this, I plan to do four things in this post.

  1. Summarize the methods I’ve used to analyze multicollinearity.
  2. Suggest that multicollinearity is a continuum with no clear-cut boundaries.
  3. Summarize the conventional wisdom on its diagnosis and treatment.
  4. Flag significant points made in my posts.

Let me say up front that there is one more thing I know of that I want to learn about multicollinearity – but it won’t happen this time around. I would like to know what economists did to get around the multicollinearity involved in estimating production functions, such as the Cobb-Douglas.
Read the rest of this entry »

Regression 1: ADM polynomials – 3 (Odds and Ends)

Edit Jan 29: a reference to the diary post of Jan 21 has been corrected to refer to Jan 14.

There are several things I want to show you, all related to our orthogonal polynomial fits.

  • Can we fit a 7th degree polynomial to our 8 data points? Yes.
  • We can do it using regression.
  • We can do it using Lagrange Interpolation.
  • Did Draper & Smith use the same orthogonalized data? Yes, but not normalized.
  • How did Draper & Smith get their values? They looked them up.
  • Were their values samples of Lagrange polynomials? No.

The bottom line is that starting with half-integral values of x, all I need is the Orthogonalize command, to apply Gram-Schmidt to the powers of x. I did that here. I don’t need to look up a set of equations or a pre-computed table of orthogonal vectors. Furthermore, I can handle arbitrary data which is not equally-spaced.

Read the rest of this entry »

Regression 1: ADM polynomials – 2

Let’s look again at a polynomial fit for our small set of annual data. We started this in the previous technical post.

What we used last time was

That is, I had divided the year by 1000… because, as messy as our results were, they would have been a little worse using the years themselves.

But there’s a simple transformation that we ought to try – and it will have a nice side effect.

Just center the data. Start with the years themselves, and subtract the mean:

I’ll observe that if we wanted to work with integers, we could just multiply by 2. In either case, our new x is not a unit vector.

Oh, the nice side effect? Our centered data is orthogonal to a constant vector.

Let’s see what happens.
Read the rest of this entry »

Regression 1: Archer Daniel Midlands (polynomials) – 1

Now I want to illustrate another problem, this time with the powers of x. The following comes from Draper & Smith, p. 463, Archer Daniel Midlands data; it may be in a file, but – with only 8 observations – it was easier to type the data in. Heck, I didn’t even look to see if it was all in some file somewhere.

raw data

I have chosen to divide the years by 1000; in the next post I will do something else.

The output of the following command is the given y values… I typed integers and then divided by 100 once rather than type decimal points.

Read the rest of this entry »