Wavelets: semi-orthogonal, from linear splines

Edit: 13 July, just one remark added. see “edit” below.

How are things going? As I said yesterday in “happenings”, the simplest answer is: right now I’m just thrilled when I can reproduce a drawing in a book, even if I don’t understand why the method works or where the function came from.

This is post that I ended up with when I started out to write a “happenings” post yesterday. I promised you a picture. In fact, I’ll give you a mother wavelet for the linear spline scaling function. (That link occurs a few more times. What can I say? That’s what I’m building on, in more than one way.)

You may recall that I have shown you two ways to approximate a scaling function. The one I do not understand applies convolution and downsampling repeatedly. The one I do understand is the dyadic expansion, and I’ve been using it ever since I worked it out, in preference to the other method.

Well. I have now seen a scaling function which is infinite at all the dyadic points, so the dyadic expansion hasn’t got a prayer of working! But the convolution and downsampling algorithm gives me a drawing which seems to match the book.

I do not yet understand how that scaling function was determined, but at least I have a picture. (Not the picture you’ll see later in this post. Different function, different picture; soon but not now.)

I also have a picture of the corresponding mother wavelet. But this scaling function and mother wavelet are half of a biorthogonal set of four functions — and I don’t yet understand how to get them. (Biorthogonal means that I have a non-orthonormal basis and its reciprocal basis.) Worse, although I’ve seen pictures of the other pair of functions, I don’t know enough about them to even draw them!

But I expect to. Most of the time. Sometimes in the evening I wonder if I’m in over my head, but the feeling is usually gone when I sit down to do mathematics the next morning.

On the plus side, I have read about these new functions while looking at orthogonality — more precisely, while looking at relaxing the orthogonality assumptions. That is, while looking at biorthogonal and semi-orthogonal wavelets. (Okay, that last is not standard terminology, as far as I know: I think the second case is called “a semi-orthogonal multi-resolution analysis” and “pre-wavelets“. It just seems natural to me to apply the term “semi-orthogonal” to the wavelets, especially when I’m contrasting them with orthogonal or biorthogonal wavelets.)

Here is an overview of what is going on. I’ve said at least some of this in more detail before.

A scaling function \varphi(t) and its integer translates define a space V_0\ . Then the function \varphi(2t) and its integer translates define a space V_1\ . It is a major assumption that V_0 is a subspace of V_1\ (and this, combined with the definition of V_1\ , gives us the dilation equation). Then we define the space W_0\ as the difference between V_0\ and V_1\ (i.e. as the complement of V_0\ in V_1\ ). W_0\ is the space of the mother wavelet. We have

V_1 = V_0 + W_0\ .

If \varphi(t) and its integer translates are an orthonormal basis for V_0\ , then W_0\ is in fact the orthogonal complement of V_0\ ; the direct sum is an orthogonal direct sum

V_1 = V_0 \oplus W_0\

and we may find a function (the mother wavelet) whose integer translates form an orthonormal basis for W_0\ .

(Incidentally, in most of my books, people appear to use \oplus\ for the direct sum whether or not it is orthogonal. That is, you can’t tell usually from the written decomposition whether or not it is orthogonal. You have been warned.)

(Another aside…. To see an example of a non-orthogonal direct sum decomposition, take R^2\ , i.e. the xy-plane, and take the non-orthogonal basis (1,0) and (1,1). Then the x-axis and the line y=x are non-orthogonal 1D subspaces, and their direct sum is the xy-plane. One possible orthogonal direct sum, in contrast, is the x-axis and the y-axis.)

But what if \varphi(t) and its integer translates are a basis for V_0\ , but not orthonormal? More specifically, what if they are not orthogonal? (If they are orthogonal, simply changing their sizes will make them orthonormal.)

You should be thinking of linear splines, those triangles that form a basis for V_0\ but not an orthogonal one.

Well, we can still find an orthogonal direct sum, and we can even find a mother wavelet whose integer translates form a basis for W_0\ .

But just as the basis for V_0\ is not orthogonal, neither is the basis for W_0\ . Still, because the direct sum is orthogonal, every element of W_0\ is orthogonal to every element of V_0\ . They’re just not orthogonal to thir own integer translates: \psi(t)\ may very well be orthogonal to many or most of its integer translates, but it is not orthogonal to all of them.

You know what? Pictures are good, so here are two pictures.

The linear spline scaling function, as we have seen, has three h coefficients, namely:

\left\{\frac{1}{2 \sqrt{2}},\frac{1}{\sqrt{2}},\frac{1}{2 \sqrt{2}}\right\}\

… and it looks like this:

july 12 scaling

Now, if we take as our mother wavelet a function whose g coefficients are (think of me as Moses waving stone tablets in your face saying, “God gave me these.”) …

\left\{\frac{1}{24 \sqrt{2}},-\frac{1}{4 \sqrt{2}},\frac{5}{12   \sqrt{2}},-\frac{1}{4 \sqrt{2}},\frac{1}{24 \sqrt{2}}\right\}\

…then we get:

july 12 mother

The most significant feature may very well be that this mother wavelet has 5 coefficients while its scaling function has 3 coefficients.

The second most important feature may be that this mother wavelet is a spline, like the scaling function. Some people know a lot about splines, and some of them would rather have splines than orthonormal bases.

That mother wavelet is orthogonal to all of the integer translates of the linear spline. It is not orthogonal to its own translates by \pm1\ or \pm2\ , just as the linear spline is not orthogonal to its own translates by \pm1\ .

This is the decomposition which is called a “semi-orthogonal multi-resolution analysis”, and that mother wavelet is called a “pre-wavelet”. So far, Strang & Nguyen (Strang, Gilbert; Nguyen, Truong.Wavelets and Filter Banks.Wellesley-Cambridge Press, 1997 (revised edition).ISBN 0 9614088 7 1) is the only book in which I have found this. Oh, this is not an example of biorthogonal wavelets; for that, I need a second scaling function and two mother wavelets.

(edit: 13 July. I should remark that we’re not out of the woods in the semi-orthogonal case. Because our bases for V_0 and W_0 are not orthogonal, it seems to me that we will still need to find their reciprocal bases, for computing components of vectors.)

This is the function whose existence I believed in but couldn’t find. Well, now I’ve been given it but I don’t understand where those five coefficients came from.

Yet.

As a reminder, the mother wavelet was computed from its g coefficients, using the analog of the dilation equation:

july 12 defn mother

Please note that the scaling function was computed, as before, using the dyadic expansion; therefore the mother wavelet is computed only at dyadic points. In particular, neither of these is the terrible function I referred to, but have not yet shown you, which is infinite at all the dyadic points, and therefore must be computed by a different method.

Oh, a point is “dyadic” if it’s an integer divided by a power of 2; that is, all those points at which I have been computing scaling functions.

Let me show you some quick calculations of integrals. As usual, I am going to calculate the finite sum of areas of rectangles:

\sum_i { f(x_i)\ dx}\

with dx = 1/128, and f(x) = a product of two functions. Just for simplicity, I let x range from -6 to 6, because some of my functions are shifted. NOTE that the shift is given by k, and that value is set in the command.

First, let’s approximate \int\ \varphi(x)\ \psi(x)\ dx\ . It should be 0, because the mother wavelet is orthogonal to the scaling function:

july 12 orth 1

Just to be clear: that calculation had f(x) = \varphi(x)\ \psi(x)\ .

The mother wavelet is also orthogonal to translates of the scaling function, so we expect \int\ \varphi(x-1)\ \psi(x)\ dx = 0\ . The approximating sum is…

july 12 orth 2

and for \int\ \varphi(x-2)\ \psi(x)\ dx = 0\ , I get:

july 12 orth 3

Those three were very good (and identical!) approximations to 0. If we now approximate \int\ \varphi(x-3)\ \psi(x)\ dx = 0\ , we will get exactly zero, because the functions do not overlap at all; we have shifted the scaling function quite far enough.

july 12 orth 4

Now look at the inner product of two translates of wavelets. Just as the scaling function is not orthogonal to overlapping translates, the mother wavelet is not orthogonal to its overlapping translates.

Here, as one example, is (an approximation of) \int\ \psi(x-1)\ \psi(x)\ dx\

july 12 not orth

We’ve seen numbers that approximate of zero, and that isn’t close.

What about scale? That still works: a wavelet at one scale is orthogonal to wavelets at a different scale. Here, as one example, is the approximation of \int\ \psi(2x-1)\ \psi(x)\ dx\ .

july 12 orth 5

I expect that the next wavelets post will be a clean summary of multi-resolution analysis for orthogonal wavelets. I’ve been flailing a bit, and I think I owe you some clear-cut statements of what we can get.

I was all set to put out a post about consequences of orthogonality, a couple of weeks ago, until I realized that the two ideas (orthogonality of subspaces V,W and orthogonality of functions) needed to be distinquished at a deeper level. Not just conceptually, as I had, but in terms of their consequences.

This example, which has orthogonal subspaces V,W but has neither orthogonal translates of scaling functions nor orthogonal translates of wavelets, makes me very happy. I know we can break the link between the two ideas of orthogonality, and in fact I now have both the scaling function and the mother wavelet in my hands, so I can compute to my heart’s content and see what is no longer true in this case.

(There’s nothing special about linear splines. We can do this for higher order splines, too. Well, okay, I still have to figure out how to derive the g coefficients for the mother wavelet. And I’m pretty sure there were other possible solutions for the linear spline, with more than 5 coefficients, so there are probably an infinite number of solutions for any spline.)

I am now willing to return to the standard presentation of multi-resolution analysis of orthogonal wavelets, which assumes that both forms of orthogonality hold. I now have sufficient context for that standard presentation. That is enough to let me move on; I just needed to know their place in the universe. And, as I say, after all the confusion, I owe you an unambiguous summary of orthogonal wavelets.

Happenings Jul 11

I sat down to write a happenings post this morning, and found myself looking at a wavelets post when I was finished with it.

Since the first draft turned into a very nice technical post, let me try this again. When I began, what else did I think I might write about?

I certainly do not mind that I ended up with a post different from what I intended. One of the secrets to writing is: don’t edit yourself while it’s happening. Of course, drafts need to be edited — but that is a different process. By giving myself free rein, I am often surprised at how well things turn out.

There is also something to be said for a piece of technical writing which flows, as opposed to one where I have to look up a technical detail every five minutes. If it flows, it makes some sense in my head. If I’m looking up lots and lots of details, I can’t really be said to know it.

My major focus since the last happenings post has been wavelets. I am struggling.

How are things going? The simplest answer is: I’m happy to just reproduce a drawing in a book, even if I don’t understand why the method works or where the function came from.

The next wavelet post — the one I just wrote — will show you a wonderful function that came from I don’t know where.

Sometimes I get so confounded frustrated that I bail on mathematics. Then sometimes I pick up that waste-of-time computer game and conquer the galaxy; if my frustration level isn’t that bad, I pick up a foreign language.

Sometimes, fortunately, I just go look at other mathematics. But that works best in the morning, when the day before didn’t go too well.

I picked up modules. We can think of them as “vector spaces” where the scalars come from a ring rather than from a field. One of the major properties we lose is the very idea of “dimension” (so they certainly aren’t actually vector spaces). Anyway, I happen never to have studied them. And, as it happens, I didn’t get much further than choosing a book to study them out of. I am being distracted by other mathematics.

I also picked up fiber bundles. They generalize the tangent bundle, which is the set of all tangent spaces to a manifold. What you should visualize is a circle with a whole lot of tangent lines drawn on it. And if you draw enough of them, you can dispense with the circle itself, which suggests that we can study a manifold by studying associated fiber bundles.

In contrast to wavelets, where I have to go through all 20 of my books on a regular basis looking for specific topics, for fiber bundles I grabbed about 10 of my 140 or so differential geometry books, and whittled that down to three. I may have missed a couple of excellent texts, but these three look to suffice.

I wasn’t looking for it, but one of them had a very nice discussion of immersions, submersions, embeddings, and related things. These distinctions are crucial when we study submanifolds. That’s another one of those things I never had to learn. So I’ve been distracted from fiber bundles and am looking at these.

I had gone through an undergraduate book about field extensions, which culminated in proofs that the three classical construction problems (square the circle, trisect an arbitrary angle, and double the cube) simply cannot be done.

Interestingly, this specific book — far more than most — lends itself to the following: I should try to briefly summarize the book. For most books, one can barely hope to briefly summarize one chapter. I still hope to write such a summary for the construction problems, but I have found one elsewhere, so it is no longer necessary that I write one myself. But it would be such a good thing for me to do, for my own understanding…. (And yes, I have in mind something a bit more detailed than “Don’t do it! I already tried it. It doesn’t work!”)

I had gotten to that book from finite fields. I tried to pick up finite fields again, but got distracted by the other thing that field extensions do for us: they show us that we cannot solve the general polynomial equation, in radicals like the quadratic equation, for a degree greater than four.

I had also picked up geometric algebra, which combines an inner product with an “outer product” (e.g. the vector cross product); and Lie algebra, which is related, and which led to the eightfold way and the classification of elementary particles.

I didn’t get very far. It’s as though what I know about them is archived on a backup hard drive, and all the work I had done on them looks very mysterious when I pick it up again. I’m not talking about mysterious textbooks — I’m talking about mysterious Mathematica notebooks that I myself wrote!

Well, it could be a very good thing, looking at what I did, trying to figure out what I knew back then. Here, instead of summarizing a book, I think I want to find a great example, one that embodies, as much as possible, the knowledge I want to recover.

Hmm. On second, thought, perhaps I should call that a “thorough example”. I have something different in mind for a “great example”.

As I said two weeks ago, I am always thrilled to find “a great question”, one that focuses one’s attention on new material. The three classical construction problems, while great, suffer from needing an entire book for their answer. Still, they help a lot in keeping me motivated.

But not everything great is a question. I know of at least one “great answer”.

Why should I study matrix algebra? Because the general solution to the problem of doing a least squares fit can only reasonably be expressed using matrices.

(Look, I can think of a lot of beautiful things that linear algebra and matrices give me. But if a student, or non-mathematician, is at the stage of wondering, “Why should we study this?”, I’m perfectly willing to show them something they can’t do without matrices. But the simple truth is, beauty aside, I really like being able to say, “Because without it, you can’t do this thing here.)

And maybe not everything great is a question or an answer. There ought to be “great examples”, too. One could argue that every counterexample is great, but I’m looking for more. Or less. “Counterexamples” are very important, probably deserving of more emphasis than they get, but I want counterexamples which are special.

OK, here’s a possibility, one that I’ve mentioned briefly before. I think that any discussion of “unique factorization domains” should begin by exhibiting a number system for which unique factorization fails: working in the set of numbers

a + b\ \sqrt{-5}\ , with a and b integers,

we have

21 = 3*7 = (1 + 2\ \sqrt{-5})\ (1 - 2\ \sqrt{-5})\

which would be no big deal (after all, 12 can be factored in more than one way, 3*4 and 2*6), except that in both factorizations of 21, the factors are irreducible. Not one of 3, 7, (1 \pm 2\ \sqrt{-5})\ can be factored.

(If unique factorization holds, then “prime” is equivalent to “irreducible”; one could argue that it is the failure of that equivalence that allows unique factorization to fail.)

Well, maybe that’s a great example, not just a run-of-the-mill counterexample; but let me keep thinking about it.

And I should look again to see if it is in fact a thorough example: is there anything in the study of non-UFDs that is not in that example?

I may have just convinced myself that I’m interested in finding thorough examples. Can an example count as “great” if it’s not “thorough”?

Let me start looking at my examples with a more critical eye….

Happenings Jun 27

A friend of mine sent me a link (to a link) last week. The second link is an essay on the teaching of mathematics. It breaks my heart, both for its truth and its beauty.

I think the author is dead on about what’s wrong in the teaching of mathematics — and not just in K-12. He is also absolutely on the mark when he describes what mathematics means to me.

It is very well written, and I think it is very worthwhile reading.

There was an interesting question and answer out on the sci.math newsgroup (and probably sci.physics). The question was posed in John Baez’ June 20 “This Week’s Finds in Mathematical Physics (Week 276)” and answered in the very next post in the thread, by Phillip Helbig.

Here is the question. What would happen to the earth if Betelgeuse went supernova?

We can get a quick answer far more easily than I thought — and apparently, more easily than the original poster realized. We don’t have to compute the energy released; all we need is absolute and apparent stellar magnitudes, at least for starters. (Eventually we might want to know more, but the first answer gives us a good idea of what would be going on.)

In the end, I am reminded of just how large interstellar distances are.

Now, I am not going to work out the answer for you. But my reason may not be what you expect. It isn’t that someone else worked it out, and it isn’t even that “you should do this yourself”. In fact, when I worked it out I got a slightly different answer from the one on the newsgroup.

No. I think this qualifies as a great pedagogical question.

This is an absolutely wonderful question to have in mind when you go look up stellar magnitudes. Whether you are cracking open an old astronomy textbook to look at something now forgotten, or whether stellar magnitudes are completely new to you — the question of how bright a supernova would be in our sky could go a long way in organizing the material as you reread or read it.

So I’m not going to work out the answer, because I don’t want to deprive you of the experience of reading about stellar magnitudes with that question in mind.

I do not know very many questions like that. I latch on to them when I find them, but in my experience they are few and far between.

The very first such question that I remember is about human blood types. Somewhere, sometime, someone showed me an introduction to the ABO human blood groups — specifically an introduction which said that when you had finished the chapter you would know three of these, four of those, three of something else, and so on.

I think that is ridiculous, to enumerate the number of alternatives or examples of something one would learn.

If you want to go read about the ABO blood groups, then let me ask you a question.

Can your mother donate blood to you?

If anyone were guaranteed to be able to donate blood to you, you would think your mother could.

To answer that question, you need to understand certain things. The question itself provides a wonderful guide to study.

I would welcome additional questions like those two.

Now let me find out what mathematics my inner child wants to do, before I get back to wavelets, and the minor mathematical distractions I’m using for now when the wavelets get confusing or discouraging.

Wavelet properties: orthogonality & counterexample

orthogonality

Edit 28 Jun: I believe that once I know that \psi(t)\ is a wavelet (i.e. an element of W_0\ ), then so are all of its integer translates. This remark occurs once later, as well as here.

Let me speak in general terms first.

There seem to be two properties subsumed under the rubric “orthogonality”.
Edit June 29. “Seem” is the correct word. I am not ready to make this precise, but these two apparently different properties are very close, and possibly the same. I think I can say that if the scaling function and its translates are not orthogonal, then W_0 is not an orthogonal complement to V_0\ . To put that another way, we can still define a direct sum decomposition

V_1 = V_0 + W_0

but V_0\ and W_0 are not orthogonal subspaces. And so, when I convinced myself that “if \psi(t)\ is a wavelet (i.e. an element of W_0\ ), then so are all of its integer translates”, I was right… because I was assuming W_0\ to be orthogonal to V_0\ , and that’s one heck of an assumption.

I am looking into non-orthogonal subspaces, and into biorthogonal wavelets.

One is that we have, for example, the orthogonal direct sum decompositions

V_1 = V_0 \oplus W_0\

V_2 = V_1 \oplus W_1\

V_2 = V_0 \oplus W_0 \oplus W_1\ .

We are using the L2 inner product, \int f(t) g(t) \, dt\ . The functions f and g are orthogonal if \int f(t) g(t) \, dt = 0\ .

The first decomposition tells us that all the functions in V_0\ are orthogonal to all the functions in W_0\ . The second decomposition tells us the same thing about V_1\ and W_1\ : all the functions in V_1\ are orthogonal to all the functions in W_1\ .

The third decomposition tells us more: all the functions in V_0\ are orthogonal to all the functions in W_1\ ; and all the functions in W_0\ are othogonal to all the functions in W_1\ .

In particular, the scaling function \varphi and its integer translates which defined V_0\ are orthogonal to all the wavelets in W_0\ and all the wavelets in W_1\ .

In orther words, for the scaling function \varphi we should have

\int \varphi(t)\ \phi(t) \, dt\ = 0\ .

\int \varphi(t-k)\ \phi(t) \, dt\ = 0\ .

where \phi\ is any function (any wavelet) in any space W_i\ . (Yes, in particular this is true if \phi is the mother wavelet \psi\ ; we must hesitate to talk about the arbitrary translates or the integer translates of \psi\ until we know that they are in W_0\ .) Edit 28 Jun: I believe that once I know that \psi(t)\ is a wavelet (i.e. an element of W_0\ ), then so are all of its integer translates. We can stop hesitating.

Similarly, we would have, for example,

\int \varphi(2t)\ \phi(t) \, dt\ = 0\ .

where \phi\ is now restricted to any function (i.e. any wavelet) in W_i\ except W_0\ . Why? \varphi(2t) \in V_1\ , which is orthogonal to W_1 \text{ or higher}\ but not necessarily to W_0\ .

Let me point out that one of the challenges here is that we don’t know much about the spaces W_0\ ; in particular, we don’t know yet if \psi(t) \in W_0\ implies that \psi(t-1) \in W_0\ .

The second property is that we may have that the scaling function and its integer translates are orthogonal:

\int \varphi(t)\ \varphi(t-k) \, dt\ = P\ \delta(k) \ .

where \delta(k) \ is the Kronecker delta, \delta(0) = 1\ and \delta(k) = 0\ for all k ≠ 0.

There I do not need the distribution, the “Dirac delta function”, whose defining property is the inner product \int \delta(t)\ f(t) \, dt\  = f(0)\ . I just want to write one equation instead of these two:

\int \varphi(t)\ \varphi(t) \, dt\ = P\ .

\int \varphi(t)\ \varphi(t-k) \, dt\ = 0 \ , for k ≠ 0.

However we write it, this is a property of major importance. We can deduce a whole lot more about wavelets in general, about a suitable mother wavelet, and about integer translations of wavelets. In particular, if the scaling function is orthogonal to its integer translates, then integer translates of the mother wavelet are also orthogonal to each other, and we speak of orthogonal wavelets.

This property does hold, of course, for the Haar system and for all the Daubechies scaling functions. (Technically, that’s redundant: we will eventually show that the Haar system is in fact D2, the Daubechies wavelets defined by 2 nonzero h’s.)

The reason I emphasize that there are two ideas of orthogonality here –orthogonal direct sums and integer translates of the scaling function — is that I sometimes get the impression – and it may be a mistaken impression – that people are using the later to prove consequences of the former. But the orthogonal direct sums are more general.

Recall that we can — and have — used non-orthonormal (no, I didn’t say non-orthogonal) bases in finite dimensional vector spaces. These lead us quickly to the definition of a reciprocal basis: we take dot products of a vector with the reciprocal basis to compute the basis-coefficients of that vector. (As I post this, at least, “reciprocal basis” is in the tag cloud, so you can look there.)

Well, we can do the same thing with wavelets. If we have a basis of wavelets but it is not orthonormal, then we need to construct the reciprocal basis. The pair, basis of wavelets and reciprocal basis, is called a bi-orthogonal system. Just a different name for something we’ve seen before.

(If we have an orthogonal basis which is not orthonormal, the reciprocal basis differs from the original only by scale factors on each vector but the directions are the same; this means we just end up with scale factors associated with the dot-products. I will show you this, below. But what it means is that, in practice, a merely orthogonal basis is almost as good as an orthonormal one.)

I am looking forward to playing with a bi-orthogonal system, so that I can look for the properties which depend on the ortogonal direct sum decomposition rather than on the orthogonality of integer translates of the scaling function.

If the integer translates are orthogonal but not orthonormal, we can fix that just by dividing \varphi\ by its norm \sqrt{P}\ , where P is defined by

\int \varphi(t)\ \varphi(t) \, dt\ = P\ .

digression on Fourier

Suppose we have a function which is precisely a finite Fourier series, say,

f(x) = 3 \cos (2 x)+2 \sin (x)\

That seems unambiguous and straightforward: our f(x) can be written exactly using sin(x) and cos(x). But, with the L2 inner product on {[0,\ 2\pi]}\

\int_0^{2 \pi } f(t)\ g(t) \, dt\

and the associated norm

||g|| = \sqrt{\int_0^{2 \pi } g(t)\ g(t) \, dt}\

we have that the set {1, cos[nx], sin[nx]} is orthogonal but not orthonormal. Each of sin(nx) and cos(nx) has norm \sqrt{\pi}\ (because the integral of the square is \pi\ )…

trig norms

and the constant function 1 (= cos (0x), if you will) has norm \sqrt{2}\ \sqrt{\pi}\ .

const norm

The orthonormal basis is constructed by dividing each of these functions by its norm; the reciprocal basis is constructed by dividing each of these functions by the square of its norm; equivalently, the reciprocal basis is constructed by dividing each of the orthonormal functions by the norm of the original function.

However we carry it out, the end result is that the set

\{\frac{1}{\sqrt{2\ \pi}}, \frac{cos[nx]}{\sqrt{\pi}}, \frac{sin[nx]}{\sqrt{\pi}}\}\

is an orthonormal basis; and the set

\{\frac{1}{2\ \pi}, \frac{cos[nx]}{\pi}, \frac{sin[nx]}{\pi}\}\

is the basis reciprocal to \{1, cos[nx], sin[nx]\}\ .

In this form…

f(x) = 3 \cos (2 x)+2 \sin (x)\

f(x) is given in terms of the orthogonal but non-orthonormal basis. We can confirm that its components {3, 2} wrt that basis can be computed as the dot products of f with the reciprocal basis. Only two terms are nonzero:

get coeffs

We are more likely to have simply been told that the coefficients are to be found using

\frac{1}{\pi}\ \int_0^{2 \pi } sin(x)\ f(x) \, dt\

and

\frac{1}{\pi}\ \int_0^{2 \pi } cos(2x)\ f(x) \, dt\ .

Nothing about a reciprocal basis, just a recipe with a god-given factor of \frac{1}{\pi}\ in front of the integrals. Conceptually, those factors belong inside the integrals, under the trig functions. But it’s the same calculation.

If, on the other hand, we had constructed the orthonormal basis \{\frac{1}{\sqrt{2\ \pi}}, \frac{cos[nx]}{\sqrt{\pi}}, \frac{sin[nx]}{\sqrt{\pi}}\}\ , then the coefficients wrt the orthonormal basis are

get 2nd coeffs

which may look weird until I remind you that we are implicitly writing

f(x) = (3\ \sqrt{\pi})\  \frac{\cos (2 x)}{\sqrt{\pi}}+(2\ \sqrt{\pi})\  \frac{\sin (x)}{\sqrt{\pi}}\

which is exactly our function f.

The point I am belaboring – perhaps excessively – is that most of us have been coping with an orthogonal but non-orthonormal basis most of our lives, and constructing the reciprocal basis in such a case amounts to just sticking a scaling factor in front of the dot product calculation for the components. We get the components without explicitly desribing the reciprocal basis.

We do much the same for wavelets and the scaling functions: so long as they are orthogonal to their integer translates, we cope with their norms not being 1.

a counterexample

Since I haven’t yet itemized the consequences of orthogonality under integer translation, I won’t show you that the following scaling function violates them. But I want to put this example out here, because it does satisfy all 6 consequences of the dilation equation; but it is not orthogonal under integer translation, and it will not satisfy the consequences of orthogonal integer translates.

And yet, because we have clearly defined spaces V_0\ , V_1\ , etc., we should have orthogonal direct sum decompositions.

Here it is. Just a triangle. You can also call it a linear spline. (And higher-order splines are also counter-examples.)

triangle

So, we can define the space V_0\ as the span of \varphi(t)\ and its integer translates \varphi(t-k)\ .

NOTE that because the triangle \varphi(t)\ is positive on (0,2), it cannot be orthogonal to two of its integer translates, namely \varphi(t-1)\ and \varphi(t+1)\ . They overlap and there are no negative values to cancel things out.

That is, it is false that \varphi(t)\ is orthogonal to all of its integer translates. All but two is two too few.

For example, we can look at \varphi(t)\ (black) and \varphi(t-1)\ (red) and their intersection (yellow)…

intersection

That the area under the intersection is positive tells us that these functions are not orthogonal.. The inner product, however, is not the area. We can estimate the inner product by a sum…

approx ip

I should draw the product \varphi(t)\ \varphi(t-1)\

exact ip pic

but of course we can compute the inner product exactly, too. Since the functions are linear, their product is quadratic (where its not zero) and the integral is cubic. In fact, it’s

exact ip calc

Anyway, we take that triangular scaling function and all its integer translates to be V_0\ .

We can define V_1\ as the span of the set \{\varphi(2t-k)\}\ . Those are just narrower triangles of height 1, e.g \{\varphi(2t)\}\ looks like….

v1 triangle

The really big question is: is V_0\ a subspace of V_1\ , V_0 \subset V_1\ ?

The answer is yes. (Otherwise this wouldn’t be a counterexample!) We have that

\varphi(t) = \frac{1}{2}\ \varphi(2t) +  \varphi(2t-1) +  \frac{1}{2}\ \varphi(2t-2)\ .

I can show it to you (the command shows that I am adding three functions in V_1 to get my scaling function in V_0\ :

sum of triangles

Now, that’s our dilation equation, except that we don’t have the factor of \sqrt{2}\ . To put that another way, if the h’s are

\{1/2,1,1/2\}\

then their sum is 2.

To put that a third way, right now we have A = 1.

No big deal; to change the sum we multiply and divide by \sqrt{2}\ . This gives us our usual dilation equation

\varphi(t) = \sum_{n} h(n)\ \sqrt{2}\ \varphi(2t-n)\ ,

with new h’s

\{\frac{1}{2 \sqrt{2}},\frac{1}{\sqrt{2}},\frac{1}{2 \sqrt{2}}\}\ .

Now, we have 6 properties to check…

  • The sum of the h’s = 2/A.
  • \varphi(t) = \sum_n {h(n)\ A\ \ \varphi(2\ t - n)}\ .
  • E = \int\ \varphi(t)\ dt \ .
  • The sum of the even h’s = the sum of the odd h’s.
  • \sum_k { \varphi(\frac{k}{2^j})} = 2^j \text{(for E = 1)}
  • \sum_k { \varphi(t+k)} = 1 \text{(for E = 1)}\

And I have been consistently referring to even h’s, for example, when I really mean h’s with even index, h(2n).

We have (one) set A = \sqrt{2}\ and (two) the sum of the h’s is \sqrt{2}\ .

There is only one odd h, h[1], and it’s \frac{1}{\sqrt{2}}\ , and each of the two even h’s is half that value, so we do have (three) the sum of the even h’s equals the sum of the odd h’s. Now is a good time to remark that the number of h’s is not even.

What about E = \int\ \varphi(t)\ dt \ ?

Our scaling function is a triangle of base 2 and height 1, so (four) its area is 1, so E = 1.

That triangle is continuous, so we should have (five) the generalized partition of unity

\sum_k { \varphi(t+k)} = 1 \text{(for E = 1)}\

Let’s try a few. Here are t = 1/2, 1/4, -3/8, 31/16. This sample is not a proof, but it’s all I personally need for now.

generalized partition

And since E = 1 we should have (six) the dyadic sum

\sum_k { \varphi(\frac{k}{2^j})} = 2^j\ .

Let’s try the first few.

partition

Remarks

Those linear splines and their translates do provide us with spaces V_0\ , V_1\ , and so on. They satisfy the 6 properties I expect them to.

But I have no idea, so far, how to find a function in W_0\ . There are two obvious candidates — orthogonal to the scaling function — but they fail to be orthogonal to all of its integer translates.

This is not a big deal. We’re going to warp those splines to hell and back, I think. Or maybe it will be the scaling function. I’m not sure yet, but something will be transmogrified.

There is a way to get an orthonormal basis from the set of splines: the result is called Battle-Lemarie wavelets. In contrast to everything we’ve seen, however, they do not have finite support (equivalently, they do not have a finite number of h’s) — but they are useful nevertheless.

So, we will see splines again.

Wavelet Properties: one more from the dilation equation

Introduction

I want to add one more property to the previous post. This is long enough that I do not want to insert it as an edit. It’s amazing that I ever considered even for a moment to insert it as an edit!

Just as we deduce what is called a partition of unity \sum_k { \varphi(k)} = 1 from the dyadic sum

\sum_k { \varphi(\frac{k}{2^j})} = 2^j

(by setting j = 0)

we can get a more general equation, if \varphi is continuous. We can show that

\sum_k { \varphi(t-k)} = 1\ ,

where the sum, as before, is over all integer k. (The proof uses the previous dyadic sum and continuity to say something about non-integer values.) This may be called a generalized partition of unity.

That’s how Burrus et al. write it, but I find it useful to change the sign of k, and write

\sum_k { \varphi(t+k)} = 1\ ,

I may sometimes need negative values of k, but I like having t and k going in the same direction: increasing either of them is moving to the right on the number line.

For the Haar system, since the support is [0,1), the sum always reduces to one nonzero term, k = 0. For k ≠ 0, the value t+k is always outside the support interval. And since \varphi(t) = 1 for t \in [0,1)\ , we understand that the equation is true.

Just in case I’ve never said it, or you don’t recognize it, the support of a function f is the part of its domain, the set of x, for which f(x) is nonzero. In particular, to say that a function defined on the real numbers has compact support means that it is zero outside of a compact (i.e., closed and bounded) interval. To say that the support of a function is the interval [0,3) means that the function is zero everywhere else outside that interval.

For the Daubechies D4, with support [0,3), the sum always reduces to two nonzero terms, but not necessarily the same 2. We always have a term for k = 0. For t \in [0,1)\ we have k = 1,2. Fort \in [1,2)\ , we can have k = \pm 1\ . For t \in [2,3)\ , we have k = -2,-1.

Computationally speaking, I might as well just write the sum for slightly more k’s than I need, because the computer will take care of the zero terms. But I better not omit k’s for which \varphi(t+k) \ne 0\ .

Let me be explict about that. If the sum fails to be 1, it will be because we used too few values of k; we omitted a nonzero term. I will illustrate these calculations for the Daubechies D4.

I do not know if this condition is more useful directly or in reverse. That is, in reverse, if we find a scaling function for which the sum really isn’t 1, \sum_k { \varphi(t+k)} \ne 1\ , then the scaling function \varphi should not be continuous. Unless there are other hypotheses involved — and, I have to remind you, at my present level of understanding, there could be!

You might want to remember that, in general, \varphi is a hypothetical solution to the dilation equation with some h’s. In particular, \varphi may not even exist. And if it does exist, it may not be integrable. Hell, it may not even be a function, as we usually think of them, but a distribution (like the Dirac delta “function”).

That said, we are in the position of having entire families of scaling functions available to us. For them, we don’t need to worry about conditions which guarantee existence or integrability or continuity: we have the silly things right in our hands and can play with them, to verify whether properties hold or not.

To return to the D4 example, it’s easy enough to pick a few values and confirm that the sum is 1. That’s not a proof that the sum is always 1, but it was enough to send me to Daubechies herself (Daubechies, Ingrid; Ten Lectures on Wavelets. Society for Industrial & Applied Mathematics, 1992; ISBN 0 89871 274 2) to confirm there that D4 is continuous. (All the Dn are continuous.)

Having added this property, my checklist of 5 items grows to 6 items:

  • The sum of the h’s = 2/A.
  • \varphi(t) = \sum_n {h(n)\ A\ \ \varphi(2\ t - n)}\ .
  • E = \int\ \varphi(t)\ dt \ .
  • The sum of the even h’s = the sum of the odd h’s.
  • \sum_k { \varphi(\frac{k}{2^j})} = 2^j \text{(for E = 1)}
  • \sum_k { \varphi(t+k)} = 1 \text{(for E = 1)}\

Example: Daubechies D4

Let us look at the generalized partition of unity,

\sum_k { \varphi(t+k)} = 1\ ,

specifically for the Daubechies D4. The previous post verified the orginal five properties; we need to do this one.

First, let’s look at t = 1/2.

Recall the D4 scaling function. You might also recall that we compute it at rational points whose denominator is a power of 2.

D4 may 9 4

Now I tell Mathematica® to list the values of \{ \varphi(t+k)\} with t = 1/2, and k an integer from -3 to 3. That’s more than enough values of k).

Here’s the list, and the sum of the entries:

gen part 1

(Yes, \varphi(3/2) = 0\ .)

Let’s think about that. We’re really trying to evaluate \varphi at all the half-integer points on the real line. That infinite sum is really one representative of an equivalence class: we have, for example,

\sum_k { \varphi(1/2+k)} = \sum_k { \varphi(-1/2+k)} = \sum_k { \varphi(3/2+k)}\

etc. If I decide, for example, to check the sum for t = 13/2, but I still use the same limits for k…

gen part 2

the problem is not that the infinite series fails to be 1, but that I have chosen the wrong limits for k. The generalized partition of unity is a doubly-infinite series, in principle, but reduces to a finite sum if the scaling function has compact support (i.e. is zero outside a finite interval). I just didn’t get all the nonzero terms. My bad. Scaling function good.

Another way to say all that is, without loss of generality we can choose t in the interval [0,1) for checking a selection of values. That infinite series will include values in [1,2) and [2,3) (i.e. the other intervals where D4 is nonzero).

So let’s look at a few. I have chosen to let k range over -3 to 5 simply because it provides a reassuring number of zeroes on either end of the output list.

The following picture shows the results for t = 1/4, 3/8, 1/128, and 100/128.

gen part 3

summary

The following six properties are consequences of the dilation equation, although we are used to having A = \sqrt{2}\ and two require E = 1. That means that some of them can be used to quickly determine that A ≠ \sqrt{2} or that E ≠ 1. Alternatively, we could work out the generalized forms for E ≠ 1.

In particular, these properties do not require that integer translates be orthogonal. But we have yet to see an example of a scaling function that is not orthogonal to its integer translates. (Yes, of course, I have one just waiting for us.)

  • The sum of the h’s = 2/A.
  • \varphi(t) = \sum_n {h(n)\ A\ \ \varphi(2\ t - n)}\ .
  • E = \int\ \varphi(t)\ dt \ .
  • The sum of the even h’s = the sum of the odd h’s.
  • \sum_k { \varphi(\frac{k}{2^j})} = 2^j \text{(for E = 1)}
  • \sum_k { \varphi(t+k)} = 1 \text{(for E = 1)}\

Wavelet properties: consequences of the dilation equation

discussion
edited 8 Jun to cross out the last line and correct it. see edit.
edited 13 Jun to correct the last line again. see edit. Sorry about that, but I shot from the hip and hit myself in the foot.

We have seen in the previous post that the idea of a set of scaling functions \{\varphi(t-k)\} spanning a space V_0\ , and such that the set \{\varphi(2t-k)\} span a space V_1\ , gives rise to a dilation equation, which we have been writing as

\varphi(t) = \sum_n {h(n)\ \sqrt{2}\ \ \varphi(2\ t - n)}.

(We also get more spaces V_j\ for both positive and negative integers j.)

We have also seen that if we impose an inner product, then wavelets live in the orthogonal complements W_j\ :

V_{j+1} =: V_j \oplus W_j\ .

I should probably remark, first, that a solution \varphi(t)\ of the dilation equation is determined only to within a constant factor. We describe that, however, by describing its integral (effectively, its average) rather than, for example, its maximum value. We write

\int\ \varphi(t)\ dt = E\ ,

where E is a nonzero constant which we often choose to be 1: E = 1.

To put that another way: if there is a solution, it is not unique.

Two examples we’ve seen repeatendly are Haar and Daubechies. The Haar scaling fuction is a unit step function on the unit interval, so its integral is 1. Similarly, the integrals of the Daubechies scaling functions are 1, although I have not shown that.

There are 3 consequences of the dilation equation (mostly with other conditions added).

The first consequence of the dilation equation is that it dictates the sum of the h’s. For the form we have usually been using, namely

\varphi(t) = \sum_n {h(n)\ \sqrt{2}\ \ \varphi(2\ t - n)}

we must have

\sum_n\ h(n) = \sqrt{2}\

More generally, for an arbitrary normalizing factor A in the dilation equation

\varphi(t) = \sum_n {h(n)\ A\ \varphi(2\ t - n)}

we get

\sum_n\ h(n) = \frac{2}{A}\

Let me remark that the sum of the h’s is independent of the integral of \varphi(t)\ , i.e. independent of E.

I should also mention that I had originally written that it was independent of the “normalization,” but I want to reserve that term for the integral of the square of \varphi(t)\ . I have found it all too easy to get confused between

\int\ \varphi(t)\ dt\

and

\int\ \varphi(t)\ \varphi(t)\ dt\ .

Further discussion and the derivation of this is at the end of the post. That will make it clear that the sum of the h’s does not depend on the value of the integral (but it does require that the integral exist and be nonzero).

The second consquence of the dilation equation, with the additional requirement that the integral E = 1,

E = \int\ \varphi(t)\ dt = 1\

is that

\sum_k { \varphi(\frac{k}{2^j})} = 2^j

for integers k and j.

Note especially the special case for j = 0,

\sum_k { \varphi(k))} = 1

That is, that the sum of the values of \varphi(t)\ at the integers is 1.

That is why Burrus et al. normalized the eigenvectors to a sum of 1, those eigenvectors that we used for the values of \varphi(t)\ at the integers, in order to initialize our recursions for computing \varphi(t)\ .

The third consequence of the dilation equation, under at least two different sets of additional assumptions which I’m skipping right over – but not an assumption that \varphi(t) is orthogonal to its integer translates – is that

the sum of the even-numbered h’s is equal to the sum of the odd-numbered h’s:

\sum_n\ h(2n) = \sum_n\ h(2n+1)\ .

This result is also implied by an orthogonality condition, but it does not require it. And this is why I state it although I’m more than a little vague – I’m not sure I could be any more vague if I tried – on the conditions it assumes: this result does not require orthogonality.

My reading suggests that this result will make a lot more sense in either the frequency domain, or from the filter-bank point of view. I suppose I should have said that what I’m doing is the “multi-resiolution analysis approach”, by focusing on the V and W spaces.

I had originally intended to list the assumptions required, until I saw a simpler but alternative set of assumptions. To heck with them all, at this stage of my learning.

At this point, I view this consequence as an interesting result to be looked for, and if it ever fails to hold, then I will investigate and see why it failed.

Anyway, there are 5 things to note or to check.

  • The sum of the h’s = 2/A.
  • The scaling factor A in the dilation equation.
  • The integral E of the scaling function (is E=1?).
  • The sum of the even h’s = the sum of the odd h’s.
  • \sum_k { \varphi(\frac{k}{2^j})} = 2^j

I did not number those because I will check them in whatever order is convenient, but I will count them as I check them.

Yes, I spoke of three consequences and then I listed five items. The extra two are the values of A and E, parameters rather than consequences.

By the way, the sum of the h’s might be a good first thing to compute, since it gives us the value of A in the dilation equation. But sometimes the dilation equation comes first; and sometimes it’s just not clear which (the h’s or the equation) comes first.

Let’s look at these.

example: Haar

We take the scaling function \varphi(t)\ to be the unit-height step function nonzero on the half-open interval [0,1).

One result, its integral is 1.

Result two, if we write the dilation equation as we have usually done it…

\varphi(t) = \sum_n {h(n)\ \sqrt{2}\ \ \varphi(2\ t - n)}

then h(0) = h(1) = \frac{1}{\sqrt{2}}\ (i.e. result three, the sum of even h’s (h(0)) is, trivially, equal to the sum of odd h’s (h(1)), and (result four) the sum of the h’s is \frac{2}{\sqrt{2}} = \sqrt{2}\ .

Five, because the integral is 1, we expect that

\sum_k { \varphi(\frac{k}{2^j})} = 2^j

and that is true, too.

Let me be explicit about result five. For j = 0, we have \varphi(0) = 1 = 2^0\ .

For j = 1, we have

\varphi(0) + \varphi(1/2) = 1 + 1 = 2 = 2^1\ .

For j = 2, we have

\varphi(0) + \varphi(1/4)+ \varphi(1/2)+ \varphi(3/4) = 1 + 1 + 1 + 1= 4 = 2^2\ .

I hope the pattern is clear: we end up adding more 1’s, in just the right number to give us another power of 2..

example: Daubechies D4

One result, the dilation equation is still

\varphi(t) = \sum_n {h(n)\ \sqrt{2}\ \ \varphi(2\ t - n)}

(This is, after all, how we computed the D4 scaling function.)

The h’s are (remember that the first one is h(0), an even coefficient):

\left\{\frac{1+\sqrt{3}}{4   \sqrt{2}},\frac{3+\sqrt{3}}{4   \sqrt{2}},\frac{3-\sqrt{3}}{4   \sqrt{2}},\frac{1-\sqrt{3}}{4 \sqrt{2}}\right\}\

Result two, the sum of the h’s is

\frac{1-\sqrt{3}}{4 \sqrt{2}}+\frac{3-\sqrt{3}}{4   \sqrt{2}}+\frac{1+\sqrt{3}}{4   \sqrt{2}}+\frac{3+\sqrt{3}}{4 \sqrt{2}}\

which does, indeed, simplify to \sqrt{2}\ .

Three, the sum of the even coefficients is (sorry, the order changed)…

\frac{3-\sqrt{3}}{4 \sqrt{2}}+\frac{1+\sqrt{3}}{4   \sqrt{2}}\

which simplifies to \frac{1}{\sqrt{2}}\ .

Since that’s half the total (\sqrt{2}\ ), we can conclude that the sum of the odd coefficients is also \frac{1}{\sqrt{2}}\ . Or we can just compute the sum of the odd coefficients…

\frac{1-\sqrt{3}}{4 \sqrt{2}}+\frac{3+\sqrt{3}}{4   \sqrt{2}}\

and it is, as it must be, \frac{1}{\sqrt{2}}\ .

Now, I can’t actually compute the integral of the D4 scaling function. I can estimate it as closely as I like, but all I have is its values at as many points as I like. But that is all I need! I can, in other words, compute things that look like finite Riemann sums — the areas of thin rectangles — but I can’t actually compute the integral except as a limit.

But the equation

\sum_k { \varphi(\frac{k}{2^j})} = 2^j

holds, and says (four) that every one of those approximations will have the same value, namely 1. Taking a limit doesn’t get any easier than the limit of a constant.

I conclude that the integral is 1 (and that’s the fifth result).

Let me be explicit.

The sum of the values of the D4 scaling function at the integers was chosen to be 1. Let’s look again at the values on the integers…

picture-37

Those 4 values are… \{0.,1.36603,-0.366025,0.\}\

and their sum is 1, and the width of each interval between them is \Delta\ t = 1\ , so the area of 4 rectangles \sum_n {f(t)\ \Delta\ t}\ ) would also be 1.

At the half integers (and integers)….

picture-38

The function values are…

\begin{array}{c} 0. \\ 0.933013 \\ 1.36603 \\ 0 \\ -0.366025 \\ 0.0669873 \\ 0.\end{array}

Now the sum is 2, but the width of each rectangle is \Delta\ t = \frac{1}{2}\ , so the total area is, again, 1.

At all the quarter integers (including halves and integers)…

picture-39

The function values are…

\begin{array}{c} 0. \\ 0.63726 \\ 0.933013 \\ 1.10377 \\ 1.36603 \\ 0.341506 \\ 0 \\ -1355.76 \\ -0.366025 \\ 0.0212341 \\ 0.0669873 \\ -0.0122595 \\ 0.\end{array}

The sum is 4, but each rectangle is of width 1/4, so the total area is still 1.

And so on.

So the equation

\sum_k { \varphi(\frac{k}{2^j})} = 2^j

would seem to give us that the area of every dyadic approximation of a wavelet is equal to 1, and that — in the limit — gives us the integral. The powers of 2 on the RHS exactly offset the diminishing widths of the rectangles in the finite sums used to compute areas of the step functions.

digression on h’s

Let us return to the Haar system. When I showed you the V and W spaces, I wrote the dilation equation as

\varphi(t) = \varphi(2t) + \varphi(2t-1).

This implies that we have two nonzero c’s:

c(0) = c(1) = 1,

(and the sum of the coefficients is clearly 2).

But we are used to writing the dilation equation as

\varphi(t) = \sum_{n} h(n)\ \sqrt{2}\ \varphi(2t-n)\ ,

which says

c(n) = h(n)\ \sqrt{2},

so that we wrote the Haar system with

h(0) = h(1) = \frac{1}{\sqrt{2}},

(and the sum of the coefficients is \sqrt{2}).

Another form of the equation is common:

\varphi(t) = \sum_{n} h(n)\ 2\ \varphi(2t-n),

which says that

c(n) = 2 h(n)

so that the Haar system would have

h(0) = h(1) = \frac{1}{2},

(and the sum of the coefficients is 1).

Many books point out the existence of different scalings. Strang & Nguyen (Strang, Gilbert; Nguyen, Truong.Wavelets and Filter Banks.Wellesley-Cambridge Press, 1997 (revised edition).ISBN 0 9614088 7 1), however, are more explicit (p. 23):

  • if you are working primarily with the dilation equation, set the sum of coefficients to 2 “… to preserve area.”
  • if you are worling with a single filter, set the sum of coefficients to 1. “That preserves the zero frequency DC term….”
  • if you are workig with a filter bank, set the sum of coefficients to \sqrt{2} “… to account for the downsampling step.

As I said when we looked at Nievergelt’s example, setting the area to 1 (sum = 2) is different from making an orthonormal basis out of the wavelets (sum = \sqrt{2}\ ).

This is the key: if someone hands me a set of filter coefficients for a scaling function – I’m going to add them all up, and see whether the sum is 1, \sqrt{2}\ , or 2.

(Yes, in principle, other normalizations are possible, too; but these are the three I expect to find.)

Burrus et al. are using a sum of \sqrt{2}.

Let’s take a closer look at the scaling factor in the dilation equation.

the sum of the h’s

Suppose we write the dilation equation as

\varphi(t) = \sum_{n} h(n)\ A\ \varphi(2t-n)

for some constant A. We can determine the sum of the h’s.

How? Integrate. (So the integral needs to exist. Easy enough to say if we know \varphi(t)\ , but remember that, in principle, we’re talking about something whose existence is not assured.) Anyway, we integrate:

\int \varphi(t)\ dt = \int (\sum_{n} h(n)\ A\ \varphi(2t-n))\ dt\

Now, if we can interchange the integral and the summation (which may not be a finite sum!), then

\int \varphi(t)\ dt = \sum_{n} h(n)\ A\ \int(\varphi(2t-n))\ dt

Now do a change-of variable y = 2t-n, dy = 2dt. We get

\int \varphi(t)\ dt = \sum_{n} h(n)\ \frac{A}{2} \int(\varphi(y))\ dy

Now, if the integral \varphi(t)\ dt is nonzero, we can divide by it, getting

1 = \frac{A}{2}\ \sum_{n} h(n)

and then

\sum_{n} h(n) = \frac{2}{A}\ .

I’ll remark that we just divided both sides of the equation by \int\ \varphi(t)\ dt = E\ , which is why the sum of the h’s is independent of E.

For Burrus, et al., A = \sqrt{2}

so we have that the sum of the h’s is \frac{2}{\sqrt{2}} = \sqrt{2}\ .

The sum of the h’s is directly related to the factor A in the dilation equation, so just keep your eyes open for it.

Let me also point out that the normalization could be considered part of the definition of the functions which span the space V_j\ . That is, instead of taking it to be the space spanned by the set of translated and scaled functions \{\varphi(t-k)\}\ , we might introduce additional scaling, say, \{\sqrt{2}\ \varphi(t-k)\}\ . We still get the same function (vector) space – all we’ve done is change the sizes of the basis functions.

The real issue may very well be, what’s in your wallet?® — what is your software doing?

What’s next? I will add the requirement that the scaling function be orthogonal to all of its translates. (We already have that the mother wavelet is orthogonal to the scaling function, and that wavelets in W_j\ are orthogonal to functions in V_j\ . Edit: Now we’re going to require that the basis for V_j\ be orthogonal to the basis for V_{j+1}\ . No, we won’t require that – we will get it as a result of requiring that integer translates of the scaling function be orthogonal, i.e. imposing a condition in V_0\ ).

edit, added:
The basis for V_j\ will not be orthogonal to the basis for V_{j+1}\ . We will neither require it, nor obtain it as a result. I will have more to say about this.

Happenings Jun 6

I’m pretty much working on only two things: wavelets and finite fields. In fact, right now I’m looking at field extensions, things like Q(\sqrt{2}) = \{a + b\ \sqrt{2} | \text{a,b rational}\}\ . (Just can’t seem to shake off those \sqrt{2}\text{'s}\ ).

I even asked a careless question last week out on sci.math. I won’t call it a stupid question — I insist that there are none — but I did forget one of the hypotheses. The topic was “Visualizing Finite Fields” in sci.math. My post was on May 29, and a reply pointing out my carelessness appeared several hours later, in the evening.

And what was it I had done wrong? The polynomial x^2 +1 is irreducible over the 3-element field GF(3) = {-1,0,1} (arithmetic mod 2), so adding its roots \pm\ i = \pm\sqrt{-1} generates the field GF(9). (It’s a 2nd degree polynomial, so we get a field extension of degree 2, so the extension field is of order 2^2 3^2). And what could be easier than adding the imaginary unit?

So why not add \sqrt{-1} to the 2-element field GF(2) = {0,1} to get the 4-element field GF(4)?

Because the polynomial x^2 + 1 is not irreducible over GF(2). In GF(2), 2 = 0, so our polynomial factors as (x+1)(x+1) = x^2 + 2x + 1 = x^2 + 1. (Alternatively, in GF(2), 1 and -1 are the same element: 1 = -1, so x^2 + 1 = x^2 -1 = (x+1)(x-1).

So, GF(4) requires that we find a different polynomial, one that is irreducible, to get the roots of.

(So there. I have now talked about a foolish post of mine to USENET, not just about the expert ones!)

For wavelets, I have begun writing code to compute wavelet coefficients. For the Haar system this is very easy: life gets very, very simple because there are only two filter coefficients h. In fact, finding code for the Haar system is also very easy; we don’t have to write anything.

Moving further, however, requires more work; and people seem unwilling to publish working code for anything beyond the Haar system.

That’s okay. I need to understand this, so I probably ought to be writing my own code.

Ugh! I don’t view coding as mathematics per se; it just lets me get answers. Maybe the way to phrase that is: writing accurate pseudo-code might be mathematics; getting code that will (compile and) run is an often unavoidable chore.

Wavelet Properties: the dilation equation.

Introduction

What do I propose to show you? (Certainly not all in one post.)

My understanding is neither complete nor rigorous. But wavelets and scaling functions, and their coefficients g and h, respectively, have a lot of properties. I want to sort them out.

I’m not trying for rigor. (Heresy!) I’m laying things out on a table so I can begin to relate them to each other.

The properties which I want to show you seem to fall into 4 categories.

  1. Where do the dilation equation and wavelets come from?
  2. What can we deduce from the dilation equation?
  3. What can we deduce from the requirement that the scaling function and its integer translates be orthogonal?
  4. A few things that I really, really don’t understand yet.

There is a fair bit of repetition in here. In particular, it seemed worthwhile to repeat things within the examples.

Where do the dilation equation and wavelets come from?

Discussion

Suppose we have a function \varphi(t)… even more, suppose we have a collection of its “integer translates” \{\varphi(t-k)\}\ , where k may be any integer.

That is, for example, take \varphi(t) to be any of the scaling functions we have seen in the following two posts: the Haar, or the Daubechies D4 or the D6 or other scaling functions we drew here.

Define the space V_0 to be that space spanned by the collection of integer translates \{\varphi(t-k)\}\ . In general, I may not be able to describe that space in any other way; whatever the \{\varphi(t-k)\}\ span, that is V_0\ .

I also haven’t said that the \{\varphi(t-k)\}\ are linearly independent; right now, I only care what they span, not that they necessarily be a basis for it.

Now consider the collection of scaled and translated functions \{\varphi(2t-k)\}\ , and let V_1 be the space they span.

For the Haar system, we know that V_0 \subset V_1\ . Any step function with jumps at the integers may be written as a (rather special) linear combination of step functions with jumps at the half-integers.

In general, however, this is a stringent requirement. Suppose it holds, for whatever hypothetical function \varphi(t) we have. That V_0 is a subspace of V_1\ , with V_1 spanned by the set \{\varphi(2t-k)\}\ tells us that any function in V_0 may be written as a linear combination (possibly infinite!) of the \{\varphi(2t-k)\}\ . In particular, the scaling function \varphi(t) itself may be wriiten as a linear combination of the \{\varphi(2t-k)\}\ .

But that gives us a dilation equation

\varphi(t) = \sum_{n} c(n)\ \varphi(2t-n)\

for some coefficients c(n). Conversely, if \varphi(t) satisfies the dilation equation, then I guess we ought to have V_0 \subset V_1\ .

The dilation equation effectively comes from the requirement that V_0 \subset V_1\ .

Now we add a little more structure, an inner product.

If V_0 is a proper subspace of V_1\ , and if we introduce an inner product, then we may split V_1 into V_0 and its orthogonal complement; that is, we define the orthogonal complement W_0 as:

V_1 = V_0\ \oplus W_0\ .

For these purposes, the customary inner product of two functions f, g is the integral of their product:

\int f(t) g(t) \, dt\ .

The definition of W_0 gives us a couple of immediate consequences. For one thing, every function \psi \in W_0 is orthogonal to the scaling function \varphi(t) \in V_0\ and is also orthogonal to every translate, \varphi(t-k\ .)

For another thing, W_0 \subset V_1\ , so every function \psi \in W_0 is also in V_1\ , and therefore can be written in terms of the set \{\varphi(2t-k)\}\ . That is, we have

\psi(t) = \sum_{n} d(n)\ \varphi(2t-n)\

for some coefficients d(n). This is how and why we could compute the mother wavelet from the scaling function! (Oh, yes, the mother wavelet is a function in W_0\ .)

So the mother wavelet, as far as we know at this point, could be any function in W_0\ , and it satisfies an equation like the dilation equation. Note, as we saw when we were computing, that we have \psi on the LHS but \varphi on the RHS.

And we continue, defining W_1\

V_2 =: V_1 \oplus W_1

and W_j\

V_{j+1} =: V_j \oplus W_j\ .

Now, for example, take V_2 in terms of V_1 and W_1 and then write V_1 in terms of V_0 \text{ and } W_0\ :

V_2 =: V_1 \oplus W_1 = V_0 \oplus W_0 \oplus W_1\ .

More generally,

V_{j+1} =: V_j \oplus W_j =  V_0 \oplus W_0 \oplus W_1 \oplus ... \oplus W_j\ .

We can write a function in V_{j+1} using a single scaling function in V_0\ , and wavelets from the W_0 \text{,...\ } W_j\ . (Oh, yes, the wavelets derived from the scaling function \varphi(t) \in V_0 are functions in the W_j\ spaces.)

We will see, in the example below, that it makes sense to go the other way, too. Suppose that for some reason – I’ll give you one, soon – we want our scaling function to be a step function of width 4 instead of width 1. (So I am speaking of the Haar system in particular, but it can make sense to go the other way for any wavelet system.)

Where does that live? Well, step functions of width 2 should live in V_{-1}\ . They are related to step functions of width 1 in the same way step functions of width 1 are related to step functions of width 1/2.

So step functions of width 4 should live in V_{-2}\ . And we should have

V_{-1} = V_{-2} \oplus W_{-2}\ ,

V_{0} = V_{-1} \oplus W_{-1} = V_{-2} \oplus W_{-2} \oplus W_{-1}\ .

Example: Haar

For the Haar system, here’s the scaling function \varphi(t) and one possible translate, \varphi(t+1)\ , on [-1,3]…

haar t and trans

As I said earlier, define the space V_0 to be that space spanned by the collection of integer translates \{\varphi(t-k)\}\ . We can describe V_0\ : the space of step functions with jumps at the integers.

Now we consider the collection of scaled and translated functions \{\varphi(2t-k)\}\ . Here’s \varphi(2t) for the Haar… and for one of its translates, in this case, \varphi(2t-1)\ .

haar 2t and trans

They are nonzero only over a half-unit interval. That is significant for their inner products with themselves (their norms).

Define the space V_1 as that space spanned by the collection \{\varphi(2t-k)\}\ . For the Haar system, we can describe V_1 as the space of step functions with jumps at the half-integers.

Note that if we add \varphi(2t) and \varphi(2t-1)\ , we get the unit step function on [0,1) — that is, we get \varphi(t)\ :

\varphi(t) =  \varphi(2t) +  \varphi(2t-1)\ .

That’s a form of the dilation equation!

(It’s not the way we’ve been writing it – we’re missing \sqrt{2}\ – but this is a dilation equation, and we can – and will shortly ! – insert the \sqrt{2}\ easily enough.)

That says that V_0 \subset V_1\ : any piecewise step function whose jumps are at integers can be written as (an admittedly special) piecewise step function whose jumps are at half-integers.

For the Haar functions \varphi(t), this is a property we observe. We generalize it and say that we want it to be a property of any scaling function; we require the space V_0 (spanned by \{\varphi(t-k)\} ) to be a subspace of V_1 (spanned by \{\varphi(2t-k)\}):

Assume

  • the set \{\varphi(t-k)\} spans V_0 by definition of V_0
  • the set \{\varphi(2t-k)\} spans V_1 by definition of V_1
  • V_0 \subset V_1

Then any function in V_1 can be written, by definition, as a linear combination of the \varphi(2t-k)… hence any function in V_0 \subset V_1 can be written as a linear combination of the \varphi(2t-k)… hence the particular function \varphi(t) \in V_0 can be written as a linear combination of the \varphi(2t-k).

That is, we have a dilation equation

\varphi(t) = \sum_{n} c(n) \varphi(2t-n)

essentially because we require V_0 \subset V_1.

The D4 and D6 scaling functions are solutions to the dilation equation for particular coefficients. Maybe now is a good time to emphasize that the h’s and the scaling function \varphi(t) are hidden in the mist, as it were. We can apparently write a dilation equation for any set (possibly infinite) of h’s, and we have no idea if there’s a solution at all. And if there is a solution, is it unique, is it continuous, is it differentiable, is it integrable, is it square integrable? This is why we look for properties of the h’s and the solution.

Example: Nievergelt

The text (Nievergelt, Yves. Wavelets Made Easy. Birkhäuser, 2001 (2nd printing with corrections), ISBN 0 8176 4061 4.) opens with a simple example.

We have 4 data points, and I want to illustrate these V and W spaces. The specific data is

\{5,1,2,8\}\ .

Now imagine that instead of 4 points I have a step function f(t) defined on the quarter integers in [0,1). I have V_0 \text{ thru } V_2\ (integers, halves, quarters), so I write

V_2 = V_1 \oplus W_1 = V_0 \oplus W_0 \oplus W_1\ .

It appears that I want the scaling function \varphi(t)\ , the mother wavelet \psi(t)\ , and two wavelets \psi(2t)\ and \psi(2t-1)\ . That is, we are going to write our step function f(t) as the combination

f(t) = a\ \varphi(t) + b\ \psi(t) + c\ \psi(2t) + d\ \psi(2t-1)\ ,

where a, b, c, d are called the wavelet coefficients. They are components of a vector – in a function space – where the basis vectors are functions. (Yes, those 4 functions are a basis for V_2\ \cap [0,1)\ .

Here’s those four functions all together.

spaces grid

I should probably remind us all that the Haar mother wavelet \psi(t) satisfies the equation

\psi(t) = \varphi(2t) - \varphi(2t-1)\ ,

and that’s exactly how I compute it.

(”Reverse the h’s and alternate the signs.”)

Let me point out that I could also write f(t) \in V_2\ , and I could graph the step function f(t) using these 4 basis functions (all scaling, no wavelets; all \varphi \text{ no } \psi\ ):

f(t)=5\ \varphi (4 t)+8\ \varphi (4 t-3)+2\ \varphi (4t-2)+\varphi (4 t-1)\ .

I can write it, but it’s nowhere near as useful as the wavelet decomposition. (On the other hand, it’s probably the most convenient way to graph f(t)!) So, here’s the graph, constructed exactly that way.

spaces new f

Now I am going to give you Nievergelt’s wavelet decomposition. There are some awkward points, at this stage, so I’m just going to hand you the answer for now.

I should do it using an orthonormal basis; he did not. At this stage, I should compute the “wavelet coefficients” by taking dot products, and for that an orthonormal basis is extremely useful. (If the basis is not orthonormal, then we need a reciprocal basis in order to compute components using the inner product.) In practice, we would use a different algorithm, one which does not require taking inner products, hence does not require that the basis vectors be normalized.

As it happens, the scaling function and the mother wavelet are both normalized: the integrals of their squares are 1, so their norms — the square roots — are 1.

\int \varphi(t)\  \varphi(t) \, dt\ = 1.

\int \psi(t)\  \psi(t) \, dt\ = 1.

(Their areas are also 1, but that’s not what we’re computing here.)

The two daughter wavelets \psi(2t)\ and \psi(2t-1)\ , however, are not normalized. It’s true that their squares, at each point, are either 1 or 0, but they are nonzero over half-intervals, so the integrals are 1/2 instead of 1. We need to multiply each of these functions by \sqrt{2} to double the heights of their squares – if we want the daughter wavelets to be part of an orthonormal basis.

That is, for example,

\int \varphi(2t)\  \varphi(2t) \, dt\ = \frac{1}{2}.

so

\int (\sqrt{2}\ \varphi(2t)\  ) (\sqrt{2}\ \varphi(2t)\  ) \, dt\ = 1\ .

(I told you I’d put that \sqrt{2} in there. And if we wanted unit area instead of unit norm, we would have multiplied by 2, instead of by \sqrt{2}\ . We’ll see this again.)

But I’m going to hold off on these calculations, until we’ve looked at some more properties. I simply tell you that the numerical result is that the components for this expansion are

\{a\ ,b\ ,c\ ,d\} = \{4,-1,2,-3\}\

… and that f(t) is

f(t) = 4\ \varphi(t) - 1\ \psi(t) + 2\ \psi(2t) - 3\ \psi(2t-1)\ .

(That’s easy enough to check algebraically: just use the recursion equation for the \psi \text{'s}, and then use the dilation on the resulting \varphi \text{'s}\ . But if you can compute all those functions, it is easier to just graph that equation and check it visually.)

Let me lead up to that in stages. Here is the least-detailed term: f(t) \approx 4\ \varphi(t)

haar V0

It tells us that the average of the data is 4; alternatively, if we smooth the data by its average, that is what we get. Yet again, if we approximate our function f(t) \in V_2 by a particular function in V_0\ , that is the approximation.

Now add the next term. If we approximate our function f(t) \in V_2 by our functions in V_1 =  V_0 \oplus W_0 \ , namely as

f(t) \approx 4\ \varphi(t) - \psi(t)\ ,

here’s what we get:

haar W0

It can be interpreted as: the average of the first two observations is 3 and the average of the last two is 5.

Finally, our given function f(t) \in V_2 can be written exactly as

f(t) = 4\ \varphi(t) - 1\ \psi(t) + 2\ \psi(2t) - 3\ \psi(2t-1)\ ,

which is:

haar W1 W2

Now, one more thing. I didn’t have to replace the data by a function in V_2\ ; I could have decided that f(t) was a step function defined (that is, nonzero) on the half-open interval [1, 5) with jumps on the integers \{1,2,3,4,5\}\ . Then my scaling function needs to be of width 4, i.e. in V_{-2}\ , my mother wavelet is in W_{-2}\ , and the two finest-scale wavelets are in W_{-1}\ :

V_{0} = V_{-2} \oplus W_{-2} \oplus W_{-1}\ .

Let me be very clear. I could have gone so far as to replace my 4 points by the step function

f(t)=5\ \varphi (t-1)+8\ \varphi (t-4)+2\ \varphi (t-3)+\varphi (t-2)\ .

spaces wide shift

That is, we can handle a change in scale and a shift: I can use integer times and I can even start time at t=1, if I choose. There’s no reason to use the shift, although it’s important to understand that we could; I’m going to forget the shift, and revert to just a change of scale,

f(t)=5\ \varphi (t)+8\ \varphi (t-3)+2\ \varphi (t-2)+\varphi (t-1)\ ,

i.e.

spaces wide no shift

The wavelet expansion should use

V_{0} = V_{-2} \oplus W_{-2} \oplus W_{-1}

and instead of

f(t) = 4\ \varphi(t) - 1\ \psi(t) + 2\ \psi(2t) - 3\ \psi(2t-1)\ ,

should be

f(t) = 4\ \varphi(t/4) - 1\ \psi(t/4) + 2\ \psi(t/2) - 3\ \psi(t/2-1)\

(with the very same coefficients, just applied to differently-scaled functions).

Here’s what it looks like:

spaces wide fit

Summary

From our chosen scaling function \varphi(t)\ – assuming it exists, i.e. assuming that we chose the h’s and found a scaling function as a solution of the dilation equation! – we get a nested sequence of V spaces. They are defined by integer translates and scaled versions of \varphi(t)\ . The key requirement is that the existence of nested V spaces corresponds to a solution of the dilation equation.

By introducing an inner product (dot product), we get the W spaces, each as the orthogonal complement of a V space. In particular, since

V_1 = V_0 \oplus W_0\ ,

we have that \psi(t) \in W_0 is orthogonal to \varphi(t) \in V_0\ ; and since that subspace decomposition implies

W_0 \subset V_1\ ,

we have that \psi(t) also satisfies something like the dilation equation.

\psi(t) = \sum_{n} d(n)\ \varphi(2t-n)\ .

Next, I expect, we will look at what the dilation equation tells us about the h’s in it. (Or the c’s, whatever we call these coefficients and however we scale them.)

Happenings May 23

As with so many things, the post explaining where those miscellaneous facts about wavelets come from is getting longer and more complicated. I do, however, know how to fix that: break it into pieces.

Last weekend was “Seminar Day” at Caltech; that is our reunion weekend. Other schools may have parties — we have lectures.

I drove down to Pasadena Friday afternoon, arriving at about 8 PM. Lectures ran Saturday from 9 AM to 5 PM, with a long break for lunch and a “general session” for honoring alumni and for someone to speak on a more general topic.

I habitually use that general session to hit the bookstore. Alas, it looks like this was my last serious visit to it. The bookstore is going out of business — as a bookstore. It will continue to sell clothing, coffee cups, and supplies, but it will make official what is at present de facto: the students buy their textbooks from Amazon.

It occurs to me that the loss of the technical bookstore just might cut back on my showing up for Seminar Day.

As it happens, I did not make it down to Caltech last year, and so I did not make my customary purchase of many interesting books. I want to joke that they went out of business because I missed a year of purchases.

Which lectures did I attend? Quantum mechanics, the Phoenix lander on Mars, human genetics, the large hadron collider, and quantum information science.

The quantum mechanics was fascinating. Something about replacing a small quantum mechanical system by an infinite limit with the same statistics (so I think they’re going from one system to an ensemble of identical systems); then treat the infinite ensemble classically! The key phrase is “Ring Polymer Molecular Dynamics”.

The Mars lander and the collider were interesting and straightforward. The genetics went over my head.

The quantum information science started well, and then lost me. (Well, I knew what he was saying but I couldn’t relate it to anything else I knew.) But I might take another look at Merman (see the bibliography and this old post). And I might go look up “anyons”, something about the “fractional quantum Hall Effect”.

Then I drove back Saturday evening and night as soon as the last lecture ended. (OK, I grabbed a cup of coffee first.) I wanted Sunday for myself. I did some mathematics, but not a lot. I had jury duty on Monday.

I made it onto a jury. The judge’s head swiveled sharply at one point. One of the routine questions they ask is, “What do you do for a living?” When I replied, “computer models of power plants”, the judge asked quickly, “What does that mean?”

The end of my answer was to say that I was doing the equivalent of balancing a checkbook, making sure that what went into the plant was equal to what came out of it. Maybe I’m learning something from Charlie Epps on Numbers!

The trial ended unexpectedly Tuesday morning, so I was free to return to work.

One of the nine books I picked up this year was about modular forms (Kilford, L.J.P.; Modular Forms: A classical and computational introduction. Imperial College Press, 2008. ISBN 1 84816 213 8.). One could phrase Andrew Wiles’ proof of Fermat’s last theorem as: every relevant elliptic curve is modular. No, I will not elaborate; but modular forms are what are related to elliptic curves.

Having mentioned Z(\sqrt{-5}) as an integral domain where unique factorization does not hold, let me show you a number system where Fermat’s last theorem is false.

The historical overview that opens the book includes the following counterexample for n = 3 in Z(\sqrt{2})\ . We have

\left(18-17\ \sqrt{2}\right)^3+\left(18+17\ \sqrt{2}\right)^3 = 42^3\ .

That is, we have

a^3+b^3=c^3\ .

Of course, those numbers are not integers so this is not a counterexample for the usual Fermat’s last theorem; but I like knowing that it is not true in all number systems.

As for the routine of my blogging, I expect to continue with wavelets; and, more and more, I am being distracted by regression analysis. One of the nine books I bought was about regression diagnostics, and I have already ordered and received the previous book by that author. In fact, I’ve already read — not worked — that previous book, as well as they one I bought Saturday. Now I’m waiting for a third, the follow-up, book. (I have the impression they’re printing it for me.)

… I’ve been doing wavelets this morning, and I guess I’ll take a break to put this out. (I drafted it yesterday.)

N = 6 scaling functions and mother wavelets

introduction

The impetus for this post is figure 1.4 on page 6 of Burrus et al. (Since it’s been a while, that’s Burrus, C. Sidney; Gopinath, Ramesh A.; Guo, Haitao. Introduction to Wavelets and Wavelet Transforms, A Primer. Prentice Hall, 1998.ISBN 0 13 489600 9.)

They offered the figure as an illustration of four different scaling functions; in addition, these scaling functions were parameterized by two angles.

Now I know how to produce the drawings of those scaling functions — and I also know how to produce drawings of the corresponding mother wavelets, which they did not show on page 6. It also turns out that they have interchanged the legends for figures (a) and (b) — and when we can produce the drawings ourselves, that mistake becomes almost irrelevant.

This post also serves to reiterate the calculation sequence for plotting a scaling function and its corresponding mother wavelet.

setup

It turns out that there are equations which must be satisfied by any “reasonable” filter coefficients h. I expect to show these to you in the next post. I have already shown you a description of these solutions of these equations when we have 4 nonzero coefficients. And I have shown you how to compute scaling functions and mother wavelets for N = 4 (D4 only) and for N = 2 (D2 = Haar).

Let me emphasize that what I understand is how to get the equations which the filter coefficients h must satisfy. What I do not yet understand is how to parameterize the solutions in terms of angles.

Let me now hand you a description of the solutions when we have 6 nonzero coefficients. The four examples I am about to work out all come from specific choices of these solutions. That is, for every one of the following examples, N = 6.

The following 4 scaling functions match figure 1.4, although (a) and (b) are swapped. (The legend describes the values “a” and “b” in the equations; the legend for figure a belongs with figure b.)

In addition, they have drawings of the Daubechies D6 (p. 82) and Coiflet C6 (p. 94) mother wavelets, so I have confirmed 6 of the following 8 graphs.

From p. 66 of Burrus et al., we get the following (using H instead of h just because it lets me preserve the equations in H while setting values for h.)

Let me tell you before I start writing that the last two equations are actually

H(4) = -H(0)-H(2)+\frac{1}{\sqrt{2}}

H(5) = -H(1)-H(3)+\frac{1}{\sqrt{2}}

That is, they say that the sum of the even coefficients is equal to the sum of the odd coefficients is equal to half the sum of all the coefficients = \frac{1}{2} \sqrt{2}= \frac{1}{\sqrt{2}}\ . They will look much more complicated than that when I substitute the first four equations.

H(0) = \frac{(\cos (a)+\sin (a)+1) (-\cos (b)-\sin (b)+1)+2 \cos (a) \sin (b)}{4 \sqrt{2}}

H(1) = \frac{(-\cos (a)+\sin (a)+1) (\cos (b)-\sin (b)+1)-2 \cos (a) \sin (b)}{4 \sqrt{2}}

H(2) = \frac{\cos (a-b)+\sin (a-b)+1}{2 \sqrt{2}}

H(3) = \frac{\cos (a-b)-\sin (a-b)+1}{2 \sqrt{2}}

H(4) = -\frac{\cos (a-b)+\sin (a-b)+1}{2 \sqrt{2}}-\frac{(\cos (a)+\sin (a)+1) (-\cos (b)-\sin (b)+1)+2 \cos (a)   \sin (b)}{4 \sqrt{2}}+\frac{1}{\sqrt{2}}

H(5) = -\frac{\cos (a-b)-\sin (a-b)+1}{2 \sqrt{2}}-\frac{(-\cos (a)+\sin (a)+1) (\cos (b)-\sin (b)+1)-2 \cos (a)   \sin (b)}{4 \sqrt{2}}+\frac{1}{\sqrt{2}}

legend 1.14 a (figure b)

The first choice (the legend for figure 1.14 a) for the angles a and b (in radians) is

\{a\to 1.3598,b\to -0.782106\}

(I knew when I saw those that the figure was wrong; by that time I recognized the D6 h’s.)

The resulting h’s are…

\{0.332671,0.806892,0.459878,-0.135011,-0.0854413,0.0352263\}

Now we form the m0 matrix (in stages for clarity)…

 \left(\begin{array}{llllll} \text{h0} & 0 & 0 & 0 & 0 & 0 \\ \text{h2} & \text{h1} & \text{h0} & 0 & 0 & 0 \\ \text{h4} & \text{h3} & \text{h2} & \text{h1} & \text{h0} & 0 \\ 0 & \text{h5} & \text{h4} & \text{h3} & \text{h2} & \text{h1} \\ 0 & 0 & 0 & \text{h5} & \text{h4} & \text{h3} \\ 0 & 0 & 0 & 0 & 0 & \text{h5}\end{array}\right)

Then we multiply by Sqrt[2]…

M_0 = \left(\begin{array}{llllll} \sqrt{2}\ \text{h0} & 0 & 0 & 0 & 0 & 0 \\ \sqrt{2}\ \text{h2} & \sqrt{2}\ \text{h1} & \sqrt{2}\ \text{h0} & 0 & 0 & 0   \\ \sqrt{2}\ \text{h4} & \sqrt{2}\ \text{h3} & \sqrt{2}\ \text{h2} & \sqrt{2}\   \text{h1} & \sqrt{2}\ \text{h0} & 0 \\ 0 & \sqrt{2}\ \text{h5} & \sqrt{2}\ \text{h4} & \sqrt{2}\ \text{h3} &   \sqrt{2}\ \text{h2} & \sqrt{2}\ \text{h1} \\ 0 & 0 & 0 & \sqrt{2}\ \text{h5} & \sqrt{2}\ \text{h4} & \sqrt{2}\ \text{h3}   \\ 0 & 0 & 0 & 0 & 0 & \sqrt{2}\ \text{h5}\end{array}\right)

Note that since all 4 examples have N = 6, that is the m0 matrix in principle for each of these examples. I won’t show it again.

Now we set the values…
M_0 = \left(\begin{array}{llllll} 0.470467 & 0. & 0. & 0. & 0. & 0. \\ 0.650365 & 1.14112 & 0.470467 & 0. & 0. & 0. \\ -0.120832 & -0.190934 & 0.650365 & 1.14112 & 0.470467 & 0. \\ 0. & 0.0498175 & -0.120832 & -0.190934 & 0.650365 & 1.14112 \\ 0. & 0. & 0. & 0.0498175 & -0.120832 & -0.190934 \\ 0. & 0. & 0. & 0. & 0. & 0.0498175\end{array}\right)

We find the eigenvector of eigenvalue 1. Mathematica returned a unit vector, but I want one whose components add up to 1. I will simply divide the unit vector by the sum of its components.

Here’s the unit vector…

\{0.,-0.955434,0.286583,-0.0707606,-0.00314509,0.\}

The sum of its components is -0.742756, so I get

V = \{0.,1.28634,-0.385837,0.0952675,0.00423435,0.\}

This function should be defined on [0,5]. I use that vector for the values of the scaling function at the integers.

Picture 34

Working, I graphed this immediately, but for the post, I ought to put the two drawings — scaling function and mother wavelet — in close proximity. This one turns out to be the Daubechies D6 scaling function.

To get the D6 mother wavelet, we take our filter coefficients h…

h = \{0.332671,0.806892,0.459878,-0.135011,-0.0854413,0.0352263\}

… we reverse them, and alternate the signs. The resulting filter coefficients are often denoted by h1 or by g. I’ll use g. The resulting g coefficients are what I expect (the sign of g[0] = h[5] has not been changed), but their Table 6.2 on p. 79 has the negatives of these numbers. And yet, I match their picture on p. 82.

(That’s why there’s a \pm in front of the recipe. Sorry, but you’ll have to look for it in the post.)

g = \{0.0352263,0.0854413,-0.135011,-0.459878,0.806892,-0.332671\}

Define the mother wavelet…

Picture 35

and plot it…
(These are Daubechies D6 mother wavelet and scaling function, respectively.)

Picture 36

Beautiful. That matches their picture on p. 82.

And here’s the D6 scaling function whose plot I delayed showing you…

Picture 37

and that matches (b), rather than (a), of figure 1.4.

legend 1.14 b (figure a)

Let’s make a different choice for a and b. I believe this is the “Coiflet” of order 6.

\{a\to 1.1468,b\to 0.42403\}

The resulting h’s are:

h = \{-0.0727362,0.337915,0.852573,0.384847,-0.0727302,-0.0156552\}

Now we form the m0 matrix exactly as before, and set the new values…

 M_0 = \left(\begin{array}{llllll} -0.102864 & 0. & 0. & 0. & 0. & 0. \\ 1.20572 & 0.477884 & -0.102864 & 0. & 0. & 0. \\ -0.102856 & 0.544256 & 1.20572 & 0.477884 & -0.102864 & 0. \\ 0. & -0.0221398 & -0.102856 & 0.544256 & 1.20572 & 0.477884 \\ 0. & 0. & 0. & -0.0221398 & -0.102856 & 0.544256 \\ 0. & 0. & 0. & 0. & 0. & -0.0221398\end{array}\right)

We find the eigenvector of eigenvalue 1.

\{0.,-0.189494,0.961829,-0.197385,0.00396248,0.\}

That’s orthonormal, so I find the sum of its components (0.578913) and divide by it, getting

V = \{0.,-0.327328,1.66144,-0.340958,0.0068447,0.\}

Set those to be the values of the scaling function at the integers, and define the recursion…

Picture 38

Again, I will delay the plot until I have the mother wavelet.

To get the mother wavelet, we start with the h’s…

h = \{-0.0727362,0.337915,0.852573,0.384847,-0.0727302,-0.0156552\}

(Those agree with p 92)

We reverse them, and alternate the signs. (This time, I need to use their signs! And they have got g[0] = -h[5].)

 g = \{0.0156552,-0.0727302,-0.384847,0.852573,-0.337915,-0.0727362\}

Picture 39

compute and plot…
(These are the Coiflet C6 mother wavelet and scaling function in order.)

Picture 40

That matches their p. 94 drawing of the C6 mother wavelet.

And here’s the C6 scaling function…

Picture 41

and that matches their figure 1.14 (a).

figure 14.1 c

I do not know if the following pair has a name. We take

\left\{a\to \frac{23 \pi }{60},b\to -\frac{\pi }{12}\right\}

The resulting h’s are…

h = \{0.0858766,0.652297,0.742126,0.0388932,-0.120896,0.0159163\}

Now we form the m0 matrix, and set the values.

 M_0 = \left(\begin{array}{llllll} 0.121448 & 0. & 0. & 0. & 0. & 0. \\ 1.04953 & 0.922488 & 0.121448 & 0. & 0. & 0. \\ -0.170973 & 0.0550033 & 1.04953 & 0.922488 & 0.121448 & 0. \\ 0. & 0.022509 & -0.170973 & 0.0550033 & 1.04953 & 0.922488 \\ 0. & 0. & 0. & 0.022509 & -0.170973 & 0.0550033 \\ 0. & 0. & 0. & 0. & 0. & 0.022509\end{array}\right)

Now we find the eigenvector of eigenvalue 1:

\{0.,-0.840331,-0.536329,0.0786992,0.00151279,0.\}

The sum of its components is -1.29645, so we divide by it, to get a vector whose components add up to 1:

V = \{0.,0.648179,0.413691,-0.0607037,-0.00116688,0.\}

That gives us the initial values, and we define the recursion…

Picture 42

Recall the h’s…

h =\{0.0858766,0.652297,0.742126,0.0388932,-0.120896,0.0159163\}

We reverse them, and alternate the signs.

g = \{-0.0159163,-0.120896,-0.0388932,0.742126,-0.652297,0.0858766\}

Redefine the mother wavelet…

Picture 43

and plot it…

Picture 44

… followed by the scaling function:

Picture 45

The last case for their Figure 1.4.

Here are the angles a and b…

\left\{a\to \frac{3 \pi }{4},b\to \frac{2 \pi }{15}\right\}

Here are the resulting h’s…

h = \{-0.158303,0.744755,0.556922,-0.103219,0.308488,0.0655711\}

We form the M0 matrix and set its values…

M_0 = \left(\begin{array}{llllll} -0.223874 & 0. & 0. & 0. & 0. & 0. \\ 0.787606 & 1.05324 & -0.223874 & 0. & 0. & 0. \\ 0.436267 & -0.145974 & 0.787606 & 1.05324 & -0.223874 & 0. \\ 0. & 0.0927315 & 0.436267 & -0.145974 & 0.787606 & 1.05324 \\ 0. & 0. & 0. & 0.0927315 & 0.436267 & -0.145974 \\ 0. & 0. & 0. & 0. & 0. & 0.0927315\end{array}\right)

Once again we find the eigenvector of eigenvalue 1. (If you’re using Mathematica, be careful. It’s probably the 2nd one rather than the 1st, in this case.) The normalized eigenvector is…

\{0.,0.955662,0.22728,0.184742,0.0303893,0.\}

The sum of its components is 1.39807, so we divide through and get our initial values for the scaling function…

V = \{0.,0.683556,0.162567,0.13214,0.0217365,0.\}

So we define the scaling function

Picture 46

Here are the h’s…

h = \{-0.158303,0.744755,0.556922,-0.103219,0.308488,0.0655711\}

We reverse them, and alternate the signs to get a set of g’s…

g = \{-0.0655711,0.308488,0.103219,0.556922,-0.744755,-0.158303\}

Define the mother wavelet…

Picture 47

Compute and plot it…

Picture 48

And here’s the scaling function (which we defined first, as usual)…

Picture 49

Closing

So. We’ve had some practice computing the scaling function and the mother wavelet from the h’s. We know that the h’s are not arbitrary, but are given by solutions to some set of equations, and I’ve shown you the general solutions for N = 4 and N = 6.

Now I really owe you an explanation of where the heck these computations come from. But at least you’ll know why I’m writing equations when you see them.

Next, I think.