Wavelets: Consequences of Orthogonality and Review II

digression: eigenvalue 1

Before I proceed with consequences of orthogonality, I need to mention an omission. For one thing, I have gotten so caught up in the properties I’ve been looking at, that I have forgotten one of the crucial ones we used earlier. For another thing, the consequence which I have forgotten is still a little strange to me.

The consequence (or consequences)?

  • That the dilation equation could be written as an eigenvalue equation,
  • that the existence of a scaling function seems to be equivalent to an eigenvalue = 1,
  • and that the values of the scaling function at the integers are given by the corresponding eigenvector.

This has been crucial to some of our work: the recursion which I use for computing the scaling function is initialized by setting the values of the scaling function at the integers — that is, by finding the eigenvector with eigenvalue 1. Recursion — especially when combined with a lookup table — is very easy and very powerful; but we absolutely had to have initial values, and that eigenvector provided them. (I did this for the Daubechies D4, and for four wavelet systems with 6 nonzero h’s.)

The strangeness?

If I had never seen it fail, it wouldn’t seem strange. I expect that I will show you an example some day. Furthermore, understanding this seems to require that I move into the frequency domain. And I’m not ready to do that yet. (Actually, I’ve begun doing it, but unfortunately I’m still at the stage where some of my calculations result in what I know to be wrong answers. Details, details. When I get this all ironed out, it should be a great example…. And during the time it has taken to finish another draft of this post, I seem to have it. Yes!)

This is probably a good time to remind you that I am not yet past the stage of laying things out so that I recognize them when I see them in my reading. Oh, they make a lot more sense than they did when I started, but my understanding is still rather fragmented.

The eigenvalue properties are still mysterious, as is the assertion that the sum of the odd h’s is equal to the sum of the even h’s.

Now, let’s return to consequences of orthogonality and orthonormality.

Okay, we have one powerful theorem and one example (linear splines) which the theorem does not cover. Let’s look at what they have in common.

For both of them, we have five of six conditions in common… nested spaces V_j\ which are closed under translation and which have a scaling property. For the theorem, we have the sixth condition — that we have an orthonormal basis for the space V_0\ . For the example — the linear splines — we have a basis for V_0 but it is not even orthogonal.

They both, however, have an orthogonal direct sum decomposition, which we describe by saying that W_j\ is the orthogonal complement of V_j\ in V_{j+1}\ ; for example,:

V_1 = V_0 \oplus W_0\ .

Continuing on, we write such things as

V_2 = V_1 \oplus W_1 = V_0 \oplus W_0 \oplus W_1\

V_3 = V_2 \oplus W_2 = V_1 \oplus W_1 \oplus W_2 = V_0 \oplus W_0 \oplus W_1 \oplus W_2\ .

The decomposition

V_3 = V_1 \oplus W_1 \oplus W_2\

is interesting because it shows us that there is nothing special about V_0\ . Furthermore, we could go in the other direction: there is nothing special about non-negative indices.

V_0 = V_{-1} \oplus W_{-1} = V_{-2} \oplus W_{-2} \oplus W_{-1}\ .

The first important thing to realize is that any two W spaces are orthogonal to each other. This means that wavelets are orthogonal across scales.

The challenge is knowing if a W space is closed under translation and has the scaling property. The orthogonal direct sum decomposition guarantees that every function in one W space is orthogonal to every function in every other W space. For our linear spline example, where I can exhibit the mother wavelet, it seems clear — and dangerous as that assumption may be, I’m going to make it — it seems clear that W_0 is closed under translation and that every W space has the scaling property.

Our theorem guarantees that, if we have an orthonormal basis for V_0\ ; but our theorem does not apply to the linear splines.

The second important thing to realize is that the space V_0\ is orthogonal to W_0\ , and in general V_j\ is orthogonal to W_j\ . But more, the nesting of spaces tells us that V_j\ is orthogonal to W_k\ whenever k is greater than or equal to j. This means that scaling functions are orthogonal to some wavelets but not necessarily all.

To be specific, the direct sum decomposition

V_3 = V_0 \oplus W_0 \oplus W_1 \oplus W_2\

tells us that V_0\ is orthogonal to W_k\ for some of the k’s greater than or equal to 0; and the generalization to arbitrary j…

V_j = V_0 \oplus W_0 \oplus ... \oplus W_j\

tells us that in fact V_0\ is orthogonal to W_k\ for all k greater than or equal to 0. And we know there’s nothing special about the 0 subscript: for any m < j we may write

V_j = V_m \oplus W_m \oplus ... \oplus W_j\ .

So. An orthogonal direct sum decomposition tells us a lot about wavelets and scaling functions across scales.

Let me remind us that our powerful theorem told us more: we have orthonormal bases for both V_0\ and W_0\ — and hence orthonormal bases for every V_j\ and W_j\ .

Now, what are the additional consequences specifically of having an orthonormal basis for V_0\ ?

One of them is developed in the proof of the theorem (for which see Daubechies “Ten Lectures on Wavelets”), wherein we explicitly constructed the mother wavelet. We have that the mother wavelet \psi satisfies the equation

\psi(t) = \sum_{n} g(n)\ \sqrt{2}\ \varphi(2t-n)\ .

and that \psi \in W_0\ if and only if the g’s are given by a magic recipe:

g(n) = \pm (-1)^n h(M-n)\ , for M any odd integer.

The -n in M-n says that we reverse the h’s; the (-1)^n\ says that we alternate the signs; the \pm\ says that we may choose to change the first sign or the second; the M says that we may shift the h’s; but in fact we may only shift by an odd integer. Since zero is even, we must shift by at least 1. Oh, we also have that the number of (non-zero) g’s is the same as the number of (non-zero) h’s.

The h’s, you recall, come from the dilation equation for the scaling function \varphi\ ,

\varphi(t) = \sum_n {h(n)\ \sqrt{2}\ \ \varphi(2\ t - n)}

Our example of the linear splines, whose translates are not all orthogonal, shows us that the magic recipe does not work if the translates are not orthogonal. To be specific, our h’s were

\left\{\frac{1}{2 \sqrt{2}},\frac{1}{\sqrt{2}},\frac{1}{2 \sqrt{2}}\right\}\

and our g’s were
\left\{\frac{1}{24},-\frac{1}{4},\frac{5}{12},-\frac{1}{4},\frac{1}{24}\right\}\ .

This example also shows us that the number of g’s can be different from the number of h’s.

By contrast, the Haar, the D4, and the D6 etc. scaling functions and wavelets satisfy our powerful theorem, and their mother wavelets were given by that magic formula for the g’s.

Two other results which I discussed in the previous post were:

  • the magic recipe guarantees that the sum of the g’s is zero;
  • and if the of the g’s is zero, we expect the integral of the mother wavelet to be zero.

For the Haar and Daubechies D4, D6 etc., both of those conditions hold. For the linear spline example, however, only the second condition holds. The magic recipe implies that the integral of the mother wavelet is zero, but the integral can be zero even if we do not use the magic recipe.

Next, there is an extremely important consequence of having an orthonormal basis for V_0\ . We can derive the following quadratic condition:

\sum_n h(n) h(n-2k) = \delta(k)\ ,

where \delta(0) = 1\ and \delta(k) = 0\ for k ≠ 0.

Perhaps surprisingly, it is independant of the norm (\sqrt{P}\ ) of the scaling function, but it does depend on A (our customary \sqrt{2}\ ) in the dilation equation. You should recall that I use P to denote the value of the integral of the scaling function squared:

P := \int \varphi(x)^2 \, dx\ .

I should point out that any consequence of orthonormality which is independent of P is, in fact, a consequence of mere orthogonality. (By orthonormality — or orthogonality –, in this context, I mean that the translates of the scaling function are orthonormal — or orthogonal.)

The quadratic condition has one immediate corollary, for k = 0: the sum of the squared h’s is 1,

\sum_n h(n) h(n) = 1\ .

Some quick algebra indicates that the general formula would be

\sum_n h(n) h(n-2k) = \frac{2}{A^2} \delta(k)\ .

I’ll show you later just how important that corollary is.

There is another crucial consequence of orthonormality. I believe it is a corollary of the quadratic condition, but I have usually seen it associated with the frequency domain.

I believe the quadratic condition implies that the number of non-zero h’s is even (if it is finite). In fact, having decided that, I find it in Strang & Nguyen (p. 148), which changes my belief to virtual certainty. Good. Incidentally, they call it “double shift orthogonality”.

Here’s how I got it.

Whenever I write it out for an odd number of h’s, I can show that the product of the first and last must be zero — so if one of them is zero, we just ended up with an even number of nonzero h’s. To be specific, if we assume that we have 3 h’s, h0, h1, h2, then we get the condition

h0 h2 = 0,

which requires that either h0 = 0 or h2 = 0, and then there are only 2 nonzero h’s.

This is why we have D2 (i.e. Haar), D4, D6, etc. but we do not have, for example, D3. That is, there is no orthonormal Daubechies’ wavelet family with three nonzero h’s. (The notations are not universal; Daubechies herself wrote D2 for what I am used to seeing as D4 today.)

Of course, the linear splines have an odd number of h’s — but the point is, they are not orthogonal — they’re allowed to have an odd number because the quadratic condition does not hold.

I do not know how useful the following are, but let me note that we have a quadratic condition relating h’s and g’s:

\sum_n h(n) g(n-2k) = 0\ .

And we have one for just the g’s, too; I think that for general A, it looks just like the condition on the h’s:

\sum_n g(n) g(n-2k) = \frac{2}{A^2} \delta(k)\ .

(If we are using the magic recipe to get the g’s from the h’s, do we need either of those conditions? I suppose we could use them to catch a typo in a given set of g’s or h’s….)

There is one last consequence I wish to show you. Again, I am not sure how useful this is, but it is unusual enough to be worth noticing, and we may very well find a use for it. This consequence is that

(\int \varphi(x) \, dx) ^2 = \int   (\varphi(x)^2)\, dx\ ,

which I would write as

E^2 = P,

E = \sqrt{P} = || \varphi ||\ .

That is, the L^2 norm \sqrt{P} of the scaling function is equal to its integral E. This is unusual. It requires both orthogonality and the partition of unity, and might also require — as usual — an interchange of integration and an infinite series. Let me show you.

If the translates of the scaling function are orthogonal, then we have

\int \varphi(x) \varphi(x-k) \, dx = P \delta(k)\ .

(That P is my notation, not Burrus et al. As I said before, you have every right to curse me out if you are reading Burrus et al. I am sorry that I did not understand their notation, but I’m not going to change mine now.)

Take the sum — or infinite series — over all k, and get

\sum_k \int \varphi(x) \varphi(x-k) \, dx = P\ .

Interchange, assuming we may, and write

\int \varphi(x) \sum_k \varphi(x-k) \, dx = P\ .

But one of the consequences of the dilation equation was a so-called partition of unity — the sum of all integer translates of the scaling function is equal to the integral E of the scaling function…

\sum_k \varphi(x-k) = E\

so by replacing the summation we get

E \int  \varphi(x)\,  dx = P\ ,

and then by replacing the integral we get

E^2 = P,

QED.

Summary

So. Here is what we have. Six “consequences” of the dilation equation:

  • The sum of the h’s = 2/A.
  • \varphi(t) = \sum_n {h(n)\ A\ \ \varphi(2\ t - n)}\ .
  • E = \int\ \varphi(t)\ dt \ .
  • The sum of the even h’s = the sum of the odd h’s.
  • \sum_k { \varphi(\frac{k}{2^j})} =E\  2^j
  • \sum_k { \varphi(t+k)} = E \

A wavelet equation like the dilation equation serves to define coefficients g(n).

  • \psi(t) = \sum_{n} g(n)\ A\ \varphi(2t-n)\ .

We have the utterly crucial property that if the translates of the scaling function are orthonormal, then the g’s are given by a magic recipe:

  • g(n) = \pm (-1)^n h(M-n)\ , for M any odd integer.

We have two observations about the sum of the g’s.

  • the magic recipe guarantees that the sum of the g’s is zero;
  • and if the of the g’s is zero, we expect the integral of the mother wavelet to be zero.

If the translates of the scaling function are merely orthogonal, so that the squared-norm is not 1…

P := \int \varphi(x)^2 \, dx ={ || \varphi ||}^2 \ne 1\ ,

then we get the following six — or seven — results:

  • the quadratic condition \sum_n h(n) h(n-2k) =\frac{2}{A^2} \delta(k)\ ;
  • its corollary \sum_n h(n) h(n) = \frac{2}{A^2}\ ;
  • its second corollary: the number of nonzero h’s is even, if finite;
  • it’s relatives \sum_n h(n) g(n-2k) = 0\ and \sum g(n) g(n-2k) = \frac{2}{A^2} \delta(k)\ ;
  • the L^2\ norm of the scaling function is equal to its integral: E = \sqrt{P} = || \varphi ||\ ;
  • its corollary E = 1 if and only if P = 1.

Now you might understand why I confused E and P: if the translates of the scaling functions are at least orthogonal, then either P and E are both equal to one, or neither is equal to one. If the translates of the scaling functions are orthonormal, then E = P = 1 — they look like the same thing. And I mixed them up when I first started working through Burrus et al. because on the first pass I was focused on the case E = P without realizing it. My bad.

I don’t know about you, but I need to be careful. If we have orthogonality but not orthonormality, then we need both E and P. If we do not have even orthogonality, we may not care about P but we will still need E.

To put it another way, any consequence of orthogonality which is independent of P is also independent of E.

Two examples

Let me close by giving you two examples of how important the quadratic corollary is.

For the first example, let us look for a scaling function with two coefficients h0 and h1, whose integer translates are orthonormal. Write the dilation equation as usual with A = \sqrt{2}\ , which gives us the sum of the h’s (a linear condition):

\sum_n h(n) = \sqrt{2}\ .

Then write the corollary of the quadratic condition also for A = \sqrt{2}\ :

\sum_n h(n) h(n) = 2/2 = 1\ .

That is, we have two equations in two unknowns,

\begin{array}{l} \text{h0}+\text{h1}=\sqrt{2} \\ \text{h0}^2+\text{h1}^2=1\end{array}

and their solution is unique: h0 = h1 = 1/\sqrt{2}\ . The Haar system. Don’t bother looking for another orthonormal set.

So. From the sum of the h’s and the quadratic condition, we can derive the Haar scaling function and show that it is the only one which leads to an orthonormal basis with two h’s.

For the second example, let us look for a scaling function with 4 coefficients, whose integer translates form an orthonormal basis for V_0\ . Sound familiar?

Again, we have the linear condition on the sum of the h’s…

h0 + h1 + h2 + h3 = \sqrt{2}\ ;

and the quadratic corollary…

h0^2 + h1^2 + h2^2 + h3^2 = 1\ .

But we also have the general quadratic condition itself, for N=4,

h(1) h(1-2 k)+h(2) h(2-2 k)+h(3) h(3-2 k)+h(0) h(-2 k)=\delta(k)\

and then for k = 1 it reduces to…

h(0) h(2)+h(1) h(3)=0\ .

(For other values of k, we get 0 = 0.)

We have three equations in four unknowns:

\begin{array}{l} h(0)+h(1)+h(2)+h(3)=\sqrt{2} \\ h(0)^2+h(1)^2+h(2)^2+h(3)^2=1 \\ h(0) h(2)+h(1) h(3)=0\end{array}

We do not get a unique solution but one of the solutions is, indeed, the Daubechies D4 scaling function.

I could write out the family of solutions in terms of h[0] as a parameter, but they are not simple. I’m not going to write them out.

I’m just going to recall the family of solutions as given by Burrus et al (and which I’ve already shown you),

h(0)=\frac{-\cos (a)+\sin (a)+1}{2 \sqrt{2}}

h(1)=\frac{\cos (a)+\sin (a)+1}{2 \sqrt{2}}

h(2)=\frac{+\cos (a)-\sin (a)+1}{2 \sqrt{2}}

h(3)=\frac{-\cos (a)-\sin (a)+1}{2 \sqrt{2}}

(Of course, we could actually verify that these solutions do satisfy our original 3 equations. I’m pretty sure I did that for myself. In fact, I think that’s how I find the mistake in the original post!)

The Daubechies h’s are found by setting a = \pi/3\ .

I still haven’t shown why the Daubechies are special, but we’re closer: now, at least, we know that they are one of an infinite number of solutions which satisfy 3 linear and quadratic conditions on the 4 h’s.

I also haven’t shown how we would get the solutions in terms of the angle a. Well, I don’t know yet. There is a chance that another derivation of the Daubechies D4 will proceed via trig polynomials and give us this compact family of solutions. There is also a chance that getting the solutions in terms of an angle is simply black magic.

I’ll try to find out, as we proceed.

Wavelets: Review I and Going Forward a Little

Let us recall what we have.

We have a collection of nested spaces…

\dotsm\ V_{-3} \subset\  V_{-2} \subset\  V_{-1} \subset\  V_{0} \subset\  V_{1} \subset\  V_{2}\ \dotsm\ \

… whose intersection is the trivial space and whose union is all square integrable functions on the real line:

\cap\ V_i = \{0\}\ and \cup\ V_i = L^2(R)\ .

We assume that the spaceV_0\ is translation invariant and has the scaling property:

f(x) \in V_0\ \text{ if and only if } f(x-k) \in V_0\ for all integers k;

f(x) \in V_j \text{ if and only if } f(2^{-j} x) \in V_0\ .

Finally, the only real theorem I have shown you says that if we also have an orthonormal basis for V_0\ , then we can get an orthonormal basis for L^2(R)\ :

\psi_{j,k} (x) := 2^{j/2)} \psi(2^j\ x - k)\

The proof is constructive. In addition to giving us that orthonormal basis, it gives us spaces W_j\ , each of which is the orthogonal complement of V_j\ in V_{j+1}\ . Furthermore, for fixed j, we have an orthonormal basis for W_j\ :

\psi_{j,k}(x)\ will be an orthonormal basis of W_j\ .

We have asserted two kinds of orthogonality:

  • that we have orthonormal bases (for V_0\ and for L^2(R)\ ;
  • that we have orthogonal direct sums.

Can we relax these two conditions separately? Yes.

I have shown you an explicit example — for the linear spline (semi-orthogonal wavelets — Strang & Nguyen call them that on the inside front cover, so I will too; they are also called pre-wavelets) — where the bases for V_j\ and W_j\ were not each orthogonal, but the spaces themselves were orthogonal.

I did not provide any existence theorem for that example, nor did I even tell you how I found the mother wavelet for that example. We’ll get there. But the example itself shows that we can have an orthogonal direct sum without having orthonormal bases for V_j\ and W_j\ .

Let me empahsize that, at present, I have no theorem associated with the mother wavelet for the linear splines.

  • The mother wavelet is supposed to be orthogonal to all the translates of the linear spline scaling function, and computation supports that.
  • I infer that all the translates of the mother wavelet are also orthogonal to all the translates of the scaling function.
  • Therefore all the translates of the mother wavelet are in W_0 (because it’s the orthogonal complement of V_0\ ).
  • I say it seems clear that the same is true for scaled versions of the mother wavelet, so, for example, \psi(2t-k) \in W_1\ .

But I don’t actually know some of the most important things I need:

  • do the W_j\ spaces have the scaling property: f(x) \in W_j \text{ if and only if }f(2^{-j}) \in W_0\ ? (I believe it for the \psi\ , but is it true for the spaces?)
  • do the \psi(t-k) \ in W_0\ span W_0\ ?
  • are they a basis for W_0\ ?

(Remember that W_j\ is defined as the orthogonal complement of V_j\ , while the V_j\ are defied as the spans of the scaled and translated scalings functions. While we have finally gotten our hands on elements of W_j\ , we don’t know very much about them. I do expect (but do not know) that once I can explain where the g coefficients came from, we’ll know that we do have bases and the scaling property.)

With the caveat that there are some some crucial concerns about the wavelets for the linear splines, let’s just keep using the linear splines as an example of a non-orthonormal basis. (They are, but are the wavelets?)

So, let me recast the dilation equation and its consequences in this framework. For a few moments let us forget that we have an orthonormal basis for V_0\ .

Suppose we do have a basis, which may not be orthonormal. The translates of the linear splines are such a basis – for the space they span.

If we have a scaling function whose integer translates are a basis for V_0\ , then I am quite sure (I didn’t say I had proven) that the translated and scaled functions are a basis for V_1\ . Maybe I should say it the other way, that if the translated and scaled functions are a basis for V_1\ , then we get the dilation equation:

\varphi(t) = \sum_n {h(n)\ A\ \ \varphi(2\ t - n)}.

That just says that the scaling function is an element of V_1\ and may be written in terms of this basis for V_1\ . Of course, this equation serves as the definition of the h’s.

From the dilation equation, we concluded that the sum of the h’s = 2/A.

We also decided that the scaling function, as a solution of the dilation equation, was determined only to within a multiple. One way to characterize those various multiples is by the integral of the function:

E = \int\ \varphi(t)\ dt \ .

I believe that people have shown that the sum of the even h’s is equal to the sum of the odd h’s. I do not yet know when this is true or how to prove it, so I treat it as a condition to be checked whenever I have a collection of h’s in my hand.

In particular, we saw that it was true for the linear splines – our example of a non-orthonormal basis – even though there were an odd number, specifically 3, nonzero h’s.

If the scaling function has been normalized so that its integral is one (E=1), then we had two further consequences:

  • \sum_k { \varphi(\frac{k}{2^j})} = 2^j \text{(for E = 1)}
  • \sum_k { \varphi(t+k)} = 1 \text{(for E = 1)}\

With the passage of time, I have decided (Hey, sometimes I’m slow! More like, overwhelmed.) that the general forms of those two equations are:

  • \sum_k { \varphi(\frac{k}{2^j})} = 2^j\ E\
  • \sum_k { \varphi(t+k)} = E\

We can confirm those by working thru the algebra in the general case; we can also just realize that going from E = 1 to E ≠ 1 effectively replaces the scaling function \varphi\ by E\ \varphi\ . They appear to be independent of A.

(I have totally messed up the notation vis a vis Burrus et al., but I’m not going to change it now. If you are reading Burrus et al., you have every right to curse me out; my E is his A_0\ , and his E is the integral of the square of the scaling function. I will remain consistent to the notation I have adopted, even though it is not that of Burrus et al. When I write “E”, its the integral of the scaling function, not of its square.)

These, then, are the six consequences I listed earlier. Actually, two of them really serve only to identify the constants A and E; but if someone just hands me a scaling function, I will compute A and E in order to see what conventions they’re using.

Let me write all 6 consequences together:

  • The sum of the h’s = 2/A.
  • \varphi(t) = \sum_n {h(n)\ A\ \ \varphi(2\ t - n)}\ .
  • E = \int\ \varphi(t)\ dt \ .
  • The sum of the even h’s = the sum of the odd h’s.
  • \sum_k { \varphi(\frac{k}{2^j})} = 2^j\ E\
  • \sum_k { \varphi(t+k)} = E\

I was very happy to have an example of a non-orthogonal basis – the linear splines – for which I have already verified those properties.

So much for consequences of the dilation equation. What about the mother wavelet?

If we have singled out a particular function in W_0\ , hence in V_1\ , then we get an equation, similar to the dilation equation for the scaling function:

\psi(t) = \sum_n {g(n)\ A\ \ \varphi(2\ t - n)}\ .

(I can’t imagine that that particular function is anything but the mother wavelet, and then this equation says that the coefficients called g are the ones associated with the mother wavelet.)

There is one particular consequence of having an orthonormal basis for V_0\ — the recipe for construction of the orthonormal basis of wavelets leads to the property that the sum of the g’s is zero.

That, in turn, implies that the integral of the mother wavelet — hence the integral of every wavelet — is zero.

But the proof that the integral of the mother wavelet is zero depends only on two properties:

  • that the sum of the g’s is zero;
  • that we may interchange a possibly infinite series and an integral.

The point I am trying to make is that:
if the sum of the g’s is zero — whether our bases are orthonormal or not — then we should expect the integral of the mother wavelet to be zero. If it is not, then there must have been a problem interchanging the integral and an infinite series (and I might be able to provide such an example).

To put that another way, I want to compute the sum of the g’s in any case.

And to be quite explicit, the proof goes by integrating both sides of the wavelet equation. From

\psi(t) = \sum_n {g(n)\ A\ \ \varphi(2\ t - n)}\ .

we get

\int \psi(t) \, dt = \int (\sum_n {g(n)\ A\ \ \varphi(2\ t - n)}) \, dt\ .

If we can interchange the integral and the summation – whic may be an infinite series rather than a finite sum – then we can extract the sum of the g’s:

\int \psi(t) \, dt = \sum_n {g(n)\ \int (\ A\ \ \varphi(2\ t - n)}) \, dt\ ,

and so long as the integral on the RHS is finite, we have that the integral of the mother wavelet is proportional to the sum of the g’s.

So if the sum of the g’s is zero, then the integral of the mother wavelet is zero, unless something went wrong interchanging the limit operations.

As I said before, if the sum of the g’s is zero, expect that the integral of the mother wavelet is zero – and if it isn’t, then realize that we’re looking at a pathological case.

To summarize, I have discussed only two properties of the mother wavelet: its sort-of dilation equation and that if the g’s add up to 0, then the integral of the mother wavelet ought to be zero:

  • \psi(t) = \sum_n {g(n)\ A\ \ \varphi(2\ t - n)}\ .
  • \sum_n {g(n)} = 0 \text{ implies } \int \psi(t) \, dt = 0

I have not checked the sum of the g’s for the mother wavelet for our only non-orthonormal basis, the linear splines.

Why wait?

In the semi-orthogonal post, I handed you the g’s for the mother wavelet orthogonal to the translates of the linear spline scaling function:

g = \left\{\frac{1}{24 \sqrt{2}},-\frac{1}{4 \sqrt{2}},\frac{5}{12 \sqrt{2}},-\frac{1}{4 \sqrt{2}},\frac{1}{24 \sqrt{2}}\right\}\

They do indeed add up to zero. I expect that the integral of the mother wavelet is zero, and we can probably prove it by writing it in terms of the scaling function. Not now.

I should point out that even though we know so little about this mother wavelet, we know that it is in V_1\ because we were given the g’s that describe it wrt our basis \{\varphi(2t-k)\}\ of V_1\ ! And I was assured that it was, in fact, in W_0\ .

I had intended to march right on into the consequences of our two kinds of orthogonality, but this post is probably long enough, and this is a good stopping point.

Wavelets: Multiresolution Analysis (MRA)

By and large I try not to flee into cold, formal mathematics, not here. You can find all too many books that will give you just the mathematics (and some of them are invaluable references). On the other hand, sometimes I am way too vague. Let me try to give you a clear statement of what I’m going on about when I say “multiresolution analysis” (MRA).

(Don’t misunderstand me. When I’m studying something, I’m sitting there with collections of definitions, theorems, proofs, and examples – trying to make sense of them. I just think that the collection itself is not a substitute for understanding.)

There’s a lot to be said for having a clearly stated theorem. Working through this has had a large impact on my own grasp of the properties of wavelets.

The most lively summary of multiresolution analysis can be found in the first couple of pages of chapter 5 of Daubechies’ “Ten Lectures on Wavelets”. The most lively introduction can be found on pages 10-16.

I have more than a few, close to several, books that seem to present the following concepts in the same way. First, they use multi-resolution analysis to describe orthonormal wavelets; then they use filter banks to describe biorthogonal wavelets; finally, they explain how biorthogonal wavelets could be described by a modified multi-resolution analysis.

No one (of these authors at my fingertips) starts with a modified multiresolution analysis, instead of ending with it. And so I’m not going to either. Although I have not shown you biorthogonal wavelets yet, I have shown you “pre-wavelets” (which I would prefer to call semi-orthogonal wavelets), so we will have something in hand while we look at the standard multi-resolution analysis leading to orthonormal wavelets.

In other words, although I will start the same way these authors do, I will be waving a generalization in our faces, probably with the very next wavelet post.

I want to summarize Daubechies’ introduction to multi-resolution analysis (pp. 129-130). If you own a copy of her book, you should open it to chapter 5 and read the master. If you don’t own a copy of her book, and you are seriously interested in wavelets, then — at the very least — you should find a colleague or a library that owns a copy. But you don’t have to do it right away.

Here’s my summary of the beginning of her chapter 5.

And if something comes out wrong here, you should probably (ahem, almost certainly) assume that she got it right and I got it wrong. Oh, I hope it turns out that I’ve clearly distinguished my comments from her summary. Let’s go.

She says, “Multiresolution analysis provides a natural framework for the understanding of wavelet bases, and for the construction of new examples.”

So how does she describe multiresolution analysis? It is a collection of spaces V_j\ which satisfy 6 conditions (whose equation numbers from her text I have included).

First (5.1.1), let’s suppose we have an infinite ladder of closed spaces, one inside another:

\dotsm\ V_{-3} \subset\  V_{-2} \subset\  V_{-1} \subset\  V_{0} \subset\  V_{1} \subset\  V_{2}\ \dotsm\ \

(That is not her notation. She uses the negatives of those indices for the ordering of the subspaces, but I will continue to follow Burrus et al. She said, “I choose here the same nesting order (the more negative the index, the larger the space) as in the ladder of Sobolev spaces. This is also the order that follows naturally from the notation of non-orthogonal wavelets…. It is nonstandard, however: Meyer (1990) uses the reverse ordering, more in accordance with established practices in harmonic analysis.”)

Let me empahsize: I have displayed, and Burrus and I are using, an order in which the more positive the index, the larger the space. Where we have, in particular,

 V_{0} \subset\  V_{1}

she writes  V_{0} \subset\  V_{-1}\ .

Let me also emphasize that the notation is not standardized, and in any given book, you need to see which of the two conventions they are using.

Second and third, let’s suppose that the union of these spaces is all of L^2(R)\ , the space of (equivalence classes of Lebesgue) square-integrable real-valued functions on the real line R, and their intersection is the {0}-space; that is, (5.1.2),

\cup\ V_i = L^2(R)\

and (5.1.3)

\cap\ V_i = \{0\}\ .

Burrus has those conditions; i just didn’t mention them before. One of our goals is to construct a basis for all of L^2(R)\ .

Fourth (5.1.4), “… the multi-resolution aspect is a consequence of the additional requirement…”

f(x) \in V_j \text{ if and only if } f(2^{-j} x) \in V_0\ .

(I did have to change that a little bit because I switched from her index convention on the V_j\ .)

Furthermore, I phrased that the other way, and so did Burrus:

f(x) \in V_0 \text{ if and only if }f(2^{j} x) \in V_j\ .

I can’t see that it matters, given that the conditions are “if and only if”.

Fifth (5.1.5), we will also require that the space V_0\ be invariant under translation:

f(x) \in V_0 \text{ if and only if } f(x-k) \in V_0\ for all integers k.

Sixth (5.1.6) and finally, we will assume that there exists a scaling function \varphi\ (she uses the symbol \phi\ ) such that its integer translates are an orthonormal basis for V_0\ .

NOTE that the scaling function \varphi\ is merely one example, a special case, of the functions f \in V_0\ . She, like most of my references, is distinguishing the scaling and translation properties of the spaces (with elements f) from the scaling and translation properties of the basis \{\varphi(x-k)\}\ .

This was not what Burrus, and I, did. I followed him in starting with the scaling function \varphi\ , defining V_0\ as its space of translates, V_1 as the space of scaled translates, and then requiring that V_0 \subset V_1\ . Instead of talking about arbitrary elements f in V_0\ , we spoke specifically about the basis generated by integer translates of \varphi\ .

This is a very natural thing to do, since we have the Haar and Daubechies orthonormal wavelets in our hands…. More precisely, since we have the orthonormal scaling functions in our hands, it is natural to use them to define the spaces V_j\ .

The general multiresolution analysis is stated, by contrast, as though we define the spaces V_j\ and then find an orthonormal basis for V_0\ . Well, that what’s Daubechies did, to get her wavelets.

(In fact, I avoided talking about whether the \varphi\ (x-k) were a basis. I haven’t even defined what I mean by a basis! Let me back up and see what Daubechies said earlier. Hmm. Not much at all. It appears to me that she takes for granted the notion of a basis in a Hilbert space. Let me do the same for a while. And I’ll comment at the end.)

Back to Daubechies.

“The basic tenet of multiresolution analysis is that whenever a collection of closed subspaces satisfies [those 6 conditions], then there exists an orthonormal wavelet basis… of L^2(R)\ …”

\psi_{j,k} (x) := 2^{j/2} \psi(2^j\ x - k)\

such that (5.1.7, projection formula)

P_{j+1} f = P_j f +  \sum_{k \in Z} \  \psi_{j,k}\ .

where P_j is the orthogonal projection operator onto V_j\ . She continues: (5.1.2) ensures that \lim_{j \to \infty} P_j f = f \text{ for all } f \in L^2(R)\ . In my words, functions in our V_j\ spaces are better approximations of functions in L^2(R)\ as we go to higher indices.

This is it in a nutshell. The key theorem is that those 6 assumptions get us an orthonormal basis \psi_{j,k} (x) \text{ for } L^2(R)\ .

Along the way, we get several other results. First, I should emphasize that we just said that all the wavelets are mutually orthogonal, across scale and under translation.

For another thing, the proof is by construction, so we actually know what the \psi\ are. They live in (our familiar) W_j\ spaces, each of which is the orthogonal complement of V_j\ in V_{j+1}\ :

V_{j+1} = V_j \oplus W_j\ .

(She will obtain our general formula for the g coefficients of \psi from the h coefficients of \varphi\ .

She points out that the W_j\ spaces inherit the scaling property:

f(x) \in W_j\text{ if and only if }f(2^{-j}) \in W_0\ .

Equivalently (it seems to me), f(x) \in W_0\text{ if and only if }f(2^{j}) \in W_j\ .

She also points out that (5.1.7, that projection formula) tells us that for each fixed j, \psi_{j,k}(x)\ will be an orthonormal basis of W_j\ ; even better, once we know \psi_{j,k}(x)\ is an orthonormal basis of W_0\ , then for each j, \psi_{j,k}(x)\ will be an orthonormal basis of W_j\ .

Hence it suffices to get one specific \psi\ , the mother wavelet.

She tells us something similar for the V_j\ spaces: from the basis for V_0\ , we get a basis for every V_j\ . If we define and denote

\varphi_{j,n} (x) := 2^{j/2} \varphi(2^j\ x - n)\ ,

for integers j,n

then \varphi_{j,n}(x)\ is an orthonormal basis for V_j\ .

(We check that. For j = 0, we have

\varphi_{0,n}(x) = \varphi(x-n)\ ,

which are translates of the scaling function

and

\varphi_{0,0}(x) = \varphi(x)\

is the scaling function itself. So this formula does work for V_0\ . As does the formula for the \psi_{j,k}(x)\ .)

We get all the V,W orthogonal decompositions…

W_j\ orthog to W_k\ if j ≠ k.

V_j = V_J \oplus_{k=0}^{j-J-1}\ W_{J+k}\ .

(I had to change that from what she had because of the different sign convention on indices. I hope I got it right, but we’ve already seen it in special cases.)

L^2(R) = \oplus_{j \in Z}\ W_j\ .

(That last one better be true, since it corresponds to the \psi_{j,k}(x)\ being a basis for L^2(R)\ !)

To construct \psi\ she derives a few more properties, which I will omit, since we’ll see some of them soon enough (and others pertain to the frequency domain, which we’ll see down the road). Perhaps it is more important that she obtains \psi\ as the inverse Fourier transform of another function. As a result, the \psi\ may be distributions instead of functions.

The properties I wanted to make sure we saw here are:

  • the \psi_{j,k}\ are an orthonormal basis for L^2(R)\ ;
  • for fixed j, the \psi_{j,k}\ are an orthonormal basis for W_j\ ;
  • the W_j\ spaces have the scaling property: f(x) \in W_j \text{ if and only if }f(2^{-j}) \in W_0\ ;
  • for fixed j, the \varphi_{j,k}\ are an orthonormal basis for V_j\ .
  • Projections and direct sums

    At some point when I was looking at this — I don’t remember whether it was her projection operator or someone else’s — I finally remembered that direct sum decompositions correspond precisely to projection operators. That is, if we have a (possibly non-orthogonal) direct sum of subspaces,

    V = U + W,

    then the linear operator P which maps V onto U is a projection, and I – P is the projection onto W. And we have a very convenient test: a linear operator P is a projection if and only if it is idempotent:

    P^2 = P\ .

    Furthermore, the direct sum is orthogonal

    V = U \oplus V\

    if and only if the projection P is self-adjoint as well as idempotent.

    How could I have forgotten that projections and direct sums go together? Well, I’ll try not to forget again. If one is on stage, then so is the other; one may be in the background, and the other in the foreground, but they are both on stage.

    I might as well make a few more remarks. In finite dimensional spaces, it’s straightforward to get a projection onto a subspace. If we have U \subset V\ , then we start with a basis for U, extend it to a basis for V, and then define the projection operator by its action on that basis for V: if e is a basis vector in U, then Pe = e; if e is a basis vector not in U, then Pe = 0.

    In infinite dimensions we have more work to do. in fact, my recollection is that John von Neumann created the axiomatic theory of infinite dimensional complete inner product (i.e. Hilbert) spaces by working with the properties of orthogonal projections.

    closed nested spaces

    Why did Daubechies assume that the V_i were closed? Well, let me just quote the appendix of Halmos’s Finite Dimensional Vector Spaces: “In the discussion of manifolds, functionals, and transformations the situation becomes uncomfortable if we do not make a concession to the topology of Hilbert space. Good generalizations of all our statements for the finite dimensional case can be proved if we consider closed linear manifolds, continuous linear functionals, and bounded linear transformations. (In a finite dimensional space every linear manifold is closed, every linear functional is continuous, and every linear transformation is bounded.)”

    In other words, she made the standard assumption of closed subspaces for working in infinite dimensions.

    The Lebesgue integral and the space L^2

    If you have never seen the Lebesgue integral, think of L^2(R)\ as a space of functions whose squares are Riemann integrable (it isn’t); the subtlety here is that we really consider two functions to be the same if they differ only on a set of measure zero — think, “at a finite or countable number of points”. We’re not just using functions so well behaved that their squares have a Riemann integral, but if you’ve never seen this, let it go for now.

    Actually, before we let it go, there’s another way to think of the Lebesgue integral…. I’m certain of this for compact intervals [a,b], so that’s how I’ll state it. Even though we’re interested in L^2(R)\ , the following property of a compact interval may be helpful. The space of continuous functions on [a,b] with inner product the Riemann integral is not complete: not all Cauchy sequences converge to a limit in the space. But it’s a metric space, so we can complete it – and the result is L^2[a,b]\ . That is, L^2[a,b]\ is to C^2[a,b]\ as R is to Q.

    You might counter with, “But C[a,b] is complete!”. Yes, with a different norm. It’s actually C^\infty[a,b]\ that is complete, since the max norm corresponds to uniform convergence.

    Let me be precise. We take the set X of continuous functions {x, y, …} defined on a compact interval J = [a,b]. Then we consider two different metrics. One is the sup (or max) metric:

    d(x,y) = max_{t \in J} |x(t) - y(t)|\ .

    Then the resulting metric space, usually denoted C[a,b], is complete: the limit of any Cauchy sequence exists in the space. Furthermore, convergence in that space is uniform convergence.

    But let’s take a different metric.

    (\int_a^b (x(t)-y(t))^2 \, dt)^{\frac{1}{2}}\ .

    Then the resulting metric space is not complete; but its completion is precisely L^2[a,b]\ . That’s one answer to the question, “Why should I care about the Lebesgue integral?”

    A basis for an infinite dimensional space

    The issue is that if a vector is written as an infinite series of coeffcients times basis vectors, then we must worry about whether that infinite series converges to a unique answer independent of the order of the basis vectors, as well as about convergence for any one particular order…. I’ve actually just taken a refreshing walk thru a few books, but even a summary of what’s involved when we talk about a basis is will take a full post in its own right.

    If you’re curious, however, we’re talking about Schauder bases rather than Hamel, and we’re concerned about unconditional versus conditional bases, and Riesz bases, and (gulp) frames (which are spanning sets that need not be linearly independent). And just for fun: every linear space has a Hamel basis (if you believe in the Axiom of Choice, i.e. Zorn’s Lemma), but not every separable normed linear space has a Schauder basis. And, since the proof of existence of a Hamel basis requires the Axiom of Choice, we have a non-constructive proof: there is a basis, but can you find it? By contrast, I think we can find a Schauder basis if one exists at all.

    wavelets without MRA?

    Let me quote Daubechies. From p. 136: “Even though every orthonormal wavelet basis of practical interest, known to this date [1992], is associated with a multiresolution analysis, it is possible to construct ‘pathological’ \psi such that the [\psi_{j,k} (x) := 2^{j/2} \psi(2^j\ x - k)\ ] constitute an orthonormal basis for L^2(R) but are not derivable from a multiresolution analysis.” (I altered the signs on her right-hand-side indices.) She provides an example.

    In other words, MRA is a very convenient framework from which to get wavelets, but not every wavelet comes from a MRA. Still, every wavelet of practical interest did. (I don’t know if that is still true, more than 15 years later. Let’s keep our eyes open.)

    Oh, one of her endnotes says that if the mother wavelet has compact support, then we can find an MRA which leads to it.

    Next, we will return to consequences of orthogonality, now that I have more context and understanding.

    Wavelets: semi-orthogonal, from linear splines

    Edit: 13 July, just one remark added. see “edit” below.

    How are things going? As I said yesterday in “happenings”, the simplest answer is: right now I’m just thrilled when I can reproduce a drawing in a book, even if I don’t understand why the method works or where the function came from.

    This is post that I ended up with when I started out to write a “happenings” post yesterday. I promised you a picture. In fact, I’ll give you a mother wavelet for the linear spline scaling function. (That link occurs a few more times. What can I say? That’s what I’m building on, in more than one way.)

    You may recall that I have shown you two ways to approximate a scaling function. The one I do not understand applies convolution and downsampling repeatedly. The one I do understand is the dyadic expansion, and I’ve been using it ever since I worked it out, in preference to the other method.

    Well. I have now seen a scaling function which is infinite at all the dyadic points, so the dyadic expansion hasn’t got a prayer of working! But the convolution and downsampling algorithm gives me a drawing which seems to match the book.

    I do not yet understand how that scaling function was determined, but at least I have a picture. (Not the picture you’ll see later in this post. Different function, different picture; soon but not now.)

    I also have a picture of the corresponding mother wavelet. But this scaling function and mother wavelet are half of a biorthogonal set of four functions — and I don’t yet understand how to get them. (Biorthogonal means that I have a non-orthonormal basis and its reciprocal basis.) Worse, although I’ve seen pictures of the other pair of functions, I don’t know enough about them to even draw them!

    But I expect to. Most of the time. Sometimes in the evening I wonder if I’m in over my head, but the feeling is usually gone when I sit down to do mathematics the next morning.

    On the plus side, I have read about these new functions while looking at orthogonality — more precisely, while looking at relaxing the orthogonality assumptions. That is, while looking at biorthogonal and semi-orthogonal wavelets. (Okay, that last is not standard terminology, as far as I know: I think the second case is called “a semi-orthogonal multi-resolution analysis” and “pre-wavelets“. It just seems natural to me to apply the term “semi-orthogonal” to the wavelets, especially when I’m contrasting them with orthogonal or biorthogonal wavelets.)

    Here is an overview of what is going on. I’ve said at least some of this in more detail before.

    A scaling function \varphi(t) and its integer translates define a space V_0\ . Then the function \varphi(2t) and its integer translates define a space V_1\ . It is a major assumption that V_0 is a subspace of V_1\ (and this, combined with the definition of V_1\ , gives us the dilation equation). Then we define the space W_0\ as the difference between V_0\ and V_1\ (i.e. as the complement of V_0\ in V_1\ ). W_0\ is the space of the mother wavelet. We have

    V_1 = V_0 + W_0\ .

    If \varphi(t) and its integer translates are an orthonormal basis for V_0\ , then W_0\ is in fact the orthogonal complement of V_0\ ; the direct sum is an orthogonal direct sum

    V_1 = V_0 \oplus W_0\

    and we may find a function (the mother wavelet) whose integer translates form an orthonormal basis for W_0\ .

    (Incidentally, in most of my books, people appear to use \oplus\ for the direct sum whether or not it is orthogonal. That is, you can’t tell usually from the written decomposition whether or not it is orthogonal. You have been warned.)

    (Another aside…. To see an example of a non-orthogonal direct sum decomposition, take R^2\ , i.e. the xy-plane, and take the non-orthogonal basis (1,0) and (1,1). Then the x-axis and the line y=x are non-orthogonal 1D subspaces, and their direct sum is the xy-plane. One possible orthogonal direct sum, in contrast, is the x-axis and the y-axis.)

    But what if \varphi(t) and its integer translates are a basis for V_0\ , but not orthonormal? More specifically, what if they are not orthogonal? (If they are orthogonal, simply changing their sizes will make them orthonormal.)

    You should be thinking of linear splines, those triangles that form a basis for V_0\ but not an orthogonal one.

    Well, we can still find an orthogonal direct sum, and we can even find a mother wavelet whose integer translates form a basis for W_0\ .

    But just as the basis for V_0\ is not orthogonal, neither is the basis for W_0\ . Still, because the direct sum is orthogonal, every element of W_0\ is orthogonal to every element of V_0\ . They’re just not orthogonal to thir own integer translates: \psi(t)\ may very well be orthogonal to many or most of its integer translates, but it is not orthogonal to all of them.

    You know what? Pictures are good, so here are two pictures.

    The linear spline scaling function, as we have seen, has three h coefficients, namely:

    \left\{\frac{1}{2 \sqrt{2}},\frac{1}{\sqrt{2}},\frac{1}{2 \sqrt{2}}\right\}\

    … and it looks like this:

    july 12 scaling

    Now, if we take as our mother wavelet a function whose g coefficients are (think of me as Moses waving stone tablets in your face saying, “God gave me these.”) …

    \left\{\frac{1}{24 \sqrt{2}},-\frac{1}{4 \sqrt{2}},\frac{5}{12   \sqrt{2}},-\frac{1}{4 \sqrt{2}},\frac{1}{24 \sqrt{2}}\right\}\

    …then we get:

    july 12 mother

    The most significant feature may very well be that this mother wavelet has 5 coefficients while its scaling function has 3 coefficients.

    The second most important feature may be that this mother wavelet is a spline, like the scaling function. Some people know a lot about splines, and some of them would rather have splines than orthonormal bases.

    That mother wavelet is orthogonal to all of the integer translates of the linear spline. It is not orthogonal to its own translates by \pm1\ or \pm2\ , just as the linear spline is not orthogonal to its own translates by \pm1\ .

    This is the decomposition which is called a “semi-orthogonal multi-resolution analysis”, and that mother wavelet is called a “pre-wavelet”. So far, Strang & Nguyen (Strang, Gilbert; Nguyen, Truong.Wavelets and Filter Banks.Wellesley-Cambridge Press, 1997 (revised edition).ISBN 0 9614088 7 1) is the only book in which I have found this. Oh, this is not an example of biorthogonal wavelets; for that, I need a second scaling function and two mother wavelets.

    (edit: 13 July. I should remark that we’re not out of the woods in the semi-orthogonal case. Because our bases for V_0 and W_0 are not orthogonal, it seems to me that we will still need to find their reciprocal bases, for computing components of vectors.)

    This is the function whose existence I believed in but couldn’t find. Well, now I’ve been given it but I don’t understand where those five coefficients came from.

    Yet.

    As a reminder, the mother wavelet was computed from its g coefficients, using the analog of the dilation equation:

    july 12 defn mother

    Please note that the scaling function was computed, as before, using the dyadic expansion; therefore the mother wavelet is computed only at dyadic points. In particular, neither of these is the terrible function I referred to, but have not yet shown you, which is infinite at all the dyadic points, and therefore must be computed by a different method.

    Oh, a point is “dyadic” if it’s an integer divided by a power of 2; that is, all those points at which I have been computing scaling functions.

    Let me show you some quick calculations of integrals. As usual, I am going to calculate the finite sum of areas of rectangles:

    \sum_i { f(x_i)\ dx}\

    with dx = 1/128, and f(x) = a product of two functions. Just for simplicity, I let x range from -6 to 6, because some of my functions are shifted. NOTE that the shift is given by k, and that value is set in the command.

    First, let’s approximate \int\ \varphi(x)\ \psi(x)\ dx\ . It should be 0, because the mother wavelet is orthogonal to the scaling function:

    july 12 orth 1

    Just to be clear: that calculation had f(x) = \varphi(x)\ \psi(x)\ .

    The mother wavelet is also orthogonal to translates of the scaling function, so we expect \int\ \varphi(x-1)\ \psi(x)\ dx = 0\ . The approximating sum is…

    july 12 orth 2

    and for \int\ \varphi(x-2)\ \psi(x)\ dx = 0\ , I get:

    july 12 orth 3

    Those three were very good (and identical!) approximations to 0. If we now approximate \int\ \varphi(x-3)\ \psi(x)\ dx = 0\ , we will get exactly zero, because the functions do not overlap at all; we have shifted the scaling function quite far enough.

    july 12 orth 4

    Now look at the inner product of two translates of wavelets. Just as the scaling function is not orthogonal to overlapping translates, the mother wavelet is not orthogonal to its overlapping translates.

    Here, as one example, is (an approximation of) \int\ \psi(x-1)\ \psi(x)\ dx\

    july 12 not orth

    We’ve seen numbers that approximate of zero, and that isn’t close.

    What about scale? That still works: a wavelet at one scale is orthogonal to wavelets at a different scale. Here, as one example, is the approximation of \int\ \psi(2x-1)\ \psi(x)\ dx\ .

    july 12 orth 5

    I expect that the next wavelets post will be a clean summary of multi-resolution analysis for orthogonal wavelets. I’ve been flailing a bit, and I think I owe you some clear-cut statements of what we can get.

    I was all set to put out a post about consequences of orthogonality, a couple of weeks ago, until I realized that the two ideas (orthogonality of subspaces V,W and orthogonality of functions) needed to be distinquished at a deeper level. Not just conceptually, as I had, but in terms of their consequences.

    This example, which has orthogonal subspaces V,W but has neither orthogonal translates of scaling functions nor orthogonal translates of wavelets, makes me very happy. I know we can break the link between the two ideas of orthogonality, and in fact I now have both the scaling function and the mother wavelet in my hands, so I can compute to my heart’s content and see what is no longer true in this case.

    (There’s nothing special about linear splines. We can do this for higher order splines, too. Well, okay, I still have to figure out how to derive the g coefficients for the mother wavelet. And I’m pretty sure there were other possible solutions for the linear spline, with more than 5 coefficients, so there are probably an infinite number of solutions for any spline.)

    I am now willing to return to the standard presentation of multi-resolution analysis of orthogonal wavelets, which assumes that both forms of orthogonality hold. I now have sufficient context for that standard presentation. That is enough to let me move on; I just needed to know their place in the universe. And, as I say, after all the confusion, I owe you an unambiguous summary of orthogonal wavelets.

    Wavelet properties: orthogonality & counterexample

    orthogonality

    Edit 3 Aug. It is embarrassing to be so wrong in my guesses. Well, I understand more today than I did yesterday.

    The two forms of orthogonality are not the same. We saw in the subsequent post about semi-orthogonality that we could have an orthogonal direct sum V_0 \oplus W_0 while having non-orthonormal bases in V_0 and W_0\ . Conversely, if we had a direct sum which was not orthogonal, we could still choose orthonormal bases in the two spaces. Okay, let me rephrase that: if we have bases in V_0 and W_0\ at all, I know we can make them orthonormal.

    I am no longer sure about the following guess about integer translates of elements of W_0\ . And I am completely wrong about the conjecture in red.

    Edit 28 Jun: I believe that once I know that \psi(t)\ is a wavelet (i.e. an element of W_0\ ), then so are all of its integer translates. This remark occurs once later, as well as here.

    Let me speak in general terms first.

    There seem to be two properties subsumed under the rubric “orthogonality”.
    Edit June 29. “Seem” is the correct word. I am not ready to make this precise, but these two apparently different properties are very close, and possibly the same. I think I can say that if the scaling function and its translates are not orthogonal, then W_0 is not an orthogonal complement to V_0\ . To put that another way, we can still define a direct sum decomposition

    V_1 = V_0 + W_0

    but V_0\ and W_0 are not orthogonal subspaces. And so, when I convinced myself that “if \psi(t)\ is a wavelet (i.e. an element of W_0\ ), then so are all of its integer translates”, I was right… because I was assuming W_0\ to be orthogonal to V_0\ , and that’s one heck of an assumption.

    I am looking into non-orthogonal subspaces, and into biorthogonal wavelets.

    One is that we have, for example, the orthogonal direct sum decompositions

    V_1 = V_0 \oplus W_0\

    V_2 = V_1 \oplus W_1\

    V_2 = V_0 \oplus W_0 \oplus W_1\ .

    We are using the L2 inner product, \int f(t) g(t) \, dt\ . The functions f and g are orthogonal if \int f(t) g(t) \, dt = 0\ .

    The first decomposition tells us that all the functions in V_0\ are orthogonal to all the functions in W_0\ . The second decomposition tells us the same thing about V_1\ and W_1\ : all the functions in V_1\ are orthogonal to all the functions in W_1\ .

    The third decomposition tells us more: all the functions in V_0\ are orthogonal to all the functions in W_1\ ; and all the functions in W_0\ are othogonal to all the functions in W_1\ .

    In particular, the scaling function \varphi and its integer translates which defined V_0\ are orthogonal to all the wavelets in W_0\ and all the wavelets in W_1\ .

    In orther words, for the scaling function \varphi we should have

    \int \varphi(t)\ \phi(t) \, dt\ = 0\ .

    \int \varphi(t-k)\ \phi(t) \, dt\ = 0\ .

    where \phi\ is any function (any wavelet) in any space W_i\ . (Yes, in particular this is true if \phi is the mother wavelet \psi\ ; we must hesitate to talk about the arbitrary translates or the integer translates of \psi\ until we know that they are in W_0\ .) Edit 28 Jun: I believe that once I know that \psi(t)\ is a wavelet (i.e. an element of W_0\ ), then so are all of its integer translates. We can stop hesitating.

    Similarly, we would have, for example,

    \int \varphi(2t)\ \phi(t) \, dt\ = 0\ .

    where \phi\ is now restricted to any function (i.e. any wavelet) in W_i\ except W_0\ . Why? \varphi(2t) \in V_1\ , which is orthogonal to W_1 \text{ or higher}\ but not necessarily to W_0\ .

    Let me point out that one of the challenges here is that we don’t know much about the spaces W_0\ ; in particular, we don’t know yet if \psi(t) \in W_0\ implies that \psi(t-1) \in W_0\ .

    The second property is that we may have that the scaling function and its integer translates are orthogonal:

    \int \varphi(t)\ \varphi(t-k) \, dt\ = P\ \delta(k) \ .

    where \delta(k) \ is the Kronecker delta, \delta(0) = 1\ and \delta(k) = 0\ for all k ≠ 0.

    There I do not need the distribution, the “Dirac delta function”, whose defining property is the inner product \int \delta(t)\ f(t) \, dt\  = f(0)\ . I just want to write one equation instead of these two:

    \int \varphi(t)\ \varphi(t) \, dt\ = P\ .

    \int \varphi(t)\ \varphi(t-k) \, dt\ = 0 \ , for k ≠ 0.

    However we write it, this is a property of major importance. We can deduce a whole lot more about wavelets in general, about a suitable mother wavelet, and about integer translations of wavelets. In particular, if the scaling function is orthogonal to its integer translates, then integer translates of the mother wavelet are also orthogonal to each other, and we speak of orthogonal wavelets.

    This property does hold, of course, for the Haar system and for all the Daubechies scaling functions. (Technically, that’s redundant: we will eventually show that the Haar system is in fact D2, the Daubechies wavelets defined by 2 nonzero h’s.)

    The reason I emphasize that there are two ideas of orthogonality here –orthogonal direct sums and integer translates of the scaling function — is that I sometimes get the impression – and it may be a mistaken impression – that people are using the later to prove consequences of the former. But the orthogonal direct sums are more general.

    Recall that we can — and have — used non-orthonormal (no, I didn’t say non-orthogonal) bases in finite dimensional vector spaces. These lead us quickly to the definition of a reciprocal basis: we take dot products of a vector with the reciprocal basis to compute the basis-coefficients of that vector. (As I post this, at least, “reciprocal basis” is in the tag cloud, so you can look there.)

    Well, we can do the same thing with wavelets. If we have a basis of wavelets but it is not orthonormal, then we need to construct the reciprocal basis. The pair, basis of wavelets and reciprocal basis, is called a bi-orthogonal system. Just a different name for something we’ve seen before.

    (If we have an orthogonal basis which is not orthonormal, the reciprocal basis differs from the original only by scale factors on each vector but the directions are the same; this means we just end up with scale factors associated with the dot-products. I will show you this, below. But what it means is that, in practice, a merely orthogonal basis is almost as good as an orthonormal one.)

    I am looking forward to playing with a bi-orthogonal system, so that I can look for the properties which depend on the ortogonal direct sum decomposition rather than on the orthogonality of integer translates of the scaling function.

    If the integer translates are orthogonal but not orthonormal, we can fix that just by dividing \varphi\ by its norm \sqrt{P}\ , where P is defined by

    \int \varphi(t)\ \varphi(t) \, dt\ = P\ .

    digression on Fourier

    Suppose we have a function which is precisely a finite Fourier series, say,

    f(x) = 3 \cos (2 x)+2 \sin (x)\

    That seems unambiguous and straightforward: our f(x) can be written exactly using sin(x) and cos(x). But, with the L2 inner product on {[0,\ 2\pi]}\

    \int_0^{2 \pi } f(t)\ g(t) \, dt\

    and the associated norm

    ||g|| = \sqrt{\int_0^{2 \pi } g(t)\ g(t) \, dt}\

    we have that the set {1, cos[nx], sin[nx]} is orthogonal but not orthonormal. Each of sin(nx) and cos(nx) has norm \sqrt{\pi}\ (because the integral of the square is \pi\ )…

    trig norms

    and the constant function 1 (= cos (0x), if you will) has norm \sqrt{2}\ \sqrt{\pi}\ .

    const norm

    The orthonormal basis is constructed by dividing each of these functions by its norm; the reciprocal basis is constructed by dividing each of these functions by the square of its norm; equivalently, the reciprocal basis is constructed by dividing each of the orthonormal functions by the norm of the original function.

    However we carry it out, the end result is that the set

    \{\frac{1}{\sqrt{2\ \pi}}, \frac{cos[nx]}{\sqrt{\pi}}, \frac{sin[nx]}{\sqrt{\pi}}\}\

    is an orthonormal basis; and the set

    \{\frac{1}{2\ \pi}, \frac{cos[nx]}{\pi}, \frac{sin[nx]}{\pi}\}\

    is the basis reciprocal to \{1, cos[nx], sin[nx]\}\ .

    In this form…

    f(x) = 3 \cos (2 x)+2 \sin (x)\

    f(x) is given in terms of the orthogonal but non-orthonormal basis. We can confirm that its components {3, 2} wrt that basis can be computed as the dot products of f with the reciprocal basis. Only two terms are nonzero:

    get coeffs

    We are more likely to have simply been told that the coefficients are to be found using

    \frac{1}{\pi}\ \int_0^{2 \pi } sin(x)\ f(x) \, dt\

    and

    \frac{1}{\pi}\ \int_0^{2 \pi } cos(2x)\ f(x) \, dt\ .

    Nothing about a reciprocal basis, just a recipe with a god-given factor of \frac{1}{\pi}\ in front of the integrals. Conceptually, those factors belong inside the integrals, under the trig functions. But it’s the same calculation.

    If, on the other hand, we had constructed the orthonormal basis \{\frac{1}{\sqrt{2\ \pi}}, \frac{cos[nx]}{\sqrt{\pi}}, \frac{sin[nx]}{\sqrt{\pi}}\}\ , then the coefficients wrt the orthonormal basis are

    get 2nd coeffs

    which may look weird until I remind you that we are implicitly writing

    f(x) = (3\ \sqrt{\pi})\  \frac{\cos (2 x)}{\sqrt{\pi}}+(2\ \sqrt{\pi})\  \frac{\sin (x)}{\sqrt{\pi}}\

    which is exactly our function f.

    The point I am belaboring – perhaps excessively – is that most of us have been coping with an orthogonal but non-orthonormal basis most of our lives, and constructing the reciprocal basis in such a case amounts to just sticking a scaling factor in front of the dot product calculation for the components. We get the components without explicitly desribing the reciprocal basis.

    We do much the same for wavelets and the scaling functions: so long as they are orthogonal to their integer translates, we cope with their norms not being 1.

    a counterexample

    Since I haven’t yet itemized the consequences of orthogonality under integer translation, I won’t show you that the following scaling function violates them. But I want to put this example out here, because it does satisfy all 6 consequences of the dilation equation; but it is not orthogonal under integer translation, and it will not satisfy the consequences of orthogonal integer translates.

    And yet, because we have clearly defined spaces V_0\ , V_1\ , etc., we should have orthogonal direct sum decompositions.

    Here it is. Just a triangle. You can also call it a linear spline. (And higher-order splines are also counter-examples.)

    triangle

    So, we can define the space V_0\ as the span of \varphi(t)\ and its integer translates \varphi(t-k)\ .

    NOTE that because the triangle \varphi(t)\ is positive on (0,2), it cannot be orthogonal to two of its integer translates, namely \varphi(t-1)\ and \varphi(t+1)\ . They overlap and there are no negative values to cancel things out.

    That is, it is false that \varphi(t)\ is orthogonal to all of its integer translates. All but two is two too few.

    For example, we can look at \varphi(t)\ (black) and \varphi(t-1)\ (red) and their intersection (yellow)…

    intersection

    That the area under the intersection is positive tells us that these functions are not orthogonal.. The inner product, however, is not the area. We can estimate the inner product by a sum…

    approx ip

    I should draw the product \varphi(t)\ \varphi(t-1)\

    exact ip pic

    but of course we can compute the inner product exactly, too. Since the functions are linear, their product is quadratic (where its not zero) and the integral is cubic. In fact, it’s

    exact ip calc

    Anyway, we take that triangular scaling function and all its integer translates to be V_0\ .

    We can define V_1\ as the span of the set \{\varphi(2t-k)\}\ . Those are just narrower triangles of height 1, e.g \{\varphi(2t)\}\ looks like….

    v1 triangle

    The really big question is: is V_0\ a subspace of V_1\ , V_0 \subset V_1\ ?

    The answer is yes. (Otherwise this wouldn’t be a counterexample!) We have that

    \varphi(t) = \frac{1}{2}\ \varphi(2t) +  \varphi(2t-1) +  \frac{1}{2}\ \varphi(2t-2)\ .

    I can show it to you (the command shows that I am adding three functions in V_1 to get my scaling function in V_0\ :

    sum of triangles

    Now, that’s our dilation equation, except that we don’t have the factor of \sqrt{2}\ . To put that another way, if the h’s are

    \{1/2,1,1/2\}\

    then their sum is 2.

    To put that a third way, right now we have A = 1.

    No big deal; to change the sum we multiply and divide by \sqrt{2}\ . This gives us our usual dilation equation

    \varphi(t) = \sum_{n} h(n)\ \sqrt{2}\ \varphi(2t-n)\ ,

    with new h’s

    \{\frac{1}{2 \sqrt{2}},\frac{1}{\sqrt{2}},\frac{1}{2 \sqrt{2}}\}\ .

    Now, we have 6 properties to check…

    • The sum of the h’s = 2/A.
    • \varphi(t) = \sum_n {h(n)\ A\ \ \varphi(2\ t - n)}\ .
    • E = \int\ \varphi(t)\ dt \ .
    • The sum of the even h’s = the sum of the odd h’s.
    • \sum_k { \varphi(\frac{k}{2^j})} = 2^j \text{(for E = 1)}
    • \sum_k { \varphi(t+k)} = 1 \text{(for E = 1)}\

    And I have been consistently referring to even h’s, for example, when I really mean h’s with even index, h(2n).

    We have (one) set A = \sqrt{2}\ and (two) the sum of the h’s is \sqrt{2}\ .

    There is only one odd h, h[1], and it’s \frac{1}{\sqrt{2}}\ , and each of the two even h’s is half that value, so we do have (three) the sum of the even h’s equals the sum of the odd h’s. Now is a good time to remark that the number of h’s is not even.

    What about E = \int\ \varphi(t)\ dt \ ?

    Our scaling function is a triangle of base 2 and height 1, so (four) its area is 1, so E = 1.

    That triangle is continuous, so we should have (five) the generalized partition of unity

    \sum_k { \varphi(t+k)} = 1 \text{(for E = 1)}\

    Let’s try a few. Here are t = 1/2, 1/4, -3/8, 31/16. This sample is not a proof, but it’s all I personally need for now.

    generalized partition

    And since E = 1 we should have (six) the dyadic sum

    \sum_k { \varphi(\frac{k}{2^j})} = 2^j\ .

    Let’s try the first few.

    partition

    Remarks

    Those linear splines and their translates do provide us with spaces V_0\ , V_1\ , and so on. They satisfy the 6 properties I expect them to.

    But I have no idea, so far, how to find a function in W_0\ . There are two obvious candidates — orthogonal to the scaling function — but they fail to be orthogonal to all of its integer translates.

    This is not a big deal. We’re going to warp those splines to hell and back, I think. Or maybe it will be the scaling function. I’m not sure yet, but something will be transmogrified.

    There is a way to get an orthonormal basis from the set of splines: the result is called Battle-Lemarie wavelets. In contrast to everything we’ve seen, however, they do not have finite support (equivalently, they do not have a finite number of h’s) — but they are useful nevertheless.

    So, we will see splines again.

    Wavelet Properties: one more from the dilation equation

    Introduction

    I want to add one more property to the previous post. This is long enough that I do not want to insert it as an edit. It’s amazing that I ever considered even for a moment to insert it as an edit!

    Just as we deduce what is called a partition of unity \sum_k { \varphi(k)} = 1 from the dyadic sum

    \sum_k { \varphi(\frac{k}{2^j})} = 2^j

    (by setting j = 0)

    we can get a more general equation, if \varphi is continuous. We can show that

    \sum_k { \varphi(t-k)} = 1\ ,

    where the sum, as before, is over all integer k. (The proof uses the previous dyadic sum and continuity to say something about non-integer values.) This may be called a generalized partition of unity.

    That’s how Burrus et al. write it, but I find it useful to change the sign of k, and write

    \sum_k { \varphi(t+k)} = 1\ ,

    I may sometimes need negative values of k, but I like having t and k going in the same direction: increasing either of them is moving to the right on the number line.

    For the Haar system, since the support is [0,1), the sum always reduces to one nonzero term, k = 0. For k ≠ 0, the value t+k is always outside the support interval. And since \varphi(t) = 1 for t \in [0,1)\ , we understand that the equation is true.

    Just in case I’ve never said it, or you don’t recognize it, the support of a function f is the part of its domain, the set of x, for which f(x) is nonzero. In particular, to say that a function defined on the real numbers has compact support means that it is zero outside of a compact (i.e., closed and bounded) interval. To say that the support of a function is the interval [0,3) means that the function is zero everywhere else outside that interval.

    For the Daubechies D4, with support [0,3), the sum always reduces to two nonzero terms, but not necessarily the same 2. We always have a term for k = 0. For t \in [0,1)\ we have k = 1,2. Fort \in [1,2)\ , we can have k = \pm 1\ . For t \in [2,3)\ , we have k = -2,-1.

    Computationally speaking, I might as well just write the sum for slightly more k’s than I need, because the computer will take care of the zero terms. But I better not omit k’s for which \varphi(t+k) \ne 0\ .

    Let me be explict about that. If the sum fails to be 1, it will be because we used too few values of k; we omitted a nonzero term. I will illustrate these calculations for the Daubechies D4.

    I do not know if this condition is more useful directly or in reverse. That is, in reverse, if we find a scaling function for which the sum really isn’t 1, \sum_k { \varphi(t+k)} \ne 1\ , then the scaling function \varphi should not be continuous. Unless there are other hypotheses involved — and, I have to remind you, at my present level of understanding, there could be!

    You might want to remember that, in general, \varphi is a hypothetical solution to the dilation equation with some h’s. In particular, \varphi may not even exist. And if it does exist, it may not be integrable. Hell, it may not even be a function, as we usually think of them, but a distribution (like the Dirac delta “function”).

    That said, we are in the position of having entire families of scaling functions available to us. For them, we don’t need to worry about conditions which guarantee existence or integrability or continuity: we have the silly things right in our hands and can play with them, to verify whether properties hold or not.

    To return to the D4 example, it’s easy enough to pick a few values and confirm that the sum is 1. That’s not a proof that the sum is always 1, but it was enough to send me to Daubechies herself (Daubechies, Ingrid; Ten Lectures on Wavelets. Society for Industrial & Applied Mathematics, 1992; ISBN 0 89871 274 2) to confirm there that D4 is continuous. (All the Dn are continuous.)

    Having added this property, my checklist of 5 items grows to 6 items:

    • The sum of the h’s = 2/A.
    • \varphi(t) = \sum_n {h(n)\ A\ \ \varphi(2\ t - n)}\ .
    • E = \int\ \varphi(t)\ dt \ .
    • The sum of the even h’s = the sum of the odd h’s.
    • \sum_k { \varphi(\frac{k}{2^j})} = 2^j \text{(for E = 1)}
    • \sum_k { \varphi(t+k)} = 1 \text{(for E = 1)}\

    Example: Daubechies D4

    Let us look at the generalized partition of unity,

    \sum_k { \varphi(t+k)} = 1\ ,

    specifically for the Daubechies D4. The previous post verified the orginal five properties; we need to do this one.

    First, let’s look at t = 1/2.

    Recall the D4 scaling function. You might also recall that we compute it at rational points whose denominator is a power of 2.

    D4 may 9 4

    Now I tell Mathematica® to list the values of \{ \varphi(t+k)\} with t = 1/2, and k an integer from -3 to 3. That’s more than enough values of k).

    Here’s the list, and the sum of the entries:

    gen part 1

    (Yes, \varphi(3/2) = 0\ .)

    Let’s think about that. We’re really trying to evaluate \varphi at all the half-integer points on the real line. That infinite sum is really one representative of an equivalence class: we have, for example,

    \sum_k { \varphi(1/2+k)} = \sum_k { \varphi(-1/2+k)} = \sum_k { \varphi(3/2+k)}\

    etc. If I decide, for example, to check the sum for t = 13/2, but I still use the same limits for k…

    gen part 2

    the problem is not that the infinite series fails to be 1, but that I have chosen the wrong limits for k. The generalized partition of unity is a doubly-infinite series, in principle, but reduces to a finite sum if the scaling function has compact support (i.e. is zero outside a finite interval). I just didn’t get all the nonzero terms. My bad. Scaling function good.

    Another way to say all that is, without loss of generality we can choose t in the interval [0,1) for checking a selection of values. That infinite series will include values in [1,2) and [2,3) (i.e. the other intervals where D4 is nonzero).

    So let’s look at a few. I have chosen to let k range over -3 to 5 simply because it provides a reassuring number of zeroes on either end of the output list.

    The following picture shows the results for t = 1/4, 3/8, 1/128, and 100/128.

    gen part 3

    summary

    The following six properties are consequences of the dilation equation, although we are used to having A = \sqrt{2}\ and two require E = 1. That means that some of them can be used to quickly determine that A ≠ \sqrt{2} or that E ≠ 1. Alternatively, we could work out the generalized forms for E ≠ 1.

    In particular, these properties do not require that integer translates be orthogonal. But we have yet to see an example of a scaling function that is not orthogonal to its integer translates. (Yes, of course, I have one just waiting for us.)

    • The sum of the h’s = 2/A.
    • \varphi(t) = \sum_n {h(n)\ A\ \ \varphi(2\ t - n)}\ .
    • E = \int\ \varphi(t)\ dt \ .
    • The sum of the even h’s = the sum of the odd h’s.
    • \sum_k { \varphi(\frac{k}{2^j})} = 2^j \text{(for E = 1)}
    • \sum_k { \varphi(t+k)} = 1 \text{(for E = 1)}\

    Wavelet properties: consequences of the dilation equation

    discussion
    edited 8 Jun to cross out the last line and correct it. see edit.
    edited 13 Jun to correct the last line again. see edit. Sorry about that, but I shot from the hip and hit myself in the foot.

    We have seen in the previous post that the idea of a set of scaling functions \{\varphi(t-k)\} spanning a space V_0\ , and such that the set \{\varphi(2t-k)\} span a space V_1\ , gives rise to a dilation equation, which we have been writing as

    \varphi(t) = \sum_n {h(n)\ \sqrt{2}\ \ \varphi(2\ t - n)}.

    (We also get more spaces V_j\ for both positive and negative integers j.)

    We have also seen that if we impose an inner product, then wavelets live in the orthogonal complements W_j\ :

    V_{j+1} =: V_j \oplus W_j\ .

    I should probably remark, first, that a solution \varphi(t)\ of the dilation equation is determined only to within a constant factor. We describe that, however, by describing its integral (effectively, its average) rather than, for example, its maximum value. We write

    \int\ \varphi(t)\ dt = E\ ,

    where E is a nonzero constant which we often choose to be 1: E = 1.

    To put that another way: if there is a solution, it is not unique.

    Two examples we’ve seen repeatendly are Haar and Daubechies. The Haar scaling fuction is a unit step function on the unit interval, so its integral is 1. Similarly, the integrals of the Daubechies scaling functions are 1, although I have not shown that.

    There are 3 consequences of the dilation equation (mostly with other conditions added).

    The first consequence of the dilation equation is that it dictates the sum of the h’s. For the form we have usually been using, namely

    \varphi(t) = \sum_n {h(n)\ \sqrt{2}\ \ \varphi(2\ t - n)}

    we must have

    \sum_n\ h(n) = \sqrt{2}\

    More generally, for an arbitrary normalizing factor A in the dilation equation

    \varphi(t) = \sum_n {h(n)\ A\ \varphi(2\ t - n)}

    we get

    \sum_n\ h(n) = \frac{2}{A}\

    Let me remark that the sum of the h’s is independent of the integral of \varphi(t)\ , i.e. independent of E.

    I should also mention that I had originally written that it was independent of the “normalization,” but I want to reserve that term for the integral of the square of \varphi(t)\ . I have found it all too easy to get confused between

    \int\ \varphi(t)\ dt\

    and

    \int\ \varphi(t)\ \varphi(t)\ dt\ .

    Further discussion and the derivation of this is at the end of the post. That will make it clear that the sum of the h’s does not depend on the value of the integral (but it does require that the integral exist and be nonzero).

    The second consquence of the dilation equation, with the additional requirement that the integral E = 1,

    E = \int\ \varphi(t)\ dt = 1\

    is that

    \sum_k { \varphi(\frac{k}{2^j})} = 2^j

    for integers k and j.

    Note especially the special case for j = 0,

    \sum_k { \varphi(k))} = 1

    That is, that the sum of the values of \varphi(t)\ at the integers is 1.

    That is why Burrus et al. normalized the eigenvectors to a sum of 1, those eigenvectors that we used for the values of \varphi(t)\ at the integers, in order to initialize our recursions for computing \varphi(t)\ .

    The third consequence of the dilation equation, under at least two different sets of additional assumptions which I’m skipping right over – but not an assumption that \varphi(t) is orthogonal to its integer translates – is that

    the sum of the even-numbered h’s is equal to the sum of the odd-numbered h’s:

    \sum_n\ h(2n) = \sum_n\ h(2n+1)\ .

    This result is also implied by an orthogonality condition, but it does not require it. And this is why I state it although I’m more than a little vague – I’m not sure I could be any more vague if I tried – on the conditions it assumes: this result does not require orthogonality.

    My reading suggests that this result will make a lot more sense in either the frequency domain, or from the filter-bank point of view. I suppose I should have said that what I’m doing is the “multi-resiolution analysis approach”, by focusing on the V and W spaces.

    I had originally intended to list the assumptions required, until I saw a simpler but alternative set of assumptions. To heck with them all, at this stage of my learning.

    At this point, I view this consequence as an interesting result to be looked for, and if it ever fails to hold, then I will investigate and see why it failed.

    Anyway, there are 5 things to note or to check.

    • The sum of the h’s = 2/A.
    • The scaling factor A in the dilation equation.
    • The integral E of the scaling function (is E=1?).
    • The sum of the even h’s = the sum of the odd h’s.
    • \sum_k { \varphi(\frac{k}{2^j})} = 2^j

    I did not number those because I will check them in whatever order is convenient, but I will count them as I check them.

    Yes, I spoke of three consequences and then I listed five items. The extra two are the values of A and E, parameters rather than consequences.

    By the way, the sum of the h’s might be a good first thing to compute, since it gives us the value of A in the dilation equation. But sometimes the dilation equation comes first; and sometimes it’s just not clear which (the h’s or the equation) comes first.

    Let’s look at these.

    example: Haar

    We take the scaling function \varphi(t)\ to be the unit-height step function nonzero on the half-open interval [0,1).

    One result, its integral is 1.

    Result two, if we write the dilation equation as we have usually done it…

    \varphi(t) = \sum_n {h(n)\ \sqrt{2}\ \ \varphi(2\ t - n)}

    then h(0) = h(1) = \frac{1}{\sqrt{2}}\ (i.e. result three, the sum of even h’s (h(0)) is, trivially, equal to the sum of odd h’s (h(1)), and (result four) the sum of the h’s is \frac{2}{\sqrt{2}} = \sqrt{2}\ .

    Five, because the integral is 1, we expect that

    \sum_k { \varphi(\frac{k}{2^j})} = 2^j

    and that is true, too.

    Let me be explicit about result five. For j = 0, we have \varphi(0) = 1 = 2^0\ .

    For j = 1, we have

    \varphi(0) + \varphi(1/2) = 1 + 1 = 2 = 2^1\ .

    For j = 2, we have

    \varphi(0) + \varphi(1/4)+ \varphi(1/2)+ \varphi(3/4) = 1 + 1 + 1 + 1= 4 = 2^2\ .

    I hope the pattern is clear: we end up adding more 1’s, in just the right number to give us another power of 2..

    example: Daubechies D4

    One result, the dilation equation is still

    \varphi(t) = \sum_n {h(n)\ \sqrt{2}\ \ \varphi(2\ t - n)}

    (This is, after all, how we computed the D4 scaling function.)

    The h’s are (remember that the first one is h(0), an even coefficient):

    \left\{\frac{1+\sqrt{3}}{4   \sqrt{2}},\frac{3+\sqrt{3}}{4   \sqrt{2}},\frac{3-\sqrt{3}}{4   \sqrt{2}},\frac{1-\sqrt{3}}{4 \sqrt{2}}\right\}\

    Result two, the sum of the h’s is

    \frac{1-\sqrt{3}}{4 \sqrt{2}}+\frac{3-\sqrt{3}}{4   \sqrt{2}}+\frac{1+\sqrt{3}}{4   \sqrt{2}}+\frac{3+\sqrt{3}}{4 \sqrt{2}}\

    which does, indeed, simplify to \sqrt{2}\ .

    Three, the sum of the even coefficients is (sorry, the order changed)…

    \frac{3-\sqrt{3}}{4 \sqrt{2}}+\frac{1+\sqrt{3}}{4   \sqrt{2}}\

    which simplifies to \frac{1}{\sqrt{2}}\ .

    Since that’s half the total (\sqrt{2}\ ), we can conclude that the sum of the odd coefficients is also \frac{1}{\sqrt{2}}\ . Or we can just compute the sum of the odd coefficients…

    \frac{1-\sqrt{3}}{4 \sqrt{2}}+\frac{3+\sqrt{3}}{4   \sqrt{2}}\

    and it is, as it must be, \frac{1}{\sqrt{2}}\ .

    Now, I can’t actually compute the integral of the D4 scaling function. I can estimate it as closely as I like, but all I have is its values at as many points as I like. But that is all I need! I can, in other words, compute things that look like finite Riemann sums — the areas of thin rectangles — but I can’t actually compute the integral except as a limit.

    But the equation

    \sum_k { \varphi(\frac{k}{2^j})} = 2^j

    holds, and says (four) that every one of those approximations will have the same value, namely 1. Taking a limit doesn’t get any easier than the limit of a constant.

    I conclude that the integral is 1 (and that’s the fifth result).

    Let me be explicit.

    The sum of the values of the D4 scaling function at the integers was chosen to be 1. Let’s look again at the values on the integers…

    picture-37

    Those 4 values are… \{0.,1.36603,-0.366025,0.\}\

    and their sum is 1, and the width of each interval between them is \Delta\ t = 1\ , so the area of 4 rectangles \sum_n {f(t)\ \Delta\ t}\ ) would also be 1.

    At the half integers (and integers)….

    picture-38

    The function values are…

    \begin{array}{c} 0. \\ 0.933013 \\ 1.36603 \\ 0 \\ -0.366025 \\ 0.0669873 \\ 0.\end{array}

    Now the sum is 2, but the width of each rectangle is \Delta\ t = \frac{1}{2}\ , so the total area is, again, 1.

    At all the quarter integers (including halves and integers)…

    picture-39

    The function values are…

    \begin{array}{c} 0. \\ 0.63726 \\ 0.933013 \\ 1.10377 \\ 1.36603 \\ 0.341506 \\ 0 \\ -1355.76 \\ -0.366025 \\ 0.0212341 \\ 0.0669873 \\ -0.0122595 \\ 0.\end{array}

    The sum is 4, but each rectangle is of width 1/4, so the total area is still 1.

    And so on.

    So the equation

    \sum_k { \varphi(\frac{k}{2^j})} = 2^j

    would seem to give us that the area of every dyadic approximation of a wavelet is equal to 1, and that — in the limit — gives us the integral. The powers of 2 on the RHS exactly offset the diminishing widths of the rectangles in the finite sums used to compute areas of the step functions.

    digression on h’s

    Let us return to the Haar system. When I showed you the V and W spaces, I wrote the dilation equation as

    \varphi(t) = \varphi(2t) + \varphi(2t-1).

    This implies that we have two nonzero c’s:

    c(0) = c(1) = 1,

    (and the sum of the coefficients is clearly 2).

    But we are used to writing the dilation equation as

    \varphi(t) = \sum_{n} h(n)\ \sqrt{2}\ \varphi(2t-n)\ ,

    which says

    c(n) = h(n)\ \sqrt{2},

    so that we wrote the Haar system with

    h(0) = h(1) = \frac{1}{\sqrt{2}},

    (and the sum of the coefficients is \sqrt{2}).

    Another form of the equation is common:

    \varphi(t) = \sum_{n} h(n)\ 2\ \varphi(2t-n),

    which says that

    c(n) = 2 h(n)

    so that the Haar system would have

    h(0) = h(1) = \frac{1}{2},

    (and the sum of the coefficients is 1).

    Many books point out the existence of different scalings. Strang & Nguyen (Strang, Gilbert; Nguyen, Truong.Wavelets and Filter Banks.Wellesley-Cambridge Press, 1997 (revised edition).ISBN 0 9614088 7 1), however, are more explicit (p. 23):

    • if you are working primarily with the dilation equation, set the sum of coefficients to 2 “… to preserve area.”
    • if you are worling with a single filter, set the sum of coefficients to 1. “That preserves the zero frequency DC term….”
    • if you are workig with a filter bank, set the sum of coefficients to \sqrt{2} “… to account for the downsampling step.

    As I said when we looked at Nievergelt’s example, setting the area to 1 (sum = 2) is different from making an orthonormal basis out of the wavelets (sum = \sqrt{2}\ ).

    This is the key: if someone hands me a set of filter coefficients for a scaling function – I’m going to add them all up, and see whether the sum is 1, \sqrt{2}\ , or 2.

    (Yes, in principle, other normalizations are possible, too; but these are the three I expect to find.)

    Burrus et al. are using a sum of \sqrt{2}.

    Let’s take a closer look at the scaling factor in the dilation equation.

    the sum of the h’s

    Suppose we write the dilation equation as

    \varphi(t) = \sum_{n} h(n)\ A\ \varphi(2t-n)

    for some constant A. We can determine the sum of the h’s.

    How? Integrate. (So the integral needs to exist. Easy enough to say if we know \varphi(t)\ , but remember that, in principle, we’re talking about something whose existence is not assured.) Anyway, we integrate:

    \int \varphi(t)\ dt = \int (\sum_{n} h(n)\ A\ \varphi(2t-n))\ dt\

    Now, if we can interchange the integral and the summation (which may not be a finite sum!), then

    \int \varphi(t)\ dt = \sum_{n} h(n)\ A\ \int(\varphi(2t-n))\ dt

    Now do a change-of variable y = 2t-n, dy = 2dt. We get

    \int \varphi(t)\ dt = \sum_{n} h(n)\ \frac{A}{2} \int(\varphi(y))\ dy

    Now, if the integral \varphi(t)\ dt is nonzero, we can divide by it, getting

    1 = \frac{A}{2}\ \sum_{n} h(n)

    and then

    \sum_{n} h(n) = \frac{2}{A}\ .

    I’ll remark that we just divided both sides of the equation by \int\ \varphi(t)\ dt = E\ , which is why the sum of the h’s is independent of E.

    For Burrus, et al., A = \sqrt{2}

    so we have that the sum of the h’s is \frac{2}{\sqrt{2}} = \sqrt{2}\ .

    The sum of the h’s is directly related to the factor A in the dilation equation, so just keep your eyes open for it.

    Let me also point out that the normalization could be considered part of the definition of the functions which span the space V_j\ . That is, instead of taking it to be the space spanned by the set of translated and scaled functions \{\varphi(t-k)\}\ , we might introduce additional scaling, say, \{\sqrt{2}\ \varphi(t-k)\}\ . We still get the same function (vector) space – all we’ve done is change the sizes of the basis functions.

    The real issue may very well be, what’s in your wallet?® — what is your software doing?

    What’s next? I will add the requirement that the scaling function be orthogonal to all of its translates. (We already have that the mother wavelet is orthogonal to the scaling function, and that wavelets in W_j\ are orthogonal to functions in V_j\ . Edit: Now we’re going to require that the basis for V_j\ be orthogonal to the basis for V_{j+1}\ . No, we won’t require that – we will get it as a result of requiring that integer translates of the scaling function be orthogonal, i.e. imposing a condition in V_0\ ).

    edit, added:
    The basis for V_j\ will not be orthogonal to the basis for V_{j+1}\ . We will neither require it, nor obtain it as a result. I will have more to say about this.

    Wavelet Properties: the dilation equation.

    Introduction

    What do I propose to show you? (Certainly not all in one post.)

    My understanding is neither complete nor rigorous. But wavelets and scaling functions, and their coefficients g and h, respectively, have a lot of properties. I want to sort them out.

    I’m not trying for rigor. (Heresy!) I’m laying things out on a table so I can begin to relate them to each other.

    The properties which I want to show you seem to fall into 4 categories.

    1. Where do the dilation equation and wavelets come from?
    2. What can we deduce from the dilation equation?
    3. What can we deduce from the requirement that the scaling function and its integer translates be orthogonal?
    4. A few things that I really, really don’t understand yet.

    There is a fair bit of repetition in here. In particular, it seemed worthwhile to repeat things within the examples.

    Where do the dilation equation and wavelets come from?

    Discussion

    Suppose we have a function \varphi(t)… even more, suppose we have a collection of its “integer translates” \{\varphi(t-k)\}\ , where k may be any integer.

    That is, for example, take \varphi(t) to be any of the scaling functions we have seen in the following two posts: the Haar, or the Daubechies D4 or the D6 or other scaling functions we drew here.

    Define the space V_0 to be that space spanned by the collection of integer translates \{\varphi(t-k)\}\ . In general, I may not be able to describe that space in any other way; whatever the \{\varphi(t-k)\}\ span, that is V_0\ .

    I also haven’t said that the \{\varphi(t-k)\}\ are linearly independent; right now, I only care what they span, not that they necessarily be a basis for it.

    Now consider the collection of scaled and translated functions \{\varphi(2t-k)\}\ , and let V_1 be the space they span.

    For the Haar system, we know that V_0 \subset V_1\ . Any step function with jumps at the integers may be written as a (rather special) linear combination of step functions with jumps at the half-integers.

    In general, however, this is a stringent requirement. Suppose it holds, for whatever hypothetical function \varphi(t) we have. That V_0 is a subspace of V_1\ , with V_1 spanned by the set \{\varphi(2t-k)\}\ tells us that any function in V_0 may be written as a linear combination (possibly infinite!) of the \{\varphi(2t-k)\}\ . In particular, the scaling function \varphi(t) itself may be wriiten as a linear combination of the \{\varphi(2t-k)\}\ .

    But that gives us a dilation equation

    \varphi(t) = \sum_{n} c(n)\ \varphi(2t-n)\

    for some coefficients c(n). Conversely, if \varphi(t) satisfies the dilation equation, then I guess we ought to have V_0 \subset V_1\ .

    The dilation equation effectively comes from the requirement that V_0 \subset V_1\ .

    Now we add a little more structure, an inner product.

    If V_0 is a proper subspace of V_1\ , and if we introduce an inner product, then we may split V_1 into V_0 and its orthogonal complement; that is, we define the orthogonal complement W_0 as:

    V_1 = V_0\ \oplus W_0\ .

    For these purposes, the customary inner product of two functions f, g is the integral of their product:

    \int f(t) g(t) \, dt\ .

    The definition of W_0 gives us a couple of immediate consequences. For one thing, every function \psi \in W_0 is orthogonal to the scaling function \varphi(t) \in V_0\ and is also orthogonal to every translate, \varphi(t-k\ .)

    For another thing, W_0 \subset V_1\ , so every function \psi \in W_0 is also in V_1\ , and therefore can be written in terms of the set \{\varphi(2t-k)\}\ . That is, we have

    \psi(t) = \sum_{n} d(n)\ \varphi(2t-n)\

    for some coefficients d(n). This is how and why we could compute the mother wavelet from the scaling function! (Oh, yes, the mother wavelet is a function in W_0\ .)

    So the mother wavelet, as far as we know at this point, could be any function in W_0\ , and it satisfies an equation like the dilation equation. Note, as we saw when we were computing, that we have \psi on the LHS but \varphi on the RHS.

    And we continue, defining W_1\

    V_2 =: V_1 \oplus W_1

    and W_j\

    V_{j+1} =: V_j \oplus W_j\ .

    Now, for example, take V_2 in terms of V_1 and W_1 and then write V_1 in terms of V_0 \text{ and } W_0\ :

    V_2 =: V_1 \oplus W_1 = V_0 \oplus W_0 \oplus W_1\ .

    More generally,

    V_{j+1} =: V_j \oplus W_j =  V_0 \oplus W_0 \oplus W_1 \oplus ... \oplus W_j\ .

    We can write a function in V_{j+1} using a single scaling function in V_0\ , and wavelets from the W_0 \text{,...\ } W_j\ . (Oh, yes, the wavelets derived from the scaling function \varphi(t) \in V_0 are functions in the W_j\ spaces.)

    We will see, in the example below, that it makes sense to go the other way, too. Suppose that for some reason – I’ll give you one, soon – we want our scaling function to be a step function of width 4 instead of width 1. (So I am speaking of the Haar system in particular, but it can make sense to go the other way for any wavelet system.)

    Where does that live? Well, step functions of width 2 should live in V_{-1}\ . They are related to step functions of width 1 in the same way step functions of width 1 are related to step functions of width 1/2.

    So step functions of width 4 should live in V_{-2}\ . And we should have

    V_{-1} = V_{-2} \oplus W_{-2}\ ,

    V_{0} = V_{-1} \oplus W_{-1} = V_{-2} \oplus W_{-2} \oplus W_{-1}\ .

    Example: Haar

    For the Haar system, here’s the scaling function \varphi(t) and one possible translate, \varphi(t+1)\ , on [-1,3]…

    haar t and trans

    As I said earlier, define the space V_0 to be that space spanned by the collection of integer translates \{\varphi(t-k)\}\ . We can describe V_0\ : the space of step functions with jumps at the integers.

    Now we consider the collection of scaled and translated functions \{\varphi(2t-k)\}\ . Here’s \varphi(2t) for the Haar… and for one of its translates, in this case, \varphi(2t-1)\ .

    haar 2t and trans

    They are nonzero only over a half-unit interval. That is significant for their inner products with themselves (their norms).

    Define the space V_1 as that space spanned by the collection \{\varphi(2t-k)\}\ . For the Haar system, we can describe V_1 as the space of step functions with jumps at the half-integers.

    Note that if we add \varphi(2t) and \varphi(2t-1)\ , we get the unit step function on [0,1) — that is, we get \varphi(t)\ :

    \varphi(t) =  \varphi(2t) +  \varphi(2t-1)\ .

    That’s a form of the dilation equation!

    (It’s not the way we’ve been writing it – we’re missing \sqrt{2}\ – but this is a dilation equation, and we can – and will shortly ! – insert the \sqrt{2}\ easily enough.)

    That says that V_0 \subset V_1\ : any piecewise step function whose jumps are at integers can be written as (an admittedly special) piecewise step function whose jumps are at half-integers.

    For the Haar functions \varphi(t), this is a property we observe. We generalize it and say that we want it to be a property of any scaling function; we require the space V_0 (spanned by \{\varphi(t-k)\} ) to be a subspace of V_1 (spanned by \{\varphi(2t-k)\}):

    Assume

    • the set \{\varphi(t-k)\} spans V_0 by definition of V_0
    • the set \{\varphi(2t-k)\} spans V_1 by definition of V_1
    • V_0 \subset V_1

    Then any function in V_1 can be written, by definition, as a linear combination of the \varphi(2t-k)… hence any function in V_0 \subset V_1 can be written as a linear combination of the \varphi(2t-k)… hence the particular function \varphi(t) \in V_0 can be written as a linear combination of the \varphi(2t-k).

    That is, we have a dilation equation

    \varphi(t) = \sum_{n} c(n) \varphi(2t-n)

    essentially because we require V_0 \subset V_1.

    The D4 and D6 scaling functions are solutions to the dilation equation for particular coefficients. Maybe now is a good time to emphasize that the h’s and the scaling function \varphi(t) are hidden in the mist, as it were. We can apparently write a dilation equation for any set (possibly infinite) of h’s, and we have no idea if there’s a solution at all. And if there is a solution, is it unique, is it continuous, is it differentiable, is it integrable, is it square integrable? This is why we look for properties of the h’s and the solution.

    Example: Nievergelt

    The text (Nievergelt, Yves. Wavelets Made Easy. Birkhäuser, 2001 (2nd printing with corrections), ISBN 0 8176 4061 4.) opens with a simple example.

    We have 4 data points, and I want to illustrate these V and W spaces. The specific data is

    \{5,1,2,8\}\ .

    Now imagine that instead of 4 points I have a step function f(t) defined on the quarter integers in [0,1). I have V_0 \text{ thru } V_2\ (integers, halves, quarters), so I write

    V_2 = V_1 \oplus W_1 = V_0 \oplus W_0 \oplus W_1\ .

    It appears that I want the scaling function \varphi(t)\ , the mother wavelet \psi(t)\ , and two wavelets \psi(2t)\ and \psi(2t-1)\ . That is, we are going to write our step function f(t) as the combination

    f(t) = a\ \varphi(t) + b\ \psi(t) + c\ \psi(2t) + d\ \psi(2t-1)\ ,

    where a, b, c, d are called the wavelet coefficients. They are components of a vector – in a function space – where the basis vectors are functions. (Yes, those 4 functions are a basis for V_2\ \cap [0,1)\ .

    Here’s those four functions all together.

    spaces grid

    I should probably remind us all that the Haar mother wavelet \psi(t) satisfies the equation

    \psi(t) = \varphi(2t) - \varphi(2t-1)\ ,

    and that’s exactly how I compute it.

    (“Reverse the h’s and alternate the signs.”)

    Let me point out that I could also write f(t) \in V_2\ , and I could graph the step function f(t) using these 4 basis functions (all scaling, no wavelets; all \varphi \text{ no } \psi\ ):

    f(t)=5\ \varphi (4 t)+8\ \varphi (4 t-3)+2\ \varphi (4t-2)+\varphi (4 t-1)\ .

    I can write it, but it’s nowhere near as useful as the wavelet decomposition. (On the other hand, it’s probably the most convenient way to graph f(t)!) So, here’s the graph, constructed exactly that way.

    spaces new f

    Now I am going to give you Nievergelt’s wavelet decomposition. There are some awkward points, at this stage, so I’m just going to hand you the answer for now.

    I should do it using an orthonormal basis; he did not. At this stage, I should compute the “wavelet coefficients” by taking dot products, and for that an orthonormal basis is extremely useful. (If the basis is not orthonormal, then we need a reciprocal basis in order to compute components using the inner product.) In practice, we would use a different algorithm, one which does not require taking inner products, hence does not require that the basis vectors be normalized.

    As it happens, the scaling function and the mother wavelet are both normalized: the integrals of their squares are 1, so their norms — the square roots — are 1.

    \int \varphi(t)\  \varphi(t) \, dt\ = 1.

    \int \psi(t)\  \psi(t) \, dt\ = 1.

    (Their areas are also 1, but that’s not what we’re computing here.)

    The two daughter wavelets \psi(2t)\ and \psi(2t-1)\ , however, are not normalized. It’s true that their squares, at each point, are either 1 or 0, but they are nonzero over half-intervals, so the integrals are 1/2 instead of 1. We need to multiply each of these functions by \sqrt{2} to double the heights of their squares – if we want the daughter wavelets to be part of an orthonormal basis.

    That is, for example,

    \int \varphi(2t)\  \varphi(2t) \, dt\ = \frac{1}{2}.

    so

    \int (\sqrt{2}\ \varphi(2t)\  ) (\sqrt{2}\ \varphi(2t)\  ) \, dt\ = 1\ .

    (I told you I’d put that \sqrt{2} in there. And if we wanted unit area instead of unit norm, we would have multiplied by 2, instead of by \sqrt{2}\ . We’ll see this again.)

    But I’m going to hold off on these calculations, until we’ve looked at some more properties. I simply tell you that the numerical result is that the components for this expansion are

    \{a\ ,b\ ,c\ ,d\} = \{4,-1,2,-3\}\

    … and that f(t) is

    f(t) = 4\ \varphi(t) - 1\ \psi(t) + 2\ \psi(2t) - 3\ \psi(2t-1)\ .

    (That’s easy enough to check algebraically: just use the recursion equation for the \psi \text{'s}, and then use the dilation on the resulting \varphi \text{'s}\ . But if you can compute all those functions, it is easier to just graph that equation and check it visually.)

    Let me lead up to that in stages. Here is the least-detailed term: f(t) \approx 4\ \varphi(t)

    haar V0

    It tells us that the average of the data is 4; alternatively, if we smooth the data by its average, that is what we get. Yet again, if we approximate our function f(t) \in V_2 by a particular function in V_0\ , that is the approximation.

    Now add the next term. If we approximate our function f(t) \in V_2 by our functions in V_1 =  V_0 \oplus W_0 \ , namely as

    f(t) \approx 4\ \varphi(t) - \psi(t)\ ,

    here’s what we get:

    haar W0

    It can be interpreted as: the average of the first two observations is 3 and the average of the last two is 5.

    Finally, our given function f(t) \in V_2 can be written exactly as

    f(t) = 4\ \varphi(t) - 1\ \psi(t) + 2\ \psi(2t) - 3\ \psi(2t-1)\ ,

    which is:

    haar W1 W2

    Now, one more thing. I didn’t have to replace the data by a function in V_2\ ; I could have decided that f(t) was a step function defined (that is, nonzero) on the half-open interval [1, 5) with jumps on the integers \{1,2,3,4,5\}\ . Then my scaling function needs to be of width 4, i.e. in V_{-2}\ , my mother wavelet is in W_{-2}\ , and the two finest-scale wavelets are in W_{-1}\ :

    V_{0} = V_{-2} \oplus W_{-2} \oplus W_{-1}\ .

    Let me be very clear. I could have gone so far as to replace my 4 points by the step function

    f(t)=5\ \varphi (t-1)+8\ \varphi (t-4)+2\ \varphi (t-3)+\varphi (t-2)\ .

    spaces wide shift

    That is, we can handle a change in scale and a shift: I can use integer times and I can even start time at t=1, if I choose. There’s no reason to use the shift, although it’s important to understand that we could; I’m going to forget the shift, and revert to just a change of scale,

    f(t)=5\ \varphi (t)+8\ \varphi (t-3)+2\ \varphi (t-2)+\varphi (t-1)\ ,

    i.e.

    spaces wide no shift

    The wavelet expansion should use

    V_{0} = V_{-2} \oplus W_{-2} \oplus W_{-1}

    and instead of

    f(t) = 4\ \varphi(t) - 1\ \psi(t) + 2\ \psi(2t) - 3\ \psi(2t-1)\ ,

    should be

    f(t) = 4\ \varphi(t/4) - 1\ \psi(t/4) + 2\ \psi(t/2) - 3\ \psi(t/2-1)\

    (with the very same coefficients, just applied to differently-scaled functions).

    Here’s what it looks like:

    spaces wide fit

    Summary

    From our chosen scaling function \varphi(t)\ – assuming it exists, i.e. assuming that we chose the h’s and found a scaling function as a solution of the dilation equation! – we get a nested sequence of V spaces. They are defined by integer translates and scaled versions of \varphi(t)\ . The key requirement is that the existence of nested V spaces corresponds to a solution of the dilation equation.

    By introducing an inner product (dot product), we get the W spaces, each as the orthogonal complement of a V space. In particular, since

    V_1 = V_0 \oplus W_0\ ,

    we have that \psi(t) \in W_0 is orthogonal to \varphi(t) \in V_0\ ; and since that subspace decomposition implies

    W_0 \subset V_1\ ,

    we have that \psi(t) also satisfies something like the dilation equation.

    \psi(t) = \sum_{n} d(n)\ \varphi(2t-n)\ .

    Next, I expect, we will look at what the dilation equation tells us about the h’s in it. (Or the c’s, whatever we call these coefficients and however we scale them.)

    N = 6 scaling functions and mother wavelets

    introduction

    The impetus for this post is figure 1.4 on page 6 of Burrus et al. (Since it’s been a while, that’s Burrus, C. Sidney; Gopinath, Ramesh A.; Guo, Haitao. Introduction to Wavelets and Wavelet Transforms, A Primer. Prentice Hall, 1998.ISBN 0 13 489600 9.)

    They offered the figure as an illustration of four different scaling functions; in addition, these scaling functions were parameterized by two angles.

    Now I know how to produce the drawings of those scaling functions — and I also know how to produce drawings of the corresponding mother wavelets, which they did not show on page 6. It also turns out that they have interchanged the legends for figures (a) and (b) — and when we can produce the drawings ourselves, that mistake becomes almost irrelevant.

    This post also serves to reiterate the calculation sequence for plotting a scaling function and its corresponding mother wavelet.

    setup

    It turns out that there are equations which must be satisfied by any “reasonable” filter coefficients h. I expect to show these to you in the next post. I have already shown you a description of these solutions of these equations when we have 4 nonzero coefficients. And I have shown you how to compute scaling functions and mother wavelets for N = 4 (D4 only) and for N = 2 (D2 = Haar).

    Let me emphasize that what I understand is how to get the equations which the filter coefficients h must satisfy. What I do not yet understand is how to parameterize the solutions in terms of angles.

    Let me now hand you a description of the solutions when we have 6 nonzero coefficients. The four examples I am about to work out all come from specific choices of these solutions. That is, for every one of the following examples, N = 6.

    The following 4 scaling functions match figure 1.4, although (a) and (b) are swapped. (The legend describes the values “a” and “b” in the equations; the legend for figure a belongs with figure b.)

    In addition, they have drawings of the Daubechies D6 (p. 82) and Coiflet C6 (p. 94) mother wavelets, so I have confirmed 6 of the following 8 graphs.

    From p. 66 of Burrus et al., we get the following (using H instead of h just because it lets me preserve the equations in H while setting values for h.)

    Let me tell you before I start writing that the last two equations are actually

    H(4) = -H(0)-H(2)+\frac{1}{\sqrt{2}}

    H(5) = -H(1)-H(3)+\frac{1}{\sqrt{2}}

    That is, they say that the sum of the even coefficients is equal to the sum of the odd coefficients is equal to half the sum of all the coefficients = \frac{1}{2} \sqrt{2}= \frac{1}{\sqrt{2}}\ . They will look much more complicated than that when I substitute the first four equations.

    H(0) = \frac{(\cos (a)+\sin (a)+1) (-\cos (b)-\sin (b)+1)+2 \cos (a) \sin (b)}{4 \sqrt{2}}

    H(1) = \frac{(-\cos (a)+\sin (a)+1) (\cos (b)-\sin (b)+1)-2 \cos (a) \sin (b)}{4 \sqrt{2}}

    H(2) = \frac{\cos (a-b)+\sin (a-b)+1}{2 \sqrt{2}}

    H(3) = \frac{\cos (a-b)-\sin (a-b)+1}{2 \sqrt{2}}

    H(4) = -\frac{\cos (a-b)+\sin (a-b)+1}{2 \sqrt{2}}-\frac{(\cos (a)+\sin (a)+1) (-\cos (b)-\sin (b)+1)+2 \cos (a)   \sin (b)}{4 \sqrt{2}}+\frac{1}{\sqrt{2}}

    H(5) = -\frac{\cos (a-b)-\sin (a-b)+1}{2 \sqrt{2}}-\frac{(-\cos (a)+\sin (a)+1) (\cos (b)-\sin (b)+1)-2 \cos (a)   \sin (b)}{4 \sqrt{2}}+\frac{1}{\sqrt{2}}

    legend 1.14 a (figure b)

    The first choice (the legend for figure 1.14 a) for the angles a and b (in radians) is

    \{a\to 1.3598,b\to -0.782106\}

    (I knew when I saw those that the figure was wrong; by that time I recognized the D6 h’s.)

    The resulting h’s are…

    \{0.332671,0.806892,0.459878,-0.135011,-0.0854413,0.0352263\}

    Now we form the m0 matrix (in stages for clarity)…

     \left(\begin{array}{llllll} \text{h0} & 0 & 0 & 0 & 0 & 0 \\ \text{h2} & \text{h1} & \text{h0} & 0 & 0 & 0 \\ \text{h4} & \text{h3} & \text{h2} & \text{h1} & \text{h0} & 0 \\ 0 & \text{h5} & \text{h4} & \text{h3} & \text{h2} & \text{h1} \\ 0 & 0 & 0 & \text{h5} & \text{h4} & \text{h3} \\ 0 & 0 & 0 & 0 & 0 & \text{h5}\end{array}\right)

    Then we multiply by Sqrt[2]…

    M_0 = \left(\begin{array}{llllll} \sqrt{2}\ \text{h0} & 0 & 0 & 0 & 0 & 0 \\ \sqrt{2}\ \text{h2} & \sqrt{2}\ \text{h1} & \sqrt{2}\ \text{h0} & 0 & 0 & 0   \\ \sqrt{2}\ \text{h4} & \sqrt{2}\ \text{h3} & \sqrt{2}\ \text{h2} & \sqrt{2}\   \text{h1} & \sqrt{2}\ \text{h0} & 0 \\ 0 & \sqrt{2}\ \text{h5} & \sqrt{2}\ \text{h4} & \sqrt{2}\ \text{h3} &   \sqrt{2}\ \text{h2} & \sqrt{2}\ \text{h1} \\ 0 & 0 & 0 & \sqrt{2}\ \text{h5} & \sqrt{2}\ \text{h4} & \sqrt{2}\ \text{h3}   \\ 0 & 0 & 0 & 0 & 0 & \sqrt{2}\ \text{h5}\end{array}\right)

    Note that since all 4 examples have N = 6, that is the m0 matrix in principle for each of these examples. I won’t show it again.

    Now we set the values…
    M_0 = \left(\begin{array}{llllll} 0.470467 & 0. & 0. & 0. & 0. & 0. \\ 0.650365 & 1.14112 & 0.470467 & 0. & 0. & 0. \\ -0.120832 & -0.190934 & 0.650365 & 1.14112 & 0.470467 & 0. \\ 0. & 0.0498175 & -0.120832 & -0.190934 & 0.650365 & 1.14112 \\ 0. & 0. & 0. & 0.0498175 & -0.120832 & -0.190934 \\ 0. & 0. & 0. & 0. & 0. & 0.0498175\end{array}\right)

    We find the eigenvector of eigenvalue 1. Mathematica returned a unit vector, but I want one whose components add up to 1. I will simply divide the unit vector by the sum of its components.

    Here’s the unit vector…

    \{0.,-0.955434,0.286583,-0.0707606,-0.00314509,0.\}

    The sum of its components is -0.742756, so I get

    V = \{0.,1.28634,-0.385837,0.0952675,0.00423435,0.\}

    This function should be defined on [0,5]. I use that vector for the values of the scaling function at the integers.

    Picture 34

    Working, I graphed this immediately, but for the post, I ought to put the two drawings — scaling function and mother wavelet — in close proximity. This one turns out to be the Daubechies D6 scaling function.

    To get the D6 mother wavelet, we take our filter coefficients h…

    h = \{0.332671,0.806892,0.459878,-0.135011,-0.0854413,0.0352263\}

    … we reverse them, and alternate the signs. The resulting filter coefficients are often denoted by h1 or by g. I’ll use g. The resulting g coefficients are what I expect (the sign of g[0] = h[5] has not been changed), but their Table 6.2 on p. 79 has the negatives of these numbers. And yet, I match their picture on p. 82.

    (That’s why there’s a \pm in front of the recipe. Sorry, but you’ll have to look for it in the post.)

    g = \{0.0352263,0.0854413,-0.135011,-0.459878,0.806892,-0.332671\}

    Define the mother wavelet…

    Picture 35

    and plot it…
    (These are Daubechies D6 mother wavelet and scaling function, respectively.)

    Picture 36

    Beautiful. That matches their picture on p. 82.

    And here’s the D6 scaling function whose plot I delayed showing you…

    Picture 37

    and that matches (b), rather than (a), of figure 1.4.

    legend 1.14 b (figure a)

    Let’s make a different choice for a and b. I believe this is the “Coiflet” of order 6.

    \{a\to 1.1468,b\to 0.42403\}

    The resulting h’s are:

    h = \{-0.0727362,0.337915,0.852573,0.384847,-0.0727302,-0.0156552\}

    Now we form the m0 matrix exactly as before, and set the new values…

     M_0 = \left(\begin{array}{llllll} -0.102864 & 0. & 0. & 0. & 0. & 0. \\ 1.20572 & 0.477884 & -0.102864 & 0. & 0. & 0. \\ -0.102856 & 0.544256 & 1.20572 & 0.477884 & -0.102864 & 0. \\ 0. & -0.0221398 & -0.102856 & 0.544256 & 1.20572 & 0.477884 \\ 0. & 0. & 0. & -0.0221398 & -0.102856 & 0.544256 \\ 0. & 0. & 0. & 0. & 0. & -0.0221398\end{array}\right)

    We find the eigenvector of eigenvalue 1.

    \{0.,-0.189494,0.961829,-0.197385,0.00396248,0.\}

    That’s orthonormal, so I find the sum of its components (0.578913) and divide by it, getting

    V = \{0.,-0.327328,1.66144,-0.340958,0.0068447,0.\}

    Set those to be the values of the scaling function at the integers, and define the recursion…

    Picture 38

    Again, I will delay the plot until I have the mother wavelet.

    To get the mother wavelet, we start with the h’s…

    h = \{-0.0727362,0.337915,0.852573,0.384847,-0.0727302,-0.0156552\}

    (Those agree with p 92)

    We reverse them, and alternate the signs. (This time, I need to use their signs! And they have got g[0] = -h[5].)

     g = \{0.0156552,-0.0727302,-0.384847,0.852573,-0.337915,-0.0727362\}

    Picture 39

    compute and plot…
    (These are the Coiflet C6 mother wavelet and scaling function in order.)

    Picture 40

    That matches their p. 94 drawing of the C6 mother wavelet.

    And here’s the C6 scaling function…

    Picture 41

    and that matches their figure 1.14 (a).

    figure 14.1 c

    I do not know if the following pair has a name. We take

    \left\{a\to \frac{23 \pi }{60},b\to -\frac{\pi }{12}\right\}

    The resulting h’s are…

    h = \{0.0858766,0.652297,0.742126,0.0388932,-0.120896,0.0159163\}

    Now we form the m0 matrix, and set the values.

     M_0 = \left(\begin{array}{llllll} 0.121448 & 0. & 0. & 0. & 0. & 0. \\ 1.04953 & 0.922488 & 0.121448 & 0. & 0. & 0. \\ -0.170973 & 0.0550033 & 1.04953 & 0.922488 & 0.121448 & 0. \\ 0. & 0.022509 & -0.170973 & 0.0550033 & 1.04953 & 0.922488 \\ 0. & 0. & 0. & 0.022509 & -0.170973 & 0.0550033 \\ 0. & 0. & 0. & 0. & 0. & 0.022509\end{array}\right)

    Now we find the eigenvector of eigenvalue 1:

    \{0.,-0.840331,-0.536329,0.0786992,0.00151279,0.\}

    The sum of its components is -1.29645, so we divide by it, to get a vector whose components add up to 1:

    V = \{0.,0.648179,0.413691,-0.0607037,-0.00116688,0.\}

    That gives us the initial values, and we define the recursion…

    Picture 42

    Recall the h’s…

    h =\{0.0858766,0.652297,0.742126,0.0388932,-0.120896,0.0159163\}

    We reverse them, and alternate the signs.

    g = \{-0.0159163,-0.120896,-0.0388932,0.742126,-0.652297,0.0858766\}

    Redefine the mother wavelet…

    Picture 43

    and plot it…

    Picture 44

    … followed by the scaling function:

    Picture 45

    The last case for their Figure 1.4.

    Here are the angles a and b…

    \left\{a\to \frac{3 \pi }{4},b\to \frac{2 \pi }{15}\right\}

    Here are the resulting h’s…

    h = \{-0.158303,0.744755,0.556922,-0.103219,0.308488,0.0655711\}

    We form the M0 matrix and set its values…

    M_0 = \left(\begin{array}{llllll} -0.223874 & 0. & 0. & 0. & 0. & 0. \\ 0.787606 & 1.05324 & -0.223874 & 0. & 0. & 0. \\ 0.436267 & -0.145974 & 0.787606 & 1.05324 & -0.223874 & 0. \\ 0. & 0.0927315 & 0.436267 & -0.145974 & 0.787606 & 1.05324 \\ 0. & 0. & 0. & 0.0927315 & 0.436267 & -0.145974 \\ 0. & 0. & 0. & 0. & 0. & 0.0927315\end{array}\right)

    Once again we find the eigenvector of eigenvalue 1. (If you’re using Mathematica, be careful. It’s probably the 2nd one rather than the 1st, in this case.) The normalized eigenvector is…

    \{0.,0.955662,0.22728,0.184742,0.0303893,0.\}

    The sum of its components is 1.39807, so we divide through and get our initial values for the scaling function…

    V = \{0.,0.683556,0.162567,0.13214,0.0217365,0.\}

    So we define the scaling function

    Picture 46

    Here are the h’s…

    h = \{-0.158303,0.744755,0.556922,-0.103219,0.308488,0.0655711\}

    We reverse them, and alternate the signs to get a set of g’s…

    g = \{-0.0655711,0.308488,0.103219,0.556922,-0.744755,-0.158303\}

    Define the mother wavelet…

    Picture 47

    Compute and plot it…

    Picture 48

    And here’s the scaling function (which we defined first, as usual)…

    Picture 49

    Closing

    So. We’ve had some practice computing the scaling function and the mother wavelet from the h’s. We know that the h’s are not arbitrary, but are given by solutions to some set of equations, and I’ve shown you the general solutions for N = 4 and N = 6.

    Now I really owe you an explanation of where the heck these computations come from. But at least you’ll know why I’m writing equations when you see them.

    Next, I think.

    Mother wavelet from scaling function: D4 and Haar

    D4 dyadic wavelet

    We can compute the mother wavelet from the scaling function. Let me show you how to do this for Daubechies’s D4.

    First, we need the scaling function.

    the D4 scaling function again, quickly

    This is review. There is a matrix M0 which has an eigenvector with eigenvalue 1, and that eigenvector gives me the values of the scaling function \varphi at the integers. Once I have initial values, I can compute \varphi by recursion (and because of the specific form of the recursion, only at points whose denominator is a power of 2).

    I start by constructing the matrix M0 in principle…

    M_0 = \left(\begin{array}{llll} \sqrt{2}\  \text{h0} & 0 & 0 & 0 \\ \sqrt{2}\  \text{h2} & \sqrt{2}\  \text{h1} & \sqrt{2}\  \text{h0}   & 0 \\ 0 & \sqrt{2}\  \text{h3} & \sqrt{2}\  \text{h2} & \sqrt{2}\    \text{h1} \\ 0 & 0 & 0 & \sqrt{2}\  \text{h3}\end{array}\right)

    (This comes from evaluating the dilation equation at the integers, then writing the resulting system of equations as a matrix-vector equation. It turns out to be M_0\ \varphi = \varphi\ . In fact, I construct M0 from its pattern, but i know how to derive – hence check – it.)

    Then I set the values of the filter coefficients h…

    \left\{\text{h0}\to \frac{1+\sqrt{3}}{4 \sqrt{2}\ },\text{h1}\to   \frac{3+\sqrt{3}}{4 \sqrt{2}\ },\text{h2}\to   \frac{3-\sqrt{3}}{4 \sqrt{2}\ },\text{h3}\to   \frac{1-\sqrt{3}}{4 \sqrt{2}\ }\right\}

    The matrix M0 becomes…

    M_0 = \left(\begin{array}{llll} 0.683013 & 0. & 0. & 0. \\ 0.316987 & 1.18301 & 0.683013 & 0. \\ 0. & -0.183013 & 0.316987 & 1.18301 \\ 0. & 0. & 0. & -0.183013\end{array}\right)

    Now I look for an eigenvector having eigenvalue 1. Here are the eigenvalues (first row) and eigenvectors (subsequent rows)…

    \begin{array}{c} \{1.,0.683013,0.5,-0.183013\} \\ \left(\begin{array}{llll} 0. & -0.965926 & 0.258819 & 0. \\ 0.408248 & -0.816497 & 0.408248 & 0. \\ 0. & 0.707107 & -0.707107 & 0. \\ 0. & 0.408248 & -0.816497 & 0.408248\end{array}\right)\end{array}

    We see, as we saw before, that we want the first eigenvector from that list. Mathematica has given me a vector of length 1; I want a vector – I’ll explain later – whose components sum to 1.

    V = \{0.,1.36603,-0.366025,0.\}

    I take those components as the initial values of the scaling function, i.e. its values at the integers. Then I write a recursive function definition, which includes the requirement that the function is zero outside the interval [0,3].

    Frankly, the easiest way to show it to you is a drawing, especially of what Mathematica knows after I make the definition; read the last line.

    D4 may 9 1

    In order to make a point, I am not going to evaluate that function anywhere else. I have all I need for now.

    the D4 mother wavelet

    Let me hand you another recipe. The “mother wavelet” \psi satisfies an equation very much like the dilation equation. We have

    \psi(t) = \sum_{n}{g(n)\ \sqrt{2}\ \ \varphi(2\ t - n)}

    We do not have to specify that the function is zero outside [0,3]; all we have to do is feed it \varphi\ .

    It’s important that it is not a recursion for \psi\ : the LHS is \psi(t) but the RHS is still the scaling function \varphi(2t-n)\ . And the coefficients have changed. They are often denoted by h1 or by g. I think I’ll use g.

    The g(n) are related to the h(n), but not uniquely. In general,

    g(n) = \pm (-1)^n\ h(L-1-n)\ , for L an even integer (I’m pretty sure it has to be even).

    An extremely convenient choice is L = N, the number of nonzero h’s, in which case we get

    g(n) = \pm (-1)^n\ h(N-1-n)\ ,

    which simply says reverse the h’s and alternate the signs.

    That is, for our case, N = 4, so we’re looking at h(3-n), and then

    g(0) = h(3)
    g(1) = -h(2)
    g(2) = h(1)
    g(3) = -h(0)

    (or the negative of all those).

    Here are our h’s again…

    \left\{\frac{1+\sqrt{3}}{4 \sqrt{2}\ },\frac{3+\sqrt{3}}{4   \sqrt{2}\ },\frac{3-\sqrt{3}}{4 \sqrt{2}\ },\frac{1-\sqrt{3}}{4   \sqrt{2}\ }\right\}

    or

    \{0.482963,0.836516,0.224144,-0.12941\}

    We reverse them, and alternate the signs, in order to get the g’s.

    \left\{\frac{1-\sqrt{3}}{4 \sqrt{2}\ },-\frac{3-\sqrt{3}}{4   \sqrt{2}\ },\frac{3+\sqrt{3}}{4   \sqrt{2}\ },-\frac{1+\sqrt{3}}{4 \sqrt{2}\ }\right\}

    or

    \{-0.12941,-0.224144,0.836516,-0.482963\}

    Oddly enough, Burrus et al. showed the opposite of those signs. That doesn’t agree with their own equation, and it would lead to the negative of the following graph – but my picture matches their picture!

    Now we define \psi in terms of \varphi(2t-n)\

    D4 may 9 2

    It’s time to just plot the D4 mother wavelet.

    D4 may 9 3

    and that looks good.

    NOTE that it will compute whatever it needs of \varphi\ , so in a sense it is recursive. This is the point I wanted to make: I did not have to compute \varphi for it, it could do it itself.

    Just to put them near each other, here’s a picture of the scaling function:

    D4 may 9 4

    the Haar wavelet system

    Let me show you all of that again for a much simpler case. The Haar wavelets, as they are now called, was invented long before we knew they was a wavelet system. Its scaling function is a step function, defined to be 1 on the half open interval [1,0) and zero elsewhere. I will eventually show you – I hope – that it could be considered the D2 scaling function also.

    Oh, here I go again using a cannon when a slingshot would more than suffice. There is no reason to use recursion to graph this scaling function, nor even to graph this mother wavelet. But it’s nice to know that the algorithm works in the simplest case.

    Haar (D2) scaling function

    We start with the M0 matrix in principle, for two nonzero h’s…

    M_0 = \left(\begin{array}{ll} \sqrt{2}\  \text{h0} & 0 \\ 0 & \sqrt{2}\  \text{h1}\end{array}\right)

    I simply tell you that the h’s have these (two identical) values…

    \left\{\text{h0}\to \frac{1}{\sqrt{2}\ },\text{h1}\to   \frac{1}{\sqrt{2}\ }\right\}

    Then the matrix M0 is…

    M_0 = \left(\begin{array}{ll} 1. & 0. \\ 0. & 1.\end{array}\right)

    Well, well. By inspection – it’s diagonal! – there are two eigenvalues equal to 1, and the eigenvectors are the columns of that matrix. We might as well choose the eigenvector

    V = \{1,0\}

    The sum of the components is already 1, so we don’t need to scale it to get the initial values of the scaling function.

    We define the scaling function \varphi as usual…

    D4 may 9 5

    Let’s plot it.

    D4 may 9 6

    A unit step function on the half-open unit interval. I’m rather pleased to see that.

    D2 wavelet

    To get the g’s for the mother wavelet, we reverse the h’s (oh, they’re the same!) and alternate the signs…

    \left\{\frac{1}{\sqrt{2}\ },-\frac{1}{\sqrt{2}\ }\right\}

    We write the definition… ask what we know… and then we plot the mother wavelet…

    D4 may 9 7

    Yes! That’s exactly what I wanted to see. And if you’ve ever seen the Haar (D2) mother wavelet, it’s what you wanted to see, too.

    I think the next post will show you a few more scaling functions and mother wavelets, for six nonzero h’s.