Elementary Number Theory: the algebra of arithmetic functions

intro & Euler’s \varphi and…

I am going to look at several arithmetic functions, and we will see how Mathematica knows them. We will see the convolution product of arithmetic functions for dealing with particularly appropriate summations, the Möbius Inversion Formula, and we will see a conceptual basis for the Möbius \mu\ function, which basis is part of a useful more conceptual way to deal with all these summations.

If you get bogged down along the way, skip to the final section entitled “Way too Easy!”

In fact, I have filed this post under “abstract algebra” because we’ll be doing just a little algebra to get a good grip on these functions. And yet that little will go a long, long way.

A function \alpha\ defined on the positive integers – which we would usually call a sequence! – is said to be multiplicative if

\alpha(m\ n) = \alpha(m)\ \alpha(n)\

whenever m and n are relatively prime. (That is, whenever the greatest common divisor (GCD) of m and n is 1: we often use (m,n) for the GCD and write that condition as (m,n) = 1.) Be careful: we require that the function of a product be the product of the functions only when the integers are relatively prime, not for all pairs of integers m and n.

In fact, I’ll even warn you more specifically: since p and p^2 are not relatively prime, we cannot compute \alpha(p^2) \text{ as } \alpha(p)^2\ even for a prime p.

Now let me be a little more precise: the definition of a function requires that we specify domain and codomain. In the context of number theory, a function from the positive integers to the complex numbers is called either an arithmetic function or a number-theoretic function.

From now on, let m, n, k, whatever be a positive integer – if it’s an integer in this post, then it’s positive. Euler’s totient function, also called Euler’s phi function, \varphi(m)\ is the number of (positive) integers less than or equal to m that are relatively prime to m.

(I can’t say “less than”, because although (m,m) = m, that’s 1 for m = 1… and I need to include that case. To put that another way, “less than” would be fine… except for m = 1.)

Mathematica® knows this function, as EulerPhi. Here are its values for m = 1, .. 20:

Note that Mathematica has \varphi(1) = 1\ … as it should, since (1,1) = 1.

You may know another arithmetic function: the number of positive primes less than or equal to a positive integer. Mathematicians use \pi(n)\ for this function, but Mathematica knows this it as PrimePi. For example, we have

That is, there are three primes less than or equal to 6, namely 2,3 5. Here are its values for m = 1, .., 20:

Note that

is 1, counting 2 itself.

Let’s take a closer look at multiplicative functions. By definition, the term multiplicative applies only to arithmetic functions, so if I say \alpha\ is multiplicative then \alpha\ is implicitly an arithmetic function.

Here’s one key fact. If a function \alpha\ is multiplicative, then either \alpha(1) = 0\ or \alpha(1) = 1\ , since

\alpha(1) = \alpha(1*1) = \alpha(1) \alpha(1)\

i.e. x = x^2\ : solutions are x = 0 and x = 1. Furthermore, if \alpha\ is multiplicative and \alpha(1) = 0\ , then \alpha(n) = 0\ for all n: the function is identically zero. Let’s see that. If p is prime, then \alpha(p)\ is zero:

\alpha(p) = \alpha(1*p) = \alpha(1)*\alpha(p) = 0 * \alpha(p) = 0\ .

(edited 10/22/12) By the same argument \alpha(p^n) = 0\ , and \alpha(p_1\ p_2 ... p_k) = 0\

and then any integer n, being a product of powers of primes, is also zero:

\alpha(n) = 0\ .

This means that you may see the hypothesis “If \alpha\ is multiplicative and not identically zero” in texts; we might just as well say “\alpha\ is multiplicative and non-trivial”.

Although we applied this technique in a trivial case, we will see that it works more generally: if a function is multiplicative, we find its values on primes, then on powers of primes, and thence on all integers.

Now is a good time to recall that PrimePi(1) = 0. Ah ha! I infer that PrimePi is not multiplicative. (If it were, it would have to be identically zero.)

Let us return to Euler’s \varphi\ function. We have they key fact that \varphi\ is multiplicative. And because it’s multiplicative, if we can find its values for powers of primes, then we can find its values for any integer n.

First, if p is prime, then \varphi(p) = p-1\ .

(because 1, 2, … , p-1 are all relatively prime to p.)

Then. if p is prime and k is a positive integer,

\varphi(p^k) = p^{k-1} (p-1) = p^k (1-1/p)\ .

(Here, the key is that p is not relatively prime to p^2, etc.)

For k =1 that reduces to our previous result, \varphi(p) = p - 1\ . I’ll go ahead and state the final result, just for completeness:

if n is a product of k distinct prime numbers p_i\ to powers k^i\ , then

\varphi(n) = n (1-1/p_1)(1-1/p_2)...(1-1/p_k)\ .

(The powers k^i\ are implicit in the “n” that multiplies the expression.)

To illustrate that \varphi\ is multiplicative, we compute \varphi(7^2 13^4 19)\ directly:

But we may also compute it by using powers of primes (since the powers are relatively prime):

We may also confirm the explicit formula for each power of a prime:

One reason for mentioning PrimePi is that

PrimePi is not multiplicative.

For example, take the two primes 2 and 3. PrimePi of 6 is 3, but PrimePi of 3 and of 2 are 2 and 1, so PrimePi of the product is not equal to the product of the PrimePi’s:

sum over divisors

It is some interest to consider the sum of all the numbers \alpha(d)\ as d ranges over all the positive divisors of n. We write this as

\sum_{ d|n} \alpha(d)

It is a useful fact that whether we sum \alpha(d) \text{ or } \alpha(n/d)\ , we get the same answer:

\sum_{ d|n} \alpha(d) =  \sum_{ d|n} \alpha(n/d)

because as d ranges over the divisors of n, so does n/d – just in reverse order.

Let me also point out that the sum \sum_{ d|n} \ always includes d=1 and d=n.

We have an interesting application to Euler’s \varphi\ function:
if n is a positive integer, then

\sum_{ d|n} \varphi(d) = n

Let me see if I can find a nice proof of that.

Mathematica knows this sum as DivisorSum. That last theorem is:

That is, DivisorSum is \sum_{ d|n} \ , and the second entry is \varphi\ written as a pure function.

Let’s define infinitely many more multiplicative functions. If n is a positive integer, we define

\tau(n)\ = the number of positive divisors of n

\sigma(n\ ) = the sum of the positive divisors of n.

In symbols, we have defined

\tau(n) = \sum_{ d|n} 1

\sigma(n) = \sum_{ d|n} d

Perhaps it will not seem too much of a stretch to define the sum of the kth powers of divisors of n:

\sigma_k(n) = \sum_{ d|n} d^k

Then what we called \sigma(n)\ could be written \sigma_1(n)\ , and \tau(n)\ is \sigma_0(n)\ . You will see both notations. More importantly, if you are using Mathematica, it knows this infinite collection of functions as \sigma_k(n)\ , and calls them DivisorSigma:

As I implied before I defined them, it is true that the functions \tau\ and \sigma\ (and \sigma_k\ , in general) are multiplicative.

Just as for \varphi\ , we can evaluate these functions if we know their values for prime powers… so we need to find those.

Here’s the answers for a power k of a single prime p:

if p is prime then

\sigma(p^k) = (p^{k-1} - 1) / (p-1)\

\tau(p^k) = k + 1\ .

We can ask Mathematica to check these formulas, for example, for p = 2.

\tau\ matches – that is, for k = 1, .. 20, p^k = k+1:

… and so does \sigma\ :

Shockley has a few examples; let me check them. Here are \tau\ and \sigma\ of 10:

Here are \tau\ and \sigma\ of 1,274,000:

Here are \tau\ and \sigma\ of 76:

Good. Mathematica agrees with my text.

Möbius Mu

(Yes, the same name as in a Möbius strip.) There is a fascinating arithmetic function written \mu\ and called Möbius Mu (written by Mathematica, as often happens, with oe instead of ö):

The most common definition – well, the definition in Mathematica, in MathWorld, in Hardy & Wright, and in Rose, and in Ireland & Rosen (see the end of the post) – is this:

\mu(n)\ is

  • +1 if n = 1
  • +1 if n is the product of an even number of distinct primes
  • –1 if n is the product of an odd number of distinct primes
  • 0 if n has a multiple prime factor (equivalently, if n is not squarefree)

and that’s what we see in the previous table.

I hate that definition. Yes, it’s simple enough – but where the hell did it come from? I would overcome my aversion if there were no choice, but there is a beautiful alternative. (I’m not saying you will find it beautiful, just that I did. I myself get annoyed when people tell me that I will like something.)

The \mu\ function arises when we answer the following question. I will eventually write this in a much, much simpler manner, but for now…

If we have arithmetic functions \alpha\ and \beta\ related by

\alpha(n) = \sum_{ d|n} \beta(d)

can we invert that equation and “solve for” \beta\ ?

Yes we can… the answer is called the Möbius Inversion Formula… and it can be written

\beta(n) = \sum_{ d|n} \alpha(d)\ \mu(n/d)

So that’s what the \mu\ function is for. Still, this is looking messy, and doesn’t give us any idea how \mu\ was obtained.

So let’s use some modern algebra to take a higher-level look at this. We can do so much better.

Let \alpha\ and \beta\ be arithmetic functions. We define the convolution product of \alpha\ and \beta\ as

(\alpha \circ \beta)(n) = \sum_{ d|n} \alpha(d)\ \beta(n/d)

I claim that the convolution product is a binary operation… that is, the product \alpha \circ \beta\ of two arithmetic functions is again an arithmetic function. Furthermore – being sums of products of complex numbers – the convolution product is associative.

More interestingly, it is commutative. I’m not going to put many proofs into this post, but let me show you this one. We write the definition of \alpha \circ \beta\

(\alpha \circ \beta)(n) = \sum_{ d|n} \alpha(d)\ \beta(n/d)

Let D = n/d, so d = n/D…

(\alpha \circ \beta)(n) = \sum_{ D|n} \alpha(n/D)\ \beta(D)

But each product term is commutative (they’re complex numbers, in general)…

(\alpha \circ \beta)(n) = \sum_{ D|n} \beta(D)\ \alpha(n/D) = (\beta \circ \alpha)(n)

(The proof of associativity is messier, but similar. Ultimately they boil down to commutativity and/or associativity of complex multiplication.)

Most interestingly, if \alpha \text{ and } \beta\ are multiplicative, then so is their convolution product\alpha \circ \beta\ .

Finally, Mathematica knows this product as the Dirichlet convolution:

My text has an example asking for (\varphi\circ\tau) (n)\ , for n= 1,2,3,4. Let’s see what Mathematica gets. In general,

Interesting. In our convolution notation, that says that \varphi \circ \tau = \sigma = \sigma_1\ . Let’s return to this toward the end of the post.

In the meantime, let’s compute those specific values:

Those are the answers in the book.

And here’s \sigma\ :

They do seem to be equal.

Now hold on a second. We have an associative binary operation defined on the set of arithmetic functions. I don’t suppose we have an identity, too?

Sure we do. (That’s the real reason I wrote this post.) The function \epsilon\ defined by

\epsilon(1) = 1\

\epsilon(n) = 0\ for all n > 1

acts as an identity for the convolution product. Let me emphasize that \epsilon(n) = 0 except for n = 1.

Let me confirm that it is an identity. Our general formula

(\alpha \circ \beta)(n) = \sum_{ d|n} \alpha(d)\ \beta(n/d)


(\epsilon \circ \beta)(n) = \sum_{ d|n} \epsilon(d)\ \beta(n/d) = \epsilon(1)\ \beta(n/1) = \beta(n)

Well, if we have an identity, might we have inverses? They should be defined in the obvious way: \beta\ is called the inverse of \alpha\ (under convolution multiplication) if

\alpha \circ \beta = \epsilon\ .

We don’t have a group yet, but the same proof applies here to show that if an inverse exists, it is unique. And it would be both a left and right inverse – if only because this product is commutative.

We can be even more precise. We can identify arithmetic functions – all, not just multiplicative ones – which have inverses:

the arithmetic function \alpha\ has an inverse if and only if \alpha(1) \ne 0\ .

That in turn is enough to let me assert than any multiplicative function which is not identically zero (\alpha(1) = 0)\ , has an inverse and the inverse is also multiplicative.

So: the set of non-zero multiplicative arithmetic functions is an abelian group under convolution.

This tells us – if we didn’t already know it – that PrimePi is not invertible. Hell, if it were, then its inverse would give us the kth prime for any k. We could simply enumerate the primes!

Let us return to the inversion problem. If we define

\nu(n) = 1\ for each positive integer n


\alpha(n) = \sum_{ d|n} \beta(d)

could be wrewritten as

\alpha = \nu \circ \beta\

and we should then have

\beta = \nu^{-1} \circ \alpha\ .

Since \nu(1) = 1 \ne 0\ , \nu\ has an inverse. And it’s clear that \nu\ is multiplicative… and therefore its inverse is multiplicative.

So, why don’t we just give a name to \nu^{-1}\ ? Guess what? That’s the definition of \mu\ that I really, really like:

\mu = \nu^{-1}\ .

That is, the equation

\alpha = \nu \circ \beta\

has inverse

\beta = \mu \circ \alpha\ ,

which in more gory detail is

\beta(n) =  (\mu \circ \alpha)(n)  = \sum_{ d|n} \alpha(d)\ \mu(n/d)

Yes, that is the Möbius Inversion Formula again.

Now, it takes about half a page to derive the properties of \mu\ . But first I need an explicit formula for the inverse of \alpha\ . It will be convenient to call it \beta\ , so \beta = \alpha^{-1}\ .

It’s clear that \beta(1) = 1/\alpha(1)\ . Right? We have

(\beta \circ \alpha) (1) = \beta(1) \alpha(1) = \epsilon(1) = 1, \text{ so } \beta(1) = 1/\alpha(1)\ .

For n = 2, first recall that \epsilon(2) = 0\ , and then we have

(\beta \circ \alpha) (2) = 0     = \sum_{ d|2} \beta(d)\ \alpha(2/d)

= \beta(1) \alpha(2) + \beta(2) \alpha(1) = 0\

so solve for \beta(2)\ :

\beta(2) = -1/ \alpha(1)\ ( \beta(1) \alpha(2) )\ .

I’ll write that more generally shortly. But we know \beta(1)\ , and we just got \beta(2)\ .


(\beta \circ \alpha) (3) = 0     = \sum_{ d|3} \beta(d)\ \alpha(3/d)

= \beta(1) \alpha(3) + \beta(3) \alpha(1) = 0\

so solve for \beta(3)\ :

\beta(3) = -1/\alpha(1)\ ( \beta(1) \alpha(3) )\ .

Yet again…

(\beta \circ \alpha) (4) = 0     = \sum_{ d|4} \beta(d)\ \alpha(4/d)

= \beta(1) \alpha(4) + \beta(2) \alpha(2) + \beta(4) \alpha(1) = 0\


\beta(4) = -1/\alpha(1)\ ( \beta(1) \alpha(4) +  \beta(2) \alpha(2) )\ .

Now it’s time to solve for \beta(n)\ in general. What we’ve been doing is pulling one term, specifically d=n, out of the sum… so let’s write that in general by adding the constraint d < n to the summation sign:

(\beta \circ \alpha) (n) = 0 \

= \sum_{ d|n} \beta(d)\ \alpha(n/d) = \beta(n)\ \alpha(1) +  \sum_{ d|n, d < n} \beta(d)\ \alpha(n/d)

Once again, solve for \beta(n)\ , n > 1; here’s our general formula for \beta = \alpha^{-1}\ :

\beta(n) = -1/\alpha(1)  \sum_{ d|n, d < n} \beta(d)\ \alpha(n/d)

For completeness, recall that

\beta(1) = 1/\alpha(1)\ .

For the fun of it, let's check that against the identity. Let \beta = \epsilon^{-1}\ … then

\beta(1) = 1/\epsilon(1) = 1\

\beta(2) = -1/\epsilon(1)  \sum_{ d|n, d < n} \beta(d)\ \epsilon(n/d) = \beta(1)\ \epsilon(2) = 0 = \epsilon(2)

(etc.) so the identity is its own inverse.

Now, can we use this inverse formula to work out \mu\ ? Of course we can!

it is immediate that \mu(1) = 1\ , because it's multiplicative, because it's the inverse of the multiplicative function \nu\ .

The explicit formula for inverse gives us

\mu(n) = -1/\mu(1)  \sum_{ d|n, d < n} \mu(d)\ \nu(n/d) = -\sum_{ d|n, d < n} \mu(d)\ ,

because \mu(1) = 1 \text{ and }\nu(n/d) = 1\ for all n and for all d.

So if n is a prime p, then the sum collapses to d =1, so

\mu(p) = - \mu(1) = -1\ .

Since \mu\ is multiplicative, \mu(p_1 p_2 ... p_k) = (-1)^k\ , for distinct primes p_i\ . In particular, \mu\ is +1 for an even number of distinct primes, -1 for an odd number of distinct primes.

For a repeated prime p, the explicit inverse gives us

\mu(p^2) = -(\mu(1) + \mu(p)) = -(1 - 1) = 0\ .

For a cubed prime, we get

\mu(p^3) = (\mu(1) + \mu(p) + \mu(p^2)) = -(1 - 1 + 0) = 0\


So now we know why \mu\ is what it is. I strongly prefer thinking of it as

\mu = \nu^{-1}\ ,

with explicit values computed using our general formula for \alpha^{-1}\ .

Way too Easy!

Let me show you other uses for the abstract approach. So far we have

the Euler phi: \varphi\
the number of divsiors: \tau = \sigma_0\
the sum of the divisors: \sigma = \sigma_1\
the sum of the kth powers of the divisors: \sigma_k\
the identity: \epsilon = 1, 0, 0, 0, 0\ , ….
the constant: \nu = 1, 1, 1, 1\ , ….
the inverse of \nu: \mu = \nu^{-1}\ .

Let's add one more:

\iota(n) = n\ .

Well, can we rewrite some things? For starters,

\tau(n) = \sum_{ d|n} 1 translates to \tau = \nu \circ \nu\ , and

\sigma(n) = \sum_{ d|n} d translates to \sigma = \iota \circ \nu\

Next, that unproved assertion

\sum_{ d|n} \varphi(d) = n\ translates to \nu \circ \varphi = \iota\ . (Which also gives us \varphi = \mu \circ \iota\ . Which we'll see again in a moment.)

Shockley asks us to prove a few things.

First, show that

\sum_{ d|n} \mu(d) = 0\ for n > 1. But that translates to (\nu \circ \mu)(n) = 0\ for n > 1… which is true because \mu = \nu^{-1}\ and \epsilon(n) = 0\ for n > 1. This question is nothing more than

\nu \circ \mu = \nu \circ \nu^{-1} = \epsilon\ .

Incidentally, Mathematica knew this. It catches the fact that the sum is 1 for n = 1:

I would hate to try proving that from the usual definition!

Second, show that

\sum_{ d|n} d\ \mu(d) = \varphi(n)

Well, that translates to \varphi = \iota \circ \mu\ , which we saw just a moment ago. I would be crying over this if all I had was the usual definition of \mu\ .

Third, prove that observation we made earlier that \varphi \circ \tau = \sigma\ .

Well, we have \tau = \nu \circ \nu\ and \varphi = \mu \circ \iota\ , so

\varphi \circ \tau =  \mu \circ \iota \circ \nu \circ \nu =  \iota \circ \mu \circ \nu \circ \nu = \iota \circ \epsilon \circ \nu = \iota \circ \nu = \sigma\ .

(Associativity lets me drop parentheses, commutativity puts \mu\ next to a \nu\ , \mu \circ \nu = \epsilon\ , and the identity drops out.)

As if this ease of manipulation weren’t enough…. The set of arithmetic functions, with pointwise addition and convolution multiplication is a ring… it's even an integral domain. Recalling our picture from this post

… I just have to ask: is it a UFD?

Yes, it is. (That is a free download.)

And I am utterly fascinated that the set of arithmetic functions is a UFD.

Oh, what are the units? Every invertible function… that is, every arithmetic function \alpha \text{ with } \alpha(1) \ne 0\ .

OK, what are the primes and the irreducibles?

I don't know….

My main resource for this post has been James E. Shockley, “Introduction to Number Theory”, Holt Rinehart Winston, 1967, 0-03-059760-9.

My other books, mostly heavy guns, do not have this approach to the Möbius function, they just define it by its values….

Hardy & Wright, “An Introduction to the Theory of Numbers”, Oxford Science Publications, 5th ed. 1979, 0-19-853171-0 is a famous book… I also own Rose, “A Course in Number Theory”, Oxford Science Publications, 2nd ed. 1994, 0-19-852376-9. The other prominent text is Ireland & Rosen, “A Classical Introduction to Modern Number Theory”, Springer-Verlag, 1982, 0-387-90625-8.

6 Responses to “Elementary Number Theory: the algebra of arithmetic functions”

  1. Andrew Au Says:

    Irreducibles and primes are the same thing in a UFD.

  2. bercier Says:

    DO YOU KNOW cedric villany
    thank you for your theoreme vivant
    very nice

    theoreme 3n+1

    impair number n
    pair number 3n+1

    divide by 2

    number x take n big 2x>=3n x>=1.5n

    probability to have pair .5 impair .5 x>n x=1.5n x increase

    now probability to have pair .5*.5=.25 x.5
    total probability to have pair <=1
    mean probability .75

    probability to have pair/impair is 3/2

    so n gives 1 always

    for n little it's very easy to verify

    if you find a big number N impair such as X=1.5N+.5 is impair…….


    and your calculator diverge…….

    and canot calculate and divide by 2…….

  3. Patsy Says:

    I do believe all the ideas you have introduced to your post. They are really convincing and will certainly work. Nonetheless, the posts are too brief for beginners. Could you please lengthen them a little from subsequent time? Thank you for the post.

    • bercier Says:

      thank you very much for your mail.
      I don’t understand well english…….
      do you know cedric villany?
      do you know well color matching functions?
      nice to meet you
      dear Patsy
      i ask you to know if the mail is for me?
      I dont understand well this blog!!!
      Rip is a great mathématicien!!!!a man?
      and you?
      who are you?

  4. Horst H. von Brand Says:

    Much simpler proof that f multiplicative, f(0) = 0 gives f(n) = 0 for all n: (n, 1) = 1; f(n) = f(n * 1) 0 f(n) * f(1) = 0. No need to mess with primes.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: