2 Jul 2011

The Kempner Series - A modified harmonic series

The harmonic series is considered a classic textbook example of a series that seems to converge but is actually divergent. For those that need a reminder, the harmonic series is: \[\sum_{n=1}^{\infty}\frac{1}{n}=1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}\cdots.\]
Now here's an interesting question:

How can we modify the harmonic series to make it convergent?

Disregarding trivial answers such as "remove all the terms", this question is actually pretty hard to answer. Say we removed every second term starting from the first (i.e. 1, 1/3, 1/5, etc). Would that work?
\[\sum_{\substack{n=1 \\ n\ is\ even}}^{\infty}\frac{1}{n}=\frac{1}{2}+\frac{1}{4}+\frac{1}{6}+\frac{1}{8}\cdots=\frac{1}{2}\left(1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}\cdots\right).\]
From above, it seems like it would not work since it just produces half of the original series (which is not a very well defined quantity since the original series was divergent anyway).

Another way to modify the harmonic series

Instead of removing every second term, let us remove every term with a 9 in it (i.e. remove 1/9, 1/19, 1/29, etc). This series is called the Kempner series: \[1+\frac{1}{2}+\frac{1}{3}+\cdots+\frac{1}{8}+\frac{1}{10}+\cdots.\]Although removing every term with a 9 doesn't seem like a lot, this bizarrely modified harmonic series is actually convergent (it is also convergent if you choose to remove any other digit/string of digits such as "0", "42", or "8675309")!

We can now state the formal question as:

Consider the series: \[a_{n}=\begin{cases}1/n & if\  n\  does\  not\  contain\  the\  digit\ 9 \\ 0 & otherwise\end{cases}\] Is $\sum a_n$ convergent?

Instead of starting the proof right away, let's look at why this modification to the harmonic series might make it convergent.

The plots below read 1 whenever an integer contains the digit 9, and 0 otherwise.
Upper left: Plot from 1 - 9, Upper right: Plot from 10 - 99, Lower left: Plot from 100 - 999, Lower Right: Plot from 1000 - 9999.

As you can see, the plots have a nice "fractal" quality to them, in that each plot looks like it makes up parts of the next plot. Although the actual number of terms removed by the modification seems fewer than doing something like removing every second term, we can see that there are always points in the series where a long stretch of consecutive terms are removed (in fact, name a number and we can always find a point in the series where the number of consecutive terms removed is greater than that number). As the number of digits in the terms get large, the number of terms we remove increase drastically. As an example, most integers of say 100 digits will have a 9 in them, leading to about 99.99997% of the 100 digit terms being deleted. It is the quirky effect of removing consecutive terms as well as the fractal quality of the series that will allow us to see and prove its convergence.

Just exactly how many terms are removed though?

To do this systematically, we'll exploit the pattern we saw in the plots above. That is, since the plots seem to exhibit fractal qualities in the ranges ([1,9], [10,99], [100, 999] ...) that they are plotted, we'll look at the number of terms removed inside these ranges.

For the interval [1,9], only 1 term is removed. Note that this range starts at 1 because the harmonic series starts at n=1.

For the interval [10, 99], we have 1 term removed from each of 10 - 19, 20 - 29, 30 - 39, ... , 80 - 89. Then we have 10 terms removed from 90 - 99. In total, we have 8 terms removed from 10 - 89, and a further 10 terms removed from 90 - 99. This makes for 8*1 + 10 = 18 terms removed.

For the interval [100, 999], there is one term removed from 100 - 109, 18 terms removed from 110 - 199, 1 term removed from 200 - 219, 18 terms removed from 220 - 299, and so on, with the last being 18 terms removed from 800 - 899 terms. Finally, we have 100 terms removed from [900, 999]. In total this is 8*1 + 8*18 + 10² = 252 terms removed.

Perhaps you are seeing a pattern here. Let's try to make out what that pattern is. Instead of looking at the number of terms removed, let's look at the number of terms we are keeping.

For the interval [1,9], we keep 9 - 1 = 8 terms.
For the interval [10, 99], we keep 90 - 18 = 72 terms.
For the interval [100, 999], we keep 900 - 252 = 648 terms.
The pattern is 8, 72, 648. Dividing by 8, we have the pattern 1, 9, 81. We can see that the next number in the sequence will be 81*9 = 729, making the number of terms kept 8*729 = 5832. So the number of terms kept in a given interval $[10^n,10^{n+1}-1]$ is $8\cdot 9^n$. We will assume this knowledge (though it can be proven by standard methods of mathematical induction) for the proof below.

Before we start the proof, we now can see why our modification to the series is so effective. Consider all integers containing 100 digits. There are $9\cdot 10^{99}$ of them. The number of terms that are kept is $8\cdot 9^{99}$. The fraction of terms that are kept is thus \[\frac{8\cdot 9^{99}}{9\cdot 10^{99}}\approx0.0000262335.\] We took out a lot of terms!

The proof
Consider the series: \[a_{n}=\begin{cases}1/n & if\  n\  does\  not\  contain\  the\  digit\ 9 \\ 0 & otherwise\end{cases}\] Is $\sum a_n$ convergent?

We know from the section above that the number of terms that are not 0 in $a_n$ on an interval  $[10^n,10^{n+1}-1]$ is $8\cdot 9^n$. Moreover, the largest term of the harmonic series in each interval $[10^n,10^{n+1}-1]$ is $\frac{1}{10^n}$.

Therefore, we can provide an upper bound for $\sum a_{n}$:
\[\sum a_{n}<\sum8\cdot9^{n-1}\cdot\frac{1}{10^{n-1}}=\sum8\cdot\left(\frac{9}{10}\right)^{n-1}.\] The right hand side is a convergent geometric series with sum \[8\sum_{n=1}^{\infty}\left(\frac{9}{10}\right)^{n-1}=80.\] Thus, $\sum a_n$ is convergent by comparison.


Extra tidbits

Although we have proven the convergence of this series when the digit 9 removed, the series is convergent if you remove any digit or string of digits. For example, when all numbers with "42" is removed from the harmonic series, the sum of the series is about 230. When all 9's are removed, the sum of the series is about 23.

Although the Kempner series converges, the convergence rate is extremely slow. After summing about 10 million terms, I am still 9 off from 23. It is known that even after summing 1027 terms, the sum is still about 1 off from 23. Can you think of a way to make the series sum faster?

1. The source of the question is from an old professor of mine, whose page can be found here (under teaching and Science One).
2. R. Baillie, Sums of Reciprocals of Integers Missing a Given Digit, American Mathematical Monthly 86, 372-374.
3. T. Schmelzer and R. Baillie, Summing a Curious, Slowly Convergent SeriesAmerican Mathematical Monthly 115, 525-540.


  1. This is quite interesting

    Nice blog you got here by the way

    keep it up

  2. Very nicely explained!

  3. I feel very stupid now. Seems interesting.

  4. @Anonymous
    Thank you very much!
    Suggestions on making the blog better are always welcomed (same to all the anons above)!

  5. Very nice...
    I checked out this post because I once saw an article posted in a Chinese journal that proves the Harmonic series is both divergent and convergent. iirc, it uses the "remove all 9" argument too. But the author showed the sum of terms containing at least a 9 is convergent(which is horribly wrong...).

  6. @Chao Xu
    Ah. What journal is this? I might take a look if I have time.

    Also, sorry to digress on this reply but I'd like to explain to future readers as to why the modified harmonic series containing only terms with 9's is not convergent (as you've said).

    If the series containing only 9's (i.e. the part we removed from the harmonic series) is convergent, then that means the sum of all the terms we removed from the harmonic series is finite. However, if we only removed a finite portion from the infinite sum of the unmodified harmonic series, we could not have possibly made the series convergent as we proved. Hence, the series containing only terms with 9's is divergent and must sum to infinity.

  7. @Paul Liu

    Here is the link to the paper
    Of course the journal isn't a well established one. It's just a journal for one university.
    I assume you can read Chinese...

    It is very LOL worthy.

    Almost as funny as the medical doctor who rediscovered Riemann sum, published in a medical journal, and manage to get 70+ citations...

  8. @Chao Xu
    Thanks! I'll take a look in the morning. It'll give me a good chance to take the rust off my chinese reading skills too.

  9. now you ask for suggestions, what about increase the number of post per page?

    at the moment it's not a big deal, but in some time you will have (i hope) a decent amount of posts, and it's quite slow trying for anyone who tries to find something they want to read

  10. @Anonymous
    Hmm. I thought it would make my page too long to load due to the amount of LaTeX but I think I will increase the number of posts per page now.

    Thanks, anonymous helper!

  11. Thomas Schmelzer29 August 2011 at 15:49

    I am delighted you find this interesting :-) Your professor isn't that old ;-) I think he might have been in the seminar I gave at Oxford discussing this problem. I highly recommend you get a copy of Alex's Adventures in Numberland by Alex Bellos. He is discussing the same problem (and many other even more interesting ones). Your plots are great!

    Enjoy your studies

  12. @Thomas Schmelzer

    Haha yes! He probably was at that seminar. I know he was at Oxford just a couple years ago.
    Thanks for the book recommendation and thank you very much for the encouragement! It's kind words from people such as yourself that inspire me to write these expositions on math.


  13. Here's another variation on the problem, from

    The sum of the series 1/9 + 1/19 + 1/29 + 1/39 + 1/49 + ... where the denominators have exactly one 9, is about 23.04428708074784831968

  14. @Anonymous

    That was a pretty interesting read! The code they provide allow you to find the sum of any Kempner-like series you wish. I am particularly curious about one thing though. All of these sums are particularly not likely to be rational. Could we find a Kempner-like series that converges to a rational sum?

  15. This was a pretty interesting read, I'd never heard of this before. Another obvious way to make the series converge is to raise the denominator of each term of the harmonic series to a power greater than or equal to 2.

    1. Yes! The powers of two thing is actually pretty useful in testing for convergence or divergence. An elementary proof of the harmonic series divergence can be found by combining terms in groups of powers of two.

      What I mean is: 1 + 1/2 + 1/3 + 1/4 + 1/5 + 1/6 + 1/7 + ... = 1 + 1/2 + (1/3 + 1/4) + (1/5 + 1/6 + 1/7 + 1/8) + ... >= 1 + 1/2 + (1/4 + 1/4) + (1/8 + 1/8 + 1/8 + 1/8) + ... >= 1 + 1/2 + 1/2 + 1/2 + ...
      which diverges quite obviously.

      Note that the above is just a special case of the Cauchy Condensation Test, which makes even better use of powers of two.


  16. Very nicely done , will help me a lot for my upcoming seminar! Thanks

  17. youre a great help to current Science One-ers :)

  18. This comment has been removed by the author.

  19. i am wondering how to speed up the calculation...since brute force summing for 10^27 terms is not possible on my pc