September 2004
—————————
In my September diary I posed the following brain-teaser.
Consider the powers of two: 2, 4, 8, 16, 32, 64, 128, … Their right-most digits follow a simple repetitive pattern: 2, 4, 8, 6, 2, 4, 8, 6, 2, 4, 8, 6, … What about their left-most digits, though? Here are the first few dozen. (Read right to left, line by line. I've just jammed the digits together, leaving out the courtesy commas, to save space.)
24813612512481361251248136125There is a pattern there, but it keeps breaking down. When you get up into really big powers of 2, in fact, it breaks down altogether. For the ten powers of 2 from the 30th to the 39th, for instance, we have lead digits 1, 2, 4, 8, 1, 3, 6, 1, 2, 5. For the ten powers from the billionth to the 1,000,000,009th, by contrast, we have lead digits 4, 9, 1, 3, 7, 1, 2, 5, 1, 2. Every digit (except, of course, zero) shows up, though.
124813612512481371251249137125
124913712512491371361249137136
124913713612512481361251248136
125124813612512481361251248137 ……
I am going to refer to this infinite sequence of digits as "the sequence." Now I am going to ask the following two questions: In the first N digits of the sequence, how many occurrences of 3 shall I find? And how many occurrences of 4?
Call the first number U(N), the second V(N). The first few values of U(N) and V(N), for N from 1 to 15, look like this, as you can easily verify:
U(N) = 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, …Show that for sufficiently large N, U(N) will always be bigger than V(N). Find the limit of the ratio U(N) to V(N) as N tends to infinity.
V(N) = 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, …
—————————
Solution
For anyone who had an old-fashioned math education, here is a welcome opportunity to
review what you learned about
logarithms to base 10. The logarithm of X to base B is, of course, just the power you need to raise
B to in order to get
X. Thus the logarithm of 457 to base 10 is 2.65991620006985022235 …, because
102.65991620006985022235 … is
equal to 457. The logarithm of a whole number is always going to be an irrational number like that one: that is, a
number that is neither whole nor
fractional, and whose decimal expression goes on forever without repeating itself. When I say "logarithm" in
what follows, I shall mean
"logarithm to base 10."
If you had one of those traditional educations, you'll recall the the whole-number part of a logarithm (the part to the
left of the decimal point) is
called the characteristic; the remainder — the part to the right of the decimal point —
is called the
mantissa. I shall use these handy terms in what follows.
The characteristic of the base-10 logarithm of X tells you how many digits X has to the left of its
decimal point.
In the example I gave, the characteristic is 2, so there will be 3 digits to the left of the decimal point in 457.
(It's always one more than
the characteristic.) Sure enough, 457 has three digits to the left of the decimal point. The mantissa tells you what
those digits actually are.
Since 10 to the power of 0.65991620006985022235 … (notice I dropped the characteristic) is precisely 4.57,
the digits in X are 4,
5, and 7, in that order.
The only other thing you need here is a theorem due to (I think) Hermann Weyl, which says the following interesting
thing.
Theorem Suppose α is any irrational number. Form the infinite sequence
α, 2α,
3α,
4α, 5α, … multiplying α by each positive whole number in turn.
A moment's thought will convince you that
every one of these numbers is also irrational. Ignore their characteristics and just consider their mantissas.
The mantissas are uniformly
distributed between 0 and 1.
I had better explain "uniformly distributed." It just means that these numbers are evenly spread between 0
and 1. If you took the first
million (say) and histogrammed them, your histogram would just be a rectangle — like the ones
here for frac(e n),
frac(γ n), and so on.
(Though look at the fidgety behavior of frac(π n)!) To be a bit more precise: if a and
b are numbers between 0
and
1, with a < b, the probability of any number picked at random from that sequence of mantissas
being between a and
b is just (b −a).
(There is actually an even more precise definition of "uniformly distributed" than that, a watertight
mathematical definition, but it
involves integrals and stuff, and I won't burden you with it here. You can find it, if you really want to, in any good
book of advanced analysis or
number theory: see, e.g., p. 67 of Tenenbaum & Mendès France's The Prime Numbers and their
Distribution.)
OK; armed with all that, let's tackle the problem. What's the lead digit of 2N? If we take its
logarithm to base 10, the
mantissa of that logarithm will tell us. Now, the logarithm to base 10 of 2N is just N
times the logarithm of 2. The
logarithm of 2 to base 10 is 0.30102999566398 …; and this is, of course, an irrational number. You see
how Weyl's theorem is going to
kick in here.
In fact, the lead digit of 2N will be a 1 if the mantissa of the logarithm to base 10 lies between
the logarithm of 1 (inclusive)
and the logarithm of 2 (exclusive). The logarithm of 220, for example, has mantissa
0.02059991327962 … (because 20 times
0.30102999566398 … equals 6.02059991327962 …), and this does indeed lie between the logarithm
of 1 (which is zero) and the
logarithm of 2 (which is 0.30102999566398 …). So 220 begins with a 1. Yes it does:
220 = 1,048,576.
Likewise, 2N begins with a 3 if the mantissa of its logarithm lies between the logarithm of 3
(inclusive) and the logarithm of 4
(exclusive). And 2N begins with a 4 if the mantissa of its logarithm lies between the logarithm of
4 (inclusive) and the
logarithm of 5 (exclusive). The probabilities of those events are, by Weyl's theorem, log104 minus
log103, and
log105 minus log104, respectively.
So in the first N digits of the sequence, you will find about
(log104 − log103) N
occurrences of 3, and about (log105 − log104) N occurrences of
4. Numerically, these come out to about 0.124938736608 N and 0.096910013008 N,
respectively. The first is indeed
bigger than the second, and the ratio is 1.289224226994 ….
Note the connection this puzzle has with (a) the good old
slide rule, and (b) Benford's Law.