In my September diary I posed the following brain-teaser.
Consider the powers of two: 2, 4, 8, 16, 32, 64, 128, … Their right-most digits follow a simple repetitive pattern: 2, 4, 8, 6, 2, 4, 8, 6, 2, 4, 8, 6, … What about their left-most digits, though? Here are the first few dozen. (Read right to left, line by line. I've just jammed the digits together, leaving out the courtesy commas, to save space.)
24813612512481361251248136125There is a pattern there, but it keeps breaking down. When you get up into really big powers of 2, in fact, it breaks down altogether. For the ten powers of 2 from the 30th to the 39th, for instance, we have lead digits 1, 2, 4, 8, 1, 3, 6, 1, 2, 5. For the ten powers from the billionth to the 1,000,000,009th, by contrast, we have lead digits 4, 9, 1, 3, 7, 1, 2, 5, 1, 2. Every digit (except, of course, zero) shows up, though.
I am going to refer to this infinite sequence of digits as "the sequence." Now I am going to ask the following two questions: In the first N digits of the sequence, how many occurrences of 3 shall I find? And how many occurrences of 4?
Call the first number U(N), the second V(N). The first few values of U(N) and V(N), for N from 1 to 15, look like this, as you can easily verify:
U(N) = 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, …Show that for sufficiently large N, U(N) will always be bigger than V(N). Find the limit of the ratio U(N) to V(N) as N tends to infinity.
V(N) = 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, …
For anyone who had an old-fashioned math education, here is a welcome opportunity to
review what you learned about
logarithms to base 10. The logarithm of X to base B is, of course, just the power you need to raise
B to in order to get
X. Thus the logarithm of 457 to base 10 is 2.65991620006985022235 …, because
102.65991620006985022235 … is
equal to 457. The logarithm of a whole number is always going to be an irrational number like that one: that is, a
number that is neither whole nor
fractional, and whose decimal expression goes on forever without repeating itself. When I say "logarithm" in
what follows, I shall mean
"logarithm to base 10."
If you had one of those traditional educations, you'll recall the the whole-number part of a logarithm (the part to the left of the decimal point) is called the characteristic; the remainder — the part to the right of the decimal point — is called the mantissa. I shall use these handy terms in what follows.
The characteristic of the base-10 logarithm of X tells you how many digits X has to the left of its decimal point. In the example I gave, the characteristic is 2, so there will be 3 digits to the left of the decimal point in 457. (It's always one more than the characteristic.) Sure enough, 457 has three digits to the left of the decimal point. The mantissa tells you what those digits actually are. Since 10 to the power of 0.65991620006985022235 … (notice I dropped the characteristic) is precisely 4.57, the digits in X are 4, 5, and 7, in that order.
The only other thing you need here is a theorem due to (I think) Hermann Weyl, which says the following interesting thing.
Theorem Suppose α is any irrational number. Form the infinite sequence α, 2α, 3α, 4α, 5α, … multiplying α by each positive whole number in turn. A moment's thought will convince you that every one of these numbers is also irrational. Ignore their characteristics and just consider their mantissas. The mantissas are uniformly distributed between 0 and 1.
I had better explain "uniformly distributed." It just means that these numbers are evenly spread between 0 and 1. If you took the first million (say) and histogrammed them, your histogram would just be a rectangle — like the ones here for frac(e n), frac(γ n), and so on. (Though look at the fidgety behavior of frac(π n)!) To be a bit more precise: if a and b are numbers between 0 and 1, with a < b, the probability of any number picked at random from that sequence of mantissas being between a and b is just (b −a).
(There is actually an even more precise definition of "uniformly distributed" than that, a watertight mathematical definition, but it involves integrals and stuff, and I won't burden you with it here. You can find it, if you really want to, in any good book of advanced analysis or number theory: see, e.g., p. 67 of Tenenbaum & Mendès France's The Prime Numbers and their Distribution.)
OK; armed with all that, let's tackle the problem. What's the lead digit of 2N? If we take its logarithm to base 10, the mantissa of that logarithm will tell us. Now, the logarithm to base 10 of 2N is just N times the logarithm of 2. The logarithm of 2 to base 10 is 0.30102999566398 …; and this is, of course, an irrational number. You see how Weyl's theorem is going to kick in here.
In fact, the lead digit of 2N will be a 1 if the mantissa of the logarithm to base 10 lies between the logarithm of 1 (inclusive) and the logarithm of 2 (exclusive). The logarithm of 220, for example, has mantissa 0.02059991327962 … (because 20 times 0.30102999566398 … equals 6.02059991327962 …), and this does indeed lie between the logarithm of 1 (which is zero) and the logarithm of 2 (which is 0.30102999566398 …). So 220 begins with a 1. Yes it does: 220 = 1,048,576.
Likewise, 2N begins with a 3 if the mantissa of its logarithm lies between the logarithm of 3 (inclusive) and the logarithm of 4 (exclusive). And 2N begins with a 4 if the mantissa of its logarithm lies between the logarithm of 4 (inclusive) and the logarithm of 5 (exclusive). The probabilities of those events are, by Weyl's theorem, log104 minus log103, and log105 minus log104, respectively.
So in the first N digits of the sequence, you will find about (log104 − log103) N occurrences of 3, and about (log105 − log104) N occurrences of 4. Numerically, these come out to about 0.124938736608 N and 0.096910013008 N, respectively. The first is indeed bigger than the second, and the ratio is 1.289224226994 ….
Note the connection this puzzle has with (a) the good old slide rule, and (b) Benford's Law.