In my June diary I posed the following brain-teaser:
In any given year, the weather station in New York City's Central Park observes a certain total rainfall. Assume that one year's total rainfall is unrelated to any other year's — in mathematical jargon, that total annual rainfall is "an independent random variable." Define a "record year" to be a year in which the rainfall exceeds that of any preceding year for which measurements were kept. Given that the Central Park measurements began in 1835, by which date would you expect to have clocked up 20 record years? (Clue: over the 160-year period up to 1994, there were six record years.)
First, let's look at the situation after just one year of record-keeping. How many record
years are there at this
point? Why, one! Since there are no previous records, the first year is certain to be a record year.
Now consider the situation after two years of record-keeping. Either the second year surpassed the first, or else it didn't. Since these are random variables, the chance that it did is 50-50, i.e. one-half. Thus, after two years' record-keeping, we would have either one record year (probability 1/2), or two record years (probability 1/2). An equivalent way to say this is, that the "expected" number of record years is 1 + 1/2, or 1.5.
(You can also argue that second year this way: Since the chance that the second year will be best of the two is 50-50, there are just two equally probable possibilities, R-R and R-X, with "R" being a record year and "X" being a non-record year. The two equal probabilities include three record years, so the "expected" number of record years is 3/2.)
After three years of record-keeping, what have we got? There is one chance in three that the third year is the record for the three years. If the first two years were both records, we can think of the three configurations R-R-R, R-R-X, and R-R-X as being all equally probable. Same for the case where the second year wasn't a record: then the equally-probable configurations are R-X-R, R-X-X, and R-X-X. That's all the possibilities, six in all, with a total of eleven record years. "Expected" number of record years: 11/6, which is to say, 1 + 1/2 + 1/3.
If you pursue this logic for a fourth year, you get an "expected" number of record years equal to 25/12, which is 1 + 1/2 + 1/3 + 1/4. So it goes: after N years of record-keeping, the "expected" number of record years will be:
1 + 1/2 + 1/3 + 1/4 + 1/5 + … + 1/N
Check: after 160 years of record keeping, we are told there were 6 record years.
1 + 1/2 + 1/3 + … + 1/160 is equal to 5.65551122493974187… A
pretty good match between
"expected" and "actual."
Jargon. The sum 1 + 1/2 + 1/3 + 1/4 + 1/5 + …+ 1/N is important in math, important enough to have a name and a symbol of its own. The name is "the N-th harmonic number." The symbol is HN.
So now the problem resolves to this one: how big does N have to be before HN exceeds 20?
To get an answer, you really have to know the following fact.
Key fact. For large numbers N, an excellent approximation to HN is log N + γ.
Here "log" means the natural log, often written "ln," while γ
means Euler's constant, equal to
So to get HN up above 20, I need log N + γ to get above 20. In other words, I need log N to get above 19.422784335098467139393487… This happens around N = e 19.422784335098467139393487… which is 272,400,600.0594077768…
In fact (I am going to cheat here, using Mathematica to get more precise results):
H272400599 = 19.9999999979463783225916261697…
H272400600 = 20.0000000016174421895581453330…
Since record-keeping began in 1835, we can expect to have logged 20 record years by around a.d. 272,402,434.