June 2003
—————————
In my June diary I posed the following brain-teaser:
In any given year, the weather station in New York City's Central Park observes a certain total rainfall. Assume that one year's total rainfall is unrelated to any other year's — in mathematical jargon, that total annual rainfall is "an independent random variable." Define a "record year" to be a year in which the rainfall exceeds that of any preceding year for which measurements were kept. Given that the Central Park measurements began in 1835, by which date would you expect to have clocked up 20 record years? (Clue: over the 160-year period up to 1994, there were six record years.)
—————————
Solution
First, let's look at the situation after just one year of record-keeping. How many record
years are there at this
point? Why, one! Since there are no previous records, the first year is certain to be a record year.
Now consider the situation after two years of record-keeping. Either the second year surpassed the first, or else it
didn't. Since these are
random variables, the chance that it did is 50-50, i.e. one-half. Thus, after two years' record-keeping, we would have
either one record year
(probability 1/2), or two record years (probability 1/2). An equivalent way to say this is, that the
"expected" number of record years is
1 + 1/2, or 1.5.
(You can also argue that second year this way: Since the chance that the second year will be best of the two is 50-50,
there are just two equally
probable possibilities, R-R and R-X, with "R" being a record year and "X" being a non-record year.
The two equal probabilities
include three record years, so the "expected" number of record years is 3/2.)
After three years of record-keeping, what have we got? There is one chance in three that the third year is the record
for the three years. If the
first two years were both records, we can think of the three configurations R-R-R, R-R-X, and R-R-X as being all
equally probable. Same for the
case where the second year wasn't a record: then the equally-probable configurations are R-X-R, R-X-X, and R-X-X.
That's all the possibilities,
six in all, with a total of eleven record years. "Expected" number of record years: 11/6, which is to
say,
1 + 1/2 + 1/3.
If you pursue this logic for a fourth year, you get an "expected" number of record years equal to 25/12,
which is
1 + 1/2 + 1/3 + 1/4. So it goes: after N years of record-keeping, the
"expected" number of record
years will be:
1 + 1/2 + 1/3 + 1/4 + 1/5 + … + 1/N
Check: after 160 years of record keeping, we are told there were 6 record years.
Well,
1 + 1/2 + 1/3 + … + 1/160 is equal to 5.65551122493974187… A
pretty good match between
"expected" and "actual."
Jargon. The sum 1 + 1/2 + 1/3 + 1/4 + 1/5 + …+ 1/N
is important in math,
important enough to have a name and a symbol of its own. The name is "the N-th harmonic number."
The symbol is
HN.
So now the problem resolves to this one: how big does N have to be before HN exceeds
20?
—————————
To get an answer, you really have to know the following fact.
Key fact. For large numbers N, an excellent approximation to HN is log N + γ.
Here "log" means the natural log, often written "ln," while γ
means Euler's constant, equal to
0.577215664901532860606512…
So to get HN up above 20, I need log N + γ to get above 20. In other
words, I need
log N to get above 19.422784335098467139393487… This happens around
N = e 19.422784335098467139393487… which is
272,400,600.0594077768…
In fact (I am going to cheat here, using Mathematica to get more precise results):
H272400599 = 19.9999999979463783225916261697…
H272400600 = 20.0000000016174421895581453330…
Since record-keeping began in 1835, we can expect to have logged 20 record years by around a.d. 272,402,434.