Forecasting Mistakes: Dow Jones Barrier

kqr

, published 2023-11-05

Tags:

I made a pretty embarrassing mistake in my initial forecast for the Metaculus question on whether the Dow Jones index will close above 35,000 before August.

It did, as you can tell from the plot, but both I and the community were slow to react to it. I’m going to focus on July 13, for no good reason other than that both I and the community had just adjusted upward, but not nearly enough.

On that date, with 18 days to go until August, the closing value was about 34,400, and it had to reach 35,000 at least once during those 18 days for the question to resolve positively. On this day, the community prediction was 37 %, and I submitted a prediction of 43 %. Both of these are, I believe, mistakes.

Expectations can be misleading

The distribution of daily movements between closing values has bias \(\mu=9.6\) and standard deviation \(\sigma=266\). Considering 18 days is a relatively short period, we can think of this as an unbiased random walk, because any movement that happens in that time is dominated by variance, not bias.1¹ Specifically, the coefficient of variation at an 18-day horizon is 6.5, meaning the standard deviation is over six times larger than the bias.

One question we can ask is, “How long, on average, will it take until the closing value goes either to 0, or above 35,000?”

There’s a convenient rule of thumb for this, which is

\[\frac{(y-c_1)(y-c_2)}{\sigma^2}\]

In our case, this evaluates to approximately

\[\frac{34400 \times 600}{266^2} \approx 292\]

It will take, on average, 292 days for the index to either exceed 35,000, or go down to 0. It is very unlikely that it will go to zero, so we can safely assume that the 292 days are the average time it takes until it exceeds 35,000.

If the average time it takes to go above 35,000 is 292 days, then clearly the probability that it does so in the next 18 days must be somewhat small, right? That was my judgment, but it’s wrong. The average is misleading.

Here’s one thing that can happen, but which is extremely rare: the random walk takes a dive down to 20,000 and then stays there and wanders around in that neighbourhood for a really long time, until it finally meanders up again to 35,000. This has a small probability of happening, but if it happens, it takes so long to get back up to 35,000 that this rare event brings up the average time by a lot.

In other words, the expectation is dominated by rare events. In almost every case, the random walk will go above 35,000 fairly quickly, but in some cases it will stay below it for a very long time, and the average of all these possibilities is 292 days.

Upper bounds for ending up over a barrier

A better start might be to ask what the probability is that the closing price will have moved more than 600 units at the end of the 18 days.

In other words, on July 31, what’s the probability that the Dow Jones is exceeding 35,000? This is useful information because the probability that it has exceeded 35,000 on some day up to and including July 31 should definitely be higher than the probability that it is currently exceeding 35,000 on July 31.

A possibly-inappropriate normality assumption

We could do this by assuming a normal distribution for the July 31 value2² \(N(9.6 \times 18, 266 \times \sqrt{18})\) and then going backwards through z-scores to probabilities. Using this model, we’ll get a 35 % probability of being above 35,000 on July 31.

However, we should be hesitant to assume a normal distribution for financial data, so there’s another option giving a theoretical lower bound.

A theoretical lower bound

The probability that an unbiased random walk is within \(\pm C\) units from its starting point in either direction after \(t\) units of time is at most

\[\frac{0.8 C}{\sigma \sqrt{t}}\]

Using our figures, the probability that the Dow Jones is within 600 units of its starting point after 18 days is at most

\[\frac{0.8 \times 600}{266 \sqrt{18}} \approx 0.43\]

If we flip this, we get a lower bound for the probability that the random walk is outside of 600 units after 18 days, which is 57 %. But we care only if it is outside 600 units in the upward direction, so we take half of that, or 29 %.

Under the normality hypothesis, which accounts also for the drift of the random walk, we draw the conclusion that the Dow Jones is exceeding 35,000 on July 31 with a 35 % probability. The theoretical lower bound of this probability for an unbiased random walk is 29 %.

Let’s use that 29 % number to be safe. Since this is our probability that the Dow Jones is currently exceeding 35,000 on July 31, our forecast that it has exceeded it at some point up to and including July 31 should clearly be higher than 29 %.

But how much higher?

Unbiased random walks crossing barriers

Here’s a fun fact about random walks: of all the random walks that enter a point \((t_b, y)\) on the plane, exactly half will end up above \(y\), and half will end up below \(y\) at any \(t > t_b\). There’s an elegant visual reflection argument for this but I’m not the right person to explain it.

The consequences for our forecasting question are particularly neat: The probability that a random walk has exceeded a barrier at \(y=C\) at some point before time \(t\) is twice the probability that the random walk exceeds \(y=C\) at time \(t\).

In other words, if \(q_C(t)\) is the probability that a random walk has exceeded the barrier at \(C\) before time \(t\), then

\[q_C(t) = 2P(Y(t) > C)\]

And we have \(P(Y(t) > C)\)! It is that 29 % figure we had! So we believe the probability that the Dow Jones exceeds 35,000 at some point before August 1 is 58 %.

That would have been a good forecast. Unfortunately, I didn’t think about the problem correctly at the time.

Resampling methods confirm theory

In fact, we wouldn’t have to dig into deep theory to arrive at 50-something percent. If the variable ytd contains the daily movements year to date on July 13, then the R expression

In[1]:

mean(replicate(
  500,
  any(cumsum(sample(ytd, 18, replace=T)) > 600)
))

would yield something near 56 %.

I don’t know how the community got it wrong

This last fact confuses me. I wanted to practise my mental forecasting abilities so I deliberately didn’t resample. But some of the community members wrote comments about how their computer simulations suggested around 30–40 % at the time. My brief simulation above gives 56 %, which I believe is much closer to the optimal forecast.

Either their models were more complicated than they needed to be, or used parameters that were ill-fitted to the data, or I’m missing something.

Two Wrongs