Wide Awake Developers

« Booklist | Main | Why Do Enterprise Applications Suck? »

Tracking and Trouble

Pick something in your world and start measuring it.  Your measurements will surely change a little from day to day. Track those changes over a few months, and you might have a chart something like this.

First 100 samples

Now that you've got some data assembled, you can start analyzing it. The average over this sample is 59.5. It's got a variance of 17, which is about 28% of the mean. You can look for trends. For example, we seem to see an upswing for the first few months, then a pullback starting around 90 days into the cycle. In addition, it looks like there is a pretty regular oscillation superimposed on the main trend, so you might be looking at some kind of weekly pattern as well.

The next few months of data should make the patterns clearer.

First 200 samples.

Indeed, from this chart, it looks pretty clear that the pullback around 100 days was the early indicator of a flattening in the overall growth trend from the first few months. Now, the weekly oscillations are pretty much the only movement, with just minor wobbles around a ceiling.

I'll fast forward and show the full chart, spanning 1000 samples (over three years' worth of daily measurements.)

Full chart of 100 samples

Now we can see that the ceiling established at 65 held against upward pressure until about 250 days in, when it finally gave way and we reached a new support at about 80. That support lasted for another year, when we started to see some gradual downward pressure resulting in a pullback to the mid-70s.

You've probably realized by now that I'm playing a bit of a game with you. These charts aren't from any stock market or weather data. In fact, they're completely random. I started with a base value of 55 and added a little random value each "day".

When you see the final chart, it's easy to see it as the result of a random number generator.  If you were to live this chart, day by day, however, it's exceedingly hard not to impose some kind of meaning or interpretation on it. The tough part is that you actually can see some patterns in the data.  I didn't force the weekly oscillations into the random number function, they just appeared in the graph. We are all exceptional good at pattern detection and matching. We're so good, in fact, that we find patterns all over the place. When we are confronted with obvious patterns, we tend to believe that they're real or that they emerge from some underlying, meaningful structure. But sometimes, they're really just nothing more than randomness.

Nassim Nicholas Taleb is today's guru of randomness, but Benoit Mandelbrot wrote about it earlier in the decade, and Benjamin Graham wrote about this problem back in the 1920's. I suspect someone has sounded this warning every decade since statistics were invented. Graham, Mandelbrot, and Taleb all tell us that, if we set out to find patterns in historical data, we will always find them. Whether those patterns have any intrinsic meaning is another question entirely. Unless we discover that there are real forces and dynamics that underlie the data, we risk fooling ourselves again and again.

We can't abandon the idea of prediction, though. Randomness is real, and we have a tendency to be fooled by it. Still, even in the face of those facts, we really do have to make predictions and forecasts. Fortunately, there are about a dozen really effective ways to deal with the fundamental uncertainty of the future. I'll spend a few posts exploring these different ways to deal with the uncertainty of the future.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)