Wide Awake Developers

Main

Fast Iteration versus Elegant Design

I love the way that proggit bubbles stuff around. Today, for a while at least, the top link is to a story from Salon in May of 2000 about Bill and Lynne Jolitz, the creators of 386BSD.

[An aside: I'm not sure exactly when I became enough of a graybeard to remember as current events things which are now discussed as history. It's really disturbing that an article from almost a decade ago talks about events seven years earlier than that, and I remember them happening! To me, the real graybeards are the guys that created UNIX and C to begin with. Me? I'm part of the second or third UNIX generation, at best. Sigh...]

Anyway, Bill and Lynne Jolitz created the first free, open-source UNIX that ran on x86 chips.  Coherent was around before that, and I think SCO UNIX was available for x86 at the same time. SCO wasn't evil then, just expensive. In those days, you had to lay down some serious jing to get UNIX on your PC. Minix was available for free, but Tannenbaum held firm that Minix should teach principles rather than be a production OS, so he favor pedagogical value over functionality. Consequently, Minix wasn't a full UNIX implementation. (At least at that time. It might be now.)

Just contemplate the hubris of two programmers deciding that they would create their own operating system, to be UNIX, but fixing the flaws, hacks, and workarounds that had built up over more than a decade. Not only that, but they would choose to give it away for the cost of floppies! And not only that, but they would build it for a processor that serious UNIX people sneered at. Most impressive of all, they succeeded. 386BSD was a technically superior, well-architected version of UNIX for commodity hardware. The Jolitzes extrapolated Intel's growth curve and rapid product cycles and saw that x86 processors would advance far faster than the technically superior RISC chips.

At various times, I ran Minix, 386BSD, and SCO UNIX on my PC well before I even heard of Linux. Each of them had the field before Linus even made his 0.1 release.

So why is Linux everywhere, and we only hear about 386BSD in historical contexts? There is exactly one answer, and it's what Eric Raymond was really talking about in The Cathedral and the Bazaar. TCatB has been seen mostly as an argument for open-source versus commercial software, but what Raymond saw was that the real competition comes down to an open contribution model versus closed contributions. Linus' promiscuous contribution policy simply let Linux out-evolve 386BSD. More contributors meant more drivers, more bug fixes, more enhancements... more ideas, ultimately. Two people, no matter how talented, cannot outcode thousands of Linux contributors. The best programmers are 10 times more productive than the average, and I would rate Bill and Lynne among the very best. But, as of last April, the Linux Foundation reported that more than 3,600 people had contributed to the kernel alone.

Iteration is one of the fundamental dynamics. Iteration facilitates adaptation, and adaptation wins competition. History is littered with the carcasses of "superior" contenders that simply didn't adapt as fast as their victorious challengers.

Tracking and Trouble

Pick something in your world and start measuring it.  Your measurements will surely change a little from day to day. Track those changes over a few months, and you might have a chart something like this.

First 100 samples

Now that you've got some data assembled, you can start analyzing it. The average over this sample is 59.5. It's got a variance of 17, which is about 28% of the mean. You can look for trends. For example, we seem to see an upswing for the first few months, then a pullback starting around 90 days into the cycle. In addition, it looks like there is a pretty regular oscillation superimposed on the main trend, so you might be looking at some kind of weekly pattern as well.

The next few months of data should make the patterns clearer.

First 200 samples.

Indeed, from this chart, it looks pretty clear that the pullback around 100 days was the early indicator of a flattening in the overall growth trend from the first few months. Now, the weekly oscillations are pretty much the only movement, with just minor wobbles around a ceiling.

I'll fast forward and show the full chart, spanning 1000 samples (over three years' worth of daily measurements.)

Full chart of 100 samples

Now we can see that the ceiling established at 65 held against upward pressure until about 250 days in, when it finally gave way and we reached a new support at about 80. That support lasted for another year, when we started to see some gradual downward pressure resulting in a pullback to the mid-70s.

You've probably realized by now that I'm playing a bit of a game with you. These charts aren't from any stock market or weather data. In fact, they're completely random. I started with a base value of 55 and added a little random value each "day".

When you see the final chart, it's easy to see it as the result of a random number generator.  If you were to live this chart, day by day, however, it's exceedingly hard not to impose some kind of meaning or interpretation on it. The tough part is that you actually can see some patterns in the data.  I didn't force the weekly oscillations into the random number function, they just appeared in the graph. We are all exceptional good at pattern detection and matching. We're so good, in fact, that we find patterns all over the place. When we are confronted with obvious patterns, we tend to believe that they're real or that they emerge from some underlying, meaningful structure. But sometimes, they're really just nothing more than randomness.

Nassim Nicholas Taleb is today's guru of randomness, but Benoit Mandelbrot wrote about it earlier in the decade, and Benjamin Graham wrote about this problem back in the 1920's. I suspect someone has sounded this warning every decade since statistics were invented. Graham, Mandelbrot, and Taleb all tell us that, if we set out to find patterns in historical data, we will always find them. Whether those patterns have any intrinsic meaning is another question entirely. Unless we discover that there are real forces and dynamics that underlie the data, we risk fooling ourselves again and again.

We can't abandon the idea of prediction, though. Randomness is real, and we have a tendency to be fooled by it. Still, even in the face of those facts, we really do have to make predictions and forecasts. Fortunately, there are about a dozen really effective ways to deal with the fundamental uncertainty of the future. I'll spend a few posts exploring these different ways to deal with the uncertainty of the future.