Wide Awake Developers

Main

Multiplier Effects

Here's one way to think about the ethics of software, in terms of multipliers. Think back to the last major email virus, or when the movie "The Two Towers" was released. No doubt, you heard or read a story about how much lost productivity this bane would cause. There is always some analyst willing to publish some outrageous estimate of damages due to these intrusions into the work life. I remember hearing about the millions of dollars supposedly lost to the economy when Star Wars Episode I was released.

(By the way, I have to take a minute to disassemble this kind of analysis. Stick with me, this won't take long.

If you take 1.5 seconds to delete the virus, it costs nothing. It's an absolutely immeasurable impact to your day. It won't even affect your productivity. You will probably spend more time than that discussing sports scores, going to the bathroom, chatting with a client, or any of the hundreds of other things human beings do during a day. It's literally lost in the noise. Nevertheless, some analyst who likes big numbers will take that 1.5 seconds and multiply it by the millions of other users and their 1.5 seconds, then multiply that by the "national average salary" or some such number.

So, even though it takes you longer to blow your nose than to delete the virus email, somehow it still ends up "costing the economy" 5x10^6 USD in "lost productivity". The underlying assumptions here are so flawed that the result cannot be taken seriously. Nevertheless, this kind of analysis will be dragged out every time there's a news story--or better yet, a trial--about an email worm.)

The real moral of this story isn't about innumeracy in the press, or spotlight seekers exploiting said innumeracy. It's about multipliers, and the very real effect they can have.

Suppose you have a decision to make about a particular feature. You can do it the easy way in about a day, or the hard way in about a week. (Hypothetical.) Which way should you do it? Suppose that the easy way makes four new fields required, whereas doing it the hard way makes the program smart enough to handle incomplete data. Which way should you do it?

Required fields seem innocuous, but they are always an imposition on the user. They require the user to gather more information before starting their jobs. This in turn often means they have to keep their data on Post-It notes until they are ready to enter it, resulting in lost data, delays, and general frustration.

Let's consider an analogy. Suppose I'm putting a sign up on my building. Is it OK to mount the sign six feet up on the wall, so that pedestrians have to duck or go around it? It's much easier for me to hang the sign if I don't have to set up a ladder and scaffold. It's only a minor annoyance to the pedestrians. It's not like it would block the sidewalk or anything. All they have to do is duck. So, I get to save an hour installing the sign, at the expense of taking two seconds away from every pedestrian passing my store. Over the long run, all of those two second diversions are going to add up to many, many times more than the hour that I saved.

It's not ethical to worsen the lives of others, even a small bit, just to make things easy for yourself. Successful software is measured in millions of people. Every requirements decision you make is an imposition of your will on your users' lives, even if it is a tiny one. Always be mindful of the impact your decisions--even small ones--have on those people. You should be willing to bear large burdens to ease the burden on those people, even if your impact on any given individual is miniscule.

The Paradox of Honor

You can use a person's honor against him only if he values honor. Only the honest man is threatened by the pointed finger. The liar is unaffected by that kind of accusation. I think it is because there is no such thing as "dishonesty". There is only honesty or it's lack. Not a thing and it's opposite, but a thing and it's absence. One or zero, not one or minus-one. One who is lacking a thing cannot be threatened at the prospect of its loss.


Needles, Haystacks

So, this may seem a little off-topic, but it comes round in the end. Really, it does.

I've been aggravated with the way members of the fourth estate have been treating the supposed "information" that various TLAs had before the September 11 attacks. (That used to be my birthday, by the way. I've since decided to change it.) We hear that four of five good bits of information scattered across the hundreds of FBI, CIA, NSA, NRO, IRS, DEA, INS, or IMF offices "clearly indicate" that terrorists were planning to fly planes into buildings. Maybe so. Still, it doesn't take a doctorate in complexity theory to figure out that you could probably find just as much data to support any conclusion you want. I'm willing to bet that if the same amount of collective effort were invested, we could prove that the U. S. Government has evidence that Saddam Hussein and aliens from Saturn are going to land in Red Square to re-establish the Soviet Union and launch missiles at Guam.

You see, if you already have the conclusion in hand, you can sift through mountain ranges of data to find those bits that best support your conclusion. That's just hindsight. It's only good for gossipy hens clucking over the backyard fence, network news anchors, and not-so-subtle innuendos by Congresscritters.

The trouble is, it doesn't work in reverse. How many documents does just the FBI produce every day? 10,000? 50,000? How would anyone find exactly those five or six documents that really matter and ignore all of the chaff? That's the job of analysis, and it's damn hard. A priori, you could only put these documents together and form a conclusion through sheer dumb luck. No matter how many analysts the agencies hire, they will always be crushed by the tsunami of data.

Now, I'm not trying to make excuses for the alphabet soup gang. I think they need to reconsider some of their basic operations. I'll leave questions about separating counter-intelligence from law enforcement to others. I want to think about harnessing randomness. You see, government agencies are, by their very nature, bureaucratic entities. Bureaucracies thrive on command-and-control structures. I think it comes from protecting their budgets. Orders flow down the hierarchy, information flows up. Somewhere, at the top, an omniscient being directs the whole shebang. A command-and-control structure hates nothing more than randomness. Randomness is noise in the system, evidence of an inadequate procedures. A properly structured bureaucracy has a big, fat binder that defines who talks to whom, and when, and under what circumstances.

Such a structure is perfectly optimized to ignore things. Why? Because each level in the chain of command has to summarize, categorize, and condense information for its immediate superior. Information is lost at every exchange. Worse yet, the chance for somebody to see a pattern is minimized. The problem is this whole idea that information flows toward a converging point. Whether that point is the head of the agency, the POTUS, or an army of analysts in Foggy Bottom, they cannot assimilate everything. There isn't even any way to build information systems to support the mass of data produced every day, let alone correlating reports over time.

So, how do Dan Rather and his cohorts find these things and put them together? Decentralization. There are hordes of pit-bull journalists just waiting for the scandal that will catapult them onto CNN. ("Eat your heart out Wolf, I found the smoking gun first!")

Just imagine if every document produced by the Minneapolis field office of the FBI were sent to every other FBI agent and office in the country. A vast torrent of data flowing constantly around the nation. Suppose that an agent filing a report about suspicious flight school activity could correlate that with other reports about students at other flight schools. He might dig a little deeper and find some additional reports about increased training activity, or a cluster of expired visas that overlap with the students in the schools. In short, it would be a lot easier to correlate those random bits of data to make the connections. Humans are amazing at detecting patterns, but they have to see the data first!

This is what we should focus on. Not on rebuilding the $6 Billion Bureaucracy, but on finding ways to make available all of the data collected today. (Notice that I haven't said anything that requires weakening our 4th or 5th Amendment rights. This can all be done under laws that existed before 9/11.) Well, we certainly have a model for a global, decentrallized document repository that will let you search, index, and correlate all of its contents. We even have technologies that can induce membership in a set. I'd love to see what Google Sets would do with the 19 hijackers names, after you have it index the entire contents of the FBI, CIA, and INS databases. Who would it nominate for membership in that set?

Basically, the recipe is this: move away from ill-conceived ideas about creating a "global clearinghouse" for intelligence reports. Decentralize it. Follow the model of the Internet, Gnutella, and Google. Maximize the chances for field agents and analysts to be exposed to that last, vital bit of data that makes a pattern come clear. Then, when an agent perceives a pattern, make damn sure the command-and-control structure is ready to respond.

Multiplier Effects

Here's another way to think about the ethics of software, in terms of multipliers. Think back to the last major virus scare, or when Star Wars Episode II was released. Some "analyst"--who probably found his certificate in a box of Cracker Jack--publishing some ridiculous estimate of damages.

BTW, I have to take a minute to disassemble this kind of analysis. Stick with me, it won't take long.

If you take 1.5 seconds to delete the virus, it costs nothing. It's an absolutely immeasurable impact to your day. It won't even affect your productivity. You will probably spend more time than that discussing sports scores, going to the bathroom, chatting with a client, or any of the hundreds of other things human beings do during a day. It's literally lost in the noise. Nevertheless, some peabrain analyst who likes big numbers will take that 1.5 seconds and multiply it by the millions of other users and their 1.5 seconds, then multiply that by the "national average salary" or some such number.

So, even though it takes you longer to blow your nose than to delete the virus email, somehow it still ends up "costing the economy" 5x10^6 USD in "lost productivity". The underlying assumptions here are so thoroughly rotten that the result cannot be anything but a joke. Sure as hell though, you'll see this analysis dragged out every time there's a news story--or better yet, a trial--about an email worm.

The real moral of this story isn't about innumeracy in the press, or spotlight seekers exploiting innumeracy. It's about multipliers.

Suppose you have a decision to make about a particular feature. You can do it the easy way in about a week, or the hard way in about a month. (Hypothetical.) Which way should you do it? Suppose that the easy way makes the user click an extra button, whereas doing it the hard way makes the program a bit smarter and saves the user one click. Just one click. Which way should you do it?

Let's consider an analogy. Suppose I'm putting a sign up on my building. Is it OK to mount the sign six feet up on the wall, so that pedestrians have to duck or go around it? It's much easier for me to hang the sign if I don't have to set up a ladder and scaffold. It's only a minor annoyance to the pedestrians. It's not like it would block the sidewalk or anything. All they have to do is duck. (We'll just ignore the fact that pissing off all your potential customers is not a good business strategy.)

It's not ethical to worsen the lives of others, even a small bit, just to make things easy for yourself. These days, successful software is measured in millions of users, of people. Always be mindful of the impact your decisions--even small ones--have on those people. Accept large burdens to ease the burden on those people, even if your impact on any given individual is miniscule. The cumulative good you do that way will always overwhelm the individual costs you pay.

Debating "Web Services"

There is a huge and contentious debate under way right now related to "Web services". A sizable contingent of the W3C and various XML pioneers are challenging the value of SOAP, WSDL, and other "Web service" technology.

This is a nuanced discussion with many different positions being taken by the opponents. Some are critical of the W3C's participation in something viewed as a "pay to play" maneuver from Microsoft and IBM. Others are pointing out serious flaws in SOAP itself. To me, the most interesting challenge comes from the W3C's Technical Architecture Group (TAG). This is the group tasked with defining what the web is and is not. Several of the TAG, including the president of the Apache Foundation, are arguing that "Web services" as defined by SOAP, fundamentally are not "the web". ("The web" being defined crudely as "things are named via URI's" and "every time I ask for the same URI, I get the same results". My definition, not theirs.) With a "Web service", a URI doesn't name a thing, it names a process. What I get when I ask for a URI is no longer dependent solely on the state of the thing itself. Instead, what I get depends on my path through the application.

I'd encourage you to all sample this debate, as summarized by Simon St. Laurent (one of the original XML designers).


Ethical decisions in software development

Ethical decisions in software development do not only arise when we are talking about malware or copyright infringement.

If my programs are successful, then they impact the lives of thousands or millions of people. That impact can be positive or negative. The program can make their lives better or worse--even if just in minute proportions.

Every time I make a decision about how a program behaves, I am really deciding what my users can and cannot do. If I make an input required, I am forcing them to abide by my rules. (Hopefully, it is a rule they expressed first, at least.) Conversely, if I allow partial entry, then I am allowing some licentiousness. They can get away with less rigorous work.

That makes every programming decision an ethical decision.


Lately, I have been struggling

Lately, I have been struggling to find the meaning in my work. I suppose that's not surprising. I am a human being--a mortal creature. My age will soon flip a decimal digit. (I decline to specify which.) These can certainly cause one to spend time reflecting on one's legacy. They can also cause one to buy a flaming red sports car. I may explore that option later.

 

I also work in a field of incredible transience. Two hundred years from now, no cathedral will bear my mark. No train depot of my design will grace the National Register of Historic Places. No literary critics will deconstruct the significance of my characters' middle initials. In truth, the shelf life of my work compares poorly to that of a gallon of milk.

I am a programmer.

I and my comrades can usually be found behind our glowing screens, working hour after hour to bring some other person's vision to life. We who grapple with chaos and ether and mud expend our spirit, energy, life, time, soul, and qi in the name of creation. We work long after the managers have left. We learn the janitors' names. I have often gazed out my window to the neon street below, full of the theater signs, restaurants, and wandering crowds seeking to be entertained. I have wondered what kind of life I should have led to be in that crowd instead of watching it. I've wondered how I could rejoin that human mass. I think I'd have to change careers.

I cannot deny, however, that my work brings me deep--if ephemeral--satisfaction. The harsh joy of self-sacrifice combines with the exultant delight of success when a project comes together. When I finally get my programs to work, it's a kind of magic, dense and layered. At one level, the thought that my work will be useful to someone--that it will make dozens, hundreds, maybe millions of people more individually powerful--it heady and exciting.

At another level, I have a fierce pride that my software works at all. Knowing that my creation is strong enough, powerful enough to survive the threat of millions of users doing their damndest to destroy it. Despite the teeming millions trying to prove that there is no such thing as "foolproof", my software keeps working. "Robust", we call it. "Resilient". "Come on", it says, "bring it on."

Deeper still, I take a craftman's pride in a job well done. Like a mason or a carpenter, I know what is under the surface. I know how well it is put together. I know what skill went into its construction. No one else may see this, but I know.