Wide Awake Developers

MLP

| Comments

Here’s a good roundup of recent traffic regarding REST.


Here’s My Number One Frustration

| Comments

Here’s my number one frustration with the state of the industry today. I am a professional. I regard my work as a craft to be studied and learned. Yet, in most domains, there is no benefit to developing a high level of skill. You end up surrounded by people who don’t understand a word you say, can’t work at that level, and don’t really give a damn. They’ll get the same rewards and go home happy at 5:00 every day. It’s like, once you achieve a base level of mediocrity, there’s no benefit for further personal development. In fact, there’s a distinct disadvantage, in that you end up pulling ridiculous hours to clean up their garbage.

Bah, there I go being bitter again. Maybe I just need to work in some other domain–one where skills count for something, and being good at your job is a benefit, not a hindrance. I’m sick of writing Address classes, anyway.


Multiplier Effects

| Comments

Here’s another way to think about the ethics of software, in terms of multipliers. Think back to the last major virus scare, or when Star Wars Episode II was released. Some "analyst"–who probably found his certificate in a box of Cracker Jack–publishing some ridiculous estimate of damages.

BTW, I have to take a minute to disassemble this kind of analysis. Stick with me, it won’t take long.

If you take 1.5 seconds to delete the virus, it costs nothing. It’s an absolutely immeasurable impact to your day. It won’t even affect your productivity. You will probably spend more time than that discussing sports scores, going to the bathroom, chatting with a client, or any of the hundreds of other things human beings do during a day. It’s literally lost in the noise. Nevertheless, some peabrain analyst who likes big numbers will take that 1.5 seconds and multiply it by the millions of other users and their 1.5 seconds, then multiply that by the "national average salary" or some such number.

So, even though it takes you longer to blow your nose than to delete the virus email, somehow it still ends up "costing the economy" 5x10^6 USD in "lost productivity". The underlying assumptions here are so thoroughly rotten that the result cannot be anything but a joke. Sure as hell though, you’ll see this analysis dragged out every time there’s a news story–or better yet, a trial–about an email worm.

The real moral of this story isn’t about innumeracy in the press, or spotlight seekers exploiting innumeracy. It’s about multipliers.

Suppose you have a decision to make about a particular feature. You can do it the easy way in about a week, or the hard way in about a month. (Hypothetical.) Which way should you do it? Suppose that the easy way makes the user click an extra button, whereas doing it the hard way makes the program a bit smarter and saves the user one click. Just one click. Which way should you do it?

Let’s consider an analogy. Suppose I’m putting a sign up on my building. Is it OK to mount the sign six feet up on the wall, so that pedestrians have to duck or go around it? It’s much easier for me to hang the sign if I don’t have to set up a ladder and scaffold. It’s only a minor annoyance to the pedestrians. It’s not like it would block the sidewalk or anything. All they have to do is duck. (We’ll just ignore the fact that pissing off all your potential customers is not a good business strategy.)

It’s not ethical to worsen the lives of others, even a small bit, just to make things easy for yourself. These days, successful software is measured in millions of users, of people. Always be mindful of the impact your decisions–even small ones–have on those people. Accept large burdens to ease the burden on those people, even if your impact on any given individual is miniscule. The cumulative good you do that way will always overwhelm the individual costs you pay.

REST and Change in APIs

| Comments

In case it didn’t come through, I’m intrigued by REST, because it seems more fluid than the WS-* specifications. I can do an HTTP request in about 5 lines of socket code in any modern language, from any client device.

The WS-splat crowd seem to be building YABS (yet another brittle standard). Riddle me this: what use is a service description in a standardized form if there is only one implementor of that service? WSDL only attains full value when there are standards built on top of WSDL. Just like XML, WSDL is a meta-standard. It is a standard for specifying other standards. Collected and diverse industry behemoths and leviathans make the rules for that playground.

I see two, equally likely, outcomes for any given service definition:

  • A defining body will standardize the interface for a particular web service. This will take far too long.
  • A dominant company in a star-like topography with its customers and suppliers (think Wal-mart) will impose an interface that its business partners must use.

Once such interfaces are defined, how easily might they be changes? I mean the WSDL (or other) definition of the service itself. Can anyone say CORBAservices? You’d better define your services right the first time, because there appears to be substantial friction opposing change.

How does REST avoid this issue? By eliminating layers. If I support a URI naming scheme like http://company.com/groupName/divisionName/departmentName/purchaseOrders/poNumber as a RESTful way to access purchase orders, and I find that we need to change it to /purchaseOrders/departmentNumber/poNumber, then both forms can co-exist. The alternative change in SOAP/WSDL-land would either modify the original endpoint (an incompatible change!) or would define a new service to support the new mode of lookup. (I suppose other hacks are available, too. Service.getPurchaseOrder2() or Service.getPurchaseOrderNew() for example.)

Of course, neither of these service architectures are implemented widely enough to really evaluate which one will be more accepting of change. I can tell you, though, that one of the huge CORBA-killers was the slow pace and resistance to change in the CORBAservices.

Debating “Web Services”

| Comments

There is a huge and contentious debate under way right now related to "Web services". A sizable contingent of the W3C and various XML pioneers are challenging the value of SOAP, WSDL, and other "Web service" technology.

This is a nuanced discussion with many different positions being taken by the opponents. Some are critical of the W3C’s participation in something viewed as a "pay to play" maneuver from Microsoft and IBM. Others are pointing out serious flaws in SOAP itself. To me, the most interesting challenge comes from the W3C’s Technical Architecture Group (TAG). This is the group tasked with defining what the web is and is not. Several of the TAG, including the president of the Apache Foundation, are arguing that "Web services" as defined by SOAP, fundamentally are not "the web". ("The web" being defined crudely as "things are named via URI’s" and "every time I ask for the same URI, I get the same results". My definition, not theirs.) With a "Web service", a URI doesn’t name a thing, it names a process. What I get when I ask for a URI is no longer dependent solely on the state of the thing itself. Instead, what I get depends on my path through the application.

I’d encourage you to all sample this debate, as summarized by Simon St. Laurent (one of the original XML designers).


Decoupling

| Comments

For the ultimate in temporal, architectural, language and spatial decoupling, try two of my favorite fluid technologies: publish-subscribe messaging and tuple-spaces.


Prison of Our Own Making

| Comments

We who build worlds dwell in a dank and dismal prison of our own construction, though not our design. Why so difficult? Where is the green grass? Where is the sunshine?


Ethical Decisions in Software Development

| Comments

Ethical decisions in software development do not only arise when we are talking about malware or copyright infringement.

If my programs are successful, then they impact the lives of thousands or millions of people. That impact can be positive or negative. The program can make their lives better or worse–even if just in minute proportions.

Every time I make a decision about how a program behaves, I am really deciding what my users can and cannot do. If I make an input required, I am forcing them to abide by my rules. (Hopefully, it is a rule they expressed first, at least.) Conversely, if I allow partial entry, then I am allowing some licentiousness. They can get away with less rigorous work.

That makes every programming decision an ethical decision.


Designing for Emergent Behavior

| Comments

Lately, I’ve been grooving on emergent behavior. This fuzzy term comes from the equally fuzzy field of complexity studies. Mix complex rules together with non-linear effects (like humans) and you are likely to observe emergent behavior.

Recent example: web browser security holes. Any program inherently constitutes a complex system. Add in some dynamic reprogramming, downloadable code, system-level scripting, and millions upon millions of users and you’ve got a perfect petri dish. Sit back and watch the show. Unpredictable behavior will surely result.

In fact, "emergent" sometimes gets used as a synonym for "unpredictable". By and large, I believe that’s true. In traditional systems design, "unpredictable" definitely equals "sloppy". Command-and-control, baby. Emergent behavior is what happens when your program goes off the rails.

The thing is, emergent behavior is where all the really interesting things happen. Predictable programs are boring. Big batch runs are predictable.

But, you have to consider the complete system. In a big batch run, the system is linear: inputs, transformation, outputs. No feedback. No humans. When you include humans in your view of the system, all these messy feedback loops start to appear. It gets even worse when you have multiple humans connected via the programs. Feedback loops that stretch from one person, through at least two programs, out to another person and back.

Any system that involves humans will exhibit emergent behaviors – and this is a very good thing.

Are "designed" behavior and "emergent" behavior inherently incompatible? I don’t think so. I think it may be possible to design for emergent behavior. I mean that certain designs will encourage some kinds of emergent behavior, whereas other designs encourage other kinds of emergent behavior. We can study the behaviors produced by various systems and designs to build a compendium of factors that are likely to facilitate one class of behavior or another.

For example: In every corporation, I see large volumes of data stored and shared in two different formats. The nature of the two systems encourages very different behaviors.

First we have relational databases. These tend to be large, expensive systems. As a result, they are centralized to one degree or another. The nature of relational algebra is that of a static schema. Therefore, changes are rigidly controlled. Centralized, rigidly controlled assets require guardians (DBAs) and gatekeepers (data modelers). Because the schema is well-defined and changes slowly, the database gains a degree of transparency. Applications are integrated through their databases. Generic tools for backup, reporting, extraction, and modeling become possible. The data can be accessed from a variety of applications in a relatively generic fashion.

The other data storage tool I see used widely is the spreadsheet. I almost never see a spreadsheet used to calculate numbers. Instead, most are used as a schema-less data storage tool. Often created directly by the business analysts, these spreadsheets are very conducive to change. Sharing is as simple as sending the file through email. Of course, this leads to version conflicts and concurrent update issues that have to be settled by hand (usually by printing a timestamp on the hardcopies!) There is not a central definition of the data structure. Indeed, neither the data nor the structures from spreadsheets can be reused. A spreadsheet makes the 2-dimensional structure of a table obvious, but it makes relationships difficult, if not impossible, to represent. Ergo, spreadsheet users don’t do relationships. Access to the spreadsheets is always mediated by a single application.

So, two different systems. Both store structured (or at least semi-structured) data. The nature of each produces very different emergent behaviors. In one case, we find the evolution of acolytes of the RDBMS. In the other case, we find that a numeric analysis tool is being used for widespread data storage and sharing.

Given enough examples, enough time, and enough study, can we not learn to extrapolate from the essential nature of our designs to the most probable emergent behaviors? Even perhaps, to select the emergent behaviors that we desire first, and, starting from those, decide what essential nature our designs must embody to most likely to encourage those behaviors?