Wide Awake Developers

Three Vendors Worth Evaluating

| Comments

Several vendors are sponsoring QCon. (One can only wonder what the registration fees would be if they didn’t.) Of these, I think three have products worth immediate evaluation.

Semmle

In the category of "really cool, but would I pay for it?" is Semmle. Their flagship product, SemmleCode, lets you treat your codebase as a database against which you can run queries. SemmleCode groks the static structure of your code, including relationships and dependencies. Along the way, it calculates pretty much every OO metric yet invented. It also looks at the source repository.

What can you do with it? Well, you can create a query that shows you all the cyclic dependencies in your code. The results can be rendered as a tree with explanations, a graph, or a chart. Or, you can chart your distribution of cyclomatic complexity scores over time. You can look for the classes or packages most likely to create a ripple effect.

Semmle ships with a sample project: the open-source drawing framework JHotDraw. In a stunning coincidence, I’m a contributor to JHotDraw. I wrote the glue code that uses Batik to export a drawing as SVG. So I can say with confidence, that when Semmle showed all kinds of cyclic dependencies in the exporters, it’s absolutely correct. Every one of the queries I saw run against JHotDraw confirmed my own experience with that codebase. Where Semmle indicated difficulty, I had difficulty. Where Semmle showed JHotDraw had good structure, it was easy to modify and extend.

There are an enormous number of things you could do with this, but one thing they currently lack is build-time automation. Semmle integrates with Eclipse, but not ANT or Maven. I’m told that’s coming in a future release.

3Tera

Virtualization is a hot topic. VMWare has the market lead in this space, but I’m very impressed with 3Tera’s AppLogic.

AppLogic takes virtualization up a level.  It lets you visually construct an entire infrastructure, from load balancers to databases, app servers, proxies, mail exchangers, and everything. These are components they keep in a library, just like transistors and chips in a circuit design program.

Once you’ve defined your infrastructure, a single button click will deploy the whole thing into the grid OS. And there’s the rub. AppLogic doesn’t work with just any old software and it won’t work on top of an existing "traditional" infrastructure.

As a comparison, HP’s SmartFrog just runs an agent on a bunch of Windows, Linux, or HP-UX servers. A management server sends instructions to the agents about how to deploy and configure the necessary software. So SmartFrog could be layered on top of an existing traditional infrastructure.

Not so with AppLogic. You build a grid specifically to support this deployment style. That makes it possible to completely virtualize load balancers and firewalls along with servers. Of course, it also means complete and total lock-in to 3tera.

Still, for someone like a managed hosting provider, 3tera offers the fastest, most complete definition and provisioning system I’ve seen.

GigaSpaces

What can I say about GigaSpaces? Anyone who’s heard me speak knows that I adore tuple-spaces. GigaSpaces is a tuple-space in the same way that Tibco is a pub-sub messaging system. That is to say, the foundation is a tuple-space, but they’ve added high-level capabilities based on their core transport mechanism.

So, they now have a distributed caching system.  (They call it an "in-memory data grid". Um, OK.) There’s a database gateway, so your front end can put a tuple into memory (fast) while a back-end process takes the tuple and writes it into the database.

Just this week, they announced that their entire stack is free for startups. (Interesting twist: most companies offer the free stuff to open-source projects.) They’ll only start charging you money when you get over $5M in revenue. 

I love the technology. I love the architecture.

Catching Up Through the Day

| Comments

One of the great things about virtual infrastructure is that you can treat it as a service. I use Yahoo’s shared hosting service for this blog. That gives me benefits: low cost and very quick setup. On the down side, I can’t log in as root. So when Yahoo has a problem, I have a problem.

Yesterday, there was something wrong with Yahoo’s install of Movable Type. As a result, I couldn’t post my "five things". I’ll be catching up today, as time permits.

My butt is planted in one track all day today, "Architectures You’ve Always Wondered About." We’ll be hearing about the architecture that runs Second Life, Yahoo, eBay, LinkedIn, and Orbitz. I may need a catheter and an IV.

Architecting for Latency

| Comments

Dan Pritchett, Technical Fellow at eBay, spoke about "Architecting for Latency". His aim was not to talk about minimizing latency, as you might expect, but rather to architect as though you believe latency is unavoidable and real.

We all know the effect latency can have on performance. That’s the first-level effect. If you consider synchronous systems—such as SOAs or replicated DR systems—then latency has a dramatic effect on scalability as well. Whenever a synchronous call reaches across a long wire, the latency gets added directly to the processing time.

For example, if client A calls service B, then A’s processing time will be at least the sum of B’s processing time, plus the latency between A and B. (Yes, it seems obvious when you state it like that, but many architectures still act as though latency is zero.)

Furthermore, latency over IP networks is fundamentally variable. That means A’s performance is unpredictable, and can never be made predictable.

Latency also introduces semantic problems. A replicated database will always have some discrepancy with the master database. A functionally or horizontally partitioned system will either allow discrepancies or must serialize traffic and give up scaling. You can imagine that eBay is much more interested in scaling than serializing traffic.

For example, when a new item is posted to eBay, it does not immediately show up in the search results. The ItemNode service posts a message that eventually causes the item to show up in search results. Admittedly, this is kept to a very short period of time, but still, the item will reach different data centers at different times. So, the search service inside the nearest data center will get the item before the search service inside the farthest. I suspect many eBay users would be shocked, and probably outraged, to hear that shoppers see different search results depending on where they are.

Now, the search service is designed to get consistent within a limited amount of time—for that item. With a constant flow of items, being posted from all over the country, you can imagine that there is a continuous variance among the search services. Like the quantum foam, however, this is near-impossible to observe. One user cannot see it, because a user gets pinned to a single data center. It would take multiple users, searching in the right category, with synchronized clocks, taking snapshots of the results to even observe that the discrepancies happen. And even then, they would only have a chance of seeing it, not a certainty. 

Another example. Dan talked about payment transfer from one user to another. In the traditional model, that would look something like this.

Synchronous write to both databases

You can think of the two databases as being either shards that contain different users or tables that record different parts of the transaction.

This is a design that pretends latency doesn’t exist. In other words, it subscribes to Fallacy #2 of the Fallacies of Distributed Computing. Performance and scalability will suffer here.

(Frankly, it has an availability problem, too, because the availability of the payment service Pr(Payment) will now be Pr(Database 1) * Pr(Network 1) * Pr(Database 2) * Pr(Network 2). In other words, the availability of the payment service is coupled to the availability of Database 1, Database 2, and the two networks connecting Payment to the two databases.)

Instead, Dan recommends a design more like this:

Payment service with back end reconciliation

In this case, the payment service can set an expectation with the user that the money will be credited within some number of minutes. The availability and performance of the payment service is now independent from that of Database 2 and the reconciliation process. Reconciliation happens in the back end. It can be message-driven, batch-driven, or whatever. The main point is that it is decoupled in time and space from the payment service. Now the databases can exist in the same data center or on separate sides of the globe. Either way, the performance, availability, and scalability characteristics of the payment service don’t change. That’s architecting for latency.

Instead of ACID semantics, we think about BASE semantics. (A cute retronym for Basically Available Soft-state Eventually-consistent.) 

Now, many analysts, developers, and business users will object to the loss of global consistency. We heard a spirited debate last night between Dan and Nati Shalom, founder and CTO of GigaSpaces about that very subject.

I have two arguments of my own to support architecting for latency.

First, any page you show to a user represents a point-in-time snapshot. It can always be inaccurate even by the time you finish generating the page. Think about a commerce site saying "Ships in 2 - 3 days". That’s true at the instant when the data is fetched. Ten milliseconds later, it might not be true. By the time you finish generating the page and the user’s browser finishes rendering it (and fetching the 29 JavaScript files needed for the whizzy AJAX interface) the data is already a few seconds old. So global consistency is kind of useless in that case, isn’t it? Besides, I can guarantee there’s already a large amount of latency in the data path from inventory tracking to the availability field in the commerce database anyway.

Second, the cost of global consistency is global serialization. If you assume a certain volume of traffic you must support, the cost of a globally consistent solution will be a multiple of the cost of a latency-tolerant solution. That’s because global consistency can only be achieved by making a single master system of record. When you try to reach large scales, that master system of record is going to be hideously expensive.

Latency is simply an immutable fact of the universe. If we architect for it, we can use it to our advantage. If we ignore it, we will instead be its victims.

For more of Dan’s thinking about latency, see his article on InfoQ.com

SOA Without the Edifice

| Comments

Sometimes the best interactions at a conference aren’t the talks, they’re shouting. An crowded bar with an over-amped DJ may seem like an unlikely place for a discussion on SOA. Even so, when it’s Jim Webber, ThoughtWorks’ SOA practice lead doing the shouting, it works. Given that Jim’s topic is "Guerilla SOA", shouting is probably more appropriate than the hushed and reverential cathedral whispers that usually attend SOA discussions.

Jim’s attitude is that SOA projects tend to attract two things: Taj Mahal architects and parasitic vendors. (My words, not Jim’s.) The combined efforts of these two groups results in monumentally expensive edifices that don’t deliver value. Worse still, these efforts consume work and attention that could go to building services aligned with the real business processes, not some idealized vision of what the processes ought to be.

Jim says that services should be aligned with business processes. When the business process changes, change the service. (To me, this automatically implies that the service cannot be owned by some enterprise governance council.) When you retire the business process, simply retire the service.

These sound like such common sense, that it’s hard to imagine they could be controversial.

I’ll be in the front row for Jim’s talk later today.

 

Cameron Purdy: 10 Ways to Botch Enterprise Java Scalability and Reliability

| Comments

Here at QCon, well-known Java developer Cameron Purdy gave a fun talk called "10 Ways to Botch Enterprise Java Scalability and Reliability".  (He also gave this talk at JavaOne.)

While I could quibble with Cameron’s counting—there were actually more like 16 points thanks to some numerical overloading—I liked his content.  He echoes many of the antipatterns from Release It.   In particular, he talks about the problem I call "Unbounded Result Sets".  That is, whether using an ORM tool or straight database queries, you can always get back more than you expect. 

Sometimes, you get back way, way more than you expect. I once saw a small messaging table, that normally held ten or twenty rows, grow to over ten million rows.  The application servers never contemplated there could be so many messages.  Each one would attempt to fetch the entire contents of the table and turn them into objects.  So, each app server would run out of memory and crash.  That rolled back the transaction, allowing the next app server to impale itself on the same table.

Unbounded Result Sets don’t just happen from "SELECT * FROM FOO;", though.  Think about an ORM handling the parent-child relationship for you.  Simply calling something like customer.getOrders() will return every order for that customer.  By writing that call, you implicitly assume that the set of orders for a customer will always be small.  Maybe.  Maybe not.  How about blogUser.getPosts()?  Or tickerSymbol.getTrades()?

Unbounded Result Sets also happen with web services and SOAs.  A seemingly innocuous request for information could create an overwhelming deluge—an avalanche of XML that will bury your system.  At the least, reading the results can take a long time.  In the worst case, you will run out of memory and crash.

The fundamental flaw with an Unbounded Result Set is that you are trusting someone else not to harm you, either a data producer or a remote web service. 

Take charge of your own safety! 

Be defensive!

Don’t get hurt again in another dysfunctional relationship!

Three Programming Language Problems Solved Forever

| Comments

It’s often been the case that a difficult problem can be made easier by transforming it into a different representation.  Nowhere is that more true than in mathematics and the pseudo-mathematical realm of programming languages.

For example, LISP, Python, and Ruby all offer beautiful and concise constructs for operating on lists of things.  In each of them, you can make a function which iterates across a list, performing some operation on each element, and returning the resulting list.  C, C++, and Java do not offer any similar construct.  In each of these languages, iterating a list is a control-flow structure that requires multiple lines to express.  More significantly, the function expression of list comprehension can be composed. That is, you can embed a list comprehension structure inside of another function call or list operation.  In reading Programming Collective Intelligence, which uses Python as its implementation language, I’ve been amazed at how eloquent complex operations can be, especially when I mentally transliterate the same code into Java.

In the evening keynote at QCon, Richard Gabriel covered 50 language topics, with a 50 word statement about each—along with a blend of music, art, and poetry. (If you’ve never seen Richard perform at a conference, it’s quite an experience.)  His presentation "50 in 50" also covered 50 years of programming and introduced languages as diverse as COBOL, SNOBOL, Piet, LISP, Perl, C, Algol, APL, IPL, Befunge, and HQ9+.

HQ9+ particularly caught my attention.  It takes the question of "simplifying the representation of problems" to the utmost extreme.

HQ9+ has a simple grammar.  There are 4 operations, each represented by a single character.

’+’ increments the register.

‘H’ prints every languages natal example, "Hello, world!" 

‘Q’ makes every program into a quine.  It causes the interpreter to print the program text.  Quines are notoriously difficult assignments for second-year CS students.

‘9’ causes the interpreter to print the lyrics to the song "99 Bottles of Beer on the Wall."  This qualifies HQ9+ as a  real programming language, suitable for inclusion in the ultimate list of languages.

These three operators solve for some very commonly expressed problems.  In a certain sense, they are the ultimate solution to those problem.  They cannot be reduced any further… you can’t get shorter than one character.

Of course, in an audience of programmers, HQ9+ always gets a laugh.  In fact, it was created specifically to make programmers laugh.  And, in fact, it’s a kind of meta-level humor. It’s not the programs that are funny, but the design of the language itself… an inside joke from one programmer to the rest of us.

Eric Evans: Strategic Design

| Comments

Eric Evans, author of Domain-Driven Design and founder of Domain Language, embodies the philosophical side of programming.

He gave a wonderful talk on "Strategic Design".  During this talk, he stated a number of maxims that are worth pondering.

"Not all of a large system will be well designed."

"There are always multiple models."

"The diagram is not the model, but it is an expression of part of the model."

These are not principles to be followed, Evans says. Rather, these are fundamental laws of the universe. We must accept them and act accordingly, because disregarding them ends in tears.

Much of this material comes from Part 4 of Domain-Driven Design.  Evans laconically labeled this, "The part no one ever gets to."  Guilty.  But when I get back home to my library, I will make another go of it.

Evans also discusses the relative size of code, amount of time spent, and value of the three fundamental portions of a system: the core domain, supporting subdomains, and generic subdomains.

Generic subdomains are horizontal. You might find these in any system in any company in the world.

Supporting subdomains are business-specific, but not of value to this particular system. That is, they are necessary cost, but do not provide value.

The core domain is the reason for the system. It is the business-specific functionality that makes this system worth building.

Now, in a typical development process (and especially a rewrite project), where does the team’s time go? Most of it will go to the largest bulk: the generic subdomains. This is the stuff that has to exist, but it adds no value and is not specific to the company’s business. The next largest fraction goes to the supporting subdomains. Finally, the smallest portion of time—and usually the last portion of time—goes to the core domain.

That means the very last thing delivered is the reason for the system’s existance in the first place.  Ouch. 

Kent Beck’s Keynote: “Trends in Agile Development”

| Comments

Kent Beck spoke with his characteristic mix of humor, intelligence, and empathy.  Throughout his career, Kent has brought a consistently humanistic view of development.  That is, software is written by humans–emotional, fallible, creative, and messy–for other humans.  Any attempt to treat development as robotic will end in tears.

During his keynote, Kent talked about engaging people through appreciative inquiry.  This is a learnable technique, based in human psychology, that helps focus on positive attributes.  It counters the negaitivity that so many developers and engineers are prone to.  (My take: we spend a lot of time, necessarily, focusing on how things can go wrong.  Whether by nature or by experience, that leads us to a pessimistic view of the world.)

Appreciative inquiry begins by asking, "What do we do well?"  Even if all you can say is that the garbage cans get emptied every night, that’s at least something that works well.  Build from there.

He specifically recommended The Thin Book of Appreciative Inquiry, which I’ve already ordered.

I should also note that Kent has a new book out, called Implementation Patterns, which he described as being about, "Communicating with other people, through code."

From QCon San Francisco

| Comments

I’m at QCon San Francisco this week.  (An aside: after being a speaker at No Fluff, Just Stuff, it’s interesting to be the audience again.  As usual, on returning from travels in a different domain, one has a new perspective on familiar scenes.) This conference targets senior developers, architects, and project managers.  One of the very appealing things is the track on "Architectures you’ve always wondered about".  This coveres high-volume architectures for sites such as LinkedIn and eBay as well as other networked applications like Second Life.  These applications live and work in thin air, where traffic levels far outstrip most sites in the world.  Performance and scalability are two of my personal themes, so I’m very interested in learning from these pioneers about what happens when you’ve blown past the limits of traditional 3-tier, app-server centered architecture.

Through the remainder of the week, I’ll be blogging five ideas, insights, or experiences from each day of the conference.