Wide Awake Developers

OTUG Tonight

| Comments

This evening, I’m speaking at OTUG. The topic is "Clouds, Grids, and Fog".

There’s no denying that "cloud" has become a huge buzzword. It’s a crossover trend, too. It’s not just the CIO who is interested in cloud computing. It’s the CFO and the CMO, too. (Not to mention the CSO, if there is one.)  Underneath the buzz, though, there is something real and valuable.

I will talk about the driving trends that are leading us toward cloud computing and how it differs from grids and software-as-a-service. I’ll also talk at length about the architectural implications and effects of running your software on a cloud.

If you live in the Twin Cities, I hope to see you there.

Attack of Self-Denial, 2008 Style

| Comments

"Good marketing can kill your site at any time."

–Paul Lord, 2006

I just learned of another attack of self-denial from this past week.

Many retailers are suffering this year, particularly in the brick-and-mortar space. I have heard from several, though, who say that their online performance is not suffering as much as the physical stores are. In some cases, where the brand is strong and the products are not fungible, the online channel is showing year-over-year growth.

One retailer I know was running strong, with the site near it’s capacity. They fit the bill for an online success in 2008. They have a great name recognition, a very strong, global brand, and their customers love their products. This past week, their marketing group decided to "take it to the next level."

They blasted an email campaign to four million customers.  It had a good offer, no qualifier, and a very short expiration time—one day only.  A short expiration like that creates a sense of urgency.  Good marketing reaches people and induces them to act, and in that respect, the email worked. Unfortunately, when that means millions of users hitting your site, you may run into trouble.

Traffic flooded the site and knocked it offline. It took more than 6 hours to get everything functioning again.

Instead of getting an extra bump in sales, they lost six hours of holiday-season revenue. As a rule of thumb, you should assume that a peak hour of holiday sales counts for six hours of off-season sales.

There are other technological solutions to help with this kind of traffic flood. For instance, the UX group can create a static landing page for the offer. Then marketing links to that static page in their email blast. Ops can push that static page out into their cache servers, or even into their CDN’s edge network. This requires some preparation for each offer, and it takes some extra preparation before the first such offer, but it’s very effective. The static page absorbs the bulk of the traffic, so only customers who really want to buy get passed into the dynamic site.

Marketing can also send the email out in waves, so people receive it at different times. That spreads the traffic spike out over a few hours. (Though this doesn’t work so well when you send the waves throughout the night, because customers will all see it in a couple of hours in the morning.)

In really extreme cases, a portion of capacity can be carved out and devoted to handling promotional traffic. That way, if the promotion goes nuclear, at least the rest of the site is still online. Obviously, this would be more appropriate for a long-running promotion than a one-day event.

Of course, it should be obvious that all of these technological solutions depend on good communication.

At a surface level, it’s easy to say that this happened because marketing had no idea how close to the edge the site was already running. That’s true. It’s also true, however, that operations previously had no idea what the capacity was. If marketing called and asked, "Can we support 4 million extra visits?" the current operations group could have answered "no". Previously, the answer would have been "I don’t know."

So, operations got better, but marketing never got re-educated. Lines of communication were never opened, or re-opened. Better communication would have helped.

In any online business, you must have close communications between marketing, UX, development, and operations. They need to regard themselves as part of one integrated team, rather than four separate teams. I’ve often seen development groups that view operations as a barrier to getting their stuff released. UX and marketing view development as the barrier to getting their ideas implemented, and so on. This dynamic evolves from the "throw it over the wall" approach, and it can only result in finger-pointing and recriminations.

I’d bet there’s a lot of finger-pointing going on in that retailer’s hallways this weekend.

(Human | Pattern) Languages, Part 2

| Comments

At the conclusion of the modulating bridge, we expect to be in the contrasting key of C minor. Instead, the bridge concludes in the distantly related key of F sharp major… Instead of resolving to the tonic, the cadence concludes with two isolated E pitches. They are completely ambiguous. They could belong to E minor, the tonic for this movement. They could be part of E major, which we’ve just heard peeking out from behind the minor mode curtains. [He] doesn’t resolve them into a definite key until the beginning of the third movement, characteristically labeled a "Scherzo".

In my last post, I lamented the missed opportunity we had to create a true pattern language about software. Perhaps calling it a missed opportunity is too pessimistic. Bear with me on a bit of a tangent. I promise it comes back around in the end.

The example text above is an amalgam of a lecture series I’ve been listening to. I’m a big fan of The Teaching Company and their courses. In particular, I’ve been learning about the meaning and structure of classical, baroque, romantic, and modern music from Professor Robert Greenberg.1 The sample I used here is from a series on Beethoven’s piano sonatas. This isn’t an actual quote, but a condensation of statements from one of the lectures. I’m not going to go into all the music theory behind this, but it is interesting.2

There are two things I want you to observe about the sample text. First, it’s loaded with jargon. It has to be! You’d exhaust the conversational possibilities about the best use of a D-sharp pretty quickly. Instead, you’ll talk about structures, tonalities, relationships between that D-sharp and other pitches. (D-sharp played together with a C? Very different from a quick sequence of D-sharp, E, D-sharp, C.) You can be sure that composers don’t think in terms of individual notes. A D-sharp by itself doesn’t mean anything. It only acquires meaning by its relation to other pitches. Hence all that stuff about keys—tonic, distantly related, contrasting. "Key" is a construct for discussing whole collections of pitches in a kind of shorthand. To a musician, there’s a world of difference between G major and A flat minor, even though the basic pitch (the tonic) is only one half-step apart.

Also notice that the text addresses some structural features. The purpose and structure of a modulating bridge is pretty well understood, at least in certain circles. The notion that you can have an "expected" key certainly implies that there are rules for a sonata. In fact, the term "sonata" itself means some fairly specific things3… although to know whether we’re talking about "a sonata" or "a movement in sonata form" requires some additional context.

In fact, this paragraph is all about context. It exists in the context of late Classical, early Romantic era music, specifically the music of Beethoven. In the Classical era, musical forms—such as sonata form—pretty much dictates the structure of the music. The number of movements, their relationships to each other, their keys, and even their tempos were well understood. A contemporary listener had every reason to expect that a first movement would be fast and bright, and if the first movement was in C major, then the second, slower movement would be a minuet and trio in G major.

Music and music theory have evolved over the last thousand-odd years. We have a vocabulary—the potentially off-putting jargon of the field. We have nesting, interrelating contexts. Large scale patterns (a piano sonata) create context for medium scale patterns (the first movement "allegretto") which in turn, create context for the medium and small scale patterns (the first theme in the allegretto consists of an ABA’BA phrasing, in which the opening theme sequences a motive upward over octaves.)  We even have the ability to talk about non sequiturs—like the modulating bridge above—where deliberate violation of the pattern language is done for effect.4

What is all this stuff if it isn’t a pattern language?

We can take a few lessons, then, from the language of music.

The first lesson is this: give it time. Musical language has evolved over a long time. It has grown and been pruned back over centuries. New terms are invented as needed to describe new answers to a context. In turn, these new terms create fresh contexts to be exploited with yet other inventions.

Second, any such language must be able to assimilate change. Nothing is lost, even amidst the most radical revolutions. When the Twentieth Century modernists rejected the tonal system, they could only reject the structures and strictures of that language. They couldn’t destroy the language itself. Phish plays fugues in concert… they just play them with electric guitars instead of harpsichords. There are Baroque orchestras today. They play in the same concert halls as the Pops and Philharmonics. The homophonic texture of plain chant still exists, and so do the once-heretical polyphony and church-sanctioned monophony. Nothing is lost, but new things can be encompassed and incorporated.

And, mainframes still exist with their COBOL programs, together with distributed object systems, message passing, and web services. The Singleton and Visitor patterns will never truly go away, any more than batch programming will disappear.

Third, we must continue to look at the relationships between different parts of our nascent pattern language. Just as individual objects aren’t very interesting, isolated patterns are less interesting than the ways they can interact with each other.

I believe that the true language of software has as much to do with programming languages as the language of music has to do with notes. So, instead of missed opportunity, let us say instead that we are just beginning to discover our true language.


1. Professor Greenberg is a delightful traveling companion. He’s witty, knowledgeable and has a way of teaching complex subjects without ever being condescending. He also sounds remarkably like Penn Jillette.

2. The main reason is that I would surely get it wrong in some details and risk losing the main point of my post here.

3. And here we see yet another of the complexities of language. The word "sonata" refers, at different times, to a three movement concert work, a single movement in a characteristic structure, a four movement concert work, and in Beethoven’s case, to a couple of great fantasias that he declares to be sonatas simply because he says so.

4. For examples ad nauseum, see Richard Wagner and the "abortive gesture".

(Human | Pattern) Languages

| Comments

We missed the point when we adopted "patterns" in the software world. Instead of an organic whole, we got a bag of tricks.

The commonly accepted definition of a pattern is "a solution to a problem in a context." This is true, but limiting. This definition loses an essential characteristic of patterns: Patterns relate to other patterns.

We talk about the context of a problem. "Context" is a mental shorthand. If we unpack the context it means many things: constraints, capabilities, style, requirements, and so on. We sometimes mislead ourselves by using the fairly fuzzy, abstract term "context" as a mental handle on a whole variety of very concrete issues. Context includes stated constraints like the functional requirements, along with unstated constraints like, "The computation should complete before the heat death of the universe." It includes other forces like, "This program is written in C#, so the solution to this problem should be in the same language or a closely related one." It should not require a supercooled quantum computer, for example.

Where does the context for a small-scale pattern originate?1 Context does not arise ex nihilio. No, the context for a small-scale pattern is created by larger patterns. Large grained patterns create the fabric of forces that we call the context for smaller patterns. In turn, smaller patterns fit into this fabric and, by their existence, they change it. Thus, the small scale patterns create feedback that can either resolve or exacerbate tensions inherent in the larger patterns.

Solutions that respect their context fit better with the rest of the organic whole. It would be strange to be reading some Java code, built into layered architecture with a relational database for storage, then suddenly find one component that has its own LISP interpreter and some functional code. With all respect to "polyglot programming", there’d better be a strong motivation for such an odd inclusion. It would be a discontinuity… in other words, it doesn’t fit the context I described. That context—the layered architecture, the OO language, relational database—was created by other parts of the system.

If, on the other hand, the system was built as a blackboard architecture, using LISP as glue code over intelligent agents acting asynchronously, then it wouldn’t be at all odd to find some recursive lambda expressions. In that context, they fit naturally and the Java code would be an oddity.

This interrelation across scale knits patterns together into a pattern language. By and large, what we have today is a growing group of proper nouns. Please don’t get me wrong, the nouns themselves have use. It’s very helpful to say "you want a Null Object there," and be understood. That vocabulary and the compression it provides is really important.

But we shouldn’t mistake a group of nouns for a real pattern language. A language is more than just its nouns. A language also implies ways of connecting statements sensibly. It has idioms and semantics and semiotics.2 In a language, you can have dialog and argumentation.  Imagine a dialog in patterns as they exist today:

"Pipes and filters."

"Observer?"

"Chain of Responsibility!"

You might be able to make a comedy sketch out of that, but not much more. We cannot construct meaningful dialogs about patterns at all scales.

What we have are fragments of what might become a pattern language. GoF, the PLoPD books, the PoSA books… these are like a few charted territories on an unmapped continent. We don’t yet have the language that would even let us relate these works together, let alone relating them to everything else.

Everything else?  Well, yes. By and large, patterns today are an outgrowth of the object-oriented programming community.  I contend, however, that "object-oriented" is a pattern! It’s a large-scale pattern that creates really significant context for all the other patterns that can work within it. Solutions that work within the "object-oriented" context make no sense in an actor-oriented context, or a functional context, or a procedural context, and so on. Each of these other large-scale patterns admit different solutions to similar problems: persistence, user interaction, and system integration, to name a few. I can imagine a pattern called "Event Driven" that would work very well with "Object oriented", "Functional", and "Actor Oriented", but somewhat less well with "Procedural programming", and contradict utterly with "Batch Processing". (Though there might be a link between them called "Buffer file" or something like that.)

That’s the piece that we missed. We don’t have a pattern language yet. We’re not even close.


1. By "large" and "small", I don’t mean to imply that patterns simply nest hierarchically. It’s more complex and subtle than that. When we do have a real pattern language, we’ll find that there are medium-grained patterns that work together with several, but not all, of the large ones. Likewise, we’ll find small-scale patterns that make medium sized ones more or less practical. It’s not a decision tree or a heuristic.

2. That’s what keeps, "Fill the idea with blue" from being a meaningful sentence. All the words work, and they’re even the right part of speech, yet the sentence as a whole doesn’t fit together.

Connection Pools and Engset

| Comments

In my last post, I talked about using Erlang models to size the front end of a system. By using some fundamental capacity models that are almost a century old, you can estimate the number of request handling threads you need for a given traffic load and request duration.

Inside the Box

It gets tricky, though, when you start to consider what happens inside the server itself. Processing the request usually involves some kind of database interaction with a connection pool. (There are many ways to avoid database calls, or at least minimize the damage they cause. I’ll address some of these in a future post, but you can also check out Two Ways to Boost Your Flagging Web Site for starters.) Database calls act like a kind of "interior" request that can be considered to have its own probability of queuing.

Exterior call to server becomes an "interior" call to a database.

Because this interior call can block, we have to consider what effects it will have on the duration of the exterior call. In particular, the exterior call must take at least the sum of the blocking time plus the processing time for the interior call.

At this point, we need to make a few assumptions about the connection pool. First, the connection pool is finite. Every connection pool should have a ceiling. If nothing else, the database server can only handle a finite number of connections. Second, I’m going to assume that the pool blocks when exhausted. That is, calling threads that can’t get a connection right away will happily wait forever rather than abandoning the request. This is a simplifying assumption that I need for the math to work out. It’s not a good configuration in practice!

With these assumption in place, I can predict the probability of blocking within the interior call. It’s a formula closely related to the Erlang model from my last post, but with a twist. The Erlang models assume an essentially infinite pool of requestors. For this interior call, though, the pool of requestors is quite finite: it’s the number of request handling threads for the exterior calls. Once all of those threads are busy, there aren’t any left to generate more traffic on the interior call!

The formula to compute the blocking probability with a finite number of sources is the Engset formula. Like the Erlang models, Engset originated in the world of telephony. It’s useful for predicting the outbound capacity needed on a private branch exchange (PBX), because the number of possible callers is known. In our case, the request handling threads are the callers and the connection pool is the PBX.

Practical Example

Using our 1,000,000 page views per hour from last time, Table 1 shows the Engset table for various numbers of connections in the pool. This assumes that the application server has a maximum of 40 request handling threads. This also supposes that the database processing time uses 200 milliseconds of the 250 milliseconds we measured for the exterior call.

NEngset(N,A,S)
0100.00000%
198.23183%
296.37740%
394.43061%
492.38485%
590.23293%
687.96709%
785.57891%
883.05934%
980.39867%
1077.58656%
1174.61210%
1271.46397%
1368.13065%
1464.60087%
1560.86421%
1656.91211%
1752.73932%
1848.34604%
1943.74105%
2038.94585%
2134.00023%
2228.96875%
2323.94730%
2419.06718%
2514.49235%
2610.40427%
276.97050%
284.30152%
292.41250%
301.21368%
310.54082%
320.21081%
330.07093%
340.02028%
350.00483%
360.00093%
370.00014%
380.00002%
390.00000%
400.00000%

Notice that when we get to 18 connections in the pool, the probability of blocking drops below 50%.  Also, notice how sharply the probability of blocking drops off around 23 to 31 connections in the pool. This is a decidedly nonlinear effect!

From this table, it’s clear that even though there are 40 request handling threads that could call into this pool, there’s not much point in having more than 30 connections in the pool. At 30 connections, the probability of blocking is already less than 1%, meaning that the queuing time is only going to add a few milliseconds to the average request.

Why do we care? Why not just crank up the connection pool size to 40? After all, if we did, then no request could ever block waiting for a connection. That would minimize latency, wouldn’t it?

Yes, it would, but at a cost. Increasing the number of connections to the database by a third means more memory and CPU time on the database just managing those connections, even if they’re idle. If you’ve got two app servers, then the database probably won’t notice an extra 10 connections. Suppose you scale out at the app tier, though, and you now have 50 or 60 app servers. You’d better believe that the DB will notice an extra 500 to 600 connections. They’ll affect memory needs, CPU utilization, and your ability to fail over correctly when a database node goes down.

Feedback and Coupling

There’s a strong coupling between the total request duration in the interior call and the request duration for the exterior call. If we assume that every request must go through the database call, then the exterior response time must be strictly greater than the interior blocking time plus the interior processing time.

In practice, it actually gets a little worse than that, as this causal loop diagram illustrates.

 Time dependencies between the interior call and the exterior call.

It reads like this: "As the interior call blocking time increases, the exterior call duration increase. As the interior call blocking increases, the exterior call duration time increases." This type of representation helps clarify relations between the different layers. It’s very often the case that you’ll find feedback loops this way. Any time you do find a feedback loop, it means that slowdowns will produce increasing slowdowns. Blocking begets blocking, quickly resulting in a site hang.

Conclusions

Queues are like timing dots. Once you start seeing them, you’ll never be able to stop. You might even start to think that your entire server farm looks like one vast, interconnected set of queues.

That’s because it is.

People use database connection pools because creating new connections is very slow. Tuning your database connection pool size, however, is all about optimizing the cost of queueing against the cost of extra connections. Each connection consumes resources on the database server and in the application server. Striking the right balance starts by identifying the required exterior response time, then sizing the connection pool—or changing the architecture—so the interior blocking time doesn’t break the SLA.

For much, much more on the topic of capacity modeling and analysis, I definitely recommend Neil Gunther’s website, Performance Agora. His books are also a great—and very practical—way to start applying performance and capacity management.

Thread Pools and Erlang Models

| Comments

Sizing, Danish Style

Folks in telecommunications and operations research have used Erlang models for almost a century. A. K. Erlang, a Danish telephone engineer, developed these models to help plan the capacity of the phone network and predict the grade of service that could be guaranteed, given some basic metrics about call volume and duration. Telephone networks are expensive to deploy, particularly when upgrading your trunk lines involves digging up large portions of rocky Danish ground or running cables under the North Sea.

The Erlang-B formula predicts the probability that an incoming call cannot be serviced, based on the call arrival rate, average call time, and number of lines available.  Erlang-C is similar, but allows for calls to be queued while waiting for service. It predicts the probability that a call will be queued. It can also show when calls will never be serviced, because the rate of arriving calls exceeds the system’s total capacity to serve them.

Erlang models are widely used in telecomm, including GPRS network sizing, trunk line sizing, call center staffing models, and other capacity planning arenas where request arrival is apparently random. In fact, you can use it to predict the capacity and wait time at a restaurant, bank branch, or theme park, too.

It should be pretty obvious that Erlang models are widely applicable in computer performance analysis, too. There’s a rich body of literature on this subject that goes back to the dawn of the mainframe. Erlang models are the foundation of most capacity management groups. I’m not even going to scratch the surface here, except to show how some back-of-the-envelope calculations can help you save millions of dollars.

One Million Page Views

In my case, I wanted to look at thread pool sizing. Suppose you have an even 1,000,000 requests per hour to handle. This implies an arrival rate (or lambda) of 0.27777… requests per millisecond. (Erlang units are dimensionless, but you need to start with the same units of time, whether it’s hours, days, or milliseconds.) I’m going to assume for the moment that the system is pretty fast, so it handles a request in 250 milliseconds, on average.

(Please note that there are many assumptions underneath simply statements like "on average". For the moment, I’ll pretend that request processing time follows a normal distribution, even though any modern system is more likely to be bimodal.)

Table 1 shows a portion of the Erlang-C table for these parameters. Feel free to double-check my work with this spreadsheet or this short C program to compute the Erlang-B and Erlang-C values for various numbers of threads. (Thanks to Kenneth J. Christensen for the original program. I can only claim credit for the extra "for" loop.)

Table 1. Erlang-C values at 250 ms / request

NPr_Queue (Erlang-C)
67undef
68undef
69undef
700.921417281
710.791698369
720.676255938
730.574128540
740.484342834
750.405921606
760.337892350
770.279296163
780.229196685
790.186688788
800.150906701
810.121031288
820.096296202
830.075992736
840.059473196
850.046152756
860.035509802
870.027084849
880.020478191
890.015346497
900.011398581
910.008390600
920.006120940
930.004424999
940.003170077
950.002250524
960.001583268
970.001103786
980.000762573
990.000522098

From Table 1, I can immediately see that anything less than 70 threads will never keep up. With less than 70 threads, the queue of unprocessed requests will grow without bound. I need at least 91 threads to get below a 1% chance that a request will be delayed by queueing.

Performance and Capacity

Now, what happens if the average request processing time goes up by 100 milliseconds on those same million requests? Adjusting the parameters, I get Table 2.

Table 2. Erlang-C values at 350 ms / request

NPr_Queue (Erlang-C)
96undef
97undef
980.907100356
990.797290966
1000.697789489
1010.608014385
1020.527376532
1030.455282634
1040.391138874
1050.334354749
1060.284347016
1070.240543652
1080.202387733
1090.169341130
1100.140887936
1110.116537521
1120.095827141
1130.078324041
1140.063626999
1150.051367297
1160.041209109
1170.032849334
1180.026016901
1190.020471625
1200.016002658
1210.012426630
1220.009585560
1230.007344611
1240.005589775
1250.004225555

Now we need a minimum of 99 threads before we can even expect to keep up and we need 122 threads to get down under that 1% queuing threshold.

On the other hand, what about increasing performance by 100 millseconds per request? I’ll let you run the calculator for that, but it looks to me like we need between 42 and 59 threads to meet the same thresholds.

That swing, from 150 to 350 milliseconds per request makes a huge difference in the number of concurrent threads your system must support to handle a million requests per hour—almost a factor of 3 times. Would you be willing to triple your hardware for the same request volume? Next time anyone says that “CPU is cheap”, fold your arms and tell them “Erlang would not approve.” On the flip side, it might be worth spending some administrator time on performance tuning to bring down your average page latency. Or maybe some programmer time to integrate memcached so every single page doesn’t have to trudge all the way to the database.

Summary and Extension

Obviously, there’s a lot more to performance analysis for web servers than this. Over time, I’ll be mixing more analytic pieces with the pragmatic, hands-on posts that I usually make. It’ll take some time. For one thing, I have to go back and learn about stochastic process and Markov chains. Pattern recognition and signal processing I’ve got. Advanced probability and statistics I don’t got.

In fact, I’ll offer a free copy of Release It to the first commenter who can show me how to derive an Erlang-like model that accounts for a) garbage collection times (bimodal processing time distribution), b) multiple coupled wait states during processing, c) non-equilibrium system states, and d) processing time that varies as a function of system utilization.

Constraint, Chaos, Collapse

| Comments

Patrick Muellr has an interesting post about being brainwashed into believing that the outrageous is normal. It’s a good read. (Hat tip to Reddit, whence many good things.) As often happens, I wrote such a long comment to his post that I felt it worthwhile to repost here.

My comment revolves around this chart of the Dow Jones Industrial Average over the last eighty years. (For the record, I’m not disputing anything about the rest of Patrick’s post. In fact, I agree with most of what he says. This chart and my comments aren’t central to his discussion about web development.) Some of you know that I’ve worked in finance before, and most of you know I have an interest in dynamics and complex systems. It’s been an interesting year.

Here’s a snapshot of the chart in question. It’s from Yahoo! Finance, and the image links to the live chart.



Most of the chart looks like an exponential, which suggests the effect of compound growth. In a functioning capital-based system you’d expect exactly that. Capital invested produces more capital. Any time an output is also a required input, you get exponential growth. One of Patrick’s other commenters points out that it looks almost linear when plotted on a logarithmic scale… a dead giveaway of an exponential.

No real system can produce infinite growth. Instead, they always hit a constraint. That could be a physical limitation on the available inputs. It could be a limit on the throughput of the system itself. In a sense, it almost doesn’t matter what the constraint itself happens to be. Rather, you should assume that a constraint exists.

In systems with a chaotic tendency, the system doesn’t slow down at all when approaching the constraint. In fact, it may be increasing at it’s greatest rate just before the constraint clamps down hardest. In such cases, you’ll either see a catastrophic collapse or a chaotic fluctuation.

I don’t know what the true constraint was in the financial system. Plenty of other people believe they know, and I’m happy to let them believe what they like. Just from looking at the chart, though, you could make a strong case that we really hit the constraint in 1999 and the rest has been chaos since then.

Licensing for Windows on EC2

| Comments

One thing I noticed when I fired up my first Windows instances on EC2 was that Windows never asked me for a license key.  From examining the registry, it appears that a valid license key is installed at boot time.  On two instances of image ami-b53cd8dc (ec2-public-windows-images/Server2003r2-i386-anon-v1.01 for i386) I got exactly the same key.

Likewise, on two different instances of ami-7b2bcf12 (ec2-public-windows-images/Server2003r2-x86_64-anon-v1.00 or x64), I got the same license key–though not the same key as the i386 image.

This tells me that the license key is probably baked into the image. It’s also possible that these particular license keys are unique to my account. If someone else wants to compare keys, it’d be an interesting experiment.

Either way, the extra 2.5 cents per hour on the small instance must go to Microsoft to pay for license rental.

 

Windows on EC2, From a Mac

| Comments

It may be a bit perverse, but I wanted to hit a Windows EC2 instance from my Mac. After a little hitch getting started, I got it to work. There are a few quirks about accessing Windows instances, though.

First off, SSH is not enabled by default. You’ll need to use remote desktop to access your instance. Remote desktop uses port 3389, so the first step is to create a new security group for Windows desktop access

$ ec2-add-group windows -d 'Windows remote desktop access'
GROUP    windows    Windows remote desktop access

Then, allow access to port 3389 from your desired origin. I’m allowing it from anywhere, which isn’t a great idea, but I’m on the road a lot. I never know what the hotel’s network origin will be.

$ ec2-authorize windows -p 3389 -P tcp
GROUP        windows    
PERMISSION        windows    ALLOWS    tcp    3389    3389    FROM    CIDR    0.0.0.0/0

Obviously, you could add that permission to any existing group that you already use.

There’s a bit of a song and dance to log in. Where Linux instances typically use SSH with public-key authentication, Windows server requires a typed password. Amazon has come up with a reasonable, but slightly convoluted, way to extract a randomized password.

You will need to start your instance in the new security group and with a keypair. The docs could be a little clearer, in that here you’re providing the name of the keypair as it was registered with EC2. The first few times I tried this, I was giving it the path of the file containing the keypair, which doesn’t work.

$ ec2-describe-keypairs
KEYPAIR    devkeypair    02:10:65:9e:51:73:7e:93:bd:30:e2:5d:91:03:d5:e1:d4:0e:c0:f4
$ ec2-run-instances ami-782bcf11 -g windows -k devkeypair
RESERVATION    r-82429ceb    001356815600    windows
INSTANCE    i-f172db98    ami-782bcf11            pending    devkeypair    0        m1.small    2008-10-23T20:01:36+0000    us-east-1a            windows

After all that, and waiting through a Windows boot cycle, you can access the Windows desktop through RDP.

What’s that? You don’t have an RDP client, because you’re a Mac user? I like CoRD for that. I also saw a lot of references to rdesktop, which is available through Darwin Ports. (For today, I wasn’t prepared to install Ports just to try out the Windows EC2 instance!)

Extract the public IP address of your instance:

$ ec2-describe-instances
RESERVATION    r-82429ceb    001356815600    windows
INSTANCE    i-f172db98    ami-782bcf11    ec2-75-101-252-238.compute-1.amazonaws.com    domU-12-31-39-02-48-31.compute-1.internal    running    devkeypair    0        m1.small    2008-10-23T20:01:36+0000    us-east-1a        windows

Fire up CoRD and paste the IP address into "Quick Connect".

Well, now what? Obviously, you’ll use "Administrator" as the username, but what’s the password? There’s a new command in the latest release of ec2-api-tools called "ec2-get-password".

$ ec2-get-password i-f172db98 -k keys/devkeypair.pem
edhnsNG1J5

Note that this time, I’m using the path of my keypair file. EC2 uses this to decrypt the password from the instance’s console output. At boot time, Windows prints out the password, encrypted with the public key from the keypair you named when starting the instance.

Success at last: fully logged in to my virtual Windows server from my Mac desktop.

Don’t Break My Heart, EC2!

| Comments

I’m a huge booster of AWS and EC2. I have two talks about cloud computing, and one that’s pretty specific to AWS, on the No Fluff, Just Stuff traveling symposium.

With today’s announcement about EC2 coming out of beta, and about Windows support, I wanted to try out a Windows server on EC2.

Heartbreak!

ec2-describe-images -a | grep windows
IMAGE    ami-782bcf11    ec2-public-windows-images/Server2003r2-i386-anon-v1.00.manifest.xml    amazon    available    public        i386    machine        
IMAGE    ami-792bcf10    ec2-public-windows-images/Server2003r2-i386-EntAuth-v1.00.manifest.xml    amazon    available    public        i386    machine        
IMAGE    ami-7b2bcf12    ec2-public-windows-images/Server2003r2-x86_64-anon-v1.00.manifest.xml    amazon    available    public        x86_64    machine        
IMAGE    ami-7a2bcf13    ec2-public-windows-images/Server2003r2-x86_64-EntAuth-v1.00.manifest.xml    amazon    available    public        x86_64    machine        
IMAGE    ami-3934d050    ec2-public-windows-images/SqlSvrExp2003r2-i386-Anon-v1.00.manifest.xml    amazon    available    public        i386    machine        
IMAGE    ami-0f34d066    ec2-public-windows-images/SqlSvrExp2003r2-i386-EntAuth-v1.00.manifest.xml    amazon    available    public        i386    machine        
IMAGE    ami-8135d1e8    ec2-public-windows-images/SqlSvrExp2003r2-x86_64-Anon-v1.00.manifest.xml    amazon    available    public        x86_64    machine        
IMAGE    ami-9835d1f1    ec2-public-windows-images/SqlSvrExp2003r2-x86_64-EntAuth-v1.00.manifest.xml    amazon    available    public        x86_64    machine        
IMAGE    ami-6834d001    ec2-public-windows-images/SqlSvrStd2003r2-x86_64-Anon-v1.00.manifest.xml    amazon    available    public        x86_64    machine        
IMAGE    ami-6b34d002    ec2-public-windows-images/SqlSvrStd2003r2-x86_64-EntAuth-v1.00.manifest.xml    amazon    available    public        x86_64    machine        
IMAGE    ami-cd8b6ea4    khaz_windows2003srvEE/image.manifest.xml    602961847481    available    public        i386    machine        

mtnygard@donk /var/tmp/nms $ ec2-run-instances ami-792bcf10
Server.InsufficientInstanceCapacity: Insufficient capacity.
mtnygard@donk /var/tmp/nms $ ec2-run-instances ami-792bcf10
Server.InsufficientInstanceCapacity: Insufficient capacity.
mtnygard@donk /var/tmp/nms $ ec2-run-instances ami-792bcf10 -z us-east-1a
Server.InsufficientInstanceCapacity: Insufficient capacity.
mtnygard@donk /var/tmp/nms $ ec2-run-instances ami-792bcf10 -z us-east-1b
Server.InsufficientInstanceCapacity: Insufficient capacity.
mtnygard@donk /var/tmp/nms $ ec2-run-instances ami-792bcf10 -z us-east-1c
Server.InsufficientInstanceCapacity: Insufficient capacity.

Ack! Insufficient capacity?! That’s not supposed to happen. Wait a second… let me try my own image

mtnygard@donk /var/tmp/nms $ ec2-describe-images
IMAGE    ami-8a0beee3    com.michaelnygard/nms-base-v1.manifest.xml    001356815600    available    private        i386    machine        
mtnygard@donk /var/tmp/nms $ ec2-run-instances ami-8a0beee3
RESERVATION    r-0c4a9465    001356815600    default
INSTANCE    i-8e79d0e7    ami-8a0beee3            pending        0        m1.small    2008-10-23T17:25:21+0000    us-east-1c        
mtnygard@donk /var/tmp/nms $ ec2-run-instances ami-792bcf10
Server.InsufficientInstanceCapacity: Insufficient capacity.

Very interesting. Looks like there’s enough capacity to run all the Linux based images, but not enough for Windows?

Seems like there might be some contractual limit on how many Windows licenses Amazon is allowed to rent out. I would also infer some serious pent-up demand to eat them all up this quickly.

Or maybe it’s just a glitch. We’ll see.

Update [1:15 PM] I was just able to start five instances. Could be fluctuations in demand, or it could be clearing of a glitch. It’s always hard to tell what’s really happening inside the cloud.

Update [2:50 PM] My plaintive post in the AWS forums got a very quick response. The inscrutable wizard JeffW posted a “we’re working on it” and “it’s fixed” messages just 3 minutes apart. We’ll probably never know quite what was going on.