Wide Awake Developers

« Geography Imposes Itself On the Clouds | Main | Agile IT! Experience »

Amazon Blows Away Objections

Amazon must have been burning more midnight oil than usual lately.

Within the last two weeks, they've announced three new features that basically eliminate any remaining objections to their AWS computing platform.

Elastic IP Addresses 

Elastic IP addresses solve a major problem on the front end.  When an EC2 instance boots up, the "cloud" assigns it a random IP address. (Technically, it assigns two: one external and one internal.  For now, I'm only talking about the external IP.) With a random IP address, you're forced to use some kind of dynamic DNS service such as DynDNS. That lets you update your DNS entry to connect your long-lived domain name with the random IP address.

Dynamic DNS services work pretty well, but not universally well.  For one thing, there is a small amount of delay.  Dynamic DNS works by setting a very short time-to-live (TTL) on the DNS entries, which instructs intermediate DNS servers to cache the entry only for a few minutes.  When that works well, you still have a few minutes of downtime when you need to reassign your DNS name to a new IP address.  For some parts of the Net, dynamic DNS doesn't work well, usually when some ISP doesn't respect the TTL on DNS entries, but caches them for a longer time.

Elastic IP addresses solve this problem. You request an elastic IP address through a Web Services call.  The easiest way is with the command-line API:

$ ec2-allocate-address
ADDRESS    75.101.158.25   

Once the address is allocated, you own it until you release it. At this point, it's attached to your account, not to any running virtual machine. Still, this is good enough to go update your domain registrar with the new address. After you start up an instance, then you can attach the address to the machine. If the machine goes down, then the address is detached from that instance, but you still "own" it.

So, for a failover scenario, you can reassign the elastic IP address to another machine, leave your DNS settings alone, and all traffic will now come to the new machine.

Now that we've got elastic IPs, there's just one piece missing from a true HA architecture: load distribution. With just one IP address attached to one instance, you've got a single point of failure (SPOF). Right now, there are two viable options to solve that. First, you can allocate multiple elastic IPs and use round-robin DNS for load distribution. Second, you can attach a single elastic IP address to an instance that runs a software load balancer: pound, nginx, or Apache+mod_proxy_balancer. (It wouldn't surprise me to see Amazon announce an option for load-balancing-in-the-cloud soon.) You'd run two of these, with the elastic IP attached to one at any given time. Then, you need a third instance monitoring the other two, ready to flip the IP address over to the standby instance if the active one fails. (There are already some open-source and commercial products to make this easy, but that's the subject for another post.)

Availability Zones 

The second big gap that Amazon closed recently deals with geography.

In the first rev of EC2, there was absolutely no way to control where your instances were running. In fact, there wasn't any way inside the service to even tell where they were running. (You had to resort to pingtracing or geomapping of the IPs). This presents a problem if you need high availability, because you really want more than one location.

Availability Zones let you specify where your EC2 instances should run. You can get a list of them through the command-line (which, let's recall, is just a wrapper around the web services):

$ ec2-describe-availability-zones
AVAILABILITYZONE    us-east-1a    available
AVAILABILITYZONE    us-east-1b    available
AVAILABILITYZONE    us-east-1c    available

Amazon tells us that each availability zone is built independently of the others. That is, they might be in the same building or separate buildings, but they have their own network egress, power systems, cooling systems, and security. Beyond that, Amazon is pretty opaque about the availability zones. In fact, not every AWS user will see the same availability zones. They're mapped per account, so "us-east-1a" for me might map to a different hardware environment than it does for you.

How do they come into play? Pretty simply, as it turns out. When you start an instance, you can specify which availability zone you want to run it in.

Combine these two features, and you get a bunch of interesting deployment and management options.

Persistent Storage

Storage has been one of the most perplexing issues with EC2. Simply put, anything you stored to disk while your instance was running would be lost when you restart the instance. Instances always go back to the bundled disk image stored on S3.

Amazon has just announced that they will be supporting persistent storage in the near future. A few lucky users get to try it out now, in it's pre-beta incarnation.

With persistent storage, you can allocate space in chunks from 1 GB to 1 TB.  That's right, you can make one web service call to allocate a freaking terabyte! Like IP addresses, storage is owned by your account, not by an individual instance. Once you've started up an instance---say a MySQL server, for example---you attach the storage volume to it. To the virtual machine, the storage looks just like a device, so you can use it raw or format it with whatever filesystem you want.

Best of all, because this is basically a virtual SAN, you can do all kinds of SAN tricks, like snapshot copies for backups to S3.

Persistent storage done this way obviates some of the other dodgy efforts that have been going on, like  FUSE-over-S3, or the S3 storage engine for MySQL.

SimpleDB is still there, and it's still much more scalable than plain old MySQL data storage, but we've got scores of libraries for programming with relational databases, and very few that work with key-value stores. For most companies, and for the forseeable future, programming to a relational model will be the easiest thing to do. This announcement really lowers the barrier to entry even further.

 

With these announcements, Amazon has cemented AWS as a viable computing platform for real businesses.

Comments

I recently purchased your ReleaseIt book and reallly enjoyed it.

My question is when dealing with a service like AWS, how do you know how to deal with issues like IoExceptions in strange places, vendor APIs that silently block etc.? Is the AWS documentation detailed enough to determine what to do, is it a trail and error process, is it ...?

For the most part, deploying an application on AWS does not make these problems any worse than any other deployment mechanism.

Your applications will generally not be interacting with AWS itself. Instead, you'll probably run a very familiar three-tier stack of servers: web, app, and database. I can see a few scenarios where you'd write code that interacts directly with AWS itself.

One would be automatic monitoring and dynamic sizing. In this case, I strongly suggest decoupling your primary application from the "exoskeleton" that handles dynamic resource allocation. So, expose latency measurements or demand stats via JMX, log files, or (ugh) SNMP. Then have the monitoring app pull those variables and make decisions about resource allocation. The monitoring app will either use direct SOAP calls, with client stubs generated from AWS' WSDL file.

The second place would be direct manipulation of S3 buckets, SimpleDB tables, or SQS. These are all web services calls, so they'll have the same safety attributes as the HTTP client library you use. For direct calls, I recommend the Apache Jakarta library. It gives you the most control over timeouts and failure modes.

Finally, you might use one of the mushroom crop of language-specific wrapper libraries that have sprung up all over the place. I haven't done a survey of these yet to see how production-ready they all are, but that's a good idea for a paper, if I can find time enough for it.

Cheers,
-Michael Nygard

There is one more potential minor objection. Has Amazon mentioned whether the persistent storage volumes are, at a hardware level, individual drives or mirrored/raid drives? I would prefer hardware-based support for data redundancy. If I were building a physical rack for a website, I wouldn't dream of storing data on individual drives. Sure, it's possible to replicate data across drives or utilize software-based raid, but my personal preference would be to have it happen in hardware. I understand it could cost twice as much as single drive storage, but I'm okay with that because I'd be allocating twice as many volumes if I had to roll-my-own solution.

Big deal, this is all stuff Joyent has been doing for a while now.

I believe there is one thing that's incorrect in the post above. If you restart your instance, your storage does not go away. As I understand it, you only lose your storage if you terminate your instance, which means you have effectively deallocated the entire instance (just like canceling your service with a VPS vendor). The other way to lose your storage is if your instance crashes (which means the hard drive crashed on the actual hardware you're running on).

See http://docs.amazonwebservices.com/AWSEC2/2008-02-01/DeveloperGuide/General_Information.html

Instance stores survive reboots. They just die when the instance is terminated, which really is a bad term because you're effectively deleting the instance, but you don't have to do that to reboot it.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)