Amazon Web Services (EC2 & S3) – The Future of Data Centre Computing? Part 1

On the back of an article published by CNet news.com on Sun CTO’s, Greg Papadopoulos, assertion that “the world only needs five computers”, DSI has taken the time step back and look at what Amazon, surely a strong candidate for one of the five, is offering on the market today in terms of processing power on demand.

[tag]EC2[/tag] (Elastic Computing Cloud) and [tag]S3[/tag] (Simple Storage Service) are two of Amazon’s service offerings as part of its [tag]Amazon Web Services[/tag] suite.

EC2 offers almost infinite computing power on-demand, using predefined Amazon Machine Images (AMI) as the blueprints for how and what your on-demand processor should be processing. These AMIs are controlled via a set of web service calls which come pre-packaged in a downloadable command line utility or an EC2 library written in Java, Python on Ruby which you can plug into your own code base.

S3 offers quite simply, storage, and lots of it. Think of S3 like a huge HashMap, using keys to map to values in the bucket. Each object can be from 1 byte to 5 gigabytes in size. It doesn’t suit to think of S3 in terms of a relational database, better to think of it as an object dump; one that allows you to access any amount of data from any location and you have Amazon’s 99.99% availability design requirement to back all this up.

Each machine instance ‘predictably provides the equivalent of a system with a 1.7Ghz x86 processor, 1.75GB of RAM, 160GB of local disk, and 250Mb/s of network bandwidth’. Ok, it’s not lightening fast but consider the possibilities of 10 or 20 of these servers with load balancing and clustering configured and you quickly get the idea that EC2 offers a highly scalable, secure (maybe, more on this later) and cost effective infrastructure on which to host your application.

We have ran some performance benchmarks on an EC2 instance using Javolution; a high performance Java library. [tag]Javolution[/tag] comes packaged with a set of common method calls that may be used to gauge performance of a JDK, third party library or hardware configuration. These methods include Quick Sort, String manipulation, Serialization and common Collection operations like put() and add(). The results are encouraging when compared against the same tests ran on various platforms including a 2.0Ghz dual core Centrino laptop, 3.0Ghz P4 pc and a 3.2Ghz, hyper-threaded pc. All tests were performed on Windows XP SP2 running JDK 1.5.0_10.

EC2 Instance Performance

Javolution – Java(TM) Solution for Real-Time and Embedded Systems
Version 4.2.1 (J2SE 1.5+) January 2 2007 (http://javolution.org)

Run benchmark… (shortest execution times are displayed)

//////////////////////////////////
// Package: javolution.context //
//////////////////////////////////

— Concurrent Context —
Quick Sort 10000 elements – Concurrency disabled: 22.40 ms
Quick Sort 10000 elements – Concurrency (1) enabled: 22.46 ms

— Heap versus Stack Allocation (Pool-Context) —
Small object heap creation: 21.00 ns
Small object stack creation: 47.00 ns
char[256] heap creation: 948.0 ns
char[512] heap creation: 1.859 us
char[256] stack creation: 64.00 ns
char[512] stack creation: 65.00 ns

//////////////////////////////
// Package: javolution.text //
//////////////////////////////

— Primitive types formatting —

StringBuffer.append(int): 413.9 ns
TextBuilder.append(int): 215.2 ns
StringBuffer.append(long): 1.608 us
TextBuilder.append(long): 837.4 ns
StringBuffer.append(float): 1.808 us
TextBuilder.append(float): 1.081 us
StringBuffer.append(double): 7.224 us
TextBuilder.append(double): 3.631 us

2.0 Ghz Dual core, Centrino laptop

Javolution – Java(TM) Solution for Real-Time and Embedded Systems
Version 4.2.1 (J2SE 1.5+) January 2 2007 (http://javolution.org)

Run benchmark… (shortest execution times are displayed)

//////////////////////////////////
// Package: javolution.context //
//////////////////////////////////

— Concurrent Context —
Quick Sort 10000 elements – Concurrency disabled: 11.06 ms
Quick Sort 10000 elements – Concurrency (2) enabled: 6.634 ms

— Heap versus Stack Allocation (Pool-Context) —
Small object heap creation: 17.04 ns
Small object stack creation: 38.83 ns
char[256] heap creation: 595.3 ns
char[512] heap creation: 1.039 us
char[256] stack creation: 56.15 ns
char[512] stack creation: 55.87 ns

//////////////////////////////
// Package: javolution.text //
//////////////////////////////

— Primitive types formatting —

StringBuffer.append(int): 251.6 ns
TextBuilder.append(int): 145.5 ns
StringBuffer.append(long): 653.0 ns
TextBuilder.append(long): 324.1 ns
StringBuffer.append(float): 718.1 ns
TextBuilder.append(float): 434.2 ns
StringBuffer.append(double): 3.073 us
TextBuilder.append(double): 1.314 us

— Primitive types parsing —

Integer.parseInt(String): 303.0 ns
TypeFormat.parseInt(String): 136.9 ns
Long.parseLong(String): 801.4 ns
TypeFormat.parseLong(String): 399.2 ns
Float.parseFloat(String): 459.2 ns
TypeFormat.parseFloat(String): 465.3 ns
Double.parseDouble(String): 1.548 us
TypeFormat.parseDouble(String): 850.4 ns

Getting Started

The documentation that Amazon provides for getting started with EC2 and S3 is really very, very good, so I don’t think I need to go over the details of this here. It basically involves setting up an Amazon Web Services account, registering for S3 and then registering for EC2. It is worth noting that EC2 is in beta, and therefore has limited availability.

Once you have gone through the process of setting up your accounts, getting your secure access keys and downloading the command line utilities you are ready to set-up the HelloWorld of AWS.

It is definitely worth going through the getting started documentation. We were up and running with our own AMI within a couple of hours. The documentation will introduce you to some of core principals of EC2 at the ground level including scalability, image bundling and security.

Each EC2 account has its own firewall. By default this firewall completely shuts out the outside world. You need to use the ec2-authorize command to open up the ports on your instances. So in the case of the HelloWorld instance which listens for http requests on port 80, I had to run ec2-authorize default –p 80. Allowing network access via SSH requires authorizing port 22.

Try it out! You will see how straight forward it is for yourself.

What Next?

So we got HelloWorld working; what next? You would think that with all this computing power and storage capacity at your fingertips, finding a practical application would be a piece of cake! In fact, we have struggled somewhat to find this practical application. We have struggled because the applications that we work with on a day-to-day basis require some kind of a database that in some cases contains an amount of sensitive data. EC2 does not provide static IP addresses so when you start an instance, it is essentially an untrusted agent that would ordinarily be blocked by any corporate network. These are a couple of the questions that we needed to answer for ourselves before we could see real benefit to our business.

To that end, we put together a roadmap covering 4 areas that we thought were core to any successful application running on EC2/S3:

· Deployment; deploy a working web application on Tomcat
· Scalability; introduce clustering and load balancing, dynamically scale AMIs
· Security
· Integration

Deployment

I guess what we were trying to prove here is that working with an AMI is as straight forward as working with a localized environment; the same principals apply.We didn’t really want to have to create our own self-contained web application and we wanted something that we have a standardised look and feel to it. We do quite a bit of Spring based development in-house, so we looked at the [tag]Spring 2.0 JPetstore[/tag] application as being a viable candidate; it is a standard MVC type web application bundled with an in-memory Hypersonic SQL database. Perfect!

To run this application, we had to install a JDK and a Tomcat instance; both are easily available on the Internet. I did not download them directly to the running instance, instead I downloaded them to my local workstation and moved them to a publicly available FTP server from which I ftp’d them directly to the running AMI. Once copied to the AMI, I extracted out the Tomcat files and installed the JDK. I did likewise with the latest Spring distribution.

At the point it is worth noting that any changes made to a running AMI are made in the short term; if you want these changes to persist in the long term, you will need to rebundle your AMI image files and register as an AMI with AWS (this process is well documented in the EC2 getting started documentation).

Once all everything had been installed, it was simply a matter of deploying the petstore onto Tomcat, and configuring Tomcat to listen on port 80 for http requests. This was as simple as doing it on your local workstation.

Once I was happy with the installation and configuration of the Tomcat server, and made sure that the petstore landing page was accessible, I rebundled the AMI as a new image and uploaded to S3 for future use. Now, whenever I need a new petstore, with its own dedicated computer resources I can simply start-up a new EC2 instance using the saved image.

Now, I understand that there is no practical application to being able to start n number of petstore instances but it proves that it is possible for any standard Java type application.

Summary

What have we achieved so far? Well we managed to deploy and access a standard Java application on an EC2 instance. Not exactly ground breaking stuff, but we have accomplished one of the steps on our roadmap to discovering EC2, and its potential.

S3 has some well documented success stories but for now EC2 has none. I think this is because people are stuggling to find a suitable application for this kind of facility. Maybe when we have further explored the three other areas on our roadmap we will have a better understanding of what can be achieved with EC2.

At the moment I am playing around with getting Tomcat clustering working even though Amazon does not currently support multicasting for EC2.[tag] Terracotta Sessions [/tag]looks promising and I will let you know how I get on in the next post.

Jay

You may also be interested in:

Clustering Tomcat Instances on EC2 with Terracotta Sessions

and

Setting up a Gnome Desktop Environment on EC2 and Access Remotely with FreeNx

  1. #1 by Paul Browne - Technology in Plain English on January 30, 2007 - 9:11 am

    Good article.

    Do you know of any tools that allow you to build your S3 / EC2 image locally , then deploy in one go?

    Paul

  2. #2 by jay on January 31, 2007 - 9:46 am

    Hi Paul,

    On the AWS forums, one guy from a company called Enomaly (they have created an EC2 management console) mentioned a tool called qemu-img which supposedly can convert VMware images to AMIs. I guess in theory you could convert any image created with VMWare workstation to an AMI; I don’t know, because I havn’t tried yet, but I will when I get sometime.

    Let me know how you get on with this. It might be worth posting a question on the aws forum about this to see if people have had any success with it.

    Jay

  3. #3 by Colum Sisk on March 30, 2007 - 3:36 pm

    Very interesting stuff.

    This article is in relation to real world example of using S3.

    http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9015143&intsrc=hm_list

  1. DSI to Present at UK Oracle User Group Conference 2007 at DeCare Systems Ireland Blog
  2. Amazon Web Services (EC2 & S3) - The Future of Data Centre Computing? Part 4 at DeCare Systems Ireland Blog

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: