RSS
Eight Ways that Cloud Computing Will Change Business is a wonderful post by Dion Hinchcliffe. The synopsis of this article is that large businesses are laggards with respect to technology adoption for the simple reason that the cost of betting on the wrong horse is too high. However, sometimes new technologies are so compelling that this wait-and-see approach is trumped. According to the article:
"Cloud Computing is quickly beginning to shape up as one of these major changes and the hundreds of thousands of business customers of cloud offerings from Amazon, Salesforce, and Google, including a growing number of Fortune 500 companies, is showing both considerable interest and momentum in the space".The article continues to spell out eight ways cloud computing will change business.
- Creation of a new generation of products and services
- New lightweight form of real-time partnerships and outsourcing with IT suppliers
- New awareness and leverage of the greater Internet and Web 2.0 in particular
- A reconciliation of traditional SOA with cloud and other emerging IT models
- The rise of new industry leaders and IT vendors
- More self-service IT from the business-side
- More tolerance for innovation and experimentation from business
- The slow-moving, dinosaur firms will have trouble keeping up with more nimble adopters and fast-followers
I have always argued that cloud computing will be defined by the bottom of the economic pyramid. Smaller businesses do not have existing and legalized corporate standards of quality, accountability, and security, and they can simply piggyback on the standards provided by the data centers on which they deploy. This provides them with a first mover advantage that doesn't waste energy trying to sell cloud computing solutions inside an already stressed IT organization of a large enterprise.
Secondly, consumers in many ways are much more adaptable than enterprises. I am using Google or Amazon or AT&T but I don't get bend out of shape if my service experiences a hick-up. Take cell phone service: if you insisted on 99.999% availability, like many enterprise customers seem to demand, you couldn't use a cell phone. However, everybody agrees that a cell phone is a net productivity improvement. It is this consumer, conditioned by an imperfect world, that is demanding new services for their iPhones, BlackBerries, and Pres and is willing to take a less stringent SLA in exchange for lower cost and convenience. And there is a legion of startups that is willing to test out that appetite.
Brand loyalty in this connected world is non-existent for the simple reason that most services are multi-vendor anyways. You get a Nokia phone on a Verizon network connecting to a Real Rhapsody music service to satisfy your need for mobility. I switched from Yahoo search, to Google search, to Microsoft search in a matter of minutes simply because either their UI and/or their results provided a better fit for my sensibilities. I find it wonderful that after a decade of technology consolidation and stagnation we are back to a world of innovation and rapid expansion of new services. And I believe that it is the consumer that will define these services, not the enterprise.
The cloud has evolved from the managed hosting concept. With data centers like EC-2 making it easier to provision servers on-demand, elasticity can be build into the application to scale dynamically. Microsoft Azure provides a similar, and nicely integrated, platform for the Windows application world. But how well do this clouds hold up when demand is elastic for compute intensive workloads? The short of it? Not so well.
I found two papers that report on experiments that take Amazon EC-2 as IT fabric and deploy compute intensive workloads on them. They compare these results to the performance obtained from on-premise clusters that include best-known practices for compute intensive workloads. The first
paper uses the NAS benchmarks to get a broad sampling of high-performance computing workloads on the EC-2 cloud. They use the high-performance instances of Amazon and compare them to similar processor gear in a cluster at NCSA. The IT gear details are shown in the following table:
|
EC-2 High-CPU Cluster |
NCSA Cluster |
| Compute Node |
7GB memory, 4 cores per socket, 2 sockets per server, 2.33GHz Xeon, 1600GB storage |
8GB memory, 4 cores per socket, 2 sockets per server, 2.33GHz Xeon, 73GB storage |
| Network Interconnect |
Specific Interconnect technology unknown |
Infiniband network |
The NAS Parallel Benchmarks are a widely used set of programs designed to evaluate the performance of high performance computing systems. The suite mimics critical computation and data movement patterns important for compute intensive workloads.
Clearly, when the workload is confined to a single server the difference between the two compute environments is limited to the virtualization technology used and effective available memory. In this experiment the difference between Amazon EC-2 and a best-known practice cluster is between 10-20% in favor of a non-virtualized server.
However, when the workload needs to reach across the network to other servers to complete its task the performance difference is striking, as is shown in the following figure.

Figure 1: NPB-MPI runtimes on 32 cores (= 4 dual socketed servers)
The performance difference ranges from 2x to 10x in favor of a optimized high-performance cluster. Effectively, Amazon is ten times more expensive to operate than if you had your own optimized equipment.
The second
paper talks to the cost adder of using cloud computing IT infrastructure for compute intensive workloads. In this experiment, they use a common workload to measure the performance of a supercomputer, HPL, which is an acronym for High Performance LINPACK. HPL is relatively kind to a cluster in the sense that it does not tax the interconnect bandwidth/latency much as compared to other compute intensive workloads such as optimization, information retrieval, or web indexing. The experiment measures the average floating point operations (FLOPS) obtained divided by the average compute time used. This experiments shows an exponential decrease in performance with respect to dollar cost of the clusters. This implies that if we double the cluster size the FLOPS/sec for money spent does down.
The first paper has a wonderful graph that explains what is causing this weak scaling result.

This figure shows the bisection bandwidth of the Amazon EC-2 cluster and that of a best-known practice HPC cluster. Bisection bandwidth is the bandwidth between two equal parts of a cluster. It is a measure how well-connected all the servers are to one another. The focus of typical clouds to provide a productive and high margin service pushes them into IT architectures that do not favor interconnect bandwidth between servers. Many clouds are commodity servers connected to a SAN and the bandwidth is allocated to that path, not to bandwidth between servers. And that is opposite to what high performance clusters for compute intensive workloads have evolved to.
This means that for the enterprise class problems, were efficiency of IT equipment is a differentiator to solve the problems at hand, cloud IT infrastructure solutions are not well matched yet. However, for SMBs that are seeking mostly elasticity and on-demand use, cloud solutions still work since there are still monetary benefits to be extracted from deploying compute intensive workloads on Amazon or other clouds.
In my continued quest to build an operational model that properly accounts for the costs of different cloud web services, I have reached back to the visual vocabulary of operational analysis. If it was good enough to build BMC Software I figured it would be good enough for this task.
The following figure captures the typical resources in a modern data center. In the vocabulary of operational analysis we have servers and transactions, and the diagram depicts the read and write transactions going into different services such as filers or Internet, and read responses coming out. If you would build your own data center these servers and services would reflect all your capital and operational expenditures.

Different data centers select different resources to monetize. This makes the comparison between different providers so difficult: they are all selling something different.
Let's start with Amazon as the baseline since AWS tries to monetize all the resources in its data center, except for the internal routers. The next diagram shows the resource costs that Amazon charges you when running an application on their data centers.

Now compare that with a second provider, GoGrid. GoGrid does not monetize the incoming internet connection into their data center. So if you have a workload that reads a lot of data from the internet, GoGrid is fantastic. Also, GoGrid does not use a filer in their architecture, instead giving the server its own local disk instance that is managed and maintained. This works very well for web applications but does not work well for running a distributed file system instance. So running Hadoop on GoGrid is not attractive. The following diagram depicts GoGrid's monetization strategy.

When you compare both diagrams it is clear that GoGrid is the better solution for running a web application server. On top of that, GoGrid offers free load balancers, which you would need to pay for separately on Amazon.
This visual vocabulary presented here makes it very easy to identify what types of workloads would fit on different cloud providers. It also shows you the high-cost items in the overall IT infrastructure you need to outsource your application.
To make the accounting complete, we also need a model of our workload that quantifies the storage, compute, and I/O requirements. For web application services the world of cloud solutions is well represented, but for utility computing this is not the case. The cost of filer and storage are significant and quickly become the overriding cost components for a workload. Furthermore, given the fact that storage costs accumulate even when you are not computing makes the on-demand argument less genuine. Finally, the use of cpu instance hours is not good enough for utility computing. Using the electric grid as comparison, I am consuming electrons, and pay accordingly. In proper utility computing I am consuming instructions and I/Os. These metrics are independent of the speed of the processors or filer on which I run and thus I do not need to guess what type of cpu-instance-hours I would consume. By providing instruction and I/O consumables providers can differentiate on the basis of capacity or latency in the same way that electricity providers do. Without that compensation model, utility computing is a ways off IMHO.
Two articles that are just wonderfully expansive...
http://www.techdirt.com/articles/20070503/012939.shtmlhttp://alwayson.goingon.com/permalink/post/30283I came across these articles researching and thinking about SaaS and PaaS and what would be the best road forward for startups in that space. Salesforce may have blazed the trail but SugarCRM is doing most of what I am doing with Salesforce. Hosting SugarCRM on demand on Amazon would save me money over Salesforce. However, in the end it is not the SaaS CRM system that is the value, it is the data inside it and my internal business process surrounding that CRM data. I want the flexibility to take this data and process anywhere so that I can take advantage of available skill or innovation and extract more value out of the accumulated data.
Cloud computing exposes this fundamental problem of data movement. This problem was not perceived as a problem as much for on-premise applications due to the false impression that local data is always usable. To make cloud services ubiquitous this problem of data movement needs to get solved and robust, free Open Source components will be developed to solve this problem since users will demand it.
GoGrid's Michael Sheehan just published
his cloud computing predictions for 2009.
1- Clouds reduce the effect of the recession.
The basic argument being that since cloud computing is a more cost effective means to obtain IT services, cloud computing enables the IT budget to go further. But that would simply take money away from the IHVs and big consultancies, so a more careful study would need to be made to assert if this is zero-sum game or not. My thought here would be that the recession may accelerate the adoption of cloud computing so that consumers of IT spent less, but it will hurt the IHVs.
2- Broader depth of clouds
This prediction is the simple progression of a new technology that is getting adopted. More customers are coming in and all have slightly different requirements that the cloud providers will cater to. It is easier to do that with specialized solutions and thus we'll see a broadening of the features offered in clouds.
3- VC, money & long term viability
This is an interesting prediction from Michael: cloud aggregators will be funded and the other players in the stack will get squeezed. Cloud aggregators are companies like RightScale and Cassatt and there is no doubt in my mind that they will do well since leveraging cloud computes is still hard work. I personally think that the VCs are not going to play in this space because of the presence of large incumbents like IBM, Amazon, Google, HP, and Sun. Personally, I think the real innovation investments will come from the emerging markets since they have the most to gain from lower IT costs.
4- Cloud providers become M&A targets
This item reads as a prediction that the consolidation in the cloud space will accelerate in 2009. My prediction is contrarian in the sense that I think we'll see more specialized clouds show up to cater to very specific nitches and thus we'll see a market segmentation first before we'll see a consolidation. For example, most clouds are web application centric, and putting up a web server is one feature that is widely supported. However, the financial industry has a broader need than just web servers, as do product organizations like Boeing and GE. I think there is a great opportunity to build specialized clouds for those customers as it can be piggy backed on supply chain integration so players like Tibco can come in. That is a very large market with very high value: much more interesting than a little $49/month hosted web server.
5- Hybrid solutions
On-premise and cloud solutions working together. That prediction is more of a looking back but it is a sign that cloud computing is accepted and companies are actively planning how to leverage this new IT capability in their day to day operation.
6- Web 3.0
More tightly integrated Web 2.0? It clearly is all about the business or entertainment value. I really like what I am seeing in the data mining space where knowledge integration is creating opportunities for small players with deep domain experts to make a lot of money. Simply take a look at marketing intelligence: the most innovative solutions come from tiny players. I think this innovation will drive cloud computing for the next couple of years since it completely levels the playing field between SMBs and large enterprise. This make domain expertise more valuable and the SMBs are much more nimble and can now monetize that skill. Very exciting!
7- Standards and interoperability
Customers will demand it, incumbent cloud providers will fight it. I can't see Google and IBM giving up their closed systems so the world will add another ETL layer to IT operations and spring to live some more consultants.
8- Staggered growth
A simple prediction that everything cloud will expand.
9- Technology advances at the cloud molecular level
This is an item dear to my heart: cloud optimized silicon. It is clear that a processor that works well in your iPhone will not be the right silicon for the cloud. There are many problems to be solved in cloud computing that only have a silicon answer, so we are seeing fantastic opportunities here. This innovation will be attenuated by the lack of liquidity in the western world but this provides amazing opportunities for the BRIC countries to develop centers of excellence that surpass the US. And 2009 will be the key year for this possible jump since the US market will be distracted trying to stay in cash till clarity improves. As they say, fortunes are made in recessions.
10- Larger Adoption
A good prediction to end with for a cloud computing audience: business will be good in 2009.
The next step was to select our benchmarks and calculate their costs. We extracted two workloads that are common to many product development companies: a regression workload that arises when a team collaborates on the same development task, and a technical workload when an individual is using computer models to generate new insight/knowledge.
The regression workload can be generated by a software design team developing a new application, a financial engineering team back testing new trading strategies, or a mechanical design team designing a new combustion engine that runs on alternative fuels.
The technical workload can be a new rendering algorithm to model fur on an animated character, or a new economic model that drives critical risk parameters in a trading strategy, or an acoustic characterization of a automobile cabin.
The first workload is characterized by a collection of tests that are run to guarantee correctness of the product during development. Our test case for a typical regression run is a 1000 tests that run at an average of 15 minutes each. Each developer typically runs two such regressions per day, and for a 50 person design team this yields 100 regression runs per day. The total workload equates to roughly 1050 cpu hours per hour and would keep a 1000 processor cluster 100% occupied.
The second workload shifts the focus from capacity to capability. The computational task is a single simulation that requires 5 cpu hours to complete. The benchmark workload is the work created by a ten person research team that runs five simulations per day. Many of these algorithms can actually run in parallel and such a task could run in 30 minutes when executed in parallel on ten processors. Latency to solution is a major driver on R&D team productivity and this workload would have priority over the regression workload particularly during the work day. The total workload equates to roughly 31 cpu hours per hour because this workload runs just in the eight hour work day.
Running these two workloads on our cloud computing providers we get the following costs per day:
| Benchmark |
Amazon |
Rackspace/Mosso |
| Regression Workload |
$25,075.17 |
$18,250.25 |
| Knowledge Discovery |
$265.09 |
$230.13 |
The total cost of $20-25k per day makes the regression workload too expensive for outsourcing to today's cloud providers. A 1000 processor on-premise x86 cluster costs roughly $10k/day including overhead and amortization. The cost of bulk computes like the regression workload needs to go down by at least a factor of 5x before cloud computing can bring in small and medium-sized enterprises. However, the technical workload at $250/day is very attractive to move to the cloud since this workload is periodical with respect to the development cycle and it moves CapEx to OpEx to frees up capital for other purposes.
The big cost difference between Rackspace/Mosso and Amazon is the Disk I/O charge. It doesn't appear that Rackspace monetizes this cost. From the cost models, this appears to be a liability for them since the Disk I/O cost (moving the VM image and data sets to and from disk) represents roughly 20% of the total costs. Fast storage is notoriously expensive so this appears to be a weakness of Rackspace.
In a future article we will dissect these costs further.
The past month we have been trying to quantify the cost of moving some of our workloads into the cloud. It has been a very painful experience. Each vendor insists on mixing up the pricing in such a way that direct comparisons require major mental gymnastics. On top of that, the big three, IBM Blue Cloud, HP Adaptive Infrastructure as a Service, or AIaaS (who in the marketing department came up with that one?), and Sun Network.com are so incredibly opaque that we have just given up. Furthermore, Sun started out at $1/cpu hour and that simply is not competitive. Sun has taken the site down and the home page of the site claims that they are working on something else. Out of sheer frustration, we have ditched IBM and HP as well. It appear that they are catering to their existing deep-pocket customers and we do not expect their solutions to be cost competitive for the disruptive cloud computing concept that will usher in the new economics.
Many activities at the US National Labs are directed to evaluate if it is cost effective to move to AWS or similar services. To be able to compare our results to that research we decided to map all costs into AWS compatible pricing units. This yielded the following very short list:
| Provider |
CPU $/cpu-hr |
Disk I/O $/GB |
Internet I/O $/GB |
Storage $GB-month |
| Amazon |
$0.80 |
$0.10 |
$0.17 |
$0.15 |
| Rackspace/Mosso |
$0.72 |
$0.00 |
$0.25 |
$0.50 |
The reason for the short list is that there are very few providers that actually sell computes. Most of the vendors that use the label
cloud provider are actually just hosting companies of standard web services. Companies like 3Tera, Bungee Labs, Appistry, and Google cast their services in terms of web application services, not generic compute services. This makes these services not applicable to the value-add computes that are common during the research and development phase of product companies.
In the next article we are going to quantify the cost of different IT workloads.
CIO Magazine just surveyed 173 IT business leaders to gauge what the common attitudes are towards cloud computing in the enterprise. 58 percent indicated that cloud computing
will dramatically change the IT business, and 47 percent said they are already using it. On the other side, 18 percent think that cloud computing is a fad.
surveyCIO used the broad definition of cloud computing: "a style of computing where massively scalable IT-related capabilities are provided 'as a service' using Internet technologies to multiple external customers". Other terms used are "on-demand services", "cloud services", "Software-as-a-Service".
The survey confirmed that cloud computing is a solution to the need for flexibility in IT resource management. IT needs flexibility and cost savings, but is unwilling to jump in with both feet until some lingering concerns are addressed: the top concern being security.
Cloud computing will be used in many pilot/proof-of-concept projects by the incumbents, and it will be experimented with as full blown business models by a growing cadre of start-ups. We have described this many times in this blog that the cloud computing model will be driven by the small and medium business segment because they value cost savings over security or SLAs. And typically with technologies that offer dramatic cost savings, when successful, there will be carnage among the companies that are holding on too tightly to old fashioned business models.
IBM announced cloud computing applications at Lotusphere in January of this year. A service called Bluehouse is a web-delivered social networking and collaboration service targeted to the SMB market. Bluehouse enables people to share documents, projects, and contacts, and offers online conferencing features as well. The
Bluehouse service has gone into public beta.
Willy Chiu, VP at IBM of High Performance On-Demand Solutions stated: "We are moving our clients, the industry and even IBM itself to have a mixture of data and applications that live in the data centre and in the cloud." IBM's approach is to expand its cloud computing offerings through a 'four-pronged strategy':
- Deliver a home spun set of cloud services
- Enable ISVs to design and build cloud services
- Help customers integrate cloud services into their business
- Sell cloud computing infrastructure to businesses for on-premise deployments
In addition to Bluehouse, IBM is also rolling out a handful of web services. Policy Tester On-Demand will automate the scanning of web content to ensure that it complies with industry legislation, and AppScan On-Demand will scan web applications for security bugs. Sean Poulley, VP of Online Collaboration Services compared the Bluehouse tools to those of Microsoft and Google: "Whereas Microsoft is document centric and Google is email centric, our solution is a mixture of both".
IBM's $400M investment in a new data center to support this new mid-market SaaS/Cloud Computing services portfolio, brings another large player into the mix. These tools have been a long time coming but with every major brand now on-line, the branding wars can begin.
As soon as I finished yesterday's blog entry, I became aware of a posting by Amazon's CTO,
Werner Vogels, where he announces that Microsoft's Windows Server is available on Amazon EC2. According to Vogels: "we can now run the
majority of popular software systems in the cloud". So there you have it, both Amazon and Microsoft are/will be offering Windows based applications in the cloud.
According the Vogels' blog the area that accelerated to adoption of this functionality in Amazon's Elastic Cloud was the entertainment industry due to the wide range of excellent codecs available for Windows. Here is the power of Microsoft's dominance of the client side translating into a huge benefit for cloud adoption. With 20-20 hindsight, the benefit of quality codecs is now obvious, and it will drive very quick adoption of Windows in the Cloud. Content apparently is still king and thus the conduit that delivers it is a critical component. Turns out that Microsoft does have an unfair advantage in the Cloud space.