The Money Shot

“Doing more with less” is a term you hear a lot these days. The shaky economic conditions have left companies across the globe reeling, and IT budgets were one of the first squeeze points. As matters begin to improve we still have the seeds of the idea that we should be seeking to ‘do more with less’, which has led to an explosion of interest in more efficient IT technologies and processes. Witness The Cloud. Whether or not you are ready for The Cloud, whether The Cloud is something more attainable than vapourware, and whether The Cloud will rain on your parade are all secondary to the rush for The Cloud coming from the boardroom. Those airline magazines have a lot to answer for.

Ignoring the massive scope of such projects for a moment, we can instead focus on some practical and tactical changes in our operations that can bring huge returns and improved services for our customers. I hinted at this in an earlier post, claiming it was possible to save over $500,000 with a single UCS blade. It’s the kind of money that allows an investment such as Cisco UCS to pay for itself, and even have some left over to fund C-level dreams of The Cloud. The concepts are nothing new, it’s all about deriving maximum value from investments and I don’t claim to have discovered a cure for the common cold, nor can I Make You Thin, give you Instant Confidence, or Change Your Life In 7 Days. But I can help you deliver these concise messages to show that while others drift along with dreams of The Cloud, you are saving potential fortunes and delivering improved services today.

I come from an Oracle background, so I am going to use Oracle as my example here. I know the licensing model, which is a black art in itself, and I know the operational considerations. However, the ideas here are equally applicable to many enterprise application services such as SQL Server and SAP. I’m not a walking encyclopaedia of price lists, so I stuck with something I knew well in Oracle. To begin our understanding of how we can save money delivering Oracle services on UCS, we first need to understand the challenges.

Oracle Challenges

Traditionally, Oracle has been safe and snug on Big Iron platforms. In fifteen years delivering Oracle production databases behind some of the world’s largest enterprise systems and corporate ecommerce websites, I saw far more Sun SPARC servers than anything else. Everyone liked Sun, with terrific performance and bombproof hardware reputation pleasing both server teams and delivery staff alike. The only concern came from the boardroom, where the massive price tag of this performance and reliability was keenly felt. The problem was further highlighted in the Oracle domain by:

  • Server sprawl – a typical enterprise might have hundreds of database servers, ranging from small locally booted servers to servers with different LAN or SAN requirements, or servers with their own storage arrays. Plus that server under that guy’s desk, the test system he cobbled together that now has vital dev code embedded in the database stored on local disk.
  • No memory capacity – Oracle is a hungry beast and will happily chew through almost as much memory as you can throw at it. Memory drives Oracle’s cache structure and is therefore key to excellent performance. It’s not uncommon for Oracle servers to run at 95%+ memory utilisation.
  • Low CPU utilisation – despite all the money spent on the Sun E10K server a few years ago, the (admittedly critical) database it serves is barely causing the server to break sweat on the CPU side. It’s reliable and we have our five-9′s performance, but it’s hardly efficient, especially in the days of Doing More With Less. It’s not uncommon for Oracle database server to be running less than 30% CPU utilisation.

Oracle on UCS Solutions

The Memory Hog

We’ve seen how an enterprise application such as Oracle can present some unique problems, rapidly sprawling through our corporate domain like bad gossip while tearing through expensive RAM DIMMs in its quest for performance.

As mentioned, Oracle uses memory to feed its cache, reducing IO load on the storage array and driving performance improvements for the end user. Other applications such as VMware will also use vast amounts of memory quite happily, but this time for reasons of scale. If you want to run more VMs on your server, then it is most likely you will start to be constrained by physical memory limitations before you are constrained on CPU. It’s a common thread in deployment of many enterprise applications, that we are restricted by RAM before we redline the CPU.

Cisco UCS was designed to bring more balance to this equation. It’s an enterprise solution developed to deliver enterprise services. Cisco’s unique extended memory technology is one of the keys to unlocking massive benefits through consolidation of database services on UCS servers, delivering efficient RAM & CPU utilisation while extracting maximum benefit from your software license.

At the time of writing, Cisco offers the following blade servers for use in the standard UCS 5108 chassis:

  • B200 M1 – Nehalem 5500-series half-width blade with two quad-core CPUs and up to 96GB RAM
  • B250 M1 – Nehalem 5500-series full-width blade with two quad-core CPUs and up to 384GB RAM using Cisco’s extended memory technology
  • B200 M2 – Nehalem 5600-series half-width blade with two six-core CPUs and up to 96GB RAM
  • B250 M2 – Nehalem 5600-series full-width blade with two six-core CPUs and up to 384GB RAM using Cisco’s extended memory technology
  • B440 M1 – Nehalem 7500-series full-width blade with four eight-core CPUs and up to 256GB RAM

In Oracle terms, the range of blades covers smaller databases in the B200 range (or well-suited to Oracle RAC nodes), through to large memory footprint databases in the B250 blades, without the expense of typical multi-socket servers which add significant cost to software licenses. If you really do need extra CPU power, then the B440 with 32 cores (64 logical cores with HyperThreading) is well matched to handle those exceptional workloads. UCS is unique with the B250 in that we finally have a server that offers the large memory footprint that enterprise applications demand, but does not punish us with additional CPU cores we will not use and yet have to license.

How much?

I keep mentioning the Oracle license. The penny is probably beginning to drop. Oracle is pretty expensive, and even without dipping into the delicious delights of the cost option list the basic list price for an Enterprise Edition license is $47,500 at the time of writing. $50k doesn’t sound too bad, until you see the ‘core weighting’ table and realise that the license is directly related to the CPU power on your server. All those CPUs running at 30% utilisation means 70% of your license cost wasted. And that’s just one server.

The core weighting varies based on the CPU type. The weighting for Intel CPUs is 0.5, with Sun SPARC processors weighted at 0.75 as they ordinarily have less cores. The cost to license your server for the basic Enterprise Edition software then becomes:

(total cores * core weighting) * $47,500

Applying the formula to our UCS blades gives the following license costs:

  • B250 M1 = $190,000
  • B250 M2 = $285,000
  • B440 M1 = $760,000

Oracle RAC adds 50% to those figures per server, and many of the additional cost options such as Advanced Security or Advanced Compression add 25% each. The prices add up very quickly and can easily outstrip UCS hardware costs with the license implications from just one or two blades.

I Don’t Care, I’ve Got an ELA!

I'll just have a wafer-thin mint

While the Oracle Enterprise License Agreement does indeed offer an all-you-can-eat buffet from the Oracle table, you’d better be ready when the bill arrives.

Oracle won’t renegotiate an ELA in a downward direction, so you may be thinking that consolidation doesn’t really apply to you. Oracle won’t give you any money back, so what’s the point? The point is that if your Oracle estate grows 100% in the next three years, when Oracle comes knocking to audit for your new ELA you’re looking at double the cost of your previous agreement. If you consolidate heavily during the period, you won’t see an immediate return on your investment through a reduction in license costs, but if your estate only grows 50% over the period then your next ELA agreement is much more palatable. Consolidate now to save down the line.

Testing Times

We’ve established that Oracle is expensive to license, and that we can extract a lot of benefit by maximising our investment in the software. To best achieve this license optimisation we need to deliver a large memory footprint while reigning in the number of expensive CPU licenses. While initially seeming unbalanced, we have shown that enterprise applications using RAM for scale or performance can function efficiently in this configuration. The Cisco UCS B250 server presents itself as a capable database workhorse for these configurations, offering up to 384GB of RAM with a dual socket CPU that is comparatively inexpensive to license.

After trumpeting the benefit of the B250 and its 384GB memory footprint I am now going to contradict myself. The memory footprint is actually too big for most commonly-sized database services! I am sure there are edge cases where you really need a massive cache and that 384GB will be very useful, but equally they may need more CPU for such HPC environments and the B440 may be the better choice in that case. In tackling median cases in the Oracle domain, we couldn’t strike a perfect balance between database sizes and the massive 384GB footprint. I’m sure you have a case or two, and the fully loaded B250 will suit your needs perfectly. We are trying to develop a more general message around consolidation and best use of resources, so I’m not going to recommend loading your B250s with 384GB RAM and trying to fill them with Oracle databases, as you’ll most likely exhaust the CPU before you chew through all that memory.

But the B250 does offer something that I really think is ‘best use of resources’. Let’s take advantage of the Cisco extended memory technology, but instead of using pricey 8GB DIMMs to fill those 48 slots let’s instead use much cheaper 4GB DIMMs, offering 192GB on the server for less than $10,000 instead of $50,000 for 384GB. We’ve used the advanced technology of the B250 to deliver a creative solution instead of the maximum solution. Let’s be smart instead of massive, that’s why we’re moving away from those E10K servers, right?

In order that our tests were not trivial, and trying to represent realistic workloads, we used customer test data to establish the bounds for our test:

  • Deliver at least 10,000 transactions per minute (TPM) per database
  • 300 concurrent users per database
  • Total SGA (Oracle caches) footprint of 40GB maximum per database
  • Drive CPU utilisation >85%

These figures will not map onto all workloads, but in trying to define a baseline they offer database sizings and performance that are non-trivial and cover a wide section of deployed database services in operation today.

What about the B200?

The B200 is a great general-purpose workhorse and is eminently suitable for many database workloads. However, using the same CPU socket configuration as the B250, we suspected its comparative lack of memory would be a disadvantage if we were trying to consolidate workloads and maximise license efficiency.

Using the 40GB memory footprint database on a 96GB B200 M1, we deployed two databases and used Swingbench to simulate database load.

Swingbench results for 2 databases

We have managed to deliver on our workload requirements, with over 11,000 TPM in both our databases, each running 300 users. With our 40GB memory footprint per database we’re also getting great value from the 96GB RAM installed on the blade, but what about CPU?

B200 M1 CPU utilisation

With CPU utilisation at roughly 35% we aren’t able to maximise value from our $190,000 Oracle license for this server as we have exhausted memory capacity before CPU. Therefore while the B200 is ideally suited for smaller databases and as  a node in RAC clusters, it’s not our preferred option if we are trying to maximise database consolidation.

B250 M1 Results

Taking the same performance test to the B250 M1 with 192GB of RAM gives interesting results.

Swingbench results for 4 databases

We see a slight drop in TPM performance, but we are still above our 10,000 TPM target result in each database and serving 1200 concurrent users across the server as required. Over 160GB of our 192GB RAM is assigned to the databases, so we have excellent value from the DIMMs we have deployed, but again, what about CPU?

B250 M1 CPU Utilisation

With four databases running a concurrent workload of 40,000 TPM and 1200 users on our B250 M1 blade, we now achieve CPU utilisation in the 90% region, driving superb value from our hardware and software investment without detriment to our performance service level requirements. If we had more memory on this server, we wouldn’t be able to drive more value with similarly-sized and performing databases, as we have exhausted CPU. As we can see it is balancing act between our performance requirements and database size, versus the hardware resource of the server. The difference is that with applications commonly starved of memory before CPU starvation, the B250 and Cisco’s extended memory technology finally provides a cost-effective answer without incurring excessive license costs. Everything in balance. All very Zen.

Show Me The Money!

Our example test case was a customer already invested in x86 technology, with previous generation Intel Harpertown severs due for refresh. With 48GB per server and a single database running on each, the dual-socket Intel CPUs attracted the same Oracle licensing cost as current Intel Nehalem CPUs!

Each customer server costs $190,000 in Oracle licenses (dual socket quad-CPU). A single B250 M1 costs the same, yet was proven able to run four typically-sized customer databases due to its memory footprint and Nehalem CPUs.

the B250 M1 can save $570,000 in license cost alone

Moving four databases from single servers onto the B250 M1 can save $570,000 in license cost alone (without looking at OPEX or server costs) in our test case, with no performance penalty in terms of our required SLA.

Before I am accused of picking on Oracle, or stopping an Oracle Sales Guy from getting that new 458 Italia he’s had his eye on, I am sure these techniques of license optimisation are equally valid elsewhere. I know Oracle, I’m an Oracle guy and it’s in my interest to see Oracle do well so my skills remain valuable. Oracle is viewed as being extremely expensive software, suitable for only the highest of high-end enterprise workloads. While that remains largely true, what I have tried to show is that we can improve value from our investment in Oracle through careful and clever consolidation. I’d like more people to use Oracle because they see it as good value, deploying their license on servers that are well suited to these workloads, servers such as Cisco’s UCS B250 blade server. Doing more with less is about driving value, not wholesale shifts to open source technology and abandonment of enterprise service levels, so put the dogs back in the kennel, Larry, I am on your side.

And with that I move onto another customer with over 200 SPARC servers. Average utilisation is 3%. They’re really interested. Are you?

Update: EMC have recently published a white paper (http://bit.ly/aAvJjD), demonstrating their shift from a Sun SPARC hardware platform onto Cisco UCS for their Oracle eBusiness Suite and database. They were previously using 224 cores of SPARC processing, and mention the E25k hardware which uses the UltraSPARC IV+ CPU with a 0.75 core weighting.

According to rough calculations 224 cores of UltraSPARC IV+ CPU in a RAC configuration would cost $11,844,000 in license cost alone. Just for the database portion. Oracle eBusiness is expensive too, but I don’t know which apps are being licensed so let’s just stick to the DB and RAC option for now. EMC have migrated to four (yes, four) B200 M1 blades with a total of 32 cores, giving an indicative license cost of $1,128,000 for the DB+RAC option.

$10M less. Oh, and they massively improved performance too.

I’m not saying these figures are deadly accurate as there may be factors not demonstrated in the paper. But as a ballpark example, it brings home the implications of consolidation and just how small the hardware cost can be in relation to potential savings on software. That legacy kit sitting in the corner is costing you a whole lot more than it should.

Next Time: Pimp My SLA