<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Carl Bradshaw</title>
	<atom:link href="http://carlbradshaw.com/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://carlbradshaw.com</link>
	<description></description>
	<lastBuildDate>Mon, 13 Jun 2011 15:29:46 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>Pimp My SLA</title>
		<link>http://carlbradshaw.com/?p=72</link>
		<comments>http://carlbradshaw.com/?p=72#comments</comments>
		<pubDate>Mon, 13 Jun 2011 14:57:29 +0000</pubDate>
		<dc:creator>Carl</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[UCS]]></category>

		<guid isPermaLink="false">http://carlbradshaw.com/?p=72</guid>
		<description><![CDATA[The Service Provider lives and dies by their Service Level Agreement, and delivering on that ]]></description>
			<content:encoded><![CDATA[<p><a href="http://carlbradshaw.com/wp-content/uploads/2011/06/PimpMySLA.jpg"><img class="size-full wp-image-73 alignnone" title="PimpMySLA" src="http://carlbradshaw.com/wp-content/uploads/2011/06/PimpMySLA.jpg" alt="" width="456" height="261" /></a></p>
<p>The Service Provider lives and dies by their Service Level Agreement, and delivering on that commitment to their customers. I was lucky enough to be working for Loudcloud at the start of the ‘noughties’, Marc Andreessen’s automated service delivery company that had a unique ’100%’ SLA as a key differentiator. It helped Loudcloud punch way above its weight, attracting large enterprise customers that might otherwise look to more established names for their critical services and systems. It taught me both the value of the SLA in driving business within the SP arena, and also the beauty and simplicity of delivering complete commitment to the customer. Five nines, three nines, nine nines. None are 100%, and Loudcloud delivered a really simple message with that 100% commitment: if we mess up, you get paid. It was simple, in a language the customer board could understand, but concealed an intrinsic commitment of the Loudcloud staff and management to service excellence that was critical to delivering a profitable business. For Loudcloud to have any success, failures needed to be extremely infrequent, as financial penalties could be crippling. They needed talented people, excellent tools, and bombproof processes to make the business even remotely viable.</p>
<p>Ultimately, Loudcloud attracted too much attention to remain undisturbed and was sold as a service organisation to EDS, with its automation tool (Opsware) separately sold to HP. Both eventually came back together when HP bought EDS, but many of the talented people had long since departed by that time, and many of the principles that made Loudcloud a unique proposition had been diluted in the interim. Without that total commitment to the customer SLA, and the right people to deliver on that commitment, then the service became more anonymous in the explosive SP market. It was further proof of the importance of the SLA as both a marketing tool and a barometer of your commitment to the customer.</p>
<p>Enterprise applications such as Oracle differentiate themselves in the market by virtue of their high availability (HA) and disaster recovery (DR) mechanisms and features. For critical systems, Oracle offers its Real Application Clusters (RAC) feature, where the cluster database can survive node failures by rerouting connections to other surviving nodes automatically. When uptime must absolutely be guaranteed, Oracle RAC is an excellent solution, but with license premiums at 50% over the regular Enterprise Edition database, it can also be expensive. For slightly less critical services, an automated failover of a single instance database can be sufficient. A solution using Symantec VCS clustering would use a heartbeat between two servers to monitor for issues in the primary server, and failover the database to the secondary server in case of problems. This would incur a system outage, unlike Oracle RAC, but the outage would be measured in minutes rather than hours. Finally, for test and development systems it is normal that the DR solution consists of a backup, if you’re lucky. Typically these servers will fail just prior to a new system release, just when you need them the most!</p>
<p>We can represent these systems, their SLA requirement and distribution in an organisation, as follows:</p>
<p><a href="http://carlbradshaw.com/wp-content/uploads/2011/06/SLA_diagram.jpg"><img class="alignright size-large wp-image-75" title="SLA_diagram" src="http://carlbradshaw.com/wp-content/uploads/2011/06/SLA_diagram-1024x590.jpg" alt="" width="594" height="342" /></a></p>
<p><em><strong>SLA vs Distribution vs Downtime</strong></em></p>
<p>In the pyramid diagram there are significant cost elements associated with the rising SLA and the technology employed to support such levels of availability. It’s these cost elements that traditionally force organisations to make significant decisions around the SLA for test and/or development servers, which will be extremely valuable at certain times but otherwise not as critical so that a backup would not be sufficient. Served by an engineer ‘onsite’ support agreement, hardware failure would normally incur a minimum four hour outage as the engineer is dispatched to site with replacement hardware for the failed server.</p>
<p>With Cisco UCS, we can reduce this outage period to below ten minutes. And it doesn’t cost a penny extra.</p>
<p>By abstracting the logical definition of the server into a UCS Service Profile, our ‘server’ becomes nothing more than a file, needing some hardware to spring into life. That hardware could be blade in any of your UCS chassis, and it could change from day to day if you so desired. When a blade fails, as we have now managed to decouple our ‘server’ from the electronics, we can move our Service Profile onto another blade and restart our stateless server within minutes. To the OS and applications it would appear that the server had crashed and restarted. They have no concept that they are running on a different blade, maybe in another chassis. We can then repair the failed blade and bring our system back to full strength without suffering extensive downtime as we would have previously experienced.</p>
<p>I’d suggest this method is so good that we’d actually want to revisit some of the departmental-level servers that are maybe served by a VCS-type automated failover with dedicated standby hardware. Would they really notice the difference between a 5min outage and a 10min outage? I am sure many of these user environments would notice, but I’d bet there are a few that wouldn’t, and with that 1:1 primary/standby hardware ratio just removing a few of those systems could save a lot of money. By ensuring we have one or two spare blades in our UCS system (a system that could support up to 320 blades in theory) we offer speedy failover services without extensive investment in redundant hardware or costly software automation.</p>
<p>It’s definitely not a replacement, I’m sure you will still need to offer those higher SLA’s for a great number of your systems, but with software licensing and hardware savings it’s another area where investment in Cisco UCS can be quickly recouped through a little smart thinking.</p>
<p>And in the meantime those other servers with no failover protection and a 4hr downtime while the engineer shuffles along to your site? Stop the call out. Let him snooze. You’re back up and running on a spare blade in 10 mins.</p>
<p>Pimped.</p>
<div id="attachment_77" class="wp-caption alignnone" style="width: 410px"><a href="http://carlbradshaw.com/wp-content/uploads/2011/06/PimpedRide.jpg"><img class="size-full wp-image-77 " title="PimpedRide" src="http://carlbradshaw.com/wp-content/uploads/2011/06/PimpedRide.jpg" alt="" width="400" height="291" /></a><p class="wp-caption-text">That&#39;s how we roll</p></div>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://carlbradshaw.com/?feed=rss2&#038;p=72</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Money Shot</title>
		<link>http://carlbradshaw.com/?p=20</link>
		<comments>http://carlbradshaw.com/?p=20#comments</comments>
		<pubDate>Mon, 13 Jun 2011 14:50:26 +0000</pubDate>
		<dc:creator>Carl</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[UCS]]></category>

		<guid isPermaLink="false">http://carlbradshaw.com/?p=20</guid>
		<description><![CDATA[“Doing more with less” is a term you hear a lot these days. The shaky ]]></description>
			<content:encoded><![CDATA[<p><a href="http://carlbradshaw.com/wp-content/uploads/2011/06/Loadsamoney1.jpg"><img class="alignleft size-medium wp-image-40" title="Loadsamoney" src="http://carlbradshaw.com/wp-content/uploads/2011/06/Loadsamoney1-204x300.jpg" alt="" width="204" height="300" /></a>“Doing more with less” is a term you hear a lot these days. The shaky economic conditions have left companies across the globe reeling, and IT budgets were one of the first squeeze points. As matters begin to improve we still have the seeds of the idea that we should be seeking to ‘do more with less’, which has led to an explosion of interest in more efficient IT technologies and processes. Witness The Cloud. Whether or not you are ready for The Cloud, whether The Cloud is something more attainable than vapourware, and whether The Cloud will rain on your parade are all secondary to the rush for The Cloud coming from the boardroom. Those airline magazines have a lot to answer for.</p>
<p>Ignoring the massive scope of such projects for a moment, we can instead focus on some practical and tactical changes in our operations that can bring huge returns and improved services for our customers. I hinted at this in an earlier post, claiming it was possible to save over $500,000 with a single UCS blade. It’s the kind of money that allows an investment such as Cisco UCS to pay for itself, and even have some left over to fund C-level dreams of The Cloud. The concepts are nothing new, it’s all about deriving maximum value from investments and I don’t claim to have discovered a cure for the common cold, nor can I Make You Thin, give you Instant Confidence, or Change Your Life In 7 Days. But I can help you deliver these concise messages to show that while others drift along with dreams of The Cloud, you are saving potential fortunes and delivering improved services today.</p>
<p>I come from an Oracle background, so I am going to use Oracle as my example here. I know the licensing model, which is a black art in itself, and I know the operational considerations. However, the ideas here are equally applicable to many enterprise application services such as SQL Server and SAP. I’m not a walking encyclopaedia of price lists, so I stuck with something I knew well in Oracle. To begin our understanding of how we can save money delivering Oracle services on UCS, we first need to understand the challenges.</p>
<h3><strong>Oracle Challenges</strong></h3>
<p>Traditionally, Oracle has been safe and snug on Big Iron platforms. In fifteen years delivering Oracle production databases behind some of the world’s largest enterprise systems and corporate ecommerce websites, I saw far more Sun SPARC servers than anything else. Everyone liked Sun, with terrific performance and bombproof hardware reputation pleasing both server teams and delivery staff alike. The only concern came from the boardroom, where the massive price tag of this performance and reliability was keenly felt. The problem was further highlighted in the Oracle domain by:</p>
<ul>
<li>Server sprawl – a typical enterprise might have hundreds of database servers, ranging from small locally booted servers to servers with different LAN or SAN requirements, or servers with their own storage arrays. Plus that server under that guy’s desk, the test system he cobbled together that now has vital dev code embedded in the database stored on local disk.</li>
<li>No memory capacity – Oracle is a hungry beast and will happily chew through almost as much memory as you can throw at it. Memory drives Oracle’s cache structure and is therefore key to excellent performance. It’s not uncommon for Oracle servers to run at 95%+ memory utilisation.</li>
<li>Low CPU utilisation – despite all the money spent on the Sun E10K server a few years ago, the (admittedly critical) database it serves is barely causing the server to break sweat on the CPU side. It’s reliable and we have our five-9?s performance, but it’s hardly efficient, especially in the days of Doing More With Less. It’s not uncommon for Oracle database server to be running less than 30% CPU utilisation.</li>
</ul>
<h3><strong>Oracle on UCS Solutions</strong></h3>
<div id="attachment_22" class="wp-caption alignright" style="width: 310px"><a href="http://carlbradshaw.com/wp-content/uploads/2011/06/MemoryHogs.jpg"><img class="size-medium wp-image-22 " style="border: 0px initial initial;" title="MemoryHogs" src="http://carlbradshaw.com/wp-content/uploads/2011/06/MemoryHogs-300x244.jpg" alt="" width="300" height="244" /></a><p class="wp-caption-text">The Memory Hog</p></div>
<p>We’ve seen how an enterprise application such as Oracle can present some unique problems, rapidly sprawling through our corporate domain like bad gossip while tearing through expensive RAM DIMMs in its quest for performance.</p>
<p>As mentioned, Oracle uses memory to feed its cache, reducing IO load on the storage array and driving performance improvements for the end user. Other applications such as VMware will also use vast amounts of memory quite happily, but this time for reasons of scale. If you want to run more VMs on your server, then it is most likely you will start to be constrained by physical memory limitations before you are constrained on CPU. It’s a common thread in deployment of many enterprise applications, that we are restricted by RAM before we redline the CPU.</p>
<p>Cisco UCS was designed to bring more balance to this equation. It’s an enterprise solution developed to deliver enterprise services. Cisco’s unique extended memory technology is one of the keys to unlocking massive benefits through consolidation of database services on UCS servers, delivering efficient RAM &amp; CPU utilisation while extracting maximum benefit from your software license.</p>
<p>At the time of writing, Cisco offers the following blade servers for use in the standard UCS 5108 chassis:</p>
<ul>
<li>B200 M1 – Nehalem 5500-series half-width blade with two quad-core CPUs and up to 96GB RAM</li>
<li>B250 M1 – Nehalem 5500-series full-width blade with two quad-core CPUs and up to 384GB RAM using Cisco’s extended memory technology</li>
<li>B200 M2 – Nehalem 5600-series half-width blade with two six-core CPUs and up to 96GB RAM</li>
<li>B250 M2 – Nehalem 5600-series full-width blade with two six-core CPUs and up to 384GB RAM using Cisco’s extended memory technology</li>
<li>B440 M1 – Nehalem 7500-series full-width blade with four eight-core CPUs and up to 256GB RAM</li>
</ul>
<p>In Oracle terms, the range of blades covers smaller databases in the B200 range (or well-suited to Oracle RAC nodes), through to large memory footprint databases in the B250 blades, without the expense of typical multi-socket servers which add significant cost to software licenses. If you really do need extra CPU power, then the B440 with 32 cores (64 logical cores with HyperThreading) is well matched to handle those exceptional workloads. UCS is unique with the B250 in that we finally have a server that offers the large memory footprint that enterprise applications demand, but does not punish us with additional CPU cores we will not use and yet have to license.</p>
<h3><strong>How much?</strong></h3>
<p>I keep mentioning the Oracle license. The penny is probably beginning to drop. Oracle is pretty expensive, and even without dipping into the delicious delights of the cost option list the basic list price for an Enterprise Edition license is $47,500 at the time of writing. $50k doesn’t sound too bad, until you see the ‘core weighting’ table and realise that the license is directly related to the CPU power on your server. All those CPUs running at 30% utilisation means 70% of your license cost wasted. And that’s just one server.</p>
<p>The core weighting varies based on the CPU type. The weighting for Intel CPUs is 0.5, with older Sun SPARC processors weighted at 0.75 as they ordinarily have less cores (the latest SPARC chips have a 0.5 weighting). The cost to license your server for the basic Enterprise Edition software then becomes:</p>
<p><strong>(total cores * core weighting) * $47,500</strong></p>
<p>Applying the formula to our UCS blades gives the following license costs:</p>
<ul>
<li>B250 M1 = $190,000</li>
<li>B250 M2 = $285,000</li>
<li>B440 M1 = $760,000</li>
</ul>
<p>Oracle RAC adds 50% to those figures per server, and many of the additional cost options such as Advanced Security or Advanced Compression add 25% each. The prices add up very quickly and can easily outstrip UCS hardware costs with the license implications from just one or two blades.</p>
<h3><strong>I Don’t Care, I’ve Got an ELA!</strong></h3>
<div id="attachment_28" class="wp-caption alignright" style="width: 208px"><a href="http://carlbradshaw.com/wp-content/uploads/2011/06/MrCreosote.jpg"><img class="size-medium wp-image-28 " title="MrCreosote" src="http://carlbradshaw.com/wp-content/uploads/2011/06/MrCreosote-198x300.jpg" alt="" width="198" height="300" /></a><p class="wp-caption-text">I&#39;ll just have a wafer-thin mint</p></div>
<p>While the Oracle Enterprise License Agreement does indeed offer an all-you-can-eat buffet from the Oracle table, you’d better be ready when the bill arrives.</p>
<p>Oracle won’t renegotiate an ELA in a downward direction, so you may be thinking that consolidation doesn’t really apply to you. Oracle won’t give you any money back, so what’s the point? The point is that if your Oracle estate grows 100% in the next three years, when Oracle comes knocking to audit for your new ELA you’re looking at double the cost of your previous agreement. If you consolidate heavily during the period, you won’t see an immediate return on your investment through a reduction in license costs, but if your estate only grows 50% over the period then your next ELA agreement is much more palatable. Consolidate now to save down the line.</p>
<h3><strong>Testing Times</strong></h3>
<p>We’ve established that Oracle is expensive to license, and that we can extract a lot of benefit by maximising our investment in the software. To best achieve this license optimisation we need to deliver a large memory footprint while reigning in the number of expensive CPU licenses. While initially seeming unbalanced, we have shown that enterprise applications using RAM for scale or performance can function efficiently in this configuration. The Cisco UCS B250 server presents itself as a capable database workhorse for these configurations, offering up to 384GB of RAM with a dual socket CPU that is comparatively inexpensive to license.</p>
<p>After trumpeting the benefit of the B250 and its 384GB memory footprint I am now going to contradict myself. The memory footprint is actually too big for most commonly-sized database services! I am sure there are edge cases where you really need a massive cache and that 384GB will be very useful, but equally they may need more CPU for such HPC environments and the B440 may be the better choice in that case. In tackling median cases in the Oracle domain, we couldn’t strike a perfect balance between database sizes and the massive 384GB footprint. I’m sure you have a case or two, and the fully loaded B250 will suit your needs perfectly. We are trying to develop a more general message around consolidation and best use of resources, so I’m not going to recommend loading your B250s with 384GB RAM and trying to fill them with Oracle databases, as you’ll most likely exhaust the CPU before you chew through all that memory.</p>
<p>But the B250 does offer something that I really think is ‘best use of resources’. Let’s take advantage of the Cisco extended memory technology, but instead of using pricey 8GB DIMMs to fill those 48 slots let’s instead use much cheaper 4GB DIMMs, offering 192GB on the server for less than $10,000 instead of $50,000 for 384GB. We’ve used the advanced technology of the B250 to deliver a creative solution instead of the maximum solution. Let’s be smart instead of massive, that’s why we’re moving away from those E10K servers, right?</p>
<p>In order that our tests were not trivial, and trying to represent realistic workloads, we used customer test data to establish the bounds for our test:</p>
<ul>
<li>Deliver at least 10,000 transactions per minute (TPM) per database</li>
<li>300 concurrent users per database</li>
<li>Total SGA (Oracle caches) footprint of 40GB maximum per database</li>
<li>Drive CPU utilisation &gt;85%</li>
</ul>
<p>These figures will not map onto all workloads, but in trying to define a baseline they offer database sizings and performance that are non-trivial and cover a wide section of deployed database services in operation today.</p>
<h3><strong>What about the B200?</strong></h3>
<p>The B200 is a great general-purpose workhorse and is eminently suitable for many database workloads. However, using the same CPU socket configuration as the B250, we suspected its comparative lack of memory would be a disadvantage if we were trying to consolidate workloads and maximise license efficiency.</p>
<p>Using the 40GB memory footprint database on a 96GB B200 M1, we deployed two databases and used Swingbench to simulate database load.</p>
<p><a href="http://carlbradshaw.com/wp-content/uploads/2011/06/2dbSwing.jpg"><img class="alignleft size-full wp-image-53" title="2dbSwing" src="http://carlbradshaw.com/wp-content/uploads/2011/06/2dbSwing.jpg" alt="" width="531" height="332" /></a></p>
<p><strong><em>Swingbench results for 2 databases</em></strong></p>
<p>We have managed to deliver on our workload requirements, with over 11,000 TPM in both our databases, each running 300 users. With our 40GB memory footprint per database we’re also getting great value from the 96GB RAM installed on the blade, but what about CPU?</p>
<p><a href="http://carlbradshaw.com/wp-content/uploads/2011/06/b200util.jpg"><img class="alignright size-full wp-image-58" title="b200util" src="http://carlbradshaw.com/wp-content/uploads/2011/06/b200util.jpg" alt="" width="605" height="275" /></a></p>
<p><strong><em>B200 M1 CPU utilisation</em></strong></p>
<p>With CPU utilisation at roughly 35% we aren’t able to maximise value from our $190,000 Oracle license for this server as we have exhausted memory capacity before CPU. Therefore while the B200 is ideally suited for smaller databases and as  a node in RAC clusters, it’s not our preferred option if we are trying to maximise database consolidation.</p>
<h3><strong>B250 M1 Results</strong></h3>
<p>Taking the same performance test to the B250 M1 with 192GB of RAM gives interesting results.</p>
<p><a href="http://carlbradshaw.com/wp-content/uploads/2011/06/b250util.jpg"><img class="size-full wp-image-60 alignleft" title="b250util" src="http://carlbradshaw.com/wp-content/uploads/2011/06/b250util.jpg" alt="" width="537" height="500" /></a></p>
<p><strong><em>Swingbench results for 4 databases</em></strong></p>
<p>We see a slight drop in TPM performance, but we are still above our 10,000 TPM target result in each database and serving 1200 concurrent users across the server as required. Over 160GB of our 192GB RAM is assigned to the databases, so we have excellent value from the DIMMs we have deployed, but again, what about CPU?</p>
<p><a href="http://carlbradshaw.com/wp-content/uploads/2011/06/b250cpu.jpg"><img class="alignright size-full wp-image-63" title="b250cpu" src="http://carlbradshaw.com/wp-content/uploads/2011/06/b250cpu.jpg" alt="" width="629" height="235" /></a></p>
<p><strong><em>B250 M1 CPU Utilisation</em></strong></p>
<p>With four databases running a concurrent workload of 40,000 TPM and 1200 users on our B250 M1 blade, we now achieve CPU utilisation in the 90% region, driving superb value from our hardware and software investment without detriment to our performance service level requirements. If we had more memory on this server, we wouldn’t be able to drive more value with similarly-sized and performing databases, as we have exhausted CPU. As we can see it is balancing act between our performance requirements and database size, versus the hardware resource of the server. The difference is that with applications commonly starved of memory before CPU starvation, the B250 and Cisco’s extended memory technology finally provides a cost-effective answer without incurring excessive license costs. Everything in balance. All very Zen.</p>
<h3><strong>Show Me The Money!</strong></h3>
<p>Our example test case was a customer already invested in x86 technology, with previous generation Intel Harpertown severs due for refresh. With 48GB per server and a single database running on each, the dual-socket Intel CPUs attracted the same Oracle licensing cost as current Intel Nehalem CPUs!</p>
<p>Each customer server costs $190,000 in Oracle licenses (dual socket quad-CPU). A single B250 M1 costs the same, yet was proven able to run four typically-sized customer databases due to its memory footprint and Nehalem CPUs.</p>
<blockquote><p>the B250 M1 can save $570,000 in license cost alone</p></blockquote>
<p>Moving four databases from single servers onto the B250 M1 can save $570,000 in license cost alone (without looking at OPEX or server costs) in our test case, with no performance penalty in terms of our required SLA.</p>
<p>Before I am accused of picking on Oracle, or stopping an Oracle Sales Guy from getting that new 458 Italia he’s had his eye on, I am sure these techniques of license optimisation are equally valid elsewhere. I know Oracle, I’m an Oracle guy and it’s in my interest to see Oracle do well so my skills remain valuable. Oracle is viewed as being extremely expensive software, suitable for only the highest of high-end enterprise workloads. While that remains largely true, what I have tried to show is that we can improve <em>value</em> from our investment in Oracle through careful and clever consolidation. I’d like more people to use Oracle because they see it as good value, deploying their license on servers that are well suited to these workloads, servers such as Cisco’s UCS B250 blade server. Doing more with less is about driving value, not wholesale shifts to open source technology and abandonment of enterprise service levels, so put the dogs back in the kennel, Larry, I am on your side.</p>
<p>And with that I move onto another customer with over 200 SPARC servers. Average utilisation is 3%. They’re really interested. Are you?</p>
<p><strong>Update:</strong> EMC have recently published a white paper <a href="(http://bit.ly/aAvJjD)">(http://bit.ly/aAvJjD)</a>, demonstrating their shift from a Sun SPARC hardware platform onto Cisco UCS for their Oracle eBusiness Suite and database. They were previously using 224 cores of SPARC processing, and mention the E25k hardware which uses the UltraSPARC IV+ CPU with a 0.75 core weighting.</p>
<p>According to rough calculations 224 cores of UltraSPARC IV+ CPU in a RAC configuration would cost $11,844,000 in license cost alone. Just for the database portion. Oracle eBusiness is expensive too, but I don’t know which apps are being licensed so let’s just stick to the DB and RAC option for now. EMC have migrated to four (yes, four) B440 blades with a total of 128 cores, giving an indicative license cost around $4.5M for the DB+RAC option.</p>
<p>Over $6M less. Oh, and they massively improved performance too.</p>
<p>I’m not saying these figures are deadly accurate as there may be factors not demonstrated in the paper. But as a ballpark example, it brings home the implications of consolidation and just how small the hardware cost can be in relation to potential savings on software. That legacy kit sitting in the corner is costing you a whole lot more than it should.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://carlbradshaw.com/?feed=rss2&#038;p=20</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cisco UCSM 1.4 and Oracle RAC</title>
		<link>http://carlbradshaw.com/?p=11</link>
		<comments>http://carlbradshaw.com/?p=11#comments</comments>
		<pubDate>Mon, 13 Jun 2011 14:03:43 +0000</pubDate>
		<dc:creator>Carl</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[UCS]]></category>

		<guid isPermaLink="false">http://carlbradshaw.com/?p=11</guid>
		<description><![CDATA[Cisco UCS Manager software version 1.4 has been well-documented over the past couple of weeks ]]></description>
			<content:encoded><![CDATA[<p>Cisco UCS Manager software version 1.4 has been well-documented over the past couple of weeks for its raft of new features. Many of these features are major upgrades to the functionality of Cisco UCS systems and as a Cisco employee it is very exciting to see the product develop in this fashion.</p>
<p>However, there is a lesser-known fix for Oracle RAC 11gR2 installation that’s also worth a look if you have any customers that are planning an Oracle install in the near future.</p>
<p>Oracle 11gR2 includes a lot of new technology and has a very different codebase to 11gR1. It has been widely speculated that Oracle 11gR1 was actually more like a ’10gR3? than a true 11g product set, but I don’t know the validity of such speculation. 11gR2 is proving rightly popular as Oracle looks to wind down Premium support status for 10gR2 and start customers thinking about the migration path for its latest and greatest product release. Among the significant changes in the new software, ‘Clusterware’ has now become ‘Grid Infrastructure’ and brings a number of key enhancements to develop the concept of database compute grids. One of these enhancements is Grid Plug and Play, designed to help ease the growth (or contraction) of database grids through use of automatic network provisioning. It relies on a number of other sub-services, including the Single Client Access Name (SCAN) address and Grid Naming Services (GNS).</p>
<p>Oracle GNS listens on a virtual IP address defined in the DNS server before installation. Resolution for node names in the RAC cluster are delegated to GNS for resolution, and GNS also relies on a DHCP server to dynamically allocate virtual IP addresses as required by the cluster. If you choose not to use Grid Plug and Play, then you can manually define static IPs for all RAC nodes and modify SQL*Net / server configuration files as per Oracle 10g RAC deployment practice.</p>
<p>To support the GNS operation Oracle Grid Infrastructure release 2 (11.2) runs an Oracle mDNS daemon on each node, using multicasting on all interfaces to communicate with other nodes in the cluster.  By default Oracle used the 230.0.1.0/24 address range, but later released a patch to add support for multicast on 224.0.0.251/24. This has caused some issues for vendors as the 224 address was supposedly a reserved address range. Oracle note 1212703.1 details the issues, which can manifest as a failure to join the cluster during Oracle RAC installation.</p>
<p>Cisco is working on an updated Reference Architecture white paper which will include very detailed definition of the issues and workaround/resolution for all versions of the UCS Manager deployment. This paper will be made available on <a href="http://www.cisco.com/go/oracle">http://www.cisco.com/go/oracle</a> shortly, but in the meantime I have tried to document the decision tree as it stands:</p>
<p><a href="http://carlbradshaw.com/wp-content/uploads/2011/06/Oracle-RAC-Decision-Tree-GNS1.jpg"><img class="alignleft size-full wp-image-15" title="Oracle RAC Decision Tree GNS" src="http://carlbradshaw.com/wp-content/uploads/2011/06/Oracle-RAC-Decision-Tree-GNS1.jpg" alt="" width="572" height="448" /></a></p>
<p>As is shown in the diagram, use of the 230 address range requires an upstream IGMP querier  to avoid timeout issues from the Fabric Interconnect.</p>
<p>Use of the 224 address range does not require the northbound IGMP querier but can present problems for servers with Palo network adapters that are not running the latest UCSM 1.4 software. This is due to the fact that network optimisations on the Palo card were not expecting multicast traffic on the reserved 224 address range, so we must explicitly set the interface to request all packets at the OS layer if we are using UCSM 1.3. The issue is resolved in UCSM 1.4, so if you’re using Palo and Oracle RAC go to UCSM 1.4 for an easy life.</p>
<p>As mentioned previously, the upcoming Cisco documents will cover all the detail required, but if you need further help in the meantime do not hesitate to contact me.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://carlbradshaw.com/?feed=rss2&#038;p=11</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

