TCO / ROI
Capacity Conundrum Part Deux
Oct 12th
– The vTrooper Report –
This is a continuation of the Capacity Conundrum, if you missed the first part start here.
$ per Compute VM
So let’s cut to the chase. In the case of the compute tiles of our Quad we have a price per vCPU and $ per GB of RAM to settle. Keeping our example 2U server in play we could expect to spend approximately $15,000 for a 2U fully loaded with 4GB DIMMs. Well unfortunately a small part of that 15K is consumed in I/O cards and maintenance which needs to be pulled out to get the compute number. For our argument we will use $10K for the compute system without the I/O cards and maint. costs; This is the CAPEX we will offset in our $/per values.
vCOMPUTE – FIREPOWAH!
We know how a CPU works right? Move process into memory , execute CPU cycles, churn, churn more, back to the I/O guys, rinse and repeat. Basically, this is where the hardware container happens in our data centers. I say container because it’s easy to show it as a box; It’s hard to define what it will always be in physical form. 1U, 2U, 4U, half blade, Full Blade, appliance , PC ; you name it, it is probably in some one’s ‘datacenter’. The lowest common denominator I have been able to settle on for a common form factor is Cores per Ram. Grouping per socket fits because you are measuring the type of memory that is close to the CPU socket. The NUMA architectures of AMD and Intel with memory controllers on-board and transports to the memory DIMMs without access through the I/O controllers (eg. Northbridge) help define the grouping.
TECHNOTE: Every core has associated memory banks it will use and every container(physical server) has a series of sockets that it controls. A hypervisor has a limit to how well it can control the associated memory space to the nearest vCPU. Generally the hypervisor will always schedule available vCPU’s from the same socket and swap the corresponding memory for those processes to the memory banks of the corresponding socket. It does this is for efficiencies of the x86 architectures. It can move the vm to another socket and readdress the memory but it has a ‘cost’ associated with such a move. Path of least resistance is to stay in the same socket.
If you create a 4 vCPU VM and run it on a 1 core processor it gets bogged down. If you have the same VM on a two socket Quad Core (8 Cores) the four cores utilized by the VM are likely to be on socket 1 or socket 2 . The cost of splitting the vCPU between the two physical sockets by the scheduler is greater than running the vCPU in the same socket. AMD delivered this earlier than Intel and sustained higher levels of virtualization consolidation “Per Host” than similar class systems of Intel could provide through the Northbridge. Core i7 is a new game for Intel and the results of Nehalem show the improvements.
For more indepth information here is a good read: CPU Scheduler in VMware ESX
We have a host of $10K CapX charge that has two sockets at a 4/45GB Socket Ratio with approx $5k spend in each socket. Looking at our Hardware invoice the CPU Cores are about 25% of the cost of a socket so we can assume that our per socket cost is broken down into 25% Core and 75% Memory. So our Socket Ratio yields a $1250 cost for 4 cores and $3750 for 45GB of memory:
Per Core CapX = $312.50; Per GB RAM CapX = $83.33
That gives us a bare metal cost without a hypervisor charge on top, but we need a hypervisor to get a VM running. Adding in the ESX cost for a per socket license of ESX Enterprise Plus (worst case) you can add $3500 each socket.
ESX Lic. Cost per socket CapX = $3500
Raw burn rate of the host would be $8500 per socket if we never loaded a VM on the Host. Well, we did it for a reason, so let’s get our money back. If we target the standard allocation for this host (4/45GB socket ratio) we get our target VM count of 16 per socket(1 vCPU/2.8 GB RAM). Also, keep in mind that we broke the socket cost down by 25% to CPU and 75% to Memory so we will keep that same split here. If we don’t do the split, then any VM that is deployed to the socket will bear the same cost regardless of its size.
ESX Lic. Cost per VM= $218.75 ( 3500 / 16 )
-Or-
Split by the 25/75 % we did previously for the cost of the CPU and Memory and you get a little different calculation.
3500 * .25 = 875 / 16 = $55 AND 3500 *.75 = 2625 / 45 = $58
per vCPU=$55
per vMEM=$58
Adding it up with our target ratios in tow we get the burn rate of the $ per Compute on a VM basis.
($312/4 = 4:1 ratio) + (83*2.8) + {(55*1) + (58 * 2.8) } = $530
Or Summarized: (vCPU = $78)+(vMEM = $233)+(Hyper$=219)=(vCOMPUTE = $530)
Assuming 8760 Hours (1 year) this VM would cost $.06/hr in vCOMPUTE.
Lets apply that to some other VM systems and see if it sticks. If we plan for the following VM deployment on our socket:

The costs would spit out as such:

Or slice it up into a per hour number:

So based on this analysis some of my VM’s probably only cost $.05 per hour for vCompute. Interesting. What is more interesting is the fact that the memory cost associated with a VM scales more accurately to the consumption. You can have as much memory you like for your new 4 and 8 GB aspirations; (eg. memory leaks) you just need to pay for it accordingly.
Too bad that only pays for the top part of my total cost model. That said, the benefit here is that this model can span across hypervisors and any market hypervisor can be split up to show the cost of a VM consumed on a Xen , KVM, VirtualIron, Parallels’, or Hyper-V infrastructure.
I will be working on a few powershell scripts and excel calculators that one can use to make this model more repeatable. At the very least, it is a model that I will use to consider CapacityIQ and third party products like the offering from VKernel; and the output they measure. Especially if they consume additional costs on a per socket basis. Which I can now calculate as Overhead.
Alas there is more to consider, stay tuned for Part III – “the I/O that binds”
Capacity Conundrum Part 1
Sep 27th
–The vTrooper Report –
In an effort to gel up an internal billing and allocation model (GaaS – Goughing as a Service) I’ve been struggling with the concept of cost per vm. I was asking a simple question in the twittersphere about that idea and it turned into a discussion and well…got out of hand. I apologize for that, as this is a better format to explain. (Special thanks to @asweemer for a dumping ground)
If I had a Nickel for each VM…
At VMWorld 2009 there was a presentation in the keynote that showed the price of a vm hosted with Terramark that was $.05 per hour. I thought wow. A nickel per hour. If I had a nickel per vm/hour; How much would I have available to spend on coffee?
Then I thought wait. I have VM’s. How much do they really cost me per hour? Well the answer is … it depends. Old servers with high power consumption and low density versus a new system with Intel 5500’s and packed in blades have different burn rates visible to different systems(power, cooling, depreciation). I haven’t found a great model to break those units down to my satisfaction yet. I need another way.
As a general practice I create in my mind some maxims that I follow in the creation of a VM.
- S – 1vCPU, 1GB Ram, 1GB Net, 10GB Disk
- M -2vCPU, 2GB Ram, 1GB Net, 20GB Disk
- L – 4vCPU, 4GB Ram, 2GB Net, 40GB Disk
Seems simple enough, but it doesn’ t really generate a cost model on a consistant basis. Hardware continues to change and each VM that consumes resources does it at different rates and times of the day. A VM that isn’t doing anything isn’t really ‘consuming’ anything, right? I thought I would try break it down further by creating a 4 quadrant block with two macro categories: Compute (CPU and Memory) and I/O (Network and Disk)
Each resource area could increase\decrease for a reason without changing the size of the original maxim it was created under. This allows for small variations of size without having a customer yell that their bill when up by $2 this month.
The Measureable Unit
Use a unit of measure to identify the four quadrants: vCPU : vMEM : vNET : vDISK or C:M:N:D . Then overlay the VM creation to count up the units. This way the growth of a ‘VM’ during its lifecycle can be adequately allocated back into the proper IT metric. Using the VM creation maxims up above this may be:
- S – 1:1:1:1
- M -2:2:1:2
- L – 4:4:2:4
This isn’t perfect but it at least allows for the average cpu cost to be allocated seperately from a memory, network, and disk cost. Afterall, you don’t get to upgrade all four parts of the quadrant in the same fiscal year usually. This also allows a way to trend an average of your cost rate per unit over a period of months and years to see which cost areas are improving. It is an interesting metric for the business and IT. Win-Win in my book. Even if no-one internally ever has to pay the values back (Showback). It also helps police which VM is consuming too much of a specific value which would skew the numbers if you simply took the cost of the esx hosts and divide by the number of VM’s.
Apples , Oranges, Lemons, and Grapes = Frutti Results
So you have a unit of measure and a type of system to match the measurement up towards over a period of time. Here’s where the fruit cart and the horse get hooked up.
This is all very complex, why can’t I just buy the same server I have purchased for the last 5 years?
Sorry Kids. They don’t build’em like they used to. But in todays market, the UCS system from Cisco has a new buzz to the original players of IBM, HP, and Dell. How do you sort any of that out among the offerings, and how do you select the right platform for your new ESX System? By the Socket ! Every system of the x86 family has them from both the Intel and AMD families. And now that you have to pay for your hypervisor and additional tools (Capacity IQ, AppSpeed, Nexus1000v) per socket it matters more. I need to squeeze the value out of those sockets.
Still staying in the upper half of the Quad; lets measure cores and RAM as a ratio assuming dual rank 4GB Dimms and measure them to some of the standard 2 socket servers.
Standard Intel x5450
2 Socket – 4 Core – 16 Dimms (8 per socket) produces 4 cores/ 32 GB Ram
Standard Intel Nehalem x5500
2 Socket – 4 Core – 18 Dimms (9 per socket) produces 4 cores/ 45 GB Ram
Cisco UCS extention on x5500
2 Socket – 4 Core – 48 Dimms (24 per socket) produces 4 cores/ 96 GB ram
What this shows is that for every license of ESX consumed in the environment there are different amounts of memory available for a VM to use. The approach by the UCS system allows for a much higher allowance of memory to a VM at the same licensing cost. Sure you could buy 4 way servers and claim that the 256 GB of RAM gives the VM more allowance but in reality the vm will have ratios of contention to the vCPU and Memory within each of the 4 sockets. You can change the size of the container by moving to a 4 way, but it won’t change the value of the ratio for that container in regards to the cores and memory.
CPU Contention
The idea of CPU contention is becoming more visible to most administrators of virtualized environments because the desire to pack the vm’s onto a host is so strong. If I can get 10 VM’s on a host for $5000 then getting 25 VM’s on the same host is lowering my cost per vm. It could also be cheating your customers of the performance they paid. Especially if you have multiple vCPU’s assigned to those 25 VM’s. This is where the ratio of VM per host becomes obsolete and vCPU/core makes more sense.
Using the example containers above you can generate an expected number of VM’s per socket. There is no reason to do a 1:1 ratio of cores to VM because the point of virtualization is to run more with less. I think a good ratio to start with is 4:1 for a production VM and 16:1 for a VDI implementation:
Standard Intel x5450 - (4 /32 GB SocketRatio) yields 16 VM’s with a 1 vCPU/ 2GB ram configuration per socket
Standard Intel Nehalem x5500 - (4 /45GB SocketRatio) yields 16 VM’s with a 1 vCPU/ 2.8GB ram configuration per socket
Cisco UCS - (4 /96GB SocketRatio) yields 16 VM’s with a 1 vCPU/ 6GB ram configuration per socket
You can always adjust your actual deployment if these ratios don’t match up for your environment. The expected deployment number helps determine how large the pizza slices are for the team. Not how many slices each of them consume. In these configurations you can see where the density of the RAM per socket (SocketRatio) of the UCS allows for much larger VM configurations before overcommitment. A nice fit for the new 64bit installations. These expected numbers of VM per socket help determine what the burn rate of a C:M:N:D value is for the CapX spend you made.
BurnRate
To fully understand how much a VM costs, one has to look at what was spent in the CapX of the host and agree on the measuring stick to measure the C:M:N:D value of the created VM. If a series of hosts are in service from different families and are at different parts of lifecycle there may have to be some averaging. The SocketRatio of Cores/RAM is a consistent way to measure systems from different form factors and families and levelset the expected allocation of VM’s. The expected allocation of VM’s for a host helps determine what density ratio is desired for vCPU:vMEM.
This is the end of Part 1 – In Part Deux I will take a deeper dive into the Compute and I/O areas and assign a more detail cost per VM model.
Thinking about upgrading to vSphere 4? It’s a no brainer.
May 4th
Since I was in a meeting during the launch of vSphere 4 on April 22nd, and since I found myself wide awake at 3AM, I decided to to watch the recording of the webcast early this morning. And as I was watching, I heard Steve Herrod (VMware CTO) make the following statement …
… So if you’re an existing customer today and you have a 100 host deployment using our vi3.5 product, simply upgrading the software will save you $2 Million dollars a year …
Wow, that’s pretty powerful. In an age when words like costly, frustration, and BSOD’s (Blue Screen of Death) are often associated with software upgrades, it’s no wonder many companies are taking an “if it ain’t broke, don’t fix it” approach. But here’s a software upgrade that, if for no other reason, should be considered purely for economic reasons.
How can VMware make such a bold claim? Steve’s statement was based on the following efficiencies you’ll achieve with vSphere4 (over and above what you’re already seeing with VI3):
30% Greater Consolidation
Most people know that you’ll get the greatest VM density with VMware due to superior technologies like memory over commitment and Distributed Resource Scheduling. But did you know that VM density is a critical metric when determining TCO? There’s a great blog post over at VMware: Virtual Reality which goes into detail. But the following graphic sums it up pretty well …
Simply put, with VMware you’ll have less physical servers to buy, less network and storage connections, less floor space and less power and cooling to support your virtual infrastructure … AND you’ll have superior functionality like VMotion, DRS, HA, etc.
If you’re an existing VMware customer, then you are already benefiting from the efficiencies afforded to you by VI3. And upgrading to vSphere4 is going to give you even greater efficiencies, allowing you to achieve an even greater VM density, as well as capture a greater number of high I/O applications that were previously considered non VM candidates. The following table taken from the webcast summarizes these performance improvements in vSphere 4.
50% Storage Savings
There are over 150 new features in vSphere 4. One of the more exciting features is Thin Provisioning. This feature is already included VMware’s virtual desktop offering, VMware View, and I have a blog post about storage savings with View if you’re looking for more technical detail. But for this post, know that the technology has been applied to vSphere 4 and allows for significant storage savings.
Basically, Thin Provisioning allows for the VM to consume no more space than the data requires. So, for example, if you have VM with a 100G virtual drive but only 20G of data within the virtual drive, then only 20G will actually be consumed. When applied across all your VMs, you’ll achieve economies of scale and you’ll likely see a 50% reduction in storage, if not more.
20% Power Savings with Distributed Power Management
What is Distributed Power management (DPM)? Steve Harrod calls it “VM Tetris” or “Server Defrag,” which I thought was clever. During low server utilization, DPM will intelligently VMotion workloads down to the smallest number of acceptable physical servers and then power off the unused servers. As traffic increases during peak hours, DPM will power on the servers and again redistribute the workloads with VMotion.
Distributed Power Management isn’t new as it was introduced over a year ago in VI3. However, up until vSphere 4, this feature was only experimentally supported. And with the lack of full support in VI3, I don’t believe many customers actually used DPM. But VMware supports DPM in vSphere 4, assuming your hardware has IPMI, WOL or iLO. And it can deliver significant savings in you power and cooling costs. Plus you get the added bonus of doing your part to save the environment.
Here is an awesome video some of the VMware engineers created showing DPM in action …
At this point, you’re probably asking the following questions …
- Sounds great, but how much is it going to cost me? Nothing. Your software maintenance covers like-for-like upgrades. So, if you have VI3 Enterprise, then you can upgrade to vSphere 4 Enterprise at no additional cost.
- Is it a difficult process to upgrade? Will it require massive configuration changes? Nope. The upgrade is actually rather simple. I upgraded my three lab VI3 servers to vSphere 4 in under an hour with no downtime of any of my VMs. Basically, Update Manager handled just about everything for me.
- Do I get anything else with vSphere? Heck yeah! Remember, there are 150 new features in vSphere 4, which I’m sure I’ll address in future posts. I only addressed the ones that will save you money.
So let’s see if I can summarize this properly … a zero cost, easy upgrade = 30% Greater Consolidation + 50% Storage Savings + 20% Power Savings. To me, that’s a no brainer. What other software company in the world offers that kind of value?
Virtualizing Tier 1 Applications
Aug 31st
With virtualization finding its way into every nook and cranny of the data center, it would seem that tier 1 applications are the only safe harbor for the few remaining “Server Huggers” out there. Their mantra usually sounds something like this …
“My application is too I/O intensive for virtualization,” or “MY xyz application vendor doesn’t support VMware” or possibly “My application is too important to be virtualized” (this is one of my favorites). Believe it or not, I even heard one guy say “you can virtualize my server when you pry it from my cold dead hands” … um, wow. He has issues. Last I heard, he was de-virtualizing a server farm at the NRA. Hehehe.
Anyway, for the rest of us with our heads NOT buried in the sand, I’m here to tell you that tier 1 applications can and should be virtualized. I’ll go so far to say that if you’re not virtualizing tier 1 applications, you are doing your company a major disservice.
Below is a brief overview of a presentation I gave in Cincinnati a few weeks ago to a group of about 75 professionals. The topic was “Virtualizing Microsoft Exchange.” And while the content that follows is geared towards the Microsoft Exchange application, it can really apply to any tier 1 application.
Performance
I’ll start with performance because this is typically the first objection to virtualizing a Tier 1 app. The perception is that virtualization creates too much overhead and therefore applications in a VM will certainly underperform applications running on a physical server. This current perception was born out of a previous reality. In the early days, virtualization really did introduce enough overhead to warrant physical servers for applications with high I/O. But a perfect storm is a-brewin’ and I summarize it with the following equation:
hypervisor improvements + server hardware improvements + application improvements =
better than native performance
That’s right. Mileage will vary, but given a properly architected solution, virtual can actually outperform physical. And even in scenarios where physical outperforms virtual, the delta is probably measurable, but not observable. So let’s take a closer look at the three areas I mentioned in the equation above.
Hypervisor Improvements
The hypervisor (AKA, the virtualization layer, AKA the Server Hugger’s worst nightmare) has come a long way in the past few years. And in VMware’s ESX product, the latest version has the following performance improvements over previous versions:
- Increased guest OS memory to 64GB
- Increased physical RAM on ESX to 256GB
- TCP segment offload to further lower CPU utilization
- NUMA optimizations improve multiple VM performance
- Support for 64-bit clustering with boot from SAN
These improvements alone can capture almost all tier 1 applications, but combined with the next two, almost no tier 1 app can hide from becoming a candidate for virtualization.
Server Hardware Improvements
We’re now seeing server hardware with 256GB+ of physical RAM. Multi-core CPU’s with 2 and 4 cores are running in production today and 6/8/12 cores are coming soon. And best of all, hardware-assisted virtualization technologies are emerging, pushing the virtualization overhead down to the hardware, getting the hypervisor ever closer to near native performance.
And because the vast majority applications simply can’t fully utilize hardware with this much horsepower, ironically, virtualization is the only way to truly capture the full ROI of these physical investments.
Application Improvements
As applications continue to evolve, bugs are fixed and bad code is optimized, performance improvements within the application are being realized, further reducing the need for a physical server. Speaking specifically about Microsoft Exchange, the following performance improvements exist in 2007 over 2003:
|
Exchange 2003 |
Exchange 2007 |
| 32-bit Windows | 64-bit Windows |
| 900MB database cache | Multi-GB database cache |
| 4Kb block size | 8Kb block size |
| High read/write ratio | 1:1 read/write ratio |
| Requires high-end storage | Affordable storage (iSCSI) |
| Storage is common pain point | Eliminates storage pain point |
| 50% reduction in disk I/O |
Of course the improvements for this piece of the equation will vary from one app to the next.
Bottom Line: Performance should not be a barrier to virtualizing an application.
A Virtual Server is Better than a Physical Server
Tier 1 applications are the most critical, important applications in your organization and therefore they need to run on the best infrastructure possible. So almost by definition, tier 1 applications need run in a VM. Here are a few of my favorite reasons why a VM is better than a physical server. Keep in mind, these aren’t the only reasons, just my favorites.
Reason #1: Better up time
The “eggs in one basket” argument no longer applies. And for those of you who don’t know what I’m talking about, the objection usually sounds something like this … “If I put 30 VMs on a single physical server, and that physical server crashes, then I’ve just lost 30 applications instead of one!” This was a very legitmate concern five years ago. But today you can get better uptime in a VM than you can with a physical machine. In the worst case scenario, if a physical server dies, those VMs are automatically powered up on a different physical server. In my experience, the VMs are usually back up and taking requests in under two minutes (and yes, I’ve timed it with a stop watch). And this is worst case scenario for a VM today! What’s best case scenario for restoring a physical server after a hardware crash? Weeks? Days? Hours (if you’re lucky and really prepared)?
So with today’s technology (and it’s only going to get better with what’s coming soon), worst case scenario for a VM is better than best case scenario for a physical server. And you might ask, what’s best case scenario? Even with hardware maintence, you can achieve 100% uptime with VMs. How? Check out a few of VMware’s features like VMotion, DRS and Update Manager.
Reason #2: Better hardware utilization
The average server utilization across the globe is less than 10% and in my experience, it’s often less than 5%. Why? A single application can rarely harness the power of the hardware it’s running on. And for a ton of different reasons (which I won’t go in to here), critical applications typically require a dedicated server. That is like buying a Ferrari and never driving it more than 5 mph … what an awful waste! Get the most for your money by putting each app in a VM, running multiple VMs per physical server. Open that baby up and let it do what it was built to do! I think the following two screen shots do a great job of showing you what I’m talking about.
CPU Utilization Before VMware
CPU Utilization After VMware
Reason #4: Avoid over provisioning
Why waste time and energy planning for future capacity (which is really nothing more than an educated guess based upon a ton of assumptions)? The tendency has been to over provision hardware to account for future growth, but this often leads to under utilized hardware. With Virtual Machines, additional CPU and RAM can be added at anytime with a few clicks of a mouse. And moving to more powerful systems in the future can be done in real time with VMotion and/or Storage VMotion. With virutalization, it only makes sense to simply build your application for the capacity you need and then throttle as necessary.
Reason #5: Better Security
Typically, protection engines come in two forms, host based and network based. The problem with network based security software is that it has no (or very limited) visibility in to the host. And the problem with host based security software is that it’s running in the same context as the malware that it’s trying to protect against. And the creators of malware are not stupid! They continually find new ways to hide their malware and/or attack the protection engine, creating a never ending viscious circle of cat-and-mouse.
But we now have new, trusted layer with the much smaller codebase of the hypervisor where we can provide protection from outside of the operating system. A protection engine from this layer provides a much stronger defense because it’s “underneath” the VM, completely isolated from the malware. And this is a great place for a protection engine to live because it can see all I/O of the VM and inspect each of the virtual components (CPU, Memory, Network and Storage). Better yet, we now have the ability to do things like:
- Intercept, view, modify and replicate I/O traffic from one, many or all VMs
- Provide inline protection or passive monitoring
- Mount and read virtual disks
Reason #6: DR made easy
In the physical world, DR is a pain in the butt and super expensive. The reason is DR solutions for physical servers often require similar hardware at the DR site to avoid issues with driver, hardware, and software compatibility. These dependencies are eliminated in a virtual world, which means any VM can run on any physical server with an ESX hypervisor. And because a VM is completely encapsulated, the entire VM exists in a small set of files. This simplifies replication and therefore simplifies the process of keeping your production and your DR environment in sync. And finally, servers at the DR site can be used for other purposes, like test and development, until they are required for DR purposes. Which means an investment in a DR infrastructure will not site idle.
Support
I love it when I hear someone say “my application vendor says they won’t support VMware.” Hmmmmm. Here’s a crazy question for ya, isn’t it VMware’s job to support VMware? Now, I’m sure what they really mean is that the vendor won’t support their application in a virtualized environment. But just to make things clear, if you have a problem with VMware … call VMware.
And support for applications in a virtualized environment is rapidly changing. Examples are numerous, but two big ones that come to mind are SAP and Microsoft. In the earlier part of the year, SAP announced full support for their software on VMware. And just recently, Microsoft announced the Server Virtualization Validation Program (SVVP) where they will support their OS’s and a good list of their applications in a virtualized environment. And VMware’s ESX is the industry’s first hypervisor to be validated by Microsoft.
What about those vendors who still don’t support their applications in a virtualized environment? Most of my customers do two things. First, they put pressure on the vendor to start providing support. For large companies, this can be very effective since the software providers want to keep their big customers happy. Second, many of them have a “swing server.” So when a vendor’s support team requires them to reproduce the problem on physical hardware, they simply V2P the VM on the swing server and continue on their merry way. (Yes, I know, this isn’t always as easy as I make it sound. Though it often can be just that easy)
Still not convinced?
The table above is the results of a survey of 500 VMware customers taken over a year ago, and the numbers are growing rapidly. Simply put, customers are virtualizing tier 1 applications today.
Powered by ScribeFire.



