VMware

A glimpse of things to come? … PCoIP at 30,000 feet

I’ve got to tell you, I’m pretty darn excited right now.  Why?  I’m typing this to you from 30,000 feet on a Delta flight from Cincinnati to Las Vegas (for VMware Partner Exchange).  And why is that so special?  Because, as the title suggests, I’m typing this on my VDI image which resides hundreds of miles away and thousands of feet below me.

Delta has a fairly new service from gogo called “gogo inflight … wi-fi with wings.”  This is my first time using the service because the past few flights I’ve taken, I’ve either not had the need to connect or the aircraft I happened to be on did not yet have the service.  But this time I have some work to do (i.e. my next “confessions” article for VSM), so I figured I’d give it a whirl.  And, being a gluten for punishment, I decided to see if I could push the limits of PCoIP.  After a quick sign up form (gogo isn’t free) and firing off a VPN connection back to my home office, I launched the View client and crossed my fingers.

And I can tell you that I am thoroughly impressed!  The Windows are snappy, flash is decent and low-end multimedia is adequate.  I was watching a youtube.com video with full sound and, while the picture was a little blurry and sound/video sync was slightly off, it was totally watchable.  And furthermore, it didn’t cripple my session.  Not bad, considering my latency is between 150ms and 250ms, with an estimated average about 200ms.

Is this a glimpse of things to come?  Right now it may seem pretty far fetched.  After all, the process to connect to my desktop image was fairly painful.  I had to …

  1. Boot into my local OS
  2. Connect to the gogo inflight wireless access point
  3. Launch my Firefox browser and walk through the gogo signup form
  4. Dig trough my briefcase for my wallet and pay for the service
  5. Fire off my OpenVPN client to my home VPN server
  6. Launch the VMware View Client

Not exactly what I’d call a seamless user experience.  And I believe that conquering this experience – that is, the mobile user – will be the coup de grace for traditional desktop infrastructure.  Until then, virtual desktop infrastructure will certainly happen in pockets, but massive, wide scale adoption will continue to elude us.  So what has to happen here?  In my mind, I see the following things need to happen …

True ubiquity of wireless Internet

This means two things.  First, the Internet has to be everywhere at all times.  I’m a true mobile user and I need to know that no matter where I am – whether it be on a puddle jumper, or in a remote country hotel – that when I power on my laptop, I will have access.

And second, this also means the connection to the Internet has to be completely integrated and transparent.  I don’t want to have to dig for my credit card every time.  But even more than that, I want the connection to happen for me automatically, in the background, as part of the boot processes.  My software client should auto detect the available wireless networks, connect, and debit my account.  Will I have a single unified account that works across all providers?  Or will I have multiple accounts that my software client will handle?  Or will it be a single, wireless / satellite provider that can reach me anytime, anywhere?  I don’t know and I don’t really care.  The point is, I don’t want to deal with it.  I want to press power and, after a short boot (maybe even zero boot?), have access.  Period.

A  purpose built Thin OS

Booting into a local OS just to launch a client and connect to a remote OS just isn’t going to cut it.  The boot process needs to be fast and do nothing more than present me with a login GUI.  If I’m remote, the VPN connection (and any necessary login parameters) need to be part of the login process.  There’s no need for a full blown local OS if our goal is to do little more than connect to our primary desktop environment.  Sure, us hardcore tech weenies will almost always want some sort of backdoor access to the local OS.  But for 99% of the users out there, they don’t care and just want a seamless desktop experience.  In fact, if done correctly, they shouldn’t even know there is a local OS and their desktop is actually running in a remote datacenter.

Does this actually exist yet?  Sort of.  ThinClients typically deliver this kind of user experience.  But for the most part, ThinClients aren’t mobile devices.  I’ve seen a ThinClient laptop model before, but I don’t know a single person actually using one.  I’ve actually seen for more cases of customers converting PCs and laptops to ThinClients.  Theron Conrey gives us a great example with his blog post VMware View Linux Live CD How-to.  And there are enterprise solutions for converting PCs to ThinClients from both Wyse and DevonIT.  So, we’re pretty darn close on this front, but still not 100%.

A rich user experience in low bandwidth, high latency environments

Like I stated earlier, my current PCoIP experience is pretty darn impressive.  It is, by far, the best experience I’ve witnessed to a remote desktop.  But, I’m not sure the average in-flight user would be ecstatic about it.  Sure, all things considered, you can’t beat it.  But I recognize all the variables working against me right now.  The typical user will not know or even care.  They just want it to work.  The good news is that PCoIP will continue to improve and brings the promise of delivering a rich user experience, whether at 30k feet of a single switch port away.

So, I ask again, is this a true glimpse of the not-too-distant future?  Ten years ago, I was the only one of my friends and family to have a cell phone.  Five years ago, mainstream virtualization in the datacenter was laughed at.  And a few short months ago, typing this blog post on my VMware View image was impossible.  So, you tell me.

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

VMware Virtual Center Attributes – It’s all about the details

eye-magnifying-glass

Overview

A fellow engineer extraordinaire (Mike Evans) inspired me to write up this blog post.  Mike and I have been using the “notes” attribute for virtual machines for a few years.  It has come in very handy to track who requested the virtual machine resource, and the date the virtual machine was provisioned.  If your not familiar with the notes field, it’s at the bottom of the summary page of a virtual machine properties page.

image

This little piece of information might seem trivial to the layperson but the larger your virtual environment grows, the more complex it becomes.  Having a way to track this fluid, ever changing infrastructure becomes more and more important as your begin to scale up and out.

Attributes

The “Notes” field was great for us except that we began to notice the variations of details that we had entered into each virtual machines properties.  Not probably a huge deal if you have a small VMware environment but when you start tracking several hundreds of virtual machines, it really starts having an affect on reporting.  A newer feature that was added to Virtual Center was the ability to use attributes, or pre-defined fields that can be populated.  This gives a VMware administrator the ability to have a common format for reporting on Virtual Infrastructure.  Below is a screen shot of the Custom Attributes you can find in Virtual Center:

image

Notice there are three different categories I have displayed in this view, Global, Host, and Virtual Machine.  You can set attributes at multiple places in your VI environment that you wish to track.  You can see we are interested in Virtual Machines attributes for certain variables (Owner, Provision Date, Provisioned By, Purpose).  We have a different interest at the host level (Build Date) for maintenance tracking purposes of physical hardware assets.

Reporting

Here is where all your hard work starts to pay off.

It’s audit time, you are tasked with trimming the fat in your environment because once again you are out of capacity, and the budget just got crushed for the rest of the year because “Insert your reason here” the UPS batteries just exploded!  Go into Virtual Center and generate a report of your virtual infrastructure so you can get a report of who owns what, and what date it was deployed.  Go to your Virtual Machine view, select your datacenter, go to the menu option “Export” and then select “Export List”.  Save the export as a Excel Spreadsheet, and view your results.  Notice the highlighted columns K through N, these are the custom attributes that we added above.

image

Conclusion

Virtual Center custom attributes are a great way to help manage your growing environment.  Sit down with your team, or your potential customer and find out what values matter most in your environment.  Create the custom attributed at the various places in Virtual Center.  Make sure you are diligent about filling out the details when you bring up new systems and make it part of your internal process and documentation.  You will thank yourself down the road.

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

VMware vSphere Capacity IQ Overview – I’m Impressed!

ciq-icon

With the launch of VMware vSphere came some new products that I hadn’t really paid much attention to (busy upgrading I guess).  One of the newer products is a Virtual Center reporting tool called Capacity IQ.  This product  gives an administrator the ability to analyze, forecast and plan for future growth across your ESX environment.  I have had a lot of experience with monitoring/reporting tools in the past, I won’t bore you with the details, so I was quite skeptical of a 1.0 reporting tool for Virtual Center.  I must admit I was blown away by the immediate relevant reports the product was able to produce.

After pulling down the trial install and obtaining the demo key, I loaded it up for a spin.  I am not going to document the installation steps needed as Eric Gray has done this for us already.  It by far is the easiest reporting application I have ever installed.  If your interested in taking it for a trial run, download the virtual appliance from VMware’s website here (OVF format).  Once you import the virtual appliance and give it a static IP address, it will need to collect data about your environment for a while.

There are three basic views that CIQ gives you once you install the plug-in, dashboard, views and reports.

Dashboard

The dashboard tab is designed to give you a quick overview of the item you have selected.  Capacity IQ uses the same approach as virtual center does, whatever object you have selected will be reported and focused on.  Here is a view of one of our clusters, notice January 11th on the Trend and Forecast graph on top.

Dashboard

One of our clusters was out of resources, I added two more physical hosts to the cluster.  You can see CIQ picks up the new physical host resources for the cluster and reflects this by increasing the number of virtual machines it believes the cluster can accommodate.  Want to see something even more interesting, check out the pink graph on the 17th.  Capacity IQ is already using a prebuilt formula to assume what it thinks we will have (or won’t have) a week out.  Pretty impressive.

Views

The views tab is designed  to give you a more detailed look on some of the specific data points.  Here is a screenshot of the various reports you can execute:

Views

So here is where you can get some great visual reports to present to either upper management, or a potential customer.  This gives you a nice interface that you can customize with data points that you can tweak.  Check out the first report on this cluster:

image

This gives you a graphical historical view of your cluster, how many virtual machines you have added over the course of time.  Notice the horizontal sliding bar at the bottom of the chart.  This allows you to adjust your variable time/date window.  The lighter shaded line to the right is the projected or forecasted growth of how the cluster might continue to grow.  The views tab is a great place to run some ad-hoc reports, gives you the ability to select the type of report, and even allows you to export the data.

Reports

The reports tab is the “pre-canned” reports that can be executed by the administrator.  The one thing I was disappointed to not see here was the ability to schedule these reports to run at a particular interval (weekly/monthly).  This is something that I assume will probably be introduced in future releases of the product.

Reports

After the report is executed and compiled, you are then provided with a .pdf or .csv version of your dataset to download and review.  The first report totaled 17 pages and provided some great technical information.  Here is the table of contents:

image

Conclusion

I am very impressed with Capacity IQ.  There are no agents you need to install across the virtual machines you wish to report against.  The installation was very straight forward, I think I had it up and running in about 15 minutes.  Once the virtual appliance was in place, all it needed was a little bit of time to start crunching some data.  The reports are well written and very relevant to what an administrator would desire and wish to see.  If your looking for a nice reporting tool to help you forecast, give this one a test to see if it fits your needs.

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

Mass upgrade of VMware Tools in Linux guests

linuximage

Installing and/or upgrading VMware tools has always been a bit more complicated for Linux guests than for Windows guests.  After the installation of the package binaries, the vmware-config-tools.pl script must be run to configure the tools for your environment.  This script has to be run from the console, which is a pain when you’ve got more then just one or two Linux VMs.  And may the good Lord help you if the modules aren’t suitable for your running kernel and you don’t have a compiler (or the C header files for your running kernel) already installed.

When VMware added the Automatic Tools Upgrade …

image

The situation certainly improved, but it is by no means a fool proof solution.  In my experience, it doesn’t work 100% of the time for Linux guests (though this *could* be due to the heavy modification I’ve done in my distro).  And furthermore, what if you want to automatically upgrade 100’s of Linux guests, not just one?  Or what if you’ve already got a deployment tool that you’d like to use to push the tools out?  (Kind of tough when the script needs to be run directly in the console)

So, I looked to see if there was a way to improve the situation.  First, I needed to find a way to run vmware-config-tools.pl remotely in an automated fashion.  And by the way, it’s not that you can’t run this script remotely via SSH because you can.  The problem is that when you do so, you immediately get following question …

 

It looks like you are trying to run this program in a remote session. This program will temporarily shut down your network connection, so you should only run it from a local console session.  Are you SURE you want to continue?

 

Unfortunately, to run vmware-config-tools.pl remotely, we need to include the –d flag so that the script will automatically select the default answers to all of the questions for us.  And the problem is, the default answer to this question is “no.”  

So I looked through the vmware-config-tools.pl and I found that it’s really only checking to see if the SSH_CONNECTION environment variable is set.  Well, that’s easy … simply executing vmware-config-tools.pl in a different shell allows us to side step this. 

Next I just created a simple bash script that gets pushed out to the /tmp directory along with the vmware tools installation package (also pushed to the /tmp directory) and gets executed remotely by my deployment tools (which for me are are just more bash scripts, but this should work with any enterprise deployment tool).  Here’s the simple script I used for my guests …

 

#!/bin/bash

RPM=`ls /tmp | grep VMwareTools`

rpm -e VMwareTools
echo "Old VMwareTools removed" > /tmp/vmware_tools_upgrade.log

rpm -i /tmp/$RPM
echo "$RPM installed" > /tmp/vmware_tools_upgrade.log

sh -l root -c /usr/bin/vmware-config-tools.pl -d
echo "vmware-config-tools.pl -d executed" >> /tmp/vmware_tools_upgrade.log

service vmware-tools restart
echo "vmware-tools restarted" >> /tmp/vmware_tools_upgrade.log

service network restart
echo "network restarted" >> /tmp/vmware_tools_upgrade.log

exit

 

This is obviously a very basic script and could easily be enhanced with better logging and error handling.  Also, for Debian distros, such as Ubuntu, you’d need to modify this script to handle the tar.gz installation package … unless, of course, you’ve modified your distro to handle RPMs (as I have).

The good news is that, at least for my environment:

  1. This works 100% of the time and a restart of the VMs is not necessary. 
  2. I no longer have to upgrade many guests by hand.

However keep in mind, there is still a network outage during the upgrade (usually just about a minute or two), so be sure to continue using a maintenance window for your upgrades. 

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

vSphere 4 Update 1 with Update Manager Shenanigans

New Year Lights 2010

Happy New Year!  I hope everyone enjoyed the holiday’s and got to spend some time with friends and family.  If your reading this I suggest you pay tribute to the quality of Virtual Insanity, and give the gift of voting.  Eric Siebert has released a “best of 2009 blog contest”.  If Virtual Insanity has helped you out in some way in the past I suggest casting a vote for this great virtualization blog space!  Ok onto the real reason for this post…

I ran into an oddity while bringing a new host online today into our vSphere environment.  And thought it best to publish my findings.  Hopefully this might save someone a support call.  With vSphere 4 update 1 came a couple of technical issues, which are detailed here and here.  Personally we don’t use ESXi so only the first one was a major issue for us.  We are an HP shop, so the issue around the HP agents and update 1 was a major concern (basically would render the host unbootable).  Luckily VMware support is proactive about announcing issues like this to the community and most people were aware of the problem right away.

The problem I hit today was strange and I thought it was just being off from work for a week.  I went to apply our update 1 baseline to a new host I was bringing up, rescanned, and then got this:

compliant1

What the?  I know this isn’t compliant, our base build is still at 4.0  Check out the build number, that’s proof.  I have used the update 1 baseline for 50+ hosts so I know it’s not that.  So maybe update manager is still on holiday as well, I restart the service and life is good?  Nope.  Same thing.

To make a long story longer, I poke around in the repository and check the update 1 patch and see it’s valid, yep 11/19/09 that’s the right release date.  Why is this thing not working?

update1-first

I kept poking and prodding thinking maybe they released an update to the update?  Sure enough it slipped by me when I wasn’t looking, or it went to my spam mail.  Check the date 12/9/09.

update1-second

I created a new test baseline, and dropped the 12/9/09 update 1 into it and applied it to my new host.  Low and behold:

compliant2

That’s much better.  Strange the older update 1 patch didn’t reflect anything and showed compliant.  As an end user I would have liked to have seen some type of error message, or a reference to the newer released update 1.  Ran the new update, (still stopped the HP agents just in case).  And now things look good again (build number):

looksbetter

Conclusion

Go vote for this site, and make sure you update your update manager, update 1 baseline.  That’s a lot of updates.  See you online!

Scott

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

A handy new addition to the Command Line Tool for View 4

First things first

Thanks to Scott Sauer (@ssauer) and John Blessing (@vTrooper) for holding down the fort here at Virtual Insanity while I’ve been finishing up some unfinished projects and preparing for the VCDX Design Exam (which I take later this month).  One of Scott’s posts actually won a vSphere blog contest.  Nice work Scott!  These two guys are becoming pretty good friends of mine here in the Cincinnati area, so hopefully I can convince them to keep the content flowin’.

An itch I couldn’t scratch

I’ve mentioned here on this blog, at least once or twice, that I “eat the dog food” and actually run my primary XP desktop as a VMware View image.  Since the conversion almost a year ago, everything has been running pretty well with only a few minor bumps along the way.  And with the recent addition of PCoIP, I can’t imaging ever going back.

But there was one little reoccurring problem I was having for which I couldn’t seem to find an answer.  It wasn’t a show stopper of an issue, but it was just an “itch I couldn’t scratch,” if you know what I mean.  And the problem went something like this …

  1. Inside my desktop VM I have a Cisco VPN client, necessary for a secure connection back to corporate HQ in Palo Alto, CA. 
  2. When connecting to my desktop with the VPN client inside the VM inactive, I had no issue.
  3. However, if I disconnect from my desktop while the VPN session was active, then I couldn’t reconnect to my desktop via VMware View. 

The reason?  The broker was sending me the new IP address of the Cisco VPN Adapter, which is an IP address on the VPN, and an IP address my local computer didn’t know about. 

Now, if I were to log off instead of disconnect from my desktop, this would terminate the VPN session and therefore wouldn’t be a problem.  But who wants to log off every time?  More often than not, I have things open on my desktop (e.g. half written emails, documents, browsers with many many open tabs, etc.) that I don’t want to bother saving and closing every time I step away from the computer.  And really the bigger issue is with unintentional disconnects that result from local power/network/OS issues.

I tried all sorts of things to fix this.  Among other thins, I tried …

  1. Reordering the NICs, hoping the broker was just grabbing the first NIC. 
  2. Poking around the broker and agent install files, hoping to find a way to force the IP address. 
  3. I even tried uninstalling and reinstalling the View agent and the Cisco client, hoping the order of installation might do the trick (admittedly, this was a random shot in the dark)

But nothing seemed to work.  So until recently, to reconnect I would have to connect directly to my desktop via RDP, or connect to the console via the VMware Infrastructure Client, then disconnect the Cisco VPN and then reconnect via the View client. 

See what I mean?  Not a show stopper, but man what a pain in the butt! 

The solution

Well I found a way around this with a handy new addition to the Command Line Tool in View4.  Check out page 12 of the Command Line Tool for View Manager titled “Override IP Address.”  On the broker from a DOS prompt, in the c:\Program Files\VMware\VMware View\Server\bin directory, execute the following …

vdmadmin.exe –A –d <desktop name> –m <machine name> –override –i hostname

The “desktop name” is the name of the VM in the broker.  The “machine name” is the name of the VM in vCenter.  It’s likely they’ll be the same, but they don’t have to be and in fact, in my case they weren’t the same.  The “hostname” can be either a FQDN or an IP address.  Oh, and I can tell you that all parameters must be present or the command won’t execute. 

But that was all there was too it.  Now I can disconnect and reconnect to my desktop, regardless of the state of my VPN client.

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

More Bang for your Buck with PVSCSI (Part 2)

Part 2 Doing the work

As you might have noticed, this blog post is a continuation to my first post about PVSCSI, you can access Part 1 here.

Hopefully now you have a better understanding of what the Paravirtual SCSI driver is all about, and we can prove there are tangible reasons to move in this direction.  Let’s get on with the important part, the implementation phase.

SCSI2

(I need to finish off this blog post, I am running out of pictures of SCSI cables)

There are some caveats I need to start out with.  In case you missed it, PVSCSI drivers on virtual machines aren’t supported on operating system disks unless you are running vSphere 4 update 1.  You can use the driver on a secondary data disk if you so desire, but for this post I am going to assume you are running vSphere 4 update 1 (Virtual Center and ESX Hosts) and want to know how to get the driver working on all disks.

In most cases, it’s always easier to build new.  You know you have a clean install, the drivers are updated, the configuration is solid.  I would suggest updating your templates to include the new paravirtual scsi driver.  Your existing virtual machines run fine with their existing configurations, and depending on your environment, it might be a lot of work to go back and target all of your virtual machines.  For an upgrade path, my personal opinion would be to target your heavy I/O virtual machines.  Upgrade the VM’s that will make a difference, and you will see some immediate benefits.  Reducing the I/O on the disk subsystem will only benefit the other virtual machines that might share those same physical disk spindles.

Clean install

This section will walk you through the process of installing the driver with a Microsoft Windows 2003/2008 operating system.  Currently these two operating systems are the only ones supported.  Hopefully we will see some added support for the various Linux operating systems down the road.

Walk through the “New virtual machine Wizard” as you normally would.  On step 9, ensure you select the “VMware Paravirtual” option as seen below.

para_wiz

Before powering your new VM up, you need to connect the virtual floppy image file that has the driver for your desired guest operating system.  This is not on the VMware.com website under downloads, it already exists on your ESX host.  You will need to browse to the following location on your ESX host. [Datastores]\vmimages\floppies I would wait to connect your floppy disk image after you boot off the Windows CD-ROM so it doesn’t try to boot off the floppy drive.

pvscsi-flop

When you power up your new virtual machine, select the F6 option to tell the operating system you need to use a third party SCSI driver:

windowsf6

Now connect your floppy disk image to your virtual machine under the “edit settings” option.  You should now be able to point to operating system to the driver as seen below:

pvscsi_select

Continue on with your normal installation, and you are complete.  Your new virtual machine is now utilizing the Paravirtual SCSI drivers.  I suggest now converting this image you created to a template for future deployments with this configuration.

Upgrading and Existing Virtual Machine

To upgrade an existing virtual machine, the process is pretty straight forward.  Assuming you have already upgraded to the latest virtual hardware (Version 7), make sure your VMtools are upgraded post Update 1.  Shut down the VM, and edit the settings “Change Type” as shown below:

chng1-pvscsi

You will get another window that will alllow you to change the type of controller as seen below:

chng2-pvscsi

Select the “VMware Paravirtual” and then select ok.  Boot up your virtual machine and you are all set.  Your system is now running with the updated drivers and you can take advantage of the newer drivers that provide better throughput and less latency!

Hope you found this post useful.  Good luck!

Scott

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

Capacity Conundrum Part Deux

 

– The vTrooper Report –

 

This is a continuation of the Capacity Conundrum, if you missed the first part start here.

$ per Compute VM

So let’s cut to the chase.  In the case of the compute tiles of our Quad we have a price per vCPU and $ per GB of RAM to settle. Keeping our example 2U server in play we could expect to spend approximately $15,000 for a 2U fully loaded with 4GB DIMMs.   Well unfortunately a small part of that 15K  is consumed in I/O cards and maintenance which needs to be pulled out to get the compute number.  For our argument we will use $10K for the compute system without the I/O cards and maint. costs;  This is the CAPEX we will offset in our $/per values.

vCOMPUTE – FIREPOWAH!

We know how a CPU works right? Move process into memory , execute CPU cycles, churn, churn more, back to the I/O guys, rinse and repeat.  Basically, this is where the hardware container happens in our data centers.   I say container because it’s easy to show it as a box; It’s hard to define what it will always be in physical form.  1U, 2U, 4U, half blade, Full Blade, appliance , PC ; you name it, it is probably in some one’s ‘datacenter’.  The lowest common denominator I have been able to settle on for a common form factor is Cores per Ram.  Grouping per socket fits because you are measuring the type of memory that is close to the CPU socket.  The NUMA architectures of AMD and Intel with memory controllers on-board and transports to the memory DIMMs without access through the I/O  controllers (eg. Northbridge) help define the grouping.

TECHNOTE:  Every core has associated memory banks it will use and every container(physical server) has a series of sockets that it controls.   A hypervisor has a limit to how well it can control the associated memory space to the nearest vCPU.   Generally the hypervisor will always schedule available vCPU’s from the same socket and swap the corresponding memory for those processes to the memory banks of the corresponding socket.  It does this is for efficiencies of the x86 architectures.  It can move the vm to another socket and readdress the memory but it has a ‘cost’ associated with such a move.  Path of least resistance is to stay in the same socket.

If you create a 4 vCPU VM and run it on a 1 core  processor it gets bogged down.   If you have the same VM on a two socket Quad Core (8 Cores)  the four cores utilized by the VM are likely to be on socket 1 or socket 2 .  The cost of splitting the vCPU between the two physical sockets by the scheduler is greater than running the vCPU in the same socket.    AMD delivered this earlier than Intel and sustained higher levels of virtualization consolidation “Per Host” than similar class systems of Intel could provide through the Northbridge.   Core i7 is a new game for Intel and the results of Nehalem show the improvements.

For more indepth information here is a good read:  CPU Scheduler in VMware ESX

We have a host of $10K  CapX charge that has two sockets at a 4/45GB Socket Ratio with approx $5k spend in each socket.  Looking at our Hardware invoice the CPU Cores are about 25% of the cost of a socket so we can assume that our per socket cost is broken down into 25% Core and 75% Memory.   So our Socket Ratio yields a $1250 cost for 4 cores and $3750 for 45GB of memory:

Per Core CapX = $312.50;  Per GB RAM CapX = $83.33

That gives us a bare metal cost without a hypervisor charge on top, but we need a hypervisor to get a VM running.  Adding in the ESX cost for a per socket license of ESX Enterprise Plus (worst case) you can add $3500 each socket.

ESX Lic. Cost per socket CapX = $3500

Raw burn rate of the host would be $8500 per socket if we never loaded a VM on the Host.  Well, we did it for a reason, so let’s get our money back. If we target the standard allocation for this host (4/45GB socket ratio) we get our target VM count of 16 per socket(1 vCPU/2.8 GB RAM).  Also, keep in mind that we broke the socket cost down by 25%  to CPU and 75% to Memory so we will keep that  same  split here.  If we don’t do the split, then any VM that is deployed to the socket will bear the same cost regardless of its size.

ESX Lic. Cost per VM= $218.75   ( 3500 / 16 )

-Or-

Split by the 25/75 % we did previously for the cost of the CPU and Memory and you get a little different calculation.

3500 * .25 = 875 / 16  = $55 AND     3500 *.75 = 2625 / 45  = $58

per vCPU=$55

per vMEM=$58

Adding it up with our target ratios in tow we get the burn rate of the $ per Compute on a VM basis.

($312/4  = 4:1 ratio) + (83*2.8) + {(55*1) + (58 * 2.8) } = $530

Or Summarized: (vCPU = $78)+(vMEM = $233)+(Hyper$=219)=(vCOMPUTE = $530)

Assuming 8760 Hours (1 year) this VM would cost $.06/hr in vCOMPUTE.

Lets apply that to some other VM systems and see if it sticks.  If we plan for the following VM deployment on our socket:

vmGrid

The costs would spit out as such:

vCompute

Or slice it up into a per hour number:

perhour

So based on this analysis some of my VM’s probably only cost $.05 per hour for vCompute.  Interesting. What is more interesting is the fact that the memory cost associated with a VM scales more accurately to the consumption.  You can have as much memory you like for your new 4 and 8 GB aspirations; (eg. memory leaks) you just need to pay for it accordingly.

Too bad that only pays for the top part of my total cost model.  That said, the benefit here is that this model can span across hypervisors and any market hypervisor can be split up to show the cost of a VM consumed on a Xen , KVM, VirtualIron, Parallels’, or Hyper-V infrastructure.

I will be working on a few powershell scripts and excel calculators that one can use to make this model more repeatable. At the very least, it is a model that I will use to consider CapacityIQ and third party products like the offering from VKernel; and the output they measure.  Especially if they consume additional costs on a per socket basis.  Which I can now calculate as Overhead.

Alas there is more to consider, stay tuned for Part III – “the I/O that binds”

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

Get Thin Provisioning working for you in vSphere

Going Thin and not looking back.

thinYes, I am slowly losing my hair like many other aging men out there, but it wouldn’t be virtual insanity if I were blogging about my personal male pattern baldness issues.  With the latest release of VMware vSphere comes a lot of new features and functionality that can be leveraged to make our lives easier.  One of these features, that I personally have been looking forward to for a while, is Thin Provisioning.  If you aren’t familiar with this technology, jump over to Gestalt IT for a great explanation of what it is and how it works.

One of the exciting promises of thin provisioning, is getting more “bang for your buck” out of the expensive enterprise storage you have been investing in for your ESX environment.  But, as Bret Michael’s once said, “Every rose has its thorn” and there are some things to look out for and considerations to make, before implementing thin disk technologies.

Efficiencies are great if they work right and don’t over

complicate the environment.

Do your homework and make sure you understand the characteristics of the virtual machine that you are considering migrating into a thin disk configuration.  The last thing you want to do is convert every VM to thin disk, and four months down the road all of your data stores are filling up and you’re scrambling for a storage CAPEX.  Some people are of the opinion to do thin provisioning either on the host side (VMware) or on the storage array side, but not both.  Take a gander at Chad Sakac’s blog that discusses thin on thin and some thoughts around each of these approaches.  I’m not going to go into all of the pluses and minuses of thin provisioning but rather focus on how to make it work for you.

Coffee Talk

coffee

So now that we have some of the basics out of the way, I wanted to share my thoughts on thin provisioning.  Like many organizations, we get requests from our customers that err on the side of caution.  They want to plan for the worse case and ensure that their project and/or application isn’t setup for failure.  I don’t blame them really, I do it myself all the time when I make coffee at home.  I always end up making more coffee than I typically drink, just in case I might need that extra charge.  The best way to do that is pad it, request more than what you might really need, just in case something comes up down the road.  Virtual machine disk storage in some cases fits this same profile.  If my coffee maker granted me access to hot coffee on demand, I would stop making extra coffee.  Thin disks can give your end users that capacity on demand so you can gain control of the padding effect that typically takes place in most corporate organizations.

Take it back…

So now you have done your research, you’re starting to get a feel for what this thin stuff is and how it might play out in your shop.  It’s go time.  If you’re a smaller VMware customer, you probably already have an idea of what are good target disks to convert.  If you’re a larger environment, it might be a little more difficult to gauge where the bloated pigs are hiding.

I worked at GE for a couple of years and was exposed to some of the Six Sigma methodologies they preach as well as practice.  Sounds boring, right?  Not really.  You can really leverage DMAIC for a lot of IT related problems/issues/projects.  You don’t have to take it to the extreme, use the framework to help guide you on your quest:

DMAIC

The DMAIC project methodology has five phases:

  • Define high-level project goals and the current process.
  • Measure key aspects of the current process and collect relevant data.
  • Analyze the data to verify cause-and-effect relationships. Determine what the relationships are and attempt to ensure that all factors have been considered.
  • Improve or optimize the process based upon data analysis using techniques like Design of experiments.
  • Control to ensure that any deviations from target are corrected before they result in defects. Set up pilot runs to establish process capability, move on to production, set up control mechanisms and continuously monitor the process.

We have already defined our project goals and what we are trying to accomplish.  We need a good “Measure” tool to really find where we might benefit from thin provisioning.  Powershell is a great tool that most VMware administrators use, or have at least heard of.  So this was the first place I turned to for assistance.

Alan Renouf of “Virtu-AL” http://www.virtu-al.net/ gave me a hand in writing the powershell script needed.  (Thanks again, Alan!).  Alan already had a one liner script to produce a list of vm’s, their disks assigned, and how much data each disk was consuming.  I needed the ability to see this data outside a powershell window and be able to analyze it in a better format.  We have a decent-sized VMware environment and exporting this out to a .csv for analysis is extremely helpful.  Here is the script!

************************************************************************

# Set the Filename for the exported data
$Filename = “C:\VMDisks.csv”

Connect-VIServer MYVIServer

$AllVMs = Get-View -ViewType VirtualMachine
$SortedVMs = $AllVMs | Select *, @{N=”NumDisks”;E={@($_.Guest.Disk.Length)}} | Sort NumDisks -Descending

$VMDisks = @()
ForEach ($VM in $SortedVMs){
$Details = New-object PSObject
$Details | Add-Member -Name Name -Value $VM.name -Membertype NoteProperty
$DiskNum = 0
Foreach ($disk in $VM.Guest.Disk){
$Details | Add-Member -Name “Disk$($DiskNum)path” -MemberType NoteProperty -Value $Disk.DiskPath
$Details | Add-Member -Name “Disk$($DiskNum)Capacity(MB)” -MemberType NoteProperty -Value ([math]::Round($disk.Capacity/ 1MB))
$Details | Add-Member -Name “Disk$($DiskNum)FreeSpace(MB)” -MemberType NoteProperty -Value ([math]::Round($disk.FreeSpace / 1MB))
$DiskNum++
}
$VMDisks += $Details
Remove-Variable Details
}
$VMDisks | Export-Csv -NoTypeInformation $Filename

***********************************************************************

So now that you have this great spreadsheet, you can do all sorts of crazy sorting and reporting, within Excel.  Take some time on phase 3, “Analyze” what you’re seeing.  Talk to your VM stakeholders to see how things might be changing from their perspective.  Try to plan for the surprises and position yourself accordingly.

Next is the “Improve” phase of DMAIC (see it’s easy!).  This is the part where you actually do the work.  It’s time to start leveraging the storage VMotion API’s, and reclaim some of that unused disk.

  1. Select the target VM in the VC client.
  2. Right click on the VM and select the option “Migrate”.
  3. Select the option “Change Datastore”.
  4. Select the destination, or click advanced if you are targeting one particular disk.
  5. Select “Thin provisioned format”.
  6. Select Finish.

Rinse and Repeat for the rest of that spreadsheet you have worked so hard on.

The last phase of DMAIC is “Control”.  This is one of the most important pieces to thin provisioning in my opinion.  At the minimum you need to setup Virtual Center alerts to monitor when your datastores are approaching critical levels.  You can’t implement thin disks in your vSphere environment and walk away.  The smart people over at VMware have given us the ability to monitor datastore disk space usage and over-allocation with the latest release of Virtual Center.  Setup your monitors so you are e-mailed when some of these thin disks begin to grow and you need to take some action.

image

Eric Gray of VMware takes this to the next level, check out his blog post on utilizing powershell to prevent datastore emergencies.  My personal approach to this concept is to setup a “hotspare” datastore for your environment.  A good practice to implement here would be to try reclaiming enough storage from your migrations to thin disks to free-up a “hot spare datastore”.  Implementing an automated recovery solution like Eric’s will help you sleep easier at night.  Worried about what might happen if your script doesn’t work or you do hit the perfect storm and end up with a full VMFS volume?  Intelligence has been built into vSphere to automatically pause the virtual machines, impressive.  Check out Eric’s video:

Wrapping it all up

Thin disk provisioning is a great feature that you should consider leveraging in your environment.  With some forward thinking and best practices you can achieve higher ROI for your ESX storage.  VMware vSphere offers the ability for you to migrate from thick to think with no downtime, so you can begin reclaiming storage on the fly.  Keep it simple, start out with a high level analysis of your infrastructure.  Identify the candidates that are a good fit and worth focusing on.  Setup your alerts on the datastores as soon as you migrate your first virtual machine so you are protecting yourself from problems down the road.  Consider taking automated actions if your datastores are reaching critical thresholds.

I hope you found this article helpful, good luck!

Scott Sauer

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

Capacity Conundrum Part 1

 

–The vTrooper Report –

 

In an effort to gel up an internal billing and allocation model (GaaS – Goughing as a Service) I’ve been struggling with the concept of cost per vm.  I was asking a simple question in the twittersphere about that idea and it turned into a discussion and well…got out of hand.  I apologize for that, as this is a better format to explain. (Special thanks to @asweemer for a dumping ground)

If I had a Nickel for each VM…

At VMWorld 2009 there was a presentation in the keynote that showed the price of a vm hosted with Terramark that was $.05 per hour.  I thought  wow.   A nickel per hour.  If I had a nickel per vm/hour; How much would I have available to spend on coffee?

Then I thought wait.  I have VM’s. How much do they really cost me per hour?  Well the answer is … it depends.  Old servers with high power consumption and low density versus a new system with Intel 5500’s and packed in blades have different burn rates visible to different systems(power, cooling, depreciation).  I haven’t found a great model to break those units down to my satisfaction yet.  I need another way.

As a general practice I create in my mind some maxims that I follow in the creation of a VM.

  1. S  – 1vCPU, 1GB Ram, 1GB Net, 10GB Disk
  2. M -2vCPU, 2GB Ram, 1GB Net, 20GB Disk
  3. L – 4vCPU, 4GB Ram, 2GB Net, 40GB Disk

Seems simple enough, but it doesn’ t really generate a cost model on a consistant basis.   Hardware continues to change and each VM that consumes resources does it at different rates and times of the day.   A VM that isn’t doing anything isn’t really ‘consuming’ anything, right?  I thought I would try break it down further by creating a 4 quadrant block with two macro categories:  Compute (CPU and Memory) and I/O (Network and Disk)

cmnd

Each resource area could increase\decrease for a reason without changing the size of the original maxim it was created under.  This allows for small variations of size without having a customer yell that their bill when up by $2 this month.

The Measureable Unit

Use a unit of measure to identify the four quadrants:   vCPU : vMEM : vNET : vDISK  or C:M:N:D  .   Then overlay the VM creation to count up the units.  This way the growth of a ‘VM’ during its lifecycle can be adequately allocated back into the proper IT metric.  Using the VM creation maxims up above this may be:

  1. S  – 1:1:1:1
  2. M -2:2:1:2
  3. L – 4:4:2:4

This isn’t perfect but it at least allows for the average cpu cost to be allocated seperately from a memory, network, and disk cost.  Afterall, you don’t get to upgrade all four parts of the quadrant in the same fiscal year usually.  This also allows a way to trend an average of your cost rate per unit over a period of months and years to see which cost areas are improving.  It is an interesting metric for the business and IT.  Win-Win in my book.  Even if no-one internally ever has to pay the values back (Showback).  It also helps police which VM is consuming too much of a specific value which would skew the numbers if you simply took the cost of the esx hosts and divide by the number of VM’s.

Apples , Oranges, Lemons, and Grapes = Frutti Results

So you have a unit of measure and a type of system to match the measurement up towards over a period of time.  Here’s where the fruit cart and the horse get hooked up.

This is all very complex, why can’t I just buy the same server I have purchased for the last 5 years?

Sorry Kids. They don’t build’em like they used to.  But in todays market, the UCS system from Cisco has a new buzz to the original players of IBM, HP, and Dell.   How do you sort any of that out among the offerings, and how do you select the right platform for your new ESX System?     By the Socket !   Every system of the x86 family has them from both the Intel and AMD families.  And now that you have to pay for your hypervisor and additional tools (Capacity IQ, AppSpeed,  Nexus1000v) per socket  it matters more.  I need to squeeze the value out of those sockets.

Still staying in the upper half of the Quad;  lets measure cores and RAM as a ratio assuming dual rank 4GB Dimms and measure them to some of the standard 2 socket servers.

Standard Intel x5450

2 Socket – 4 Core – 16 Dimms (8 per socket) produces 4 cores/ 32 GB Ram

Standard Intel Nehalem  x5500

2 Socket – 4 Core – 18 Dimms (9 per socket)  produces 4 cores/ 45 GB Ram

Cisco UCS extention  on x5500

2 Socket – 4 Core – 48 Dimms (24 per socket)   produces 4 cores/ 96 GB ram

What this shows is that for every license of ESX consumed in the environment there are different amounts of memory available for a VM to use.  The approach by the UCS system allows for a much higher allowance of memory to a VM at the same licensing cost.   Sure you could buy 4 way servers and claim that the 256 GB of RAM gives the VM more allowance but in reality the vm will have ratios of contention to the vCPU and Memory within each of the 4 sockets. You can change the size of the container by moving to a 4 way,  but it won’t change the value of the ratio  for that container in regards to the cores and memory.

CPU Contention

The idea of CPU contention is becoming more visible to most administrators of virtualized environments because the desire to pack the vm’s onto a host is so strong.  If I can get 10 VM’s on a host for $5000 then getting 25 VM’s on the same host is lowering my cost per vm.  It could also be cheating your customers of the performance they paid. Especially if you have multiple vCPU’s assigned to those 25 VM’s.    This is where the ratio of VM per host becomes obsolete and vCPU/core  makes more sense.

Using the example containers above you can generate an expected number of VM’s per socket.  There is no reason to do a 1:1 ratio of cores to VM because the point of virtualization is to run more with less.  I think a good ratio to start with is 4:1 for a production VM and 16:1 for a VDI implementation:

Standard Intel x5450  -  (4 /32 GB SocketRatio)   yields 16  VM’s with a 1 vCPU/ 2GB ram configuration per socket

Standard Intel Nehalem  x5500  -  (4 /45GB SocketRatio) yields 16  VM’s with a 1 vCPU/ 2.8GB ram configuration per socket

Cisco UCS  -   (4 /96GB SocketRatio) yields 16  VM’s with a 1 vCPU/ 6GB ram configuration per socket

You can always adjust your actual deployment if these ratios don’t match up for your environment.   The expected deployment number helps determine how large the pizza slices are for the team.  Not how many slices each of them consume.  In these configurations you can see where the density of the RAM per socket (SocketRatio) of the UCS allows for much larger VM configurations before overcommitment. A nice fit for the new 64bit installations. These expected numbers of VM per socket help determine what the burn rate of a C:M:N:D value is for the CapX  spend you made.

BurnRate

To fully understand how much a VM costs, one has to look at what was spent in the CapX of the host and agree on the measuring stick to measure the C:M:N:D value of the created VM.  If a series of hosts are in service from different families and are at different parts of lifecycle there may have to be some averaging.  The SocketRatio of Cores/RAM is a consistent way to measure systems from different form factors and families and levelset the expected allocation of VM’s.  The expected allocation of VM’s for a host helps determine what density ratio is desired for vCPU:vMEM.

This is the end of Part 1 –  In Part Deux I will take a deeper dive into the Compute and I/O areas and assign a more detail cost per VM model.

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon