vSphere

Upgrade your Virtual Hardware in a few minutes, with a twist.

stop_watch

Introduction

I attended last months Cincinnati VMUG (VMware User Group) and was surprised to hear the responses from the audience on how many customers had not taken the plunge, and upgraded to vSphere yet.  I think there were a handful of users that had just completed the upgrade.  Sometimes I forget to step out of my own personal space and consider what others have going on in their own environments.  If your still wondering about the upgrade, Aaron has a post on some of the benefits of going from VI3 to vSphere.

Part of the process of upgrading your existing investment is the need to upgrade all of the virtual machines to the latest and greatest virtual machine hardware version 7.  Someone mentioned to me how much of a pain this was since you had to touch each virtual machine, and my response to them was “It only takes a couple of minutes”.  I wanted to prove this theory in a different way, so I mulled over it and came up with a timed video clip.  The song I chose is 2 minutes and 39 seconds, so I figured If I can knock this out within the amount of time it takes for the song to play, well then, mission accomplished.

vSphere Upgrade Thoughts

Before getting into my bizarre video clip challenge, some quick thoughts and comments from my personal experiences on the upgrade are as follows.

  • Make sure that you check the new HCL for vSphere prior to the upgrade, some of your older server hardware might not be technically compatible or supported with the new release of code.
  • Understand the licensing changes that have taken place before you begin your upgrade process.  Work with your account team or VAR and understand the features and functionality that fit your environment.  You need to ensure your current licenses get ported over so the newer licensing server will be able to register your newer ESX hosts.
  • If your going to slowly transition over to vSphere you will need to maintain a legacy license server for the older VI3 hosts until your migration is complete.
  • Testing your upgrade is a lab environment is always a good approach if you have the infrastructure.
  • If you are utilizing hardware management agents on your ESX hosts or third party backup software, make sure you get the latest agents that support the current release of vSphere.
  • If you are upgrading your existing Virtual Center database, make sure you do a backup prior to the upgrade.  We chose to “leap frog” into our new environment, so we built the new Virtual Center server from ground up then disconnected the ESX hosts out of the old into the new.
Virtual Hardware

So what is virtual hardware anyways and why do I care?  Virtual hardware is an important component of your infrastructure and you should understand what it means to you.  You must be running version 7 to leverage some of the new features you will find in vSphere like the paravirtual storage driver (pvSCSI) and the paravirtual network driver (VMXNET3).  Here is the technical definition straight out of the admin guide.

The hardware version of a virtual machine indicates the lower-level virtual hardware features supported by the virtual machine, such as BIOS, number of virtual slots, maximum number of CPUs, maximum memory configuration, and other characteristics typical to hardware.  Virtual machines with hardware versions lower than 4 can run on ESX/ESXi 4.x hosts but have reduced performance and capabilities. In particular, you cannot add or remove virtual devices on virtual machines with hardware versions lower than 4 when they reside on an ESX/ESXi 4.x host.

Here is a table that lists what each version of the virtual hardware can support and what limitations you might experience:

image

Get your Groove on

Here are the steps that I take in the video to upgrade the Windows virtual hardware version from 4 to 7.  Many thanks to Scott Lowe for posting these upgrade instructions to his blog, it helped our efforts tremendously as we were early adopters of vSphere.  Don’t forget to upgrade your templates so all of your future virtual machines you implement will be running version 7.

  1. Upgrade your VMware tools in the guest operating system.
  2. Once the upgrade is complete, shut the guest operating system down.
  3. Upgrade the virtual machine hardware.  (Right click virtual machine, upgrade)
  4. Add the new VMXNET3 network adapter. (now an option)
  5. Remove the old network adapter.
  6. Power on the virtual machine.
  7. Let the hardware discovery execute and add the new devices.
  8. Reboot the system.
  9. Finished.

Matt Costa is the featured artist here, the song is titled “Sweet Rose”.  Enjoy!!

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

Performance Troubleshooting VMware vSphere – Memory

memory

Introduction

As memory prices continue to drop and the x64 bit architecture is embraced and adopted more in the industry, we continue to see a rise in memory demands.  Only a few years ago, 1-2 GB virtual machines were the norm, 95% of these being 32 bit operating systems.  From my personal experience I have seen this trend change to 2-4 GB as a norm, with the more high performing virtual machines consuming anywhere from 4-16 GB of memory.  VMware has answered this demand with vSphere now delivering up to 1TB of addressable memory per physical host, and up to 255GB per virtual machine.

With processors now more powerful than ever, the general shift of virtual machine limitations is changing from compute to memory.  This is reflected in our industry today as we see an increase in the memory footprint on traditional servers (Intel Nehalem), and vendors such as Cisco introducing extended memory technology which can more than double the standard memory configuration.  I recently had the opportunity to sit in on a Cisco Unified Computing System architectural overview class, and was impressed with what I saw.  The extended memory technology is quite unique because it not only allows you to scale our on your memory configuration, it uses a special ASIC to virtualize the memory so there is no reduction in bus speed.  A financial advantage to having this many DIMM sockets is you can use lower capacity DIMMs (2 GB or 4GB) to achieve the same memory configuration in a standard server where you would have to use 8GB DIMMs.

Memory Technologies in VMware vSphere

There are some major benefits of virtualization when it comes to memory.  VMware implements some sophisticated and unique ways of maximizing physical memory workloads within an ESX host.  All of these features work out of the box with no advanced configuration necessary.  To understand problems that might occur in your environment you need to be familiar with these basic memory concepts.

  • Transparent Page Sharing – The VMkernel will compare physical memory pages to find duplicates, then free up this redundant space and replaces it with a pointer.  If multiple operating systems are running on one physical host, why should you load the same files multiple times?  Think of this as the data de-duplication process we are seeing in a majority of backup solutions in the industry.
  • Memory Overcommitment – The act of assigning more memory to powered on virtual machines than the physical server has available.  This allows for virtual machines that have heavier memory demands to utilize the memory that is not actively being used on under utilized machines.
  • Memory Overhead - Once a virtual machine is powered on the ESX host reserves memory for the the normal operations of VMware infrastructure.  This memory can’t be used for swapping or ballooning, and is reserved for the system.
  • Memory Balloon Driver – When VMware tools are installed on a virtual machine they provide device drivers into the host virtualization layer, from within the guest operating system.  Part of this package that is installed is the balloon driver or “vmmemctl” which can be observed inside the guest.  The balloon driver communicates to the hypervisor to reclaim memory inside the guest when it’s no longer valuable to the operating system.  If the Physical ESX server begins to run low on memory it will grow the balloon driver to reclaim memory from the guest.  This process reduces the chance that the physical ESX host will begin to swap, which you will cause performance degradation.  Here is an illustration if ballooning in ESX:

image

What to look for
  • Check ESX host swapping.  If you are overcommitting memory on the physical ESX host you can run into a situation when each virtual machine is in need of the total amount of what is granted.  When the host is out of memory it will begin to page out.  Keep an eye on your oversubscription rates of physical hosts, or ensure you have enough memory resources across your DRS clusters so it can balance the load more effectively.  Swapping will occur when the following formula is met:

Total_active_memory > (Memory_Capacity – Memory_Overhead) + Total_balloonable_memory + Page_sharing_savings

  • Check for Virtual machine swapping.  Make sure you virtual machines have enough memory for the application workload that they are supporting.  If virtual machine swapping starts to occur this can put a strain on the disk subsystem.
  • Check to ensure VMware tools are installed and updated.  VMware tools not only provides drivers from the guest to the hypervisor, but the balloon driver also gets installed with VMware tools.  For proper memory management the ESX host relies on the balloon driver to manage memory.
  • Check memory reservation settings.  By default VMware ESX dynamically tries to reclaim memory when not needed.  There are situations when you might choose to utilize memory reservations.  If you set memory reservations in your environment be aware that this memory is permanently assigned to the host and can not be reallocated when it’s not being used.  Don’t sell the balloon driver short, many third part application vendors over spec their configurations for personal safety, and ballooning can help counteract some of that wasted “fluff factor”.
Monitoring with Virtual Center

The first place I would start with checking memory configurations is Virtual Center.  Virtual Center provides excellent reporting and gives you granular control over which metrics you would like to report against.  VMware vSphere now includes a nice graphical summary in the performance tab of the physical host.  This gives you a quick dashboard type view of the overall health of the system over a 24 hour period.  Here are some memory samples:

Check your over all % usage (lower is better)

image Check your Ballooning (lower is better)

image

Selecting the advance tab gives you a much more granular way of viewing performance data.  At first glance this might look like overkill, but with a little bit of fine tuning, you can make it report on some great historical information.  Here is a snapshot of memory utilization with many of the variables we just discussed above, great snapshot of what’s going on (looks healthy below):

Check your various metrics, mainly for swapping activity

image The virtual center performance statistics by default display the past hour of statistics, and show a more detailed analysis of what’s currently happening on your host.  Select the option “Chart Options” to change values such as time/date range and which counters you would like to display.

Virtual Center Alarms are an excellent tool that can sometimes be overlooked and forgotten about.  While this is more of a proactive tool than a reactive or troubleshooting tool, I thought it was worth mentioning.  Setup Memory alerts so you will be notified via e-mail if a problem starts to manifest itself.  Here is an alarm configured to trigger if physical host Memory usage is above 90% for 5 minutes or greater.  A lot of these alerts are built into Virtual Center so you don’t have to do a lot of pre-configuration work.  You do need to make sure you setup the e-mail notifications under the “Actions Tab”.

image

Monitoring with ESXTOP

Esxtop is another excellent way to monitor performance metrics on an ESX host.  Similar to the Unix/Linux “Top” command, this is designed to give an administrator a snapshot of how the system is performing.  SSH to one of your ESX servers and execute the command “esxtop”.  The default screen that you should see is the CPU screen, if you need to monitor memory select the “m” key.  Esxtop gives you great real-time information and can even be set to log data over a longer time period, try “esxtop –a –b > performance.csv”.  Check your total Physical memory here, make sure you aren’t over committing and causing swapping.  Examine what your virtual machines are doing, if you want to just display the virtual machine worlds hit the “V” key.

image

Monitor inside the Virtual Machine

A great feature VMware introduced for Windows virtual machines was integrating VMware performance counters right into the Performance Monitor or “perfmon” tool.  If your running vSphere 4 update 1 make sure you read this post first as there is a bug with the vmtools that will prevent them from showing up.  You can monitor the same metrics found in Virtual Center and esxtop here.  Just another way of getting at the data especially if you have a background in Microsoft Windows and are familiar with perfmon.

image

Monitoring with PowerCLI

Another great place to go to for finding potential memory problems and bottlenecks is PowerCLI.  I have been using PowerGUI from Quest, accompanied by a powerpack from Alan Renouf.  If your not a command line guru don’t let this discourage you.  PowerGUI is a windows application that allows you to run pre-defined PowerCLI commands against your Virtual Center server or your physical ESX hosts.  Want to find out what your ESX host service console memory is set to?  How about virtual machines that have memory reservations, shares or limits configured?  You can pull all of this information using Alan’s powerpack.

image

Conclusion

If your using VMware vSphere, there are many different ways to monitor for memory problems.  The Virtual Center database is the first place you should start.  Check your physical host memory conditions, then work your way down the stack to the virtual machine(s) that might be indicating a problem.  Take a look at esxtop, check some of the key metrics that we discussed above.

Look for the outliers in your environment.  If something doesn’t look right, that’s probably the case.  Scratch away at the surface and see if something pops up.  Use all possible tools available to you like PowerCLI.  Approaching problems from a different perspective will sometimes bring light to a situation you weren’t aware of.  If all else fails, engage VMware support and open a service request.  Support contracts exist for a reason and I have opened many SR’s that were new technical problems that have never been discovered by VMware support.

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

Performance Troubleshooting VMware vSphere – The Tetralogy

steth

Yes a bold topic I know, but I wanted to tackle this subject because it’s such an important aspect that everyone typically deals with at some point.  I also find it personally useful to document some of my thoughts so I can solidify my own understanding of these processes and tools.  I will admit, this was a challenge for me to write up.  There is so much material and information that I had to really focus on keeping it simple and to the point.  Performance problems can span such a wide array of possibilities that there is never typically one easy answer.  Hopefully by highlighting some of the tools that are available for use, and offering some of my personal thoughts and experiences, I might be able to help when problems arise in your infrastructure.

There is so much useful information floating around on PDF’s, blog’s, websites, PowerPoint decks, that one could easily get consumed by this topic.  Since this is such a broad topic, I wanted to try and set the stage.  The focus of this series of blog posts is to highlight some key components to examine, and then provide tools that will give you insight into your own environment and/or situation.   This page will be the launch point for the various categories.  Each blog post will cover a different category relating to the possible points of I/O in a VMware ESX environment.

Performance Troubleshooting VMware vSphere – CPU

Performance Troubleshooting VMware vSphere – Memory

Performance Troubleshooting VMware vSphere – Storage

Performance Troubleshooting VMware vSphere – Network

There is one last I/O component that I will not be covering, and that is the human factor.  These posts will assume that your installation or upgrade is of sound mind and body.  If there are underlying installation issues or post upgrade issues, I suggest engaging VMware support before examining conventional performance problems.

Acknowledgments/References:

VMware vSphere 4 Performance Troubleshooting Guide – Hal Rosenberg

Performance Monitoring and Analysis – Scott Drummonds

VMworld 2009 TA2963 ESXtop for Advanced Users – Krishna Raj Raja

http://www.vmware.com/support/

http://www.yellow-bricks.com/esxtop/ – Duncan Epping

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

Performance Troubleshooting VMware vSphere – CPU

pentiumee_processor_back

Introduction

Processors have come a long way in a very short time, and over the past few years we have seen the industry embrace the multi-core x86 architectures (Intel and AMD) which is allowing us to consolidate with even greater efficiencies than previous processor architectures.  Ensuring available compute cycles to virtual machine workloads is critical, and should be monitored closely as you scale out your infrastructure.

What to look for
  • Check for physical cpu utilization that is consistently above 80-90%.  Getting high consolidation rates is a wonderful thing, but don’t over tax the physical server.  Maybe it’s time to purchase another host for your DRS cluster and let the software balance your workloads better.
  • Watch pCPU0 on non ESXi hosts.  If pCPU0 is consistently saturated, this will negatively impact performance of the overall system.  If you are using third party agents, ensure they are functioning properly.  A couple of years ago we had issues with HP System Insight management agents (Pegasus process) which was creating a heavy load on our COS.  All of the virtual machines looked fine from a performance perspective, but once we dug a little bit deeper, we discovered this was our root cause.
  • Watch for high CPU ready times, this indicates that the processor is waiting on other I/O components on the host before it can perform its computations (Memory/Network/Storage).  This can help point you towards another possible bottleneck in your infrastructure outside of CPU.
  • Watch for virtual machines that are consistently at 80-100% utilization.  This is not a typical pattern of a conventional server.  Most likely if you login to the guest you will find a runaway process that is consuming all of the cpu cycles.  I actually found an offshore contractor running Rosetta@home (a cancer research screen saver) inside one of our virtual machines!  If something doesn’t look right, it’s worth checking it out.
  • Watch for virtual machines where the Kernel or HAL is not set to use more that one CPU (SMP) and the vm is allocated multiple processors via Virtual Center.  I was approached by a Linux administrator that told me he wasn’t seeing any performance improvements after he added a second processor.  After I poked around a little bit I discovered he was running a uniprocessor kernel and hadn’t recompiled his operating system for SMP.  If the operating system doesn’t have the ability to recognize more than one processor, you won’t be seeing any performance gains by throwing more vcpu’s at a larger workload.
Monitoring with Virtual Center

Virtual Center is a great place to start at for CPU performance monitoring both at a physical level and a virtual machine level.  Before getting into too much detail I wanted to explain Virtual Center statistics logging.  There are various levels of logging that can be set for the VC database.  Beware! You can easily over run your database and fill up your exiting disk space by setting all of these to the maximum setting.  Think of this as a debug level, the higher you set it the more information will be captured to the database for analysis (more disk space consumed).  If you need to get to some of the more detailed performance statistics, VC performance counters and their corresponding levels can be found here.  To change these settings click, Administration –> vCenter Server Settings –> Statistics.

image

Let’s take a look at a physical ESX host performance metrics through Virtual Center.  vSphere now includes a nice graphical summary in the performance tab of the physical host.  This gives you a quick dashboard type view of the overall health of the system over a 24 hour period.  Here is the CPU sample:

image Selecting the advance tab gives you a much more granular way of viewing performance data.  At first glance this might look like overkill, but with a little bit of fine tuning, you can make it report on some great historical information.  Here is a snapshot of physical CPU utilization across all processors:

image

The virtual center performance statistics by default display the past hour of statistics, and show a more detailed analysis of what’s currently happening on your host.  Select the option “Chart Options” to change values such as time/date range and which counters you would like to display.

Virtual Center Alarms are an excellent tool that can sometimes be overlooked.  While this is more of a proactive tool than a reactive or troubleshooting tool, I thought it was worth mentioning.  Setup CPU alerts so you will be notified via e-mail if a problem starts to manifest itself.  Here is an alarm configured to trigger if physical host CPU utilization is at 75% for 5 minutes or greater.

image

Monitoring with ESXTOP

Esxtop is another excellent way to monitor performance metrics on an ESX host.  Similar to the Unix/Linux “Top” command, this is designed to give an administrator a snapshot of how the system is performing.  SSH to one of your ESX servers and execute the command “esxtop”.  The default screen that you should see is the CPU screen, if you ever need to get back to this screen in the future, just hit the “c” key on your keyboard.  Esxtop gives you great real-time information and can even be set to log data over a longer time period, try “esxtop –a –b > performance.csv”.  Check your PCPU and CCPU (Physical/Console) here.  Examine what your virtual machines are doing, if you want to just display the virtual machine worlds hit the “V” key.

imageA detailed list of ESXTOP counters can be found here:

http://communities.vmware.com/docs/DOC-5240

http://communities.vmware.com/docs/DOC-9279

Monitor inside the Virtual Machine

A great feature VMware introduced for Windows virtual machines was integrating VMware performance counters right into the Performance Monitor or “perfmon” tool.  If your running vSphere 4 update 1 make sure you read this post first as there is a bug with the vmtools that will prevent them from showing up.  Check your % Processor time which is the current load of the virtual processor.

image

Monitoring with PowerCLI

Another great place to go to for finding potential cpu problems and bottlenecks is PowerCLI.  I have been using PowerGUI from Quest, accompanied by a powerpack from Alan Renouf.  If your not a command line guru don’t let this discourage you.  PowerGUI is a windows application that allows you to run pre-defined PowerCLI commands against your Virtual Center server or your physical ESX hosts.  Want to find virtual machines with CPU ready time?  How about virtual machines that have CPU reservations, shares or limits configured?  You can pull all of this information using Alan’s powerpack.

image

Conclusion

If your using VMware vSphere, there are many different ways to monitor for CPU problems.  The Virtual Center database is the first place you should start.  Check your physical host CPU contention, then work your way down the stack to the virtual machine(s) that might be indicating a problem.  Take a look at esxtop, check physical CPU, console cpu then the vmworlds that are running on the ESX host.

Look for the outliers in your environment.  If something doesn’t look right, that’s probably the case.  Scratch away at the surface and see if something pops up.  Use all possible tools available to you like PowerCLI.  Approaching problems from a different perspective will sometimes bring light to a situation you weren’t aware of.  If all else fails, engage VMware support and open a service request.  Support contracts exist for a reason and I have opened many SR’s that were new technical problems that have never been discovered by VMware support.

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

VMware pvSCSI – When and when not to use it

scsi_cable

Introduction

Hopefully you have read my previous blog posts on pvSCSI.  It describes what the driver is, how it works, and how it can positively impact your performance and workloads.  Part two covers the process of installing the pvSCSI driver on an existing system and a new system.  Both can be found here on the site and you might find them useful:

http://www.virtualinsanity.com/index.php/2009/11/21/more-bang-for-your-buck-with-pvscsi-part-1/

http://www.virtualinsanity.com/index.php/2009/12/01/more-bang-for-your-buck-with-pvscsi-part-2/

Interrupt Coalescing

VMware recently published a KB article that answers a question that has been floating around the community for a while.  The pvSCSI driver sounds superior to the LSI driver with direct I/O access to the hypervisor so why not use it in all cases?  The article states that you should only use the newer driver when driving higher workloads, those that are typically 2000 IOPS or greater.  For those that don’t know 2000 IOPS is a pretty big workload.  Consider this, a standard fiber channel 10,000 RPM drive averages around 125 IOPS per disk.

I didn’t really understand this and the knowledge base article is lacking any detail on the rational behind the statement.  I reached out to VMware performance engineer Scott Drummonds to see if he had anything he could publish to help clarify the KB article.  Scott was nice enough to research this and posted his findings here.

So it appears that the technical explanation is interrupt coalescing or buffering.  The paravirtual SCSI driver was designed to handle receiving multiple requests at a high rate and then “batching” the requests together for better efficiencies in throughput.  If you aren’t generating high enough workloads on the virtual machine, the I/O request could unnecessarily sit in the queue while the “batch" waits to be filled up for the next transaction.  This could cause storage performance problems which would typically be seen as higher latency and would negatively impact the virtual machine.

Now and Then

The great news is the current release of the driver is optimized for heavy workloads.  If you are starting to virtualize SQL/Oracle systems and need the performance, go for the pvSCSI driver and get better throughput.  If your deploying standard virtual machines that are doing lower workloads, continue to embrace the existing LSI Logic driver.

If you are new to vSphere 4, or have just upgraded from 3.5 and are starting to rebuild your templates to embrace virtual hardware version 7, don’t use the pvSCSI driver as part of your standard template.  VMware is working on the driver and will be introducing advanced coalescing functionality.  When this is built into the driver stack pvSCSI will then be able to be utilized for all workloads as it will understand when it needs to ramp up for higher workloads.

Thanks again to Scott Drummonds for taking the time out of his busy schedule to track this one down.

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

VMware Windows Perfmon counters missing in vSphere 4u1

I am attempting to pull together a blog post around performance, it’s going to be a four part segment on each I/O component of VMware, CPU, Memory, Storage and Networking.  My goal is to try and cover the various tools that you can use to help troubleshoot performance problems that you might experience in your virtual environment.

image

While I was going through some of the methods, I wanted to illustrate how VMware now includes Windows Performance Counters inside a guest virtual machine to assist with performance monitoring/troubleshooting.  I jumped on a test virtual machine I have, and pulled up Windows perfmon.  To my dismay the VMware counters are missing! We are currently running VMware vSphere 4.0 update 1 so I checked with a few other people online like Rick Vanover (@RickVanover).  It confirmed it seemed to be related to this specific release of vSphere.

I reached out to Scott Drummonds via Twitter (@drummonds), a performance systems engineer who works for VMware, and also opened a service request with support.  Scott validated that he saw the same issue and was launching an investigation.  Unfortunately the SR didn’t get very far as I was instructed that this was an “experimental feature and was removed from vSphere”.  Uhhh ok, I knew that wasn’t right so I waited to hear back from Scott.

Scott has since written a blog post that discusses this issue.  It looks like a complete uninstall of the VMware tools on the client followed by a re-install resolves the issue.  This does require a reboot for those that are not familiar with this process.  The problem appears to be related to mofcomp which it a tool that Microsoft provides and registers WMI information (such as VMware performance counters) with Windows.

Thanks to Scott for jumping on this so quickly and posting a fix to the issue, it’s great to see social media paying off in the real world.  Thanks to Rick for helping me figure out what was going on and validating some of my assumptions.  Rick has also written up an excellent blog post on this same issue.  Hopefully a patch will be rolled into the next minor release of vSphere 4 that will resolve this bug going forward.

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

VMware Virtual Center Attributes – It’s all about the details

eye-magnifying-glass

Overview

A fellow engineer extraordinaire (Mike Evans) inspired me to write up this blog post.  Mike and I have been using the “notes” attribute for virtual machines for a few years.  It has come in very handy to track who requested the virtual machine resource, and the date the virtual machine was provisioned.  If your not familiar with the notes field, it’s at the bottom of the summary page of a virtual machine properties page.

image

This little piece of information might seem trivial to the layperson but the larger your virtual environment grows, the more complex it becomes.  Having a way to track this fluid, ever changing infrastructure becomes more and more important as your begin to scale up and out.

Attributes

The “Notes” field was great for us except that we began to notice the variations of details that we had entered into each virtual machines properties.  Not probably a huge deal if you have a small VMware environment but when you start tracking several hundreds of virtual machines, it really starts having an affect on reporting.  A newer feature that was added to Virtual Center was the ability to use attributes, or pre-defined fields that can be populated.  This gives a VMware administrator the ability to have a common format for reporting on Virtual Infrastructure.  Below is a screen shot of the Custom Attributes you can find in Virtual Center:

image

Notice there are three different categories I have displayed in this view, Global, Host, and Virtual Machine.  You can set attributes at multiple places in your VI environment that you wish to track.  You can see we are interested in Virtual Machines attributes for certain variables (Owner, Provision Date, Provisioned By, Purpose).  We have a different interest at the host level (Build Date) for maintenance tracking purposes of physical hardware assets.

Reporting

Here is where all your hard work starts to pay off.

It’s audit time, you are tasked with trimming the fat in your environment because once again you are out of capacity, and the budget just got crushed for the rest of the year because “Insert your reason here” the UPS batteries just exploded!  Go into Virtual Center and generate a report of your virtual infrastructure so you can get a report of who owns what, and what date it was deployed.  Go to your Virtual Machine view, select your datacenter, go to the menu option “Export” and then select “Export List”.  Save the export as a Excel Spreadsheet, and view your results.  Notice the highlighted columns K through N, these are the custom attributes that we added above.

image

Conclusion

Virtual Center custom attributes are a great way to help manage your growing environment.  Sit down with your team, or your potential customer and find out what values matter most in your environment.  Create the custom attributed at the various places in Virtual Center.  Make sure you are diligent about filling out the details when you bring up new systems and make it part of your internal process and documentation.  You will thank yourself down the road.

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

VMware vSphere Capacity IQ Overview – I’m Impressed!

ciq-icon

With the launch of VMware vSphere came some new products that I hadn’t really paid much attention to (busy upgrading I guess).  One of the newer products is a Virtual Center reporting tool called Capacity IQ.  This product  gives an administrator the ability to analyze, forecast and plan for future growth across your ESX environment.  I have had a lot of experience with monitoring/reporting tools in the past, I won’t bore you with the details, so I was quite skeptical of a 1.0 reporting tool for Virtual Center.  I must admit I was blown away by the immediate relevant reports the product was able to produce.

After pulling down the trial install and obtaining the demo key, I loaded it up for a spin.  I am not going to document the installation steps needed as Eric Gray has done this for us already.  It by far is the easiest reporting application I have ever installed.  If your interested in taking it for a trial run, download the virtual appliance from VMware’s website here (OVF format).  Once you import the virtual appliance and give it a static IP address, it will need to collect data about your environment for a while.

There are three basic views that CIQ gives you once you install the plug-in, dashboard, views and reports.

Dashboard

The dashboard tab is designed to give you a quick overview of the item you have selected.  Capacity IQ uses the same approach as virtual center does, whatever object you have selected will be reported and focused on.  Here is a view of one of our clusters, notice January 11th on the Trend and Forecast graph on top.

Dashboard

One of our clusters was out of resources, I added two more physical hosts to the cluster.  You can see CIQ picks up the new physical host resources for the cluster and reflects this by increasing the number of virtual machines it believes the cluster can accommodate.  Want to see something even more interesting, check out the pink graph on the 17th.  Capacity IQ is already using a prebuilt formula to assume what it thinks we will have (or won’t have) a week out.  Pretty impressive.

Views

The views tab is designed  to give you a more detailed look on some of the specific data points.  Here is a screenshot of the various reports you can execute:

Views

So here is where you can get some great visual reports to present to either upper management, or a potential customer.  This gives you a nice interface that you can customize with data points that you can tweak.  Check out the first report on this cluster:

image

This gives you a graphical historical view of your cluster, how many virtual machines you have added over the course of time.  Notice the horizontal sliding bar at the bottom of the chart.  This allows you to adjust your variable time/date window.  The lighter shaded line to the right is the projected or forecasted growth of how the cluster might continue to grow.  The views tab is a great place to run some ad-hoc reports, gives you the ability to select the type of report, and even allows you to export the data.

Reports

The reports tab is the “pre-canned” reports that can be executed by the administrator.  The one thing I was disappointed to not see here was the ability to schedule these reports to run at a particular interval (weekly/monthly).  This is something that I assume will probably be introduced in future releases of the product.

Reports

After the report is executed and compiled, you are then provided with a .pdf or .csv version of your dataset to download and review.  The first report totaled 17 pages and provided some great technical information.  Here is the table of contents:

image

Conclusion

I am very impressed with Capacity IQ.  There are no agents you need to install across the virtual machines you wish to report against.  The installation was very straight forward, I think I had it up and running in about 15 minutes.  Once the virtual appliance was in place, all it needed was a little bit of time to start crunching some data.  The reports are well written and very relevant to what an administrator would desire and wish to see.  If your looking for a nice reporting tool to help you forecast, give this one a test to see if it fits your needs.

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

Mass upgrade of VMware Tools in Linux guests

linuximage

Installing and/or upgrading VMware tools has always been a bit more complicated for Linux guests than for Windows guests.  After the installation of the package binaries, the vmware-config-tools.pl script must be run to configure the tools for your environment.  This script has to be run from the console, which is a pain when you’ve got more then just one or two Linux VMs.  And may the good Lord help you if the modules aren’t suitable for your running kernel and you don’t have a compiler (or the C header files for your running kernel) already installed.

When VMware added the Automatic Tools Upgrade …

image

The situation certainly improved, but it is by no means a fool proof solution.  In my experience, it doesn’t work 100% of the time for Linux guests (though this *could* be due to the heavy modification I’ve done in my distro).  And furthermore, what if you want to automatically upgrade 100’s of Linux guests, not just one?  Or what if you’ve already got a deployment tool that you’d like to use to push the tools out?  (Kind of tough when the script needs to be run directly in the console)

So, I looked to see if there was a way to improve the situation.  First, I needed to find a way to run vmware-config-tools.pl remotely in an automated fashion.  And by the way, it’s not that you can’t run this script remotely via SSH because you can.  The problem is that when you do so, you immediately get following question …

 

It looks like you are trying to run this program in a remote session. This program will temporarily shut down your network connection, so you should only run it from a local console session.  Are you SURE you want to continue?

 

Unfortunately, to run vmware-config-tools.pl remotely, we need to include the –d flag so that the script will automatically select the default answers to all of the questions for us.  And the problem is, the default answer to this question is “no.”  

So I looked through the vmware-config-tools.pl and I found that it’s really only checking to see if the SSH_CONNECTION environment variable is set.  Well, that’s easy … simply executing vmware-config-tools.pl in a different shell allows us to side step this. 

Next I just created a simple bash script that gets pushed out to the /tmp directory along with the vmware tools installation package (also pushed to the /tmp directory) and gets executed remotely by my deployment tools (which for me are are just more bash scripts, but this should work with any enterprise deployment tool).  Here’s the simple script I used for my guests …

 

#!/bin/bash

RPM=`ls /tmp | grep VMwareTools`

rpm -e VMwareTools
echo "Old VMwareTools removed" > /tmp/vmware_tools_upgrade.log

rpm -i /tmp/$RPM
echo "$RPM installed" > /tmp/vmware_tools_upgrade.log

sh -l root -c /usr/bin/vmware-config-tools.pl -d
echo "vmware-config-tools.pl -d executed" >> /tmp/vmware_tools_upgrade.log

service vmware-tools restart
echo "vmware-tools restarted" >> /tmp/vmware_tools_upgrade.log

service network restart
echo "network restarted" >> /tmp/vmware_tools_upgrade.log

exit

 

This is obviously a very basic script and could easily be enhanced with better logging and error handling.  Also, for Debian distros, such as Ubuntu, you’d need to modify this script to handle the tar.gz installation package … unless, of course, you’ve modified your distro to handle RPMs (as I have).

The good news is that, at least for my environment:

  1. This works 100% of the time and a restart of the VMs is not necessary. 
  2. I no longer have to upgrade many guests by hand.

However keep in mind, there is still a network outage during the upgrade (usually just about a minute or two), so be sure to continue using a maintenance window for your upgrades. 

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

vSphere 4 Update 1 with Update Manager Shenanigans

New Year Lights 2010

Happy New Year!  I hope everyone enjoyed the holiday’s and got to spend some time with friends and family.  If your reading this I suggest you pay tribute to the quality of Virtual Insanity, and give the gift of voting.  Eric Siebert has released a “best of 2009 blog contest”.  If Virtual Insanity has helped you out in some way in the past I suggest casting a vote for this great virtualization blog space!  Ok onto the real reason for this post…

I ran into an oddity while bringing a new host online today into our vSphere environment.  And thought it best to publish my findings.  Hopefully this might save someone a support call.  With vSphere 4 update 1 came a couple of technical issues, which are detailed here and here.  Personally we don’t use ESXi so only the first one was a major issue for us.  We are an HP shop, so the issue around the HP agents and update 1 was a major concern (basically would render the host unbootable).  Luckily VMware support is proactive about announcing issues like this to the community and most people were aware of the problem right away.

The problem I hit today was strange and I thought it was just being off from work for a week.  I went to apply our update 1 baseline to a new host I was bringing up, rescanned, and then got this:

compliant1

What the?  I know this isn’t compliant, our base build is still at 4.0  Check out the build number, that’s proof.  I have used the update 1 baseline for 50+ hosts so I know it’s not that.  So maybe update manager is still on holiday as well, I restart the service and life is good?  Nope.  Same thing.

To make a long story longer, I poke around in the repository and check the update 1 patch and see it’s valid, yep 11/19/09 that’s the right release date.  Why is this thing not working?

update1-first

I kept poking and prodding thinking maybe they released an update to the update?  Sure enough it slipped by me when I wasn’t looking, or it went to my spam mail.  Check the date 12/9/09.

update1-second

I created a new test baseline, and dropped the 12/9/09 update 1 into it and applied it to my new host.  Low and behold:

compliant2

That’s much better.  Strange the older update 1 patch didn’t reflect anything and showed compliant.  As an end user I would have liked to have seen some type of error message, or a reference to the newer released update 1.  Ran the new update, (still stopped the HP agents just in case).  And now things look good again (build number):

looksbetter

Conclusion

Go vote for this site, and make sure you update your update manager, update 1 baseline.  That’s a lot of updates.  See you online!

Scott

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon