Leverage VMware vCops to pull custom vSphere statistics

vcops_perf

Introduction

A couple of weeks ago I was presenting at the regional Columbus, OH VMUG on “Troubleshooting Storage Performance in vSphere”.  The content was put together by our internal storage guru Joseph Dieckhans and I modified some of the content along the way.  If you are interested in seeing the presentation, you can view it here.

The presentation covers a lot of great information on troubleshooting with ESXTOP and identifying the various subcomponents of the storage stack that are important to monitor.  When I deliver this presentation, it typically brings up some great questions and conversations.  One of the questions that was asked was around VMware vCenter Operations Manager and it’s ability to monitor storage.  My answer was yes, vCops will do a great job monitoring your storage infrastructure, as it uses analytics to understand your storage performance and will send smart alerts when there are anomalies.  But the customer wanted to know if we were specifically  monitoring all of the components in ESXTOP that we were covering in the session.  Good question!

vCops Metrics

I decided to dig into this one to see if there were any gaps between good old ESXTOP and vCops so let’s take a look.  Below is a screenshot of the vCops disks statistics that are being monitored for the various LUNS.  In this example I am showing you a iSCSI device being presented to the ESX host. 

vcops-disk

As you can see vCops is monitoring latency, Kbps, and SCSI reservation conflicts.  That’s a pretty good list of metrics that you would want to know about if you suspected a problem with the storage infrastructure.  I think even CTU’s very own technical specialist, Chloe O’Brian, would be happy with those metrics.

chloe24

Get more Detail

If you think you’re better than Chloe, and need more detail than what’s provided out of the box with vCops, have no fear.  VMware vCops is very flexible and you can customize the data feeds in a lot of different ways.  You might have recently seen Clint Kitson’s posts around injecting metrics into vCops.  This was the first phase of EMC integrating their storage specifics metrics into vCops for analysis and reporting (unsupported).  EMC is working on an official adapter that their customers will be able to leverage if they are a VMware vCops customer.  I expect we sill see more and more storage vendors offering up a supported adapter for vCops in the future.

Powershell is a great way to pull VMware performance data. You can utilize “get-esxtop” or the “get-stat” commands the get the same visibility as what is covered in the troubleshooting storage presentation.  Let’s see if we can add more details to vCops than what is given to us out of the box.

PowerCLI commands

Let’s start with an important metric we covered in the presentation.  Let’s get the metric “KAVG” from PowerCLI and have it display data back for a system we are interested in monitoring.  Here I am utilizing the PowerCLI command “get-stat” to pull some statistics on the VMKernel and it’s associated latency.  (Should be below 0 ms, if above 2ms you should investigate!).

get_disk_stat

Connect-VIServer -Server [YOUR HOST] -User root -Password [Your Password]
get-stat -instance [YOUR DEVICE] -Stat disk.kernellatency.average

Here are the returned values I get back from the above query:

getstat_value

Let’s format the data results for vCops just append the following to the end of the above command so it looks like this:

Connect-VIServer -Server 192.168.1.101 -User root –Password  REDACTED get-stat -instance naa.5000144f05346019 -Stat disk.kernellatency.average | sort timestamp -desc | select -first 1 | select @{n="name";e={$_.instance}},value

 

Ok great, now we have the data points I am interested in so let’s take it into vCops with the work Clint Kitson and Matt Cowger put together.  The following powershell script now takes the output and passes it off to vCops via a http post command.

http_post

C:\Program Files (x86)\VMware\Infrastructure\vSphere PowerCLI> C:\Users
\ssauer\Desktop\kavg.ps1 | C:\Users\ssauer\Desktop\ps_vcops_httpost.ps1 -vcopsip
192.168.1.220 -devicename iSCSI -resourcedescription "iSCSI KAVG" -devicetype p
s-vmware-esxtop -protocol https -vcopsuser admin -vcopspass *REDACTED* -post;sleep
60

 

Let’s login to the vCops custom UI and check out our data to see if it’s posting correctly.  (https://(VCOPS-IP/vcops-custom).  Navigate to the environment tab at the top of the screen, then select the option “environment overview” to find the new http post.  It most likely will show a blue icon as vCops hasn’t had enough time to baseline the data to understand the dynamic thresholds.

vcops_data

The above data graph isn’t really that sexy, since my home ESX lab host isn’t being worked hard enough to calculate.You can now setup a task to run the powershell script every x amount of minutes to automate the data pull.  From here you can now create a customized dashboard for the specific data metrics you would like to present back to your operations team or possibly your manager to show him why you deserve a raise.

Conclusion

The question about getting ESXTOP data into vCops has now been answered.  With the example above you can now pull some specific ESXTOP or statistics into the product.  This is obviously not an approved or supported method, and certainly not a method I would recommend implementing in a large scale fashion.  It is a helpful utility that you can leverage for troubleshooting performance problems in your storage stack.  Not only do you have a visual representation of these data metrics, but you are now leveraging the vCops patent analytics to start getting smart alerts on data anomalies.

-Scott