ESXi Performance Caveats on Drobo Elite

Several months ago, a small firm I consult for ordered a Drobo Elite (recently replaced by the B800i).  These guys had run ESXi for a while in one of their environments, and were wanting to explore some of the features requiring shared storage.  Like most small businesses, they wanted to get there without breaking the bank.  There aren’t a ton of options in the $6-7k range for iSCSI arrays on the VCG, so it was an easy choice.

Their CIO called up Drobo and placed the order.  He explained what they were going to use it for, and the guy configured it right over the phone and shipped it out.  A few days later, the Drobo Elite arrived configured with 8 x 2TB Western Digital (WD20EARS) disks at a cost of just under $6k.

Setup in ESXi was straight forward.  I followed the documentation from Drobo and set the PSP to VMW_PSP_MRU and SATP to VMW_SATP_DEFAULT_AA and started throwing VM’s on for testing.

The initial tests were okay.  I wasn’t really bouncing around the room yet, but I am used to larger FC array speeds.  Once I saw that IOmeter was pushing the expected number of IOPS, we were ready to throw on a few VM’s.  For some context, we’re talking about a 100 person company with about 20 servers in total.  They’re running 50% of those on ESXi right now on two hosts.  Once normal daily production started with 3-4 VM’s hitting the Drobo, everything screeched to a grinding halt.

Latency, as reported in ESXTOP, was showing 4-5000ms, and there wasn’t any single workload that was giving it a tough time.  I went back in and double checked the iSCSI config.  All the bindings were correct, as were the PSP and SATP.  Nothing had changed except adding a couple more VM’s to the Drobo.

I began to suspect the switch was misconfigured, so I pulled it out, and went direct to the Drobo.  That didn’t really yield a noticeable improvement.  After troubleshooting this forever, and deliberating on the phone with Drobo, they announced their verdict.  Apparently the WD “Green” drives are not supported with VMware.  They said we’d need to buy the Black drives.

Their site quickly confirmed.  But again, since Drobo configured the unit, knowing it was for a 2 host VMware environment, we both assumed the Green drives were sufficient.  The extra cost of the Black wasn’t warranted for this environment.  I could understand if the customer had gone out and bought some random drives, but they came with the unit directly from Drobo.

They had us run some of their own IOmeter tests directly connected from a Windows box using the MS iSCSI initiator.  We then went ahead and swapped the disks for the recommended WD Black disks, and below are a few charts showing the results.

 

The Black is faster in every way, but the most noticeable aspect is write latency.  I suspect this is due to the increased processing power and faster cache.  Nevertheless, the results speak for themselves.

Bottom line is, if you’re going to run ESXi on a Drobo, don’t go green!

vSphere 5 and the new vSphere Distributed Switch – NetFlow

 

image

 

 

Introduction

With vSphere 5 comes a plethora of new features and functionality across the entire VMware virtualization platform.  One of the core components that got a nice upgrade was the vSphere Distributed Switch (vDS).  For those of you that have not had the chance to use the vDS, it is a centralized administrative interface that allows access to manage and update a network configuration in one location as opposed to each separate ESX host.  This saves vSphere administrators or network engineers a lot of operational configuration time and/or scripting activities.   The vDS is a feature that is packaged with Enterprise Plus licensing.  Here are some of the new features that are included with the vDS 5.0:

  • New stateless firewall that is built into the ESXi kernel (iptables is no longer used)
  • Network I/O Control improvements (network resource pools and 802.1q support)
  • LLDP standard is now supported for network discovery (no longer just CDP support)
  • The ability to mirror ports for advanced network troubleshooting or analysis
  • The ability to configure NetFlow for visibility of inner-VM communication (NetFlow version 5)

 

NetFlow Basics

I could do a write-up on each one of these components as they are all worth discussing in more detail, but I wanted to focus on the NetFlow feature for this post as I think it’s an awesome addition.  NetFlow has had experimental support in vSphere for some time, but now VMware has integrated the functionality right into the vDS and is officially supported.

NetFlow gives the administrator the ability to monitor virtual machine network communications to assist with intrusion detection, network profiling, compliance monitoring, and in general, network forensics.  Enabling this functionality can give you some real insight into what is going on within your environment from a network perspective.  Having “cool features” is a nice to have, but having features that you can utilize and show value back to the business is a completely different value add.

Let’s look at how to setup NetFlow on the new vDS, then take a look at the data you can extract from NetFlow with a third party NetFlow viewer.  Once you see the value of the data, you can then make some important IT business decisions on how you need to mitigate risk and protect your investment by getting ahead of the curve (aka VMware vShield or some other third party software).

 

Setup your vDS 5 Switch

Ensure you are running VMware vSphere 5.0 and have activated Enterprise Plus licensing to setup the vDS switch in your environment.  You can see below the new option to deploy a vDS 5.0 switch, and of course we offer backwards compatibility for those that need to deploy to their 4.x environments.  Select the 5.0 version and hit next.

 

image

In the “General” section give the vDS a name, in this example I am giving him “dvSwitch5”.  Select next the number of network interface cards you want to participate in the switch and then select next.

 

image

For each host in your cluster that you wish to participate in the vDS, you will need to configure the network interfaces that will support this vDS implementation.  In this example I have selected vmnic 4 and vmnic 5 to be members of the vDS 5 switch.  Select next.

 

image

That’s it, review the summary and select finish for your vDS configuration to come online and begin configuring NetFlow.

 

image

 

Setup Netflow on the vDS 5

Now you have a fully functioning vDS 5.0 switch, you can actually start to use it!  First let’s go ahead and configure NetFlow on the dvPortGroup, then we will move some virtual machines over to the new vDS so we can get some real data flowing.  Right click on your newly created dvSwitch and select “edit settings”.  Go to the “NetFlow” tab across the top of the page.  You will need to give your vDS an IP address so your NetFlow tool will know where to collect the data from.  Populate an IP address for the vDS, then you will need to enter the IP address of the collector you plan on using to pull the data from.  Make sure you enter the correct port number (default is 1) for how you setup your NetFlow application to communicate.

 

image

Right click on the dvPortGroup within the vDS and select the “monitoring” option and enable NetFlow so you can begin to collect data.

 

image

Move a few VM’s over to the new vDS so you can begin to capture some real data within your newly established NetFlow configuration.  I have highlighted below how you can change the network connection on a VM to now utilize the dvSwitch5 we created earlier.

 

image

Pull Some Data

You will need to utilize a third party NetFlow analysis tool to parse the data we have started to generate.  In the example below I am using a pretty nice application called Manage Engine Netflow Analyzer.  I won’t be covering how to install or setup this application here, as your organization might already have some network tool that they have standardized on.  Once you have moved some virtual machines over to the new vDS, ensure you start to create some traffic so there is some relevant data to examine.  Below I ran a few speedtest.net downloads, and hit some websites to make traffic appear below.

 

image

Below you can see the different virtual interfaces on my vDS that are being monitored.  You can see our application is showing us what type of traffic we are examining, and the consumption of the different tcp/udp ports that are communicating both inbound and outbound on the switch.

 

image

The “under the covers” reporting is great stuff, but let’s start to look at how this can help the business.  Consider a VMware View environment where you are supporting hundreds if not thousands of desktop images.  You can use the NetFlow data to start to examine if certain VM’s are communicating to production systems that they shouldn’t be communicating to at all.  How about reducing the overall workload on your VMware View ESX server?  Many of the NetFlow products like the one I am showing here will produce reports on where users are going externally on the internet.  See the report below.  YouTube is probably a website you want to keep an eye on, as streaming video can greatly impact a virtual desktop environment.

 

image

From an intrusion detection and compliance perspective, you can now gain visibility into the vSphere environment to begin to understand some of the network communications that are taking place.  See below:

 

image

 

From a risk mitigation perspective, VMware can help you eliminate these security vulnerabilities that you are beginning to gather data on.  VMware vShield has three different solutions that can help protect your environment from the edge to the core.  I would suggest to examine segmenting and protecting your internal workloads to eliminate these security risks.  From a virtual desktop perspective, the desktop workloads are better served being contained in their own protected segment (VLAN’s are broadcast domains not protected segments).  Below is an example of how a logical vShield configuration can begin to help you segment your virtual infrastructure.

 

image

Conclusion

VMware vSphere 5 offers some great new features that are integrated into the new vSphere 5 Distributed Switch.  Start to leverage your existing investment by examining your network infrastructure with the NetFlow data you can now begin to extract.  Once you have gathered this data, begin considering how you can mitigate some of the security and compliance risks within your organization.  VMware vShield is a product that can help you in this regard and will integrate into your current environment.

 

-Scott

VMware Licensing Debate Post-Mortem

 

BREAKING – PALO ALTO (VP)

The VMware licensing debate was killed this afternoon while trying to rescue the #vTax hashtag from the inside lane of the Ridiculous Interstate.  Witnesses say a bearded, balding “smart-looking” man was driving north at a very high rate of speed in a truck with the license plate VMW when the debate was struck.  The truck backed up and struck the debate again and again before authorities arrived and pronounced the debate dead at 5 PM PDT today.

I am writing this as a blogger at Virtual Insanity, and a customer of VMware. I don’t sell VMware, and I’ve never worked for VMware. I don’t even work for a partner. I barely get to chat with my fellow bloggers who work for VMware, and am certainly not privy to inside information, despite my company’s NDA.

With that out of the way, VMware has done the right thing here. The fact that they can take customer feedback and mold it into a dramatic licensing change, just a few weeks before a product GA’s, is astounding. That speaks not only to the agility of the company, but their willingness to please their customers.

They even went out of their way to please NON-PAYING customers with this change. The change to the free version was causing more drama than the change to customers who spend millions with VMware.

Should VMware have focus-grouped the licensing change more than they did? Yes. It would have preempted the customer perception wildfire they have had to fight for the past couple of weeks. I am sure they ran the numbers and knew that only a small percentage would be impacted. But the fact is an even smaller percentage actually ran the scripts to see how it would affect them. Once the fervor got started by a few, it wasn’t going to stop.

A price increase was inevitable. VMware has given us HUNDREDS of new features in the past several years for free. I think not increasing it with 4.0 was the right move, but they couldn’t hold out forever. The new vRAM allotments and policies are spot on, and are going to put a lot of customers’ fears to bed.

Now we can get on with discussing the amazing new features of vSphere 5.0 without that licensing cloud hanging over our heads.

Kudos VMware!