More Bang for Your Buck with PVSCSI (Part 1)

One of the new features that was added to the release of VMWare vSphere 4.0 was a new SCSI subsystem driver that allows more I/O and less latency per virtual machine.  What the heck is PVSCSI?  Here is the technical definition stripped right from the vSphere storage guide. (RTFM).

scsicable

VMware Paravirtualized SCSI (PVSCSI) is a special purpose driver for high-performance storage adapters that offer greater throughput and lower CPU utilization for virtual machines. They are best suited for environments in which guest applications are very I/O intensive. VMware requires that you create a primary adapter for use with a disk that will host the system software (boot disk) and a separate PVSCSI adapter for the disk that will store user data, such as a database.  The primary adapter will be the default for the guest operating system on the virtual machine. For example, a virtual machine with Microsoft Windows 2008 guest operating systems, LSI Logic is the default primary adapter. The PVSCSI driver is similar to vmxnet in that it is an enhanced and optimized special purpose driver for VM traffic and works with only certain Guest OS verision that currently include Windows Server 2003, 2008 and RHEl 5. It can also be shared by multiple VMs running on a single ESX, unlike the VMDirectPath I/O which will dedicate a single adaptor to a single VM.”

So what does all that mean for you?  Better disk performance and less CPU cycles spent on processing these disk requests.  I took some notes at VMWorld 2009 during a few different sessions that discussed PVSCSI.  Here is my logical diagram of what PVSCSI is.  Download the PDF version here so you can print it out and frame it on your cube wall!

Sauer_PVSCSI.pdf

PVSCSI

With the release of vSphere 4.0 PVSCSI was only supported on disks other than the operating system (secondary data drives).  For more information on this, reference KB Article: 1010398.

vSphere 4 update 1 is now released and it’s exciting news for those looking at utilizing PVSCSI.  Support for boot disk devices attached to a Paravirtualized SCSI ( PVSCSI) adapter has been added for Windows 2003 and 2008 guest operating systems.

So let’s first find out if it’s all that.  We need to do some testing to validate the hype.  I created two virtual machines, one with the traditional LSI Logic SCSI driver, and one with the new PVSCSI driver.  The host is the same for each VM, 4 socket Intel Xeon system with 64 GB of RAM, connected to EMC Clariion CX3-80 storage.  The Raid configuration is a 4+1 RAID 5 set (10K spindles), with the default Clariion Active/Passive MRU setup (No PPVE).  Each VM has 2 vCPU’s and 4 GB of RAM and both are running 32 bit Microsoft Windows 2003 R2.  Both Virtual Machines data disks were formatted using diskpart and the tracks were correctly aligned.  Anti-virus real time scanning was disabled on both systems.  This test is meant to get as close as possible to a standard configuration that we can benchmark from.

I used IOMETER as my testing engine.  I didn’t go too deep on the various settings.  The first run is 32K 50%R 0%W.

Non-PVSCSI

no-pvscsi

With-PVSCSI

with-pvscsi

Quite the difference, no?  To be honest, I was seeing a lot of fluctuation while doing my tests.  I probably should have segregated things out a little more, but the screen captures were the average of the results.  I was thinking maybe I should use the built-in random IOmeter combined results.  So here you go.

Non-PVSCSI

no2-pvscsi

With-PVSCSI

with2-pvscsi

I believe the results speak for themselves.  I need to do a little more testing for my own personal preferences.  I want to get a more insight on what the differences are on the reads/writes and the various sizes.  I am certain cache has a lot to do with the results, but I think IOmeter can bypass cache since you force the randomizer.  I’m also curious about the sweet spot on the block sizes and how that plays out with read vs write.

Conclusion

PVSCSI is a technology worth moving towards.  There is no cost involved, and it can deliver better disk performance across your ESX environment.  It also can bring your host CPU utilization down, which can provide you with better consolidation ratios across your clusters.  Stay tuned for part 2, when I am going to provide the “how to do it” aspect so you can begin to leverage this technology you are already paying for!

Hope you found this information helpful.  Thanks goes out to Aaron Sweemer (@asweemer) for allowing me to abuse his website and not having to deal with bringing up my own site.

Thanks!

Scott Sauer