The vTrooper Report

I was asked a question about a specific use case where a second vCPU should be added to a VM in a Virtualized environment. Generally its an easy answer;

If the server can execute multiple threads and really uses the second vCPU for that other tread

then it’s probably OK to add the second vCPU to the VM

Now adding a vCPU in a server to make it SMP oriented is an elementary task in VMware, but has a few impacts:

  • It will change your metrics for reporting
  • It changes the HA slot size for your failover needs
  • It will modify your consolidation ratio per core and indirectly per socket affecting your Capacity Planning plans
  • It will make you redeploy your Ubuntu or Linux server that you forgot to compile with an SMP kernel. (Not to be taken lightly or your server won’t boot)

I was exploring the use case and impacts when a bit of information popped up:

Garbage collection on .NET applications will require a second vCPU to perform in ‘Server Mode’ versus ‘Workstation Mode’

Explaination from MSDN: http://msdn.microsoft.com/en-us/library/bb680014.aspx

Managed code applications that use the server API receive significant benefits from using the server-optimized garbage collector (GC) instead of the default workstation GC.

Workstation is the default GC mode and the only one available on single-processor computers. Workstation GC is hosted in console and Windows Forms applications. It performs full (generation 2) collections concurrently with the running program, thereby minimizing latency. This mode is useful for client applications, where perceived performance is usually more important than raw throughput.

The server GC is available only on multiprocessor computers. It creates a separate managed heap and thread for each processor and performs collections in parallel. During collection, all managed threads are paused (threads running native code are paused only when the native call returns). In this way, the server GC mode maximizes throughput (the number of requests per second) and improves performance as the number of processors increases. Performance especially shines on computers with four or more processors.

This caught me by surprise and makes me think;  for every disk of VM’s around the world which are out of whack (mis-aligned) , there are an equal number of .NET app servers that have been virtualized with P2V tools across the globe that are starved for the correct garbage collection mechanism….

OH, THE HUMANITY !!

All would Perish

Hindenburg and .NET

Whoa!  I gotta settle down!

Ok

Now you are going to ask where the special override switch or Regedit value would be used to fix it.  The answer is even more easy.    There isn’t one.  .NET sees one vCPU or more and decides for the app.  You cannot override it.  You can add 4 vCPU’s to improve its performance but not turn it off.

This probably explains a few of the things that have already happened or will happen in your app development world:

  • The .NET development on dual core workstations is working fine and when you move the application to a single core VM the development process hits a hiccup in performance while GC runs.
  • VM admins who have been adverse to the second vCPU that was idle now have a reason to deploy a second vCPU but won’t like it.
  • It drives a reason to migrate to vSphere sooner than later due to the relaxed CPU scheduler that was introduced in 4.0
  • Additional vCPU’s will drive more ‘Eggs’ into your baskets – Do Not Panic

As all the worlds workloads increase its only inevitable that the number of vCPU’s would increase as well.  The push for 64-bit systems with over 4GB of RAM are driving up the size of the VM in most farms as seen by the new maxims in the VMware vSphere release.  Just remember that you can look for vCPU contention and NUMA pressure in the ESXTOP values.

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

  • http://twitter.com/kculw Kelly Culwell

    Nice one, John! I love explaining to people how multiple vCPUs can sometime hurt performance as well.

  • Stu

    Excellent post – too often we focus purely on the infrastructure side of things (ie misaligned disks) with zero regard for implications above the OS.