My Take On EMC VNX

Yesterday I had a marathon five-hour executive briefing with EMC, and I learned a lot about the VNX, and EMC’s strategy going forward.  This was good information, and I want to give my take on what I learned, as well as maybe open up some discussion in the comments.

First off, I have to say one thing about EMC.  Regardless what you think about their storage technology, and product line-up, their sales is absolutely fantastic!  I have yet to experience pre-sales as good as EMC.  They are polished, professional, and knowledgeable about their product.  They brought in a wide array of talent, including a vSpecialist, and there was not a single question they could not answer.  This is impressive, considering some of our questions were a bit. . . creative.

Aside from all this, I am most impressed with their ability to put in an awful lot of work to analyze my current environment, and figure out exactly what my needs are.  No one else has even come close in my recent experience.  This alone goes a long way toward overcoming what could be perceived as product weaknesses versus their competition.

In my opinion, the VNX launch is an attempt at addressing the lessons learned by EMC over the past several years.  I believe we can all agree that EMC was caught off guard by the ability of other vendors to integrate more tightly with VMware than VMware’s majority owner.  Their recent moves have shown their willingness to correct that, and a strong desire to leapfrog the competition.

Considering the combination of VNX, recent tighter integration with VMware, and the intense growth of EMC’s battalion of vSpecialists , one can only presume that they are willing to use brute force to become the VMware platform of choice.  I find it remarkable that a company with as many divisions and products can move with the degree of agility they have shown recently.

VNX introduces some interesting changes besides the concept of a truly unified platform.  Fiber Channel is out, and SAS is in.  This makes sense to me, considering the current pricing trends of SSD.  It no longer makes as much sense to go with FC as a tier of storage.  When SSD was 30x more expensive, not many considered it a viable alternative to FC, despite the huge performance advantage.  These days, it’s maybe 3-5x more expensive, and that’s easy to justify with the performance.  The price should continue to plummet as more manufacturers come online with more fab plants, so it won’t be long before there is price parity between FC and SSD.  This was the right call, in my opinion.

A new version of FAST VP is another significant change introduced with VNX.  When you lose an entire category of disks (FC), and only have SAS and SSD, you can eliminate the issues with FAST where you still could have hot spots on slower disks, and cool spots on your faster disks.  Now there are only two tiers, so there should not be an issue with data finding itself on the wrong tier.

There is no way I can go through all the features and software changes introduced with VNX, and that is not really my intent.  I do want to flesh out one area where I see a potential design flaw.  Of course, this is all based on my opinion, and I am not in the same league as EMC, NetApp, HPar, or XioTech storage engiNerds, so take it for what it is worth.

The whole idea of FAST Cache bothers me.  EMC is using SSD’s for caching.  While there are advantages to FAST Cache over a standard pool of SSD’s, they do not seem to justify this design decision.  FAST Cache uses 64k chunks, which is more flexible than normal FAST operation on an SSD pool, which is using 1GB chunks of data.  I can see the advantage of tiering things in smaller increments.  I cannot see any reason why EMC did not go with a PCI based cache option.

In my opinion, flash is so fast, wrapping it up in a traditional disk interface makes little sense for best performance.  I guess at this point, I should disclose that we use a few Texas Memory Systems RamSAN devices here, so i know how fast PCI based flash can be.  Maybe this skews my opinion a bit, but EMC engineers who were here yesterday were touting the VNX’s use of the PCI 2.0 bus that was so much faster.  I agree.  So why not use it for cache?

Maybe I am missing something, and maybe in the real world, it won’t matter.  But if EMC ever bothers to perform an SPC benchmark, I suspect we would see a bottleneck caused by this method of crippling SSD’s with SAS interfaces.  That said, no one is going to argue that replacing a hot swappable SSD is not 100x easier than shutting an array down to change out a bad PCI flash card.  I am just not sure the performance penalty is worth the extra convenience, for me.

If someone who knows a lot more than me can help me understand why this decision was made, feel free to comment below.  For now, I will hold out hope that maybe there will be a VNX-p at some point with some faster cache.

In the coming weeks, I will have similar briefings with a few other vendors, as we try and narrow down our choices for a new storage platform to replace our aged HP EVA’s.  If I come across anything else interesting, I will certainly pass it along to our readers here.

UPDATE: An Opportunity to Voice YOUR Feedback for the Next VCP Test!!

 vcp-logo

I am one of the SE SME content contributors to the VCP certification exam testing and blueprint process at VMware, and as we are about to start the analysis portion for preparation of  the next version of the exam (and no, I cannot share any timelines or version numbers since they are under NDA!), I thought it might be a good opportunity to get some direct feedback from the VCP certified readers of this blog to take back to my curriculum development colleagues. So this is YOUR opportunity to tell me what you like, dislike, would like to see changed, areas that need more or less content coverage, etc. Please be honest, but reasonable and constructive as well. I will take the feedback from the comments back to our upcoming curriculum development meetings. Let your voice be heard!

Test1

UPDATE: Thanks for all of the feedback folks! I have submitted the input to others in the curriculum dev team as we start to formulate the next tests. I have had one informal conversation with one of the test developers, and he agreed that we will be looking to not totally eliminate the min/max, but reduce it to -5% of the overall test question complement…So mission accomplished on your feedback! Let me know if you have other comments or suggestions!TMac

How I Made Millions of Dollars in 15 Minutes

OK good. . . the catchy title worked, and you’ve committed to reading this post. So let’s get to work on making your millions.

Lucky for us technology guys and gals, there are still plenty of C-level employees at companies everywhere who have yet to witness all the benefits of VMware first-hand. Steve Duplessie with ESG just published the following statistics:

• 58% of organizations have virtualized less than 1/3 of their servers.
• Thus far IT owned applications dominate what’s being virtualized. File/Print, etc. 59% haven’t virtualized ANY “mission-critical” applications.

So why are there still this many laggards? These execs have all read countless articles showing how much money they’ll save with virtualization, but often times they haven’t seen the benefits of virtualization beyond simple consolidation. I’m going to show you how to open their eyes in just 15 minutes.

I have done this demo multiple times for audiences from CEO, CIO level, all the way down to customer facing business executives. Each time it has literally been shock and awe, and although I can’t give all the detailed numbers, I will say that the purse strings were blown open and FINALLY our project has a green light. . . millions of them in fact.

This post assumes you have at least rudimentary presentation skills, and understand how to communicate well with people at the executive level.

Fitting vSphere’s “Greatest Hits” into only 15 minutes is not an easy task. It took weeks of careful planning, and test runs before I was able to get everything timed just right. Each time I have presented this, it has gone beyond 15 minutes, but only because there are lots of great questions coming from the audience. I expect you will see the same results if your audience is somewhat intelligent.

After going through a list of features I thought might be interesting to executives, I pared it down to just three in order to make my time limit of 15 minutes. If you have more time, that’s even better, but we all know these guys are very busy.

And that brings me to my first tip: Try and schedule your demo during the winter months so that tee times won’t conflict with your meeting.

Here’s how my 15 minute demo goes down.

Most of your audience will probably already have some idea of what virtualization is at this point, but it’s still a good idea to make sure you cover a few quick points on the basics. I do this while showing them the vCenter console. I show them the virtual machine view, and tell them about typical x86 workload resource usage, and how virtualization allows us to maximize our hardware investments, etc.

Important: Make sure you clarify for them what a “host” is, and whatever term you use for a guest, whether it’s “server”, “VM, or “guest”. You need to emphasize this point up front, and multiple times during your demo so people stay on track and understand what exactly they are seeing.

The first feature I show is a simple cloning operation. By now in your career, this may be old hat, but believe me your executive audience will be impressed by this.

Always make sure you give the benefits when showing any feature. I start my cloning operation while telling them what I’m doing, and then while it runs you’ll have time to explain the benefits. Try and tailor the benefits around how they will impact the customer. Whether the customer is an internal one, or an external one, your audience will appreciate this point of view.

Tip: Use something like BgInfo on your VM’s to show the server name so that your audience can follow more easily. If you don’t use BgInfo, at least change the wallpaper to show the name.

With server cloning, I touch on how valuable it is to be able to clone a server that is having issues so that Development or QC will be able to reliably duplicate a bug or issue. Explain how tough this is in a traditional environment where you have to try and duplicate an issue on a server that is not 100% identical. Also explain how a server can be cloned to test patching or an application update without impacting the production environment, or the customer.

This section should run 3-4 minutes, depending on your SAN speed. If your SAN is slower, do it on local storage to get it done faster. When it’s done, make sure you change the VLAN so you don’t get a conflict, and boot it up. You can quickly just login to show them the server is identical to the one you cloned. You don’t want to spend more than 5 minutes on this one if your allotted time is only 15 minutes.

This is a perfect time to share the next tip: TEST ALL THESE STEPS several times. Time yourself each time, and even do a few dry runs with your team if possible. This will ensure your demo comes off flawless, and that’s important. Believe me when I say some of these people are looking for a reason NOT to virtualize. Most of the time, it has less to do with virtualization, and more to do with fear of change.

My next demonstration is vMotion / Maintenance Mode. This will be mind-blowing for your audience, especially if any of them have a technical background. I start off by telling a story about how some fans have stopped working on one of our hosts, and we need to get the new fans installed before it overheats. (Make sure you know which host has your intended vMotion candidate ahead of time) We can’t wait for the maintenance window.

Normally in this situation, a second cluster node, or hot spare would have to be brought on line, which would mean a short outage for the customer. In this demo, we don’t have time to enter Maintenance Mode, so we’re going to vMotion a single server. I explain that this is how Maintenance Mode works, and how this will be transparent to our customers, and then I prove it.

Bring up a console session on the server to be vMotioned. I use an IIS server as an example, as it’s customer facing, and they understand that well. In my console session, I start a ping -t to another server. In this case, it’s an application server, which the IIS server needs to maintain contact with, or customers will be impacted. Then I execute my vMotion. You might need to explain what “ping” is, so that everyone is on board.

After the server vMotions, I show them that we didn’t drop any packets, and that the customer has not been impacted, and then show them that the VM is on another host. I always reduce my DRS automation level before the demo. I don’t want them to see other migrations happening while we’re demoing. That would spawn a discussion we don’t want to have right now. This takes us to the 10 minute point, barring any questions.

Inevitably at this point, someone usually asks what would happen if the server just failed with no warning. This plays right into your hands, as it’s the perfect segue into your next feature. Fault Tolerance. If they don’t ask, then you ask.

For FT, I setup an FTP server using FileZilla. You can setup whatever works best for your business, but make sure it is something that can clearly demonstrate that customers will not be impacted by an outage. I have preselected a “customer data file”, and setup a simple “FTP Client” VM with FileZilla Client.

I did have to adjust my incoming FTP speed for the server so that the file wouldn’t complete too quickly. You’ll want to make sure you have enough time to test a failover operation, and show the file transfer still going from both the client, and server perspective. So either select a huge file, or bump down your bit rate for the client in the FTP software.

Open up a console session on the FTP server, and then point out the “secondary” instance. Open a console session to the secondary. With both sessions side by side, poke around in the primary and open some windows, a browser, or whatever. You’ll want to demonstrate that the servers are in lockstep with one another. Open a console session to the FTP Client.

At this point, I explain how the customer is sending us this file, and start the transfer. I then explain that the particular host that the FTP server lives on is going to go down without warning. Show them the host name. You can simulate this with the Test Failover option.

It will take less than a minute, during which you will want to point out the bytes transferred on the server, and the client. Point out that the client has no errors, and then you’ll see the secondary come back online.

Again, show the host name so they can see that it has indeed changed servers. I found it helpful to time the file transfer so that it would complete right around this time. Then you can show them the server, the completed file, and the client, once again explaining how the customer has no idea that a server went down in our datacenter.

If you don’t get gasps or applause at this point, you did it wrong.  Once again, PRACTICE this over, and over before taking it to the executives! You don’t want to be up there looking like Bill Gates demoing Win98.  You want to look like John Chambers demoing the Cius.

Wrap up with an explanation that these are just three of hundreds of VMware features, and then answer the questions that follow. At the end, these people will be frantically searching for their checkbooks. Your millions should start flowing after the next budget committee meeting.