While cruising the show floor at VMworld San Francisco, the Xiotech booth seemed to be abuzz every single time I walked by.  Finally, I stopped in to see what all the commotion was about.  It was not about Xiotech at all.  It was about iPads.  As soon as I stepped close to the booth, there was a rush to scan my badge so I could win an iPad, but not a real clear picture of what the heck Xiotech actually did.  I moved on to the next booth, and I suspect most of the 17,000 attendees did as well.

The first time I heard about Xiotech was a year or so ago, and at the time, the only thing I understood was that they offered storage in a sealed box that was supposed to be more reliable than your average monolithic array.  At the time, it sounded far-fetched to me.  I didn’t see how they could make those claims while using the same disks we all use in some special locked box.

It wasn’t until I heard Xiotech’s CEO Alan Atkinson on an Infosmack podcast that I decided to investigate further.  Alan didn’t really go into details, but after hearing him talk about some of the heavy hitters on his team, I was intrigued.  Their own website doesn’t really spell out in detail what they are doing that makes them stand out.  I had to make a call and talk to their engineers.  What I found was so amazing, I have to share it here.

Since Seagate is one of Xiotech’s major shareholders, these guys have unfettered access to the inner workings of the disk drive, from the firmware on down.  Inside a Xiotech “magic box”, the disks are the exact same model as one would find spinning inside an array from another manufacturer.  The firmware is where the magic actually happens.

Apparently disk drives have all kinds of cool things they can do besides reporting your typical “OMG I’m going to fail soon!” messages.  Seagate reports that nearly 80% of the “failed” disks they receive are actually fine.  They cannot find any issues whatsoever.  What does Seagate do with those disks?  They remanufacture them and toss them back into the “refurbished” bin.

Although the disk has the capability to “remanufacture” itself in the firmware, it cannot be done inside your traditional array.  The vibration is simply too high to do this reliably.  A typical shelf full of rotating disks vibrates at over 40 rads (units of rotational vibration).  When a dozen or more disks are all rotating at the same speed, in the same direction, it’s not hard to imagine the vibration in a tray of disks.  Xiotech actually mounts its disks so that they are counter-rotating.  One disk rotates clockwise, and the disk beside it is mounted so that the rotation is in the other direction.  This reduces vibration inside the Xiotech box to 2 rads.  This means they can reliably remanufacture a disk inside the box while the array is in operation.

Cool huh?

Here’s a breakdown of what a Xiotech array does when it detects a potential disk error:

  1. Data is migrated from the suspect disk to another disk elsewhere in the array
  2. Disk is power cycled
  3. Complete factory remanufacturing process
  4. Recalibrate heads
  5. Rewrite servo tracks
  6. Perform a low-level format

It gets better.  The unit of storage inside the Xiotech is not actually the disk.  It’s the head.  This means that each individual platter is a unit of storage, and the data is striped as such.  Why is that a big deal?  For one, as disk drives get larger and larger, they take longer and longer to rebuild.  This increases your exposure to another failure.  Second, if the most common catastrophic disk failure is a head crash, wouldn’t it be nice just to be able to disable that particular head and move on?  When all the above steps are complete, if a head is still not responding, it will be disabled, and that platter will not be usable.  The rest of the disk is good to go.

The net result of all this firmware magic is that the Xiotech array comes with a 5 year warranty at no cost.  Why not include it when you’re seeing 99.9983% uptime in the field?  To prove the point, Xiotech took 200 disks that were marked “failed” and returned to Seagate, and stuffed them into a rack of Xiotech arrays.  They ran for TWO YEARS without a service event.

So we’ve established reliability.  What about performance?  Disk drives aren’t really getting any faster.  How can we squeeze more performance out of spinning disks?  Xiotech has answered that with yet another firmware tweak.  Often times, the disk subsystem has to wait for data to be read because the head is busy reading another part of the platter.  Even under optimal conditions, disks have a few ms of latency built in for the heads to move.

One of the storage geniuses over there actually came up with a plan to have the heads constantly move back and forth across the platter.  When I heard this, I wanted to fly out to Minnesota and buy this guy a beer.  If the heads never actually park, then statistically speaking, the head will always be closer to the data it needs.  What does this mean for you and me?  It means that a single 3U Xiotech ISE performs at 12,600 IOPS on the SPC-1 benchmark.

Since the cache, and controllers are on board each ISE, these IOPS scale linearly, as opposed to a monolithic array which could run out of gas if its controllers get saturated.  So with 5 ISE’s at 15U, one can expect to hit 63,000 IOPS.  This is clearly not your father’s storage array.

Xiotech has their own storage virtualization appliance with the ISE 9000 for larger enterprises, which works quite well with VMware, and is ICON capable for ease of integration with just about anything.  This would also be absolutely amazing behind some virtualization from FalconStor, NetApp, Nexenta, or really any of the other storage virtualization products out there.

With all these patents, and truly ground breaking technology, one is left wondering why Xiotech has yet to secure a huge OEM deal.  Chris Mellor did a nice write-up on this topic, and his argument is that manufacturers are worried about sourcing disks from only one manufacturer.  If Seagate were to have a bad batch of disks go out, it could be crippling.

I can understand his point of view, but I was born a skeptic.  Considering that storage array manufacturers generate huge chunks of highly profitable revenue from servicing arrays, I doubt we’ll be seeing a commercial with an EMC / NetApp, or HP guy out fishing with the Maytag repair man anytime soon.  Regardless whether they land a huge OEM or not, Xiotech has proven that spinning disks are far from the end of their useful life.

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

  • http://twitter.com/StorageTexan StorageTexan

    Brandon – great write up!!! I think you really captured some of the solid points of the #Xiotech ISE architecture. The guys that developed this “storage brick” spent 5 years re-defining what you can do with disk drives and storage controllers and it shows in this 80+ patent solution!!!

    Innovation RULEZ !!!!!

    @StorageTexan

    • http://twitter.com/BrandonJRiley B. Riley

      Thanks Tex. I wanted to get this out there b/c I think only the hardest of hardcore storage freaks know about the technology here. What I found most surprising after meeting with the engineer for a couple hours was that NONE of this is on the website. There is emphasis on reliability and performance, but no “why”.

      Then again, the entire storage industry is saturated with marketing and not enough real information, so Xiotech is not alone.

  • http://twitter.com/barrymmartin Barry Martin

    Brandon, you have hit the points head on and that is why we lead with Xiotech in our storage practice for our customers. The product just works and works well.

    @barrymmartin

  • Anonymous

    This is very informative and exciting news. If there is one thing that keeps me up at night it is the potential for hard drive failure on my servers. Even with backups and hot swappable drives, a drive failure can devastate an array under the right circumstances.

  • Pingback: Storage Basics – Part VIII – The Difference in Consumer vs. Enterprise Class Disks and Storage Arrays; or ‘Why is the SAN you are proposing so darn expensive?’ | VMtoday

  • Anonymous

    I have a customer using this soution for a very busy OLTP SQL instance and have been impressed with the performance. Great write up.