Back in June I was asked to build and captain the vFabric SQLFire lab at VMworld. Now, I’ve been a big champion of the vFabric technologies since the beginning, so I was certainly excited to be given the opportunity. But I don’t think I fully realized just how fortunate I was at the time. Because not only did the experience force me to go deep into the technology, but it also forced me to focus on a layer where I previously had little direct involvement in my career. Let me explain.
We often like to think of the cloud in three distinct layers: Infrastructure (IaaS), Platform (PaaS) and Software (SaaS). But in my opinion, these categories may be a bit too broad because there are definitely layers between the layers. For example, where does the data live? Infrastructure guys often mentally push the data up into the Platform layer, while application guys often mentally push the data down towards the Infrastructure layer. Obviously, regardless of which side of the infrastructure-platform fence you live on, we all “touch” the data all the time. And of course we all recognize how vitally important the data is. But unless you’re a DBA, the data usually ends up being a problem for someone on the other side of that fence.
So now I mentally place data in its own layer in between the Infrastructure and the Platform, where it in many ways (at least in my mind) serves as the “glue” which binds the two. You might be asking yourself, “What’s his point?” or even “Who the heck cares?” Well, I want to make the distinction for a couple reasons.
First, I believe the way we categorize and compartmentalize things in our mind has a dramatic affect on our focus and behavior. Mentally misplacing important concepts in the wrong compartment usually leads to confusion and misunderstanding, and we can miss tremendous opportunities. However the correct mental placement will bring clarity, focus and potentially open up a whole new world to us. So now for me, instead of just thinking of data as something that lived in file somewhere or in a database that some DBA was responsible for, a whole new world has been unlocked. I’ll come back to this point in a bit.
Second, lots of Infrastructure folks are a bit concerned about their future due to the high degree of automation and integration happening in the Infrastructure layer right now. We’re starting to hear whispers of things like “infrastructure guys need to move up the stack or they’ll be left behind.” Scott Lowe’s recent post The End of the Infrastructure Engineer? not only articulates the concern well, but he also suggests the concern may be unwarranted. I’m not sure I fully agree with Scott, but I’m simply trying to highlight that the concern is out there and it’s growing. Shoot, I know I’ve certainly implied numerous times here on this blog that we all need to start moving towards the application/development space (here, here and here). But would I actually go so far as to say that everyone needs to stop what they’re doing and go buy the latest copy of Programming for Dummies? Probably not.
What I do believe, however, is this new data layer (not that the data layer is actually new, of course) may be a way for infrastructure engineers to stay relevant as the world moves towards application centric clouds. It may be a way for us to “move up the stack” by taking a few steps, rather than a career changing leap of faith. After all, building skills in this layer isn’t a shock to the system because, like I said in the beginning, we’ve all indirectly worked with the data layer our entire careers. Whereas application development is a completely different world for an infrastructure engineer (and vise versa), data lives much closer to home. Infrastructure and data are like “kissing cousins” … kind of awkward, but not completely taboo either.
So, if you couldn’t tell by now, I’ve been thinking a lot about data. Databases, data grids, data fabrics, data warehouses, data in the cloud, moving data, securing data, big data, data data data data data data. Most of the hard problems (not all, of course) with cloud computing are with data. It’s always the data that seems to trip us up …
User: “Why can’t I VMotion my server from here to China?”
Admin: “Well, aside from the fact that we don’t have a data center in China, your application has a Terabyte of data and it would take month to get there.”
User: “Why can’t I use dropbox anymore?”
Admin: “Because you put sensitive company data on it, which was compromised and now we’re being sued. By the way, your boss is looking for you.”
User: “The performance of my application you put on our hybrid cloud is pitiful. You suck.”
Admin: “You told us there would be no more than 100 users, and now there are 50k users trying to access the same database at the same time. You’re an idiot.”
Granted, in all of these situations, it’s not just the data that’s the issue. Well, actually, come to think of it, it’s not really the data at all, now is it? It’s all the things we must do in order to deliver/secure/migrate/manage/scale the data in the cloud that becomes the issue. So data is often the root of our problems, but never really the problem itself. Instead it’s data handling in the cloud that’s the big challenge. Yes, that’s it! And it would appear someone really smart from JPMorgan Chase would agree with me …
Whew! Validation gives me warm fuzzies. Anyway, circling back to a point I made earlier, since I’ve been focused on the data layer a whole big crazy world has opened up for me. Much like what vSphere did for servers, there is a ton of activity happening at the data layer to transform the way we handle data at scale in the cloud. And again, what’s so cool about this layer is that it is all too familiar. When studying up on how data grids can make data globally available via their own internal WAN replication techniques, or when learning about how a new breed of load balancers are enabling linear scalability for databases, or when exploring how in-memory databases can dramatically improve application performance … the concepts/language/lingo are easily understood and relatable to things I already know.
Now in the midst of all this learnin’ it occurred to me, everyone has been talking about the Platform as the next big thing (myself included) … but I would think the data problems need to be solved first, don’t they? Sure, I know things won’t happen serially here; lots of smart people and cool new companies are working to solve cloudy problems at both layers in parallel. But we all know that where there are problems, there are opportunities. And it would appear to me that the more immediate problems needed to be solved are with data handling. So could it be that the next big thing is actually in the layer between the layers? And could this really be the place where developers and engineers finally meet and give each other a great big awkward hug?
Which brings me back to the very beginning of this blog post (the next big thing, not the awkward hug). After digging pretty deep into SQLFire, I’ve found it’s a radically new kind of database that addresses many of the issues with data handling in the cloud. It’s a database built differently from the ground up because it is built on amazing, disruptive data grid technology, yet presents itself to an application as a regular old database. It can unobtrusively slide in between applications and their existing databases to solve performance problems, or it can stand on its own as a complete database solution. It can instantly scale linearly, it can make your data extremely fault tolerant, and it can make your data available globally, all with very little effort and/or overhead. Pretty amazing stuff. You should check it out and let me know what you think. And even if you don’t take a look at SQLFire, what do you think about the “layer between the layers?” The next big thing?