Software Defined Storage Use Case – Block SAN Storage for Traditional Workloads

In my last post, IBM Spectrum Storage Suite – Revolutionizing How IT Managers License Software Defined Storage, I introduced a simple and predictable licensing model for most all the storage needs an IT manager might have. That’s a pretty big concept if you think about all the storage use cases an IT manager has to deal with.

  • Block SAN storage for traditional workloadsIBM Spectrum Storage Suite Symbol
  • File storage for analytic or Big Data workloads
  • Object storage for cloud and mobile workloads
  • Scale-out block storage for VMware datastores
  • Storage for housing archive or backup copies

Just to name a few… The idea behind software defined storage is that an IT manager optimizes storage hardware capacity purchases for performance, environmentals (like power and space consumption), and cost. Then he ‘software defines’ that capacity into something useful – something that meets the needs of whatever particular use case he is trying to deal with. But is that really possible under a single, predictable software defined storage license? The best way I can think of to answer the question is to look at several of the most common use cases we see with our clients.

Perhaps the most widely deployed enterprise use case today is block SAN storage for the traditional workloads that all our businesses are built on – databases, email systems, ERP, customer relationship management and the like. Most IT managers know exactly what kind of storage capabilities they need to deploy for this use case. It’s stuff like:

Here’s the thing… This use case has been evolving for years and most IT managers have it deployed. The problem isn’t that the capabilities don’t exist. The problem is that the capabilities are most often tied directly to a specific piece of hardware. If you like a particular capability, the only way to get it is to buy that vendors hardware. It’s a hardware-defined model and you were locked in. With IBM Spectrum Storage, IBM has securely unboxed with software defined. All the capabilities I just mentioned can be accomplished with one IBM Spectrum Storage Suite software license and you have complete flexibility to pick whatever hardware vendor or tier you like. The idea of software defined changes everything. With the software securely unboxed from the hardware, you really are free to choose whatever hardware you want from most any vendor you like. And since the software can stay the same even while hardware is changing, it means you don’t experience any operational or procedural tax when you make those changes.

All of the capabilities mentioned above for addressing this Block SAN storage for traditional workloads use case can be accomplished with one IBM Spectrum Storage Suite software license. This may be the most widely deployed use case today, but it’s not the fastest growing use case. In my next posts, I’ll continue looking at the wide variety of use cases that are covered by the simple, predictable IBM Spectrum Storage Suite software defined storage license.

Are you interested in taking the first step with software defined storage? Contact your IBM Business Partner or sales representative. And join the conversation with #IBMStorage and #softwaredefined.

Edge 2013 Day 4 – Poke in the eye

I’ve spent a good bit of time this week talking to clients, business partners, managed service providers and IBMers about their perspective on IBM Edge. One of the strengths they point out is the diversity of programming. My experience at the conference has included main tent sessions and sessions at Executive Edge, Technical Edge, the MSP Summit and, today, Winning Edge. Winning Edge is a sales training boot camp exclusively for IBM Specialty Business Partners. It’s advertised mostly by word-of-mouth and on IBM PartnerWorld. Unlike other areas of Edge, the Winning Edge sessions and ensuing hallway conversations are focused on, well, winning competitive engagements. As a result, there is a fair amount of talk about the strength of IBM offerings and the weakness of competitive offerings. In other words – “there’s your competitor, go poke’em in the eye!

Now, before I go on, here’s my disclaimer. Although I am employed by IBM, my perspectives are my own and do not necessarily represent the views, positions, strategies or opinions of IBM or IBM management.  Enough said?

Butterfly AERs

Back in September last year, IBM acquired Butterfly Software. This little company brought a tectonic shift in the way I and our customers think about the value of storage software. As has been repeated over and over this week at Edge, it’s all about economics. Butterfly has developed what they call an Analysis Engine Report (AER) that follows a straight forward thought process.

  1. Using a very light weight collector, gather real data about the existing storage infrastructure at a potential customer.
  2. Using that data, explain in good detail what the as-is effectiveness of the environment is and what costs will look like in five years time if the customer continues on the current approach.
  3. Show what a transformed storage infrastructure would look like compared to the as-is approach, and more importantly what future costs could look like compared to continuing as-is.

Butterfly has two flavors of AER’s, one for primary storage infrastructure and one for copyButterfly AER data (or backup) infrastructure. They have analyzed some 850 different infrastructures scattered across every industry in most parts of the world and comprising over 2 exabytes of data. In all that analysis, they have discovered some remarkable things about IBM’s ability to transform the economic future of storage for its clients. (Editorial comment: the results probably have something to do with why IBM acquired the company).

  • When compared to as-is physical storage environments, transforming to a software-defined storage environment with IBM SmartCloud Virtual Storage Center (built on a Storwize family software-defined layer), the economic outlook is, on average, 63% more efficient. That’s the average, your results may vary. As an example, in my post on Tuesday I talked about LPL Financial who followed the recommendations of a Butterfly Storage AER and saved an astounding 47% in total infrastructure.
  • When compared to as-is competitive backup environments, transforming to IBM Tivoli Storage Manager approach is, on average, 38% more efficient. Again, your results may vary. For example, when you look just at the mass of Backup AER results from as-is Symantec NetBackup environments, transforming to IBM TSM was 45% more efficient. For those who had CommVault Sympana, the backup AER results showed TSM to be 54% more economically efficient. EMC NetWorker? The transformed TSM approach showed 45% less expensive. There’s data by industry and for many other competitive backup approaches but you get the picture. Choosing a TSM approach saves money.

IBM software-defined storage vs EMC VPLEX

First I have to say that I wish EMC would focus. Trying to figure out which horse they are on for software defined storage and virtualization makes it hard for sellers to figure out what they’ll be competing against. Maybe that’s the point. In my post on EMC ViPR: Breathtaking but not Breakthrough, I talked about how EMC is openly asking the question “Is ViPR a modern interpretation of what we now mean when we say ‘storage virtualization’?” It’s certainly EMC’s modern interpretation having tried before to virtualize physical storage with Invista (circa 2005) and VPLEX (circa 2010). This week at Edge I heard an industry analyst refer to “ViPR-ware” (say it fast, you’ll get the pun) so for the moment, there’s still competitive talk about VPLEX.

In the hallway, I bumped into an IBM Business Partner whose firm has already helped 80 different clients implement active-active datacenters with the Storwize family software-defined storage layer in a stretched-cluster configuration using the IBM SAN Volume Controller engine. Eighty clients, one Business Partner. I wonder if that’s more than the sum total of EMC VPLEX deployments in the world? Anyway, back to the larger conversation. At Winning Edge there were a few observations made about EMC’s approach to VPLEX that are worth noting.

  1. VPLEX is missing most every important storage service IT managers need. Snapshot, mirroring, thin provisioning, compression, automated tiering, the list goes on. As a result, VPLEX depends on those capabilities being delivered somewhere else, either in an underlying physical disk array or in bolt-on software or appliances like EMC RecoverPoint.  Because of the dependence, mobility of the VPLEX virtual volumes is limited. Think about it, if a workload stores its data on a VPLEX virtual volume and that data is expecting services like automated tiering or thin provisioning to keep storage costs down, then the virtual volume is limited in its mobility because it must be stored on some physical array that provides those services. It’s sort of like if you were using VMware for a virtual machine, but you were told you couldn’t use vMotion from a physical Dell server over to an HP server because there was some capability on the Dell server that your workload was depending on and the HP server couldn’t deliver it. That scenario simply doesn’t happen (and IT managers wouldn’t tolerate it if it did) because VMware provides the required services in the hypervisor and doesn’t depend on anything from the underlying physical servers. VPLEX hasn’t gotten there yet.
  2. Workload migration with VPLEX is a manual pain. Let’s think about what happens in a VPLEX environment when you need to move a workload off a certain piece of physical hardware. Why would you be moving? Everyday reasons like resolving a performance issue, getting off an array that is failing, or simply unloading a device that is being replaced. Assuming the IT manager has convinced himself that issue #1 above is okay (that tying the capability of a virtual volume to some piece of physical hardware is okay) and that they have a replacement array handy that happens to exactly match the set of capabilities available on the array that’s being vacated, then what’s the process for getting the workload moved? Well, for each and every virtual volume you want to move:
  • A new physical LUN has to be created on the target array that looks exactly like the source LUN on the array being replaced.
  • The virtual volume to be moved has to be mirrored between the two arrays
  • Once in sync, the mirror has to be broken and the old physical volume taken offline.

Today’s large arrays can have LOTs of LUN’s on them meaning the above procedure would have to be executed LOTs of times. How often do new arrays come and go in your datacenter? How often do you experience a performance incident? How often do repetitive procedures like this work flawlessly over and over and over? This is not something that most customers I talk with would attempt. VPLEX hasn’t yet matured to the point where analytics, not people with procedures, drive virtual volume movement to avoid array performance issues and unloading a physical array is a simple command.

Who is your favorite competitor to poke at? Get your stick out and leave a comment!

EMC ViPR: Breathtaking, but not Breakthrough

Last Monday, EMC announced ViPR as its new Software-defined Storage platform. Almost simultaneously, Chuck Hollis described it as ‘Breathtaking’ in his usually excellent blog. I must admit, one thing I routinely find breathtaking about EMC is their approach to marketing. They have a knack for being able to take unexceptional technology (or, as in this case, combinations of technology and theories about the future), and spin an extraordinarily compelling story. With all seriousness and without tongue in cheek… Nicely done EMC!

Chuck’s blog described ViPR in three parts. To a heritage EMC customer, these three concepts may seem revolutionary because, to-date, EMC hasn’t successfully offered this sort of technology. However, for clients of IBM, Hitachi, or other smaller vendors, the environment EMC hopes to create with ViPR will seem familiar because, in large part, it’s been evolving for years. Let’s look at the three parts one at a time.

unexceptional

The first ViPR idea Chuck describes is to help create “a better control plane for existing storage arrays: EMC and others”. To be clear, EMC is just getting started with ViPR so initially the ‘others’ include only NetApp, but you can expect the list to expand if ViPR matures. Chuck is describing a software virtualization layer that discovers existing physical storage arrays and allows administrators to construct virtual storage arrays as abstractions across the multiple units. The ‘better control plane’ comes when the virtual array capabilities are surfaced via a storage service catalog that describes things like snaps, replication, remote sites, etc.  Administrators are then able to make requests for these services, in turn driving an orchestrated set of provisioning steps. IBM clients over the last decade have come to understand that this first idea is extraordinarily powerful. Today, the IBM SmartCloud Virtual Storage Center helps clients create an software-defined abstraction layer over existing physical arrays from EMC and LOTS of others. Regardless of the brand, tier, or capability of your existing physical arrays, the virtual arrays are capable of snaps, replication, stretching a virtual volume across two physical sites at distance to facilitate active-active datacenters, thin provisioning, real-time compression, transparent data mobility, etc. Administrators can describe named collections of services for different workloads — “here are the services ‘Database’ workloads need, and here are the different set of services ‘E-mail’ workloads need” — greatly simplifying provisioning. If you need help in understanding your unique data and its needs, IBM has developed consulting services to assist. Once service levels are defined and named, administrators simply specify a) what service level they need, b) how much capacity they need in that service level, and c) what machine needs access. Requests kick off an orchestrated workflow that performs all the mundane tasks of creating virtual volumes with the right services, provisioning the remote replication relationships if needed, zoning the SAN and masking the virtual volumes for secure access, configuring the host multi-pathing for access resiliency, etc.  Requests can be made by administrators via a visually intuitive GUI, or programatically via REST API’s, an OpenStack Cinder plug-in, or deep integration with VMware vSphere Storage API’s. SmartCloud Virtual Storage Center also meters client capacity usage by service level. CIO’s can effectively manage these and other IT costs with IBM SmartCloud Cost Management.

SmartCloud Virtual Storage Center visually intuitive GUI
SmartCloud Virtual Storage Center visually intuitive GUI

The second ViPR idea Chuck describes is ‘changing how data is presented depending on a given application’s access needs’. What he is describing is a storage approach that layers access methods. In the case of ViPR, Chuck describes NFS as the base method and then other methods that could be layered on top, like object-over-NFS or HDFS-over NFS. The SmartCloud Virtual Storage Center implements block storage as the base layer. Related offerings that use the same block code stack, like the IBM Storwize V7000 Unified and the IBM SONAS, offer file-over-block and are looking forward to adding other object methods. This area is evolving rapidly and I agree with Chuck’s speculation that storing a piece of data once and accessing it through multiple methods could be important in the future.

The third ViPR idea Chuck describes is ‘Storage Services For Cloud Applications’. In his blog, he’s wrestling with a great question. A decade ago, ‘server virtualization’ was a budding young concept. Today it is foundational to the way we do IT. CIO’s have long since made their decisions on server virtualization and are now working to complete the virtual datacenter. We’ve found that with the servers handled, virtualizing the storage infrastructure is the focus in 2013. The question Chuck is wrestling with is “Is ViPR a modern interpretation of what we now mean when we say ‘storage virtualization’?” It’s certainly EMC’s modern interpretation having tried before to virtualize physical storage with Invista (circa 2005) and VPLEX (circa 2010). At IBM, we started virtualizing storage in 2003. Today, that software stack and its ecosystem of integration with applications, server hypervisors, orchestrators, cloud stacks, and cost managers is implemented in thousands of datacenters. If nothing else, we’ve stayed focused on growing what works. In recent posts, I have explored how the industry defines software-defined storage, and whether it is a key to a successful private cloud. If EMC breaks tradition and sticks with ViPR for the long term, the words they are using in their marketing demonstrate they understand what ViPR needs to become if it wants to be a complete offering. However, as CIO’s make decisions on software-defining their storage in 2013, I think they’ll find that the IBM SmartCloud Virtual Storage Center is already accomplishing for storage what server hypervisors have accomplished for servers.

SNW Spring 2013 recap

(Originally posted April 4, 2013 on my domain blog at ibm.com. Reposted here for completeness as I move to WordPress.com)
 

I’m just returning from the SNW Spring conference in Orlando. It seemed sparsely attended but my 5-foot tall wife of almost 28 years has always told me that dynamite comes in small packages (I believe her!).

As I noted in my last post, I was in Orlando to participate in a round table discussion on storage hypervisors hosted by ESG Senior Analyst Mark Peters. I was joined by Claus Mikkelsen – Chief Scientist at Hitachi Data Systems, Mark Davis – CEO of Virsto (now a VMware company), and George Teixeira – CEO of DataCore. Conspicuously missing from the conversation both at this SNW and at a similar round table held during the SNW Fall 2012 conference was any representation from EMC. More on that in a moment.

The session this time drew a crowd roughly three times the size of the Fall 2012 installment – a completely full room. And the level of audience participation in questioning the panel members further demonstrated just how much the industry conversation is accelerating. I was pleased to see that most of the discussion was focused on use cases for what was interchangeably referred to as storage virtualization, storage hypervisors, and software-defined storage. Following are a few of the use cases that were probed on.

Data migration was noted as an early and enduring use case for software-defined storage. Today’s physical disk arrays are capable of housing many TB’s of data, often from MANY simultaneous business applications. When one of these physical disk arrays has reached the end of its useful life (the lease is about to terminate), the process of emptying the data from that old disk array to a newer, more modern disk array can be consuming. The difficult part isn’t the volume of data, it’s the number of application disruptions that have to be scheduled to make the data available for moving. And if you happen to be switching physical disk array vendors, that can create related effort on each of the host machines accessing the data to ensure the correct drivers are installed. Clients we have worked with tell us the process can take months. That’s not only hard on the storage administration team, but it’s also wasteful because a) you have to bring in a new target array months ahead of time and b) both it and the source array remain only partially used during those months as the data is migrated. The economic value of solving this data migration issue is an early use case that has fueled solutions like IBM SAN Volume Controller (SVC), Hitachi Virtual Storage Platform, and DataCore SANsymphony-V. Each of these are designed to provide the basic mechanics of storage virtualization and mobility across most any physical disk array you might choose – all without disruption of any kind to the business applications that are accessing the data.

A quick side comment. While the data migration use case carries a strong economic benefit for IT managers (transparent migration from old to new disk arrays), it can just as easily be used to migrate from old to new disk array ‘vendors’. For the IT manager, this has the potential for even greater economic benefit because it creates the very real threat of competition among physical disk array vendors driving cost down and service up. But for an incumbent disk array vendor, there’s not a lot of built in motivation to introduce their client to such a technology. At SNW this week, it was suggested that this dynamic may be responsible for the relatively low awareness and deployment of storage virtualization technologies. Incumbent vendors are happy to keep their clients in the dark about software-defined storage and data migration use cases. Interestingly, almost 10 years after these technologies were first introduced, EMC (whose market share makes them the most frequent incumbent physical disk array vendor), is still only talking about this topic in the shadows of ‘small NDA sessions’. See Chuck’s Blog from earlier this week.

Flash storage ‘everywhere’ was identified as a more recent, and perhaps more powerful use case. SNW drew a strong contingent of storage industry analysts from firms like IDC, ESG, Evaluator Group, Silverton Consulting and Mesabi Group. A consistent theme from the analysts I spoke with, as well as from the panel discussion, is that data and performance hungry workloads are driving an unusually rapid adoption of flash storage. Early deployments were as simple as adding a new ‘flash’ disk type into existing physical disk arrays, but now flash is showing up ‘everywhere’ in the data path from the server on down. The frontier now is in the efficient management of this relatively expensive real estate whether it is deployed in disk arrays, in purpose-built drawers, or in servers. Flash is simply too expensive to park whole storage volumes on because a lot of what gets stored isn’t frequently accessed and would be better stored on something slower and less expensive. This is where the basic mechanics of storage virtualization and mobility from the Data migration use case come in. At IBM, we’ve evolved the original SVC capabilities to now couple the basic mechanics with analytics and automation that guide how and when to employ the mechanics most efficiently. The evolved offering, SmartCloud Virtual Storage Center, was introduced last year. Consider this scenario. You are an IT manager who has invested in two tiers of physical disk arrays. You have also added a third disk technology – a purpose-built flash drawer (perhaps an IBM TMS RamSan). You have gathered all that physical capacity and put it under the management of a software-defined storage layer like the SmartCloud Virtual Storage Center. All of your application data is stored in virtual volumes that SmartCloud Virtual Storage Center can move at-will across any of the physical disk arrays or flash storage. Knowing which ones to move, when, and where to move them is where SmartCloud Virtual Storage Center excels. Here’s an example. Let’s suppose there is a particular database-driven workload that is only active during month end processing. The analytics engine in SmartCloud Virtual Storage Center can discover this and create a pattern of sorts that has this volume living in a hybrid pool of tier-1 and flash storage during month end and on tier-2 storage the rest of the month. In preparation for month end, the volume can be transparently staged into the hybrid pool (we call it an EasyTier pool), at which point more real-time analytics take over identifying which blocks inside the database are being most accessed. Only these are actually staged into flash leaving the lesser utilized blocks on tier-1 spinning disks. Do you see the efficiency? The icing on the cake comes when all this data is compressed in real-time by the storage hypervisor. This kind of intelligent analytics – directing the mechanics of mobility – from a software-defined layer are critical to economically deploying flash.  

Commoditization of physical disk capacity YYYYiiiikkkkeeeessss!!! One of the more insightful observations offered by panel members, including VMware, was that if you follow the intent of a software-defined storage layer to its conclusion, it leads to a commoditization of physical disk capacity prices. From a client perspective, this is welcomed news, and really, it’s economically required to keep storage viable. Think about it, data is already growing at a faster pace than disk vendor ability to improve areal density (the primary driver behind reduced cost), and the rate of data growth is only increasing. Intelligence, analytics, efficiency, mobility… in a software-defined storage layer will increase in value freeing IT managers to shift, in mass, toward much lower cost storage capacity.

Another quick side comment. With EMC still lurking in the shadows on this conversation and VMware agreeing with the ultimate end state, it seems the two still have some internal issues to resolve. I don’t fault them. It’s a sobering thought for any vendor who has a substantial business in physical disk capacity. But at least for the two disk vendors represented on this week’s SNW panel, we are actively engaged in helping clients achieve the necessary end goal.

The conversation continues. Check out the blog by Kate Davis at HP, How do you define software-defined storage?

Join the conversation! Share your point of view here. Follow me on Twitter @RonRiffe and the industry conversation under #SoftwareDefinedStorage.