IBM on Software Defined Storage

I routinely follow a number of blogs by storage industry thought leaders. Among them is a usually insightful blog by EMC’s Chuck Hollis. Last Friday I read his post titled Software-Defined Storage – Where Are We?  As Chuck described, the post was intended to explore “Where are the flags being planted?  Is there any consistency in the perspectives?  How do various vendor views stack up?  And what might we see in the future?” The questions themselves captured my attention. First, they are great questions that everyone who is watching this space should want answered. Second, I wanted to see which vendors EMC was interested in comparing with. Notably missing from Chuck’s list was IBM, a vendor who both has a lot to say and a lot to offer on the subject of software defined.

I thought Chuck did a nice job in the sections of his post on Basic [Software Defined Storage] SDS Concepts and Towards a Superset of Characteristics. My only critique would be that he didn’t acknowledge some of the forward leaning work being done in the space. For example, in the area of concepts he rightly observed of the past that “there is little consensus on what is software-defined storage, and what isn’t” but he failed to acknowledge the important work by the team at IDC in providing the industry with an unbiased nomenclature and taxonomy for software-based storage. See my post from a couple months back on How do you define Software-defined Storage?  Chuck also suggested that “the required technology isn’t quite there yet — but there are all signs that it’s coming along very quickly.  By next year, there should be several good products in the marketplace to concretely evaluate.”  That may be true for EMC and the rest of the vendors he chose to talk about, but by the end of this post I hope you will understand that when it comes to IBM, Chuck’s statement is several years behind.

The aim of software-defined

Software defined storage isn’t an end unto itself. It is a necessary piece in the evolution to a software defined environment (SDE), also referred to as a software defined datacenter IBM Software Defined Environments(SDDC). I like IDC’s definition of what this is, “a loosely coupled set of software components that seek to virtualize and federate datacenter-wide hardware resources such as storage, compute, and network resources and eventually virtualize facilities-centric resources as well. The goal for a software-defined datacenter is to tie together these various disparate resources in the datacenter and make the datacenter available in the form of an integrated service…” IBM is one of the few vendors who are working in all the areas of software-defined and Jamie Thomas, Vice President and General Manager of Software Defined Systems is the head of the division that coordinates that work.

IBM SDE PatternsJamie thinks about SDE from the perspective of workloads and patterns of expertise that can help simplify operations reducing labor costs and improving security. A software defined environment is also more responsive and adaptive as workloads expand from today’s enterprise applications to mobile, social, big data analytics and cloud. Her view is that open source and standards communities are crucial to the long term viability of SDE. IBMs work in software defined compute with the Open Virtualization Alliance and oVirt, our work in SDN with Open Daylight, and our work in cloud with OpenStack is helping propel the construction of software defined environments.

IBM SDE Standards

IBM’s work in software defined storage

The words have morphed over time. What VMware did for Intel servers has been referred to as a hypervisor, as virtualization, and now is being called software defined compute to line up with the rest of the SDE vocabulary. The foundation of a software defined environment is, well, software that offers a full suite of services and federates physical infrastructure together to provide the basic commodity. In the case of VMware, the commodity is Intel megahertz. In the case of SDS, the commodity is terabytes.

IBM clients first began using these capabilities in 2003 with the IBM SAN Volume Controller software drawing its compute horsepower from commodity Intel processors and managing terabytes provided by federated disk arrays. That software base has since been renamed to the Storwize family software platform and given an expanded set of commodity engines to run on. Today, there are federating systems with no storage capacity of their own, systems with internal solid-state drives to speed the input/output (I/O) of other federated storage, and systems that carry their own serial attached SCSI (SAS) disk and flash capacity to augment other federated capacity. There are entry models, midrange models, enterprise models and even models that are embedded in the IBM PureSystems family converged infrastructure. For a more complete description of the suite of services offered, the breadth of physical storage that can be federated, and the I/O performance that can be enjoyed, see my post Has IBM created a software-defined storage platform? Over the last decade, this software platform has been referred to as virtualization, as a storage hypervisor, and now with a total capacity under Storwize software management on its way to an exabyte, we call it SDS v1.0.

IBM Software Defined Storage (SDS)

SDS v2.0

SDS v2.0 came along early in 2012 with the introduction of IBM SmartCloud Virtual Storage Center (VSC). Building on the successful base of the Storwize family software platform, VSC added a number of important capabilities.

  • Service catalog:  Administrators organize the suite of VSC storage services into named patterns – catalog entries.  Patterns describe workload needs in terms of capacity efficiency, I/O performance, access resilience, and data protection.  For example, a pattern for ‘Database’ might describe needs that translate to compressed, thin provisioned capacity on a hybrid flash and SAS pool, with a single direction synchronous mirror and load-balanced multi-path access. The beauty of the service catalog is that requestors (application owners or orchestrators as we’ll see shortly) don’t need to concern themselves with the details. They just need to know they need ‘Database’ capacity.
  • Programmable means of requesting services: VSC includes API’s that surface the service catalog patterns to portals and orchestrators. The questions that must be answered are quite simple. How much capacity do you need? In what service level do you need it? Who needs access? From there, storage-centric orchestration takes over and performs all the low level mundane tasks of satisfying the request. And it works on a wide variety of physical storage infrastructure. The VSC API’s have been consumed by an end-user accessible portal, SmartCloud Storage Access, and by higher level SDE orchestrators like SmartCloud Orchestrator.
  • Metering for usage-based chargeback: Service levels and capacity usage is metered in VSC. Metering information is made available to usage and cost managers like SmartCloud Cost Management so that individual consumers may be shown or charged for their consumption. Because VSC meters service levels as well as usage, higher prices can be established for higher levels of SDS service. Remember IBM’s perspective, we are building out SDE of which SDS is a necessary part. SmartCloud Cost Management follows the model providing insight into the full spectrum of virtualized and physical assets.
  • Management information and analytics:  When the challenges of day-to-day operations happen (and they do happen most every day), administrators need straightforward information surrounded by visually intuitive graphics and analytic-driven automation to speed decision making and problem resolution. Last year we
    SmartCloud Virtual Storage Center management and analytics
    SmartCloud Virtual Storage Center management and analytics

    introduced just this approach with SmartCloud Virtual Storage Center. I discussed it more thoroughly in my post Do IT managers really “manage” storage anymore? If you watch the news, you’ll know that IBM is leading a transformation toward cognitive computing. We’re not there yet with the management of SDS, but consider this scenario. You are an IT manager who has invested in two tiers of physical disk arrays, probably from different vendors. You have also added a third storage technology – a purpose-built flash drawer. You have gathered all that physical capacity and put it under the management of a software defined storage layer like the SmartCloud Virtual Storage Center. All of your workloads store their data in virtual volumes that SmartCloud Virtual Storage Center can move at-will across any of the physical disk arrays or flash storage. Knowing which ones to move, when, and where to move them is where SmartCloud Virtual Storage Center excels. Here’s an example. Let’s suppose there is a particular database workload that is only active during month end processing. The analytics in SmartCloud Virtual Storage Center can discover this and create a pattern of sorts that has this volume living in a hybrid pool of tier-1 and flash storage during month end and on tier-2 storage the rest of the month. In preparation for month end, the volume can be transparently staged into the hybrid pool (we call it an EasyTier pool), at which point more real-time analytics take over identifying which blocks inside the database are being most accessed. Only these are actually staged into flash leaving the lesser utilized blocks on tier-1 spinning disks. Can you see the efficiency?

SDS v3.0

So where are we?

  • SDS v1.0 delivered. Software that offers a full suite of services and federates physical infrastructure.
  • SDS v2.0 delivered. A service catalog with a programmable means of accessing services, a portal and SDE cloud orchestration integration. Metering for usage-based chargeback and management information with analytics.

Where do we go from here? At IBM we’re busy opening up the Storwize family software platform for industry innovation, helping VSC become even more aware of application patterns, and progressing the notion of cognitive and analytic driven decision making in SDS.  Watch this space!

Users of IBM SDS speak

More than just theory and a point of view, IBM SDS is helping real customers. At the recent IBM Edge conference there were over 75 client testimonials shared, many of them about the benefits realized from using IBM SDS. I covered several of them in my post on Edge Day 2.

One of the coolest stories came earlier in the year at the IBM Pulse conference from IBMs internal IT operations.  IBMs CIO manages 100 petabytes of data and by leveraging SmartCloud Virtual Storage Center was able to reduce costs by 50% with no impact to performance.

Did this help clarify IBM’s position in SDS?

Edge 2013 Day 2 – “I could have cried!”

In my post yesterday I mentioned that we heard the first of over 75 client testimonials being shared at IBM Edge 2013. Today, the client stories came fast and furious. Several caught my attention.

Sprint is a US telecommunications firm who has 90% of their 16 petabytes of SAN storage capacity under the control of software-defined storage – specifically the Storwize family software running on IBM SAN Volume Controller engines. Because of the flexibility of software-defined storage, Sprint was able to seamlessly introduce IBM FlashSystem capacity as a new tier of MicroLatency capacity and transparently move a call center workload to the new flash storage. The results were impressive: 45x faster access to customer records. That’s right, a 4,500% improvement!

eBay is both the world’s largest online marketplace as well as a company that offers solutions to help foster merchant growth. They are serious about open collaborative solutions in their datacenters. When it comes to cloud, they use OpenStack. eBay implemented IBM XIV storage with its OpenStack Cinder driver integration and now is able to guarantee storage service levels to their internal customers.

Ricoh is a global technology company specializing in office imaging equipment, production print solutions, document management systems and IT services. All of their physical storage capacity is under the control of a Storwize family software-defined layer inside the IBM SmartCloud Virtual Storage Center. This enabled extreme efficiency saving them 125TB of real capacity and a 40% cost reduction with tiering. As the Ricoh speaker left the stage, the IBM host asked an unscripted question “Can you imagine running your IT without software-defined storage virtualization?” to which Ricoh responded “No! It would be catastrophic.”

LPL Financial is the largest independent financial dealer-broker in the US. Their physical storage infrastructure was multi-vendor, isolated in islands, underutilized, with little administrative visibility. The inflexible nature of physical storage had isolated workloads with certain disk arrays even though excess capacity might exist elsewhere in the datacenter. LPL implemented SmartCloud Virtual Storage Center (built on a Storwize family software-defined layer) for their most problem areas in just three months – 3 months! The seamless workload mobility provided by this software-defined storage approach solved issues like performance incident resolution, islands of waste, and the headaches associated with retiring old physical arrays. The quote of the day came from tears-of-joyChris Peek, Senior Vice President of Production Engineering at LPL Financial: “It was so good I could have cried!” LPL continued by building a new datacenter with a 100% software-defined storage infrastructure using SmartCloud Virtual Storage Center. Using software layer capabilities like tiering, thin provisioning and real-time compression they were able to save an astounding 47% in total infrastructure.

Arkivum is a public archive cloud provider in Europe. They use tape media with the IBM Linear Tape File System as the foundation of an archive service that economically offers a 100% guarantee for 25 years. The thing that struck me is in a storage industry that speaks of things in terms of five-9’s, Arkivum are combining cloud and tape with a 100% guarantee.

There were others too. Kroger is a grocery chain in the US. They implemented IBM FlashSystem and reduced latency tenfold for their Oracle Retail platform. And CloudAccess.net is a cloud service provider who needed to drive 400,000 I/O’s per second. They replaced a bank of disk drives with a single IBM FlashSystem drawer at one-tenth the cost.

I have to say that all the focus on client outcomes is refreshing. Sure, Edge has plenty of discussion around IBM’s strategy and the innovative technology being announced this week But I agree with Kim Stevenson, Chief Information Officer at Intel, who said “Organizations don’t buy technology, they buy benefits.”

I’m sure there were other client stories shared in sessions that I missed. Share your favorite outcomes below. Leave a comment!

Edge 2013 Day 1 – Clouds and floods of Flash

After using Rainy days and sunshine to describe Day 0, Cloud and Flash seemed natural for Day 1. Not sure where there weather will lead tomorrow.

Day 1 of IBM Edge was action packed. With 65 new and refreshed products being announced and the first of over 75 client testimonials being shared, there was a lot of information to consume.

Here are the highlights that caught my attention.

Stephen Leonard is IBMs General Manager for Global Markets and a current resident of the UK. He described his experience traveling to the US for Edge. Sitting at breakfast in his home, he used his mobile phone to select a seat on the airplane, check in and get his boarding pass. Leaving his house, the satnav in his car downloaded real-time traffic data for the London area and optimized his route to the airport. In the airport terminal he read a British newspaper on his tablet, and because he had been looking at real estate in the US the day before, the digital British newspaper presented him with advertisements for realtors in Connecticut.

  • We are creating wider and wider trails of data that are as unique to an individual as fingerprints and DNA.
  • These data trails are also being created by manmade things like roads, railways, cities, and supply chains as well as nature made things like rivers, wind, and cattle.
  • The world is being shaped by Big Data.
  • But today’s datacenters aren’t made for this kind of world.
  • He referenced an analyst study (I admit I didn’t catch the source) that suggested in last 2 decades the cost of IT administration has grown from less than one-third of the IT budget to over two-thirds making investment in innovation difficult.

Bernie Meyerson is an IBM Fellow and Vice President of Innovation.

  • We will reach the density limits of silicon in 7-10 years.
  • The limits of magnetic recording are also approaching quickly.
  • Bernie showed a now famous IBM Research video of A Boy and His Atom to make the point that it’s important we know the physical limits of current technology and when they are coming because massive investment is needed to come up with what’s next.
  • IBM spends about $6B annually in R&D and has been #1 in patent production for the last 20 years. Last year alone, IBM was awarded 6,478 patents.

Kim Stevenson is the Chief Information officer at Intel.

  • We live in a sharing economy. We share pictures on Instagram, music on Pandora, bicycles with Citi Bike, cars with zipcar, and vacation rentals with airbnb.
  • As we move forward, most IT will be delivered in a shared model. Public, private, and hybrid clouds.
  • Stephen Leonard added IBMs point of view that important innovation will also be shared. He pointed to IBM’s strong involvement in and support of Linux, Eclipse, Apache, and now OpenStack and Hadoop.

Ed Walsh is IBMs Vice President Storage Systems Marketing and Strategy

  • You know about virtualizing servers and the benefits that lead both you and your peers to broadly adopt it for your compute infrastructure. Imagine if you could achieve the same benefits by virtualizing your storage infrastructure.
  • This is the promise of software-defined storage (SDS). The good news is that it is here today.
  • SDS v1.0 is virtualization of physical storage infrastructure, regardless of your choice in IMG_0245hardware vendor. IBM was delivering this in the Storwize family software platform  before the industry started calling it SDS.
  • SDS v2.0 is making that platform open and extensible kicking off an era of industry innovation. This is also here today.
  • SDS v3.0 adds analytic and application driven patterns, hints provided to the SDS platform through open APIs enabling it to adapt and optimize services to the workload.
  • I understand from other attendees that there was standing room only in two of the Technical Edge sessions in this area. Performance optimization expert and Master Inventor Barry Whyte discussed the history of the Storwize family software platform and product strategist Jason Davison talked about today’s SmartCloud Virtual Storage Center packaging of that technology. If you were in either of those sessions, please leave a comment below with your perspective.

TSM Operations Center

  • The much anticipated Tivoli Storage Manager Operations Center was announced today.
  • Reports were that the session with product manager Xin Wang on the TSM Operations Center was also standing room only. If you were in that session, please leave a comment below with your perspective.  

Edge 2013 Day 0 – Rainy days and sunshine

I woke this morning at home just outside Dallas, Texas to a surprise rain storm and remarkably cool weather. A plane ride later I was in Las Vegas, Nevada where the sun was shining and the temperature topped out at 108 degrees farenheit (almost 40 degrees higher than what I woke to).

At Day 0 of the IBM Edge 2013 conference, the temperature wasn’t the only thing that was hot. I spent the afternoon with 900+ of the best partners in storage and the evening at a reception for some of the most influential analysts in the industry. Like me, they are all here to join IBM in a conversation on pushing the Edge of possibility – the conference theme.

IBM-edge
Here are my key observations from Day 0

  1. IBM is going to be majoring in three areas of technology during this conference.
    • FlashSystem: You’ll remember back in April IBM announced a $1B investment in Flash. If you missed it, Steve Furrier at SiliconANGLE did a nice job of covering the announcement.   Edge 2013 is going to continue to reveal what’s behind that investment and where it’s are heading.
    • PureSystems: Almost exactly a year before the announcement of a $1B Flash investment, IBM announced PureSystems, the result of a $2B, 4-year investment.
    • Storwize: Software defined environments are strategic and IBM has been maturing this software-defined storage platform for the last decade. It’s now becoming a center for industry innovation.
  2. IBM’s chief economist, Martin Fleming, shared his point of view that the global economy is going through a transformation but macro trends are starting to normalize and recover. He noted that as far back as 1771 with the dawn of the industrial revolution, there have been technological interruptions followed by a frenzy of strong economic growth, followed by an economic crash similar to the one we’ve been experiencing. These eras are measured in decades.  His observation is that we are coming through the crash, starting to normalize again, and are positioning for an era of strong growth. Companies like IBM which are built for distance are prepared to thrive.

If you aren’t able to attend Edge in person, the conference organizers have made it possible for you to watch keynote addresses by Livestream at http://www.ibm.com/edgeIf you are here at Edge, join the conversation. Come back to my blog each day this week. I’ll be sharing what I found important but it’s a big conference (Executive Edge, Technical Edge, Winning Edge, MSP Summit, Business Partner Forum) and I can’t cover everything. Leave comments letting me know what your impressions were.

Has IBM created a software-defined storage platform?

You get to be the judge.

If you have been anywhere close to the storage industry for the last several months you’ve noticed that software-defined storage (SDS) is all the rage. One of the difficulties has been pinning down exactly what SDS means. In my post “How do you define Software-defined Storage,” I pointed to a definitive piece of work by IDC laying out a complete taxonomy for SDS (they call it Software-based Storage). In the report, IDC describes three key attributes that IT managers can look for in identifying software-based storage solutions.

  1. A software-based storage solution is software. It is designed to run on commodity hardware and leverage commodity persistent data resources. This is in contrast to most traditional storage systems that, while they may have software microcode at their core, depend on some custom application-specific integrated circuit (ASIC), a specialized processor or a controller to perform some or all of their storage functions.
  2. A software-based storage solution offers a full suite of storage services.
  3. A software-based storage solution federates physical storage capacity from multiple locations like internal disks, flash systems, other external storage systems and soon from the cloud and cloud object platforms.

Next week at IBM Edge, I expect to hear a lot of conversation on SDS and, in particular, the IBM Storwize family software platform. Although the SDS craze is a relatively recent phenomenon, IBM has been investing in and tuning the Storwize family software platform for more than a decade. IBM clients first began federating physical storage in 2003, and today the total capacity under Storwize software management is on its way to an exabyte.

From its beginnings in 2003, the Storwize family software platform was designed to be portable, leveraging commodity hardware engines for its compute needs. In fact, the internal code name for the Almaden Research Center project was Compass (Commodity Parts Storage System). The technical motivation behind the decision was cost. IBM recognized that broad market dynamics would push the price and performance curve for Intel-based processors much more quickly than any custom-designed ASIC or specialized processor could keep up. In the early days, the decision was ridiculed by IBM competitors, most of whom were building high margin disk systems on custom hardware. Today, there are still some vendors who argue against the idea of SDS on commodity hardware. See the recent post by Hu Yoshida “Software Defined Storage is not about commodity storage.”

The first commodity hardware engines to house the Storwize family software platform were the IBM 2145 SAN Volume Controller engine and the Cisco MDS 9000 Caching Service Module. The Cisco engine was short-lived, but the IBM SAN Volume Controller engine is now in its seventh generation and processes about 800% more input/output operations per second (IOPS) than the original engine, proving exactly what IBM had hoped when it chose to ride the commodity price and performance curve. This current generation engine holds the Storage Performance Council SPC-1 benchmark record for the fastest external storage system at over 520,000 IOPS. Along the way, other commodity-based engines in a variety of configurations have joined the family. Storwize family software platformThere are federating systems with no storage capacity of their own, systems with internal solid-state drives to speed the input/output (I/O) of other federated storage, and systems that carry their own serial attached SCSI (SAS) disk and flash capacity to augment other federated capacity. There are entry models, midrange models, enterprise models and even models that are embedded in the IBM PureSystems family converged infrastructure.

The beauty in the Storwize family software platform is not only that it runs on commodity engines and federates capacity from a wide variety of internal disks, flash systems and other external storage systems, but also that it accomplishes this using a single software code base across the entire range.

As IDC noted, one of the foundational requirements of an SDS platform is that it provide a complete set of storage services. If not, IT managers would be forced to weigh the tradeoff between commodity cost efficiency and missing storage services that are traditionally offered by high-margin disk arrays on custom hardware. Ten years into its development, the IBM Storwize family software platform solves the dilemma in grand manner. Check out the following features:

Deep in traditional array capabilities:

  • Snapshot
  • Synchronous mirroring
  • Asynchronous mirroring

Efficient by design:

  • Real-time compression giving up to 5x more usable capacity on all federated storage without impacting application performance
  • Thin provisioning
  • Streamlined human interface that Edison Group testing has shown saves of over 47 percent in administrator time and is 31 percent less complex versus performing the same set of tasks on physical storage

Self-optimizing:

  • EasyTier automated tiering places the hottest data on the highest performing storage for up to 3x improvement in I/O performance with as little as 5 percent of your total infrastructure on flash.

Cloud agile:

  • Active-active data center configurations federating storage capacity from two physical locations into a single instance of a virtual storage volume; used in conjunction with active-active virtual server capabilities like VMware vMotion over distance or IBM PowerVM Live Partition Mobility, can enables transparent site switching.
  • OpenStack Cinder driver can automatically deploy federated storage capacity with new cloud workloads
  • Integration with IBM PureSystems family for rapid deployment and management simplicity in converged cloud infrastructure

What do you think? Is IDC’s taxonomy a good measure of what the software-defined storage craze is all about? Does the IBM Storwize family software platform hit the mark as software-defined storage?