Not too long ago, the gold standard for protecting organizational data involved using a disk-to-disk-to-tape process. First, a copy of production data went to secondary disk to expedite rapid recovery if needed, and then the data went to tape for long-term retention. Previously, some organizations used only tape, and a few moved to using only disk when that became cost-effective. Even then, most IT groups knew it made more sense to use a combination of the two media types to leverage what they did best: disk for recovery, tape for retention.
Backups to provide multiple previous versions over an extended timespan;
Snapshotting to deliver the fastest recovery from a near-current version; and
Replication for data survivability at an alternate location.
Some have argued that one of these data protection methods should usurp the others depending on your point of view. But the fact is the best approach has always been to use each process for what it does best, in a complementary manner with the others.
Here we are today, and it appears the gold standard for protecting data, or at least the de facto standard at enterprise-scale organizations, has changed once again. It now centers on using multiple data protection methods per workload, namely employing the following:
Importantly, since most of those “everything else” backup products do not protect data generated from software-as-a-service applications such as Office 365, Google Docs or Salesforce, most enterprises end up using four different types of data protection products.
Ramifications of fragmentation
Do we really need four backup products? No, but because the approach to protection is already so fragmented, many IT operations admins — and even senior IT decision makers — are often no longer able to persuade their vAdmin and database admin colleagues that a unified product would really work better.
This situation is really the fault of vendors — specifically, the unified-product vendors who haven’t invested enough in marketing awareness of the economic benefits and technical proofs regarding how their products support varied workloads as well as niche offerings. Arguably, this lack of effective promotion is more of a high-impact factor in the lack of unified data protection uptake than any lack of engineering to actually deliver equitable protection capabilities. Until vendors fix this messaging problem, today’s data protection-related fragmentation will continue.In the meantime, having each administrator perform their own backups for the technological areas under their domain is a dangerous practice. Think about it: Most workload or platform admins only really care about being able to achieve 30-, 60- or 90-day rollbacks, for example. They are not worried about 5-year, 7-year or 10-year retention rules.
Corporate data must be protected to a corporate standard, however, which can include adhering to long-term retention and deletion requirements. That’s a consideration regardless of how fragmented the actual execution of protection. So, right now, some organizational data is being under-protected and some overprotected. This situation of varying data protection methods is making organizations vulnerable.
A multiplicity of gold standards
Another thing to note is that gold standards do not necessarily replace one another. Many organizations may be using two, three or four backup products per workload. But they are still supplementing those backups with snapshots and replicas. That’s actually a wise move, as no backup offering can replace the agility that comes with snapshotting or replication.
They’re also still using disk for rapid restoration and tape for long-term retention. And many are now adding cloud-based protection (disaster recovery as a service, for example) to achieve added agility.
At the end of the day, what these admins and the organizations they work for should care most about is the agility and reliability of the protection effort — regardless of the various mechanisms and media used to facilitate that protection.
We are going to have heterogeneous protection media, and we are going to have multiple data protection methods. With those realities in mind, to avoid unnecessary risk, the answer might be to haveas close to a common catalog, control layer (for policy management) and console as possible. That way everyone will understand what is really going on across an environment via a single pane of glass, regardless of fragmentation behind the scenes.
As ESG often tries to do, here is a short summary video of ESG’s impressions from a major industry event – Microsoft Ignite, held in Atlanta over September 26-29, 2016 – from a backup perspective.
In the video, I suggested that Microsoft is a leader in Windows data protection. Certainly, this is not to disparage the many Microsoft partners that have built whole companies and product lines around data protection. And from a revenue perspective, their backup offerings wouldn’t register at all. But …
Almost every version of Windows has shipped with a built-in backup utility to address the immediate, per-machine need for ad-hoc backups or file roll-back, with today’s “Previous Versions” functionality more closely resembling software-based snapshots than “backup” per se. That said, it has always been recognized that a more full-featured, multi-server solution is almost always warranted.
Of course, Microsoft has been producing Volume Shadowcopy Services (VSS) for over a decade, which is the underpinnings of how any backup vendor protects data within Windows systems.
Microsoft has been shipping its “for sale” backup offering, Data Protection Manager (within System Center), for over a decade. And though a good number of Hyper-V centric environments use DPM, the greater impact is how DPM gave/gives Microsoft insights into how to improve VSS, thereby improving all data protection offerings in market.
The point is – Microsoft is not new to “backup.” It hasn’t previously been a monetary focus, but they have consistently recognized backup as intrinsic to a management story and assured satisfaction with Windows Server and its application server offerings. All of that may be changing with Azure as the crown jewel of the Microsoft ecosystem and OMS as a cloud-based management stack.
Azure Backup is a key pillar of OMS, just like “backup” is a key pillar to many management strategies, alongside “provisioning,” “monitoring,” and “patching.” IT Operations folks that are responsible for the latter three activities are continually wanting to do backups (as preparation for recoveries) as well … which makes Azure Backup something to watch for its own sake, and the sake of the data protection market in 2017 as many organizations are reassessing their partners during their embrace of cloud services.
This week, the newly unencumbered Veritas (from Symantec) relaunched its premier user event – Veritas Vision. There was a palpable energy that resonated around “we’re back and ready to resume our leadership mantle,” starting with an impressive day one from main stage:
Bill Coleman (CEO) started the event by making “information is everything” personal by tying medical data to a young girl with health struggles, which gave context that would resonate with everyone — which then got “real” by revealing her as Bill’s niece.
Ana Pinczuk (CPO) introduced us to the journey that Veritas wants to partner with its customers and ecosystem – from what you already rely on Veritas for to ambitious data management and enablement, with an impressive array of announcements, almost all of which coupled Veritas’ established flagships with emerging offerings that unlock some very cool data-centric capabilities.
Mike Palmer (SVP, Data Insights), who is arguably the best voice on the Veritas stage in a very long time, delivered a brilliant session describing data and metadata through the movie InsideOut, tying the movie’s memory globes to data, the colors to meta data, combined with formative insights, pools of repositories and outcomes, etc. I’ll be hard pressed to use any other analogy to describe metadata for a long time to come. Vendors, if you want to see how to completely nail a keynote, watch the recordng of Mike’s session.
And that was just the first two hours. Along the way, the new Veritas also wanted everyone to know that they are combining two decades of storage/data protection innovation with a youthful, feisty aggressiveness against perceived legacy technologies, with EMC + Dell being the punchline of many puns and direct takedowns. That could have come across as mean or disrespectful, but was delivered with enough wit that it served to bring the room together. The competitive digs may have had arguable merit, but they clearly cast Veritas as a software-centric, data-minded contrast to hardware vendors – with a level of spunk that ought to energize its field, partners, and customer base.
As further testament to their approach for combining flagships with emerging offerings, many of the breakouts leveraged multiple, integrated Veritas products for solution-centric outcomes — which candidly is their best route to the ambitious journey that Veritas is embarking on. Glueing together their new journey through integrated solutions that are then underpinned by products (instead of jumping straight to what’s new in product X this year) will be a key to watch for as the new Veritas continues to redefine itself. As a reminder that “Veritas” is much more than “NetBackup,” check out their current portfolio.
For further impressions on the event, check out ESG’s video coverage from the event:
Congratulations Veritas on a fresh vision (and Vision) that ought to propel you into some exciting opportunities.
I am a huge fan of the Marvel movies. Each of the individual hero movies has done an awesome job contributing to the greater albeit fictional universe. Each of the heroes has their unique role to play within the Avengers team. And yet, in the latest movie that released on Blu-Ray today, it appears as if this colorful array of heroes is divided. They have similar goals, but what seems to be opposing methods that put them at odds with each other. Data protection can have similar contradictions.
The Spectrum of Data Protection activities can seem similar. We often talk about the spectrum as a holistic perspective on the myriad of data protection outcomes—and the potentially diverse tools that enable those outcomes. And yet, sometimes, the spectrum can appear opposed to itself:
Some in your organization are focused on “data management” (governance, retention, and compliance) which focuses on how long you can or should retain data in a cost-effective way that unlocks the value of the data.
Others in your organization are focused on “data availability” (assured availability and BC/DR), as part of ensuring the users’ and the business’ productivity.
Do these goals actually contradict? No.
But … you have to start with the core of what is common: data protection, powered by a complementary approach of backup, snapshots, and replication. But as backup evolves to data protection, many come to a crossroads where that evolution only goes down one path or the other—data management or data availability.
We’ll have to wait until next year to see how the Avengers reconciles to a single team again—but you can’t afford to wait that long. Start with your core focus areas and then evolve toward the edges, as opposed to coming from the edges in.
The foundation of any data protection, preservation, and availability strategy is grounded in “backup,” period. Yes, a majority of organizations supplement backups with snapshots, or replicas, or archives, as shown in what ESG refers to as the Data Protection Spectrum:
And as much as some of those other colors (approaches to data protection) can add agility or flexibility to a broader data protection strategy, make no mistake that for most organizations of any and all sizes, backup still matters!
In fact, ESG wrote a brief on the relevance of backup today, within the context how other methods supplement backups and vice versa. ESG is now making this brief publicly available, courtesy of Commvault.
In fact, Commvault believes so much in this backup-centric and yet comprehensive approach to data management, protection, and recovery, that they’ve invited me to speak at their Commvault GO conference in October at a session aptly named, Why you still need Backup….and Beyond (session description below)
ESG research shows that for the past five years, improving data backup and recovery has consistently been one of the IT priorities most reported by organizations. However, to evolve from traditional backup to true data management is to get smarter on all of the iterations of data throughout an infrastructure, including not only the backups/snapshots/replicas/archives from data protection, but also the primary/production data and the copies of data created for non-protection purposes, like test/dev, analytics/reporting, etc. Further, the cloud offers a new way to approach data protection, disaster recovery and some of those non-protection use cases. In this session, leading industry analyst, Jason Buffington discusses the trends in data protection today and market shifts that customers MUST understand in order to keep pace with the changing IT landscape.
My thanks to Commvault for syndicating the brief and the chance to share ESG’s perspectives on how the realm of data protection and data management is changing, and what to look for as it does. See you in Orlando!
Most organizations supplement traditional backup with some combination of snapshots, replication and archiving to achieve more comprehensive data protection. They are using what we at ESG refer to as the Spectrum of Data Protection.
Innovative data protection vendors, meanwhile, are constantly reacting to the changing IT landscape in their attempts to give their customers and prospects what they are looking for. With backup no longer enough for many organizations, data protection startups and industry dominators are stretching out and evolving in one of two tracks: data availability or data management.
Let’s take a look at these complementary yet decisively distinct branches of the data protection family tree.
A significant challenge in embracing a comprehensive data protection strategy, instead of simply a backup strategy, is the myriad methods employed. They can cause drastic over-protection or under-protection. While most data still requires backups (routine, multiversion retention over an extended period of time), the agile recovery needed for heightened availability often comes from snapshots and replicas before even attempting a restore. And those activities are complemented by application- or platform-specific availability/clustering/failover mechanisms.
The key to a successful data availability strategy then becomes a heterogeneous control plane across the multiple data protection methods, and a common catalog to mitigate over-/under-protection while unlocking all of the copies available for recovery.
Data management can be seen as both the reactive and proactive result of a truly mature IT infrastructure that has evolved beyond data protection. From a reactive perspective, all copies of data created through both data protection and data availability initiatives are economically unsustainable without data management.
That’s because primary production storage and secondary/tertiary protection storage are each growing faster than IT budgets. As such, organizations need to look at how they can unlock additional business value from their sprawling data protection infrastructure by leveraging otherwise dormant copies of information for reporting, test/dev enablement and analytics. Most in the industry call this copy data management, which was pioneered by startups and is now starting to be championed by industry leaders as an evolution of their broader data protection portfolios.
The proactive side of data management encompasses data protection areas such as e-discovery and compliance, archiving and, of course, backups (essentially, the left one-third of the family tree diagram). Here, organizations embrace real archival technologies instead of just long-term backups. They combine those technologies with processes and corporate culture changes to enable information governance and regulatory compliance.
For any of this to happen (data management, data availability or even just comprehensive data protection), organizations need a framework that we at ESG refer to as “The 5 Cs of Data Protection”:
Containers: Organizations should have multiple containers (repositories) for production storage and protection storage, including tape, disk and cloud.
Conduits: Enterprises will likely have multiple conduits (data movers). They frequently include not only snapshot and replication mechanisms that are often hardware based, but also multiple backup applications for general-purpose platforms and perhaps tools specifically for databases, VMs or SaaS.
Control: Because of the presumed heterogeneity of containers and conduits, organizations should seek out a single control plane (policy engine) that can ensure adequate protection across the underlying widgets without over- or under-protection.
Catalog: This needs to be a real catalog of what you have stored across those containers, regardless of which conduits created them. While some vendors might claim a rich catalog, they merely have an index of backup jobs, perhaps with enumerated file sets, instead of something that recognizes the contextual or embedded business value of the information within the data that has been stored.
Consoles: Lastly, to make sense of the whole data protection environment, most organizations need multiple consoles, whereby different roles can provide contextual insight — though vCenter plug-ins, System Center packs, workload (e.g., Oracle/SQL) connectors — as well as a ubiquitous lens across the vastly heterogeneous arrays and backup/replication technologies for a view that enables IT operations specialists to gain insights from their catalog and drive their control plane.
You can read more about The 5 Cs of Data Protection on ESG’s blog.
Yes, of course, data protection has to evolve to keep up with how production platforms are evolving, but I would offer that the presumptive ‘gold standard’ for what is the norm for those on the front lines of proactive data protection is evolving in at least three different directions at the same time.
Here is a 3-minute video on what we are seeing and what you should be thinking about as the evolutions continue.
As always, thanks for watching.
Announcer: The following is an ESG video blog.
Jason: Hi. I’m Jason Buffington. I’m a Principal Analyst at ESG covering all things data protection. The gold standard for data protection has evolved over the years. It used to be, way back in the day, it was disc-to-disc-to-tape, D2D2T. Meaning, we’d first try to recover from disc for rapid restoration or we go from tape as a longer term medium. There were a few folks out there that only used tape. There were even fewer folks out there that said you only needed disc. But most of us figured out we should use both as a better-together scenario.
Fast forward a couple of years and the gold standard changed. Now we’re talking about supplementing those longer term retention tiers of backups with snapshots for even faster recovery. And replication for data survivability and agility. Again, there’s a few folks out there that will try to convince you that one might usurp the others. But most of us have figured out that it makes sense, and sense, to have all of them in complement or in supplement to each other.
Fast forward a few more years and the gold standard continues to evolve. Now what we’re seeing is that backup is actually defragmenting. Where we’re seeing different virtualization solutions than we are for database solutions than we are for a unified solution. But the problem is the unified solutions don’t typically cover SaaS. So now we’re up to four different backup products.
Do you really need four different backup products? Probably not. But based on what we’re seeing in the industry, evidently, there’s a lot of IT operations and a lot of IT decision makers out there that haven’t been able to convince their V-Admin and their DBA colleagues that unified solution might be superior. And that brings up two challenges. Workload owners tend to think about 30-day, 60-day, 90-day rollback in order to keep their platforms productive. Whereas a backup admin tends to think in 5-year, 10-year, 15-year retention windows. Corporate data has to be protected to a corporate standard regardless of how many different people are clicking the mouse.
The other thing to note is that one gold standard doesn’t replace the ones before it. We’re still using disc plus tape and we’re supplementing that with cloud. We’re still supplementing backups with snapshots and replication. And we’re continuing to fragment the virtualization and DBA and all the other ones we’ve talked about. And that’s gonna lead us to what ESG calls the Five Cs of data protection.
We’ve already covered the containers, meaning all those different media types you’re gonna store. We’ve already talked about the conduits, those data movers, the backups, and the snaps, and the replicas, etc. And if you’re gonna have that much heterogeneity, you better have a single control plane to make sure that everything is operating in sync with each other. Mitigating that over and under protection. You better have a catalogue that’s rich enough to tell you what you have, where it is, and how long you need to keep it. And you better have at least one console that can tell you what all is going on within the environment. And make sure they’re actually making this all actionable and insightful.
Those are the Five Cs. I’m Jason Buffington for ESG. Thanks for watching.
Last week provided a case study in how to #EpicFail at a product launch. The vendor in question took a fresh look at the market and then created a completely new offering, built on a trusted brand, but stretching in a new and intriguing direction. And then, it completely failed in its first days in market.
The product (and service) is Pokemon Go, but there are several lessons that a lot of vendors can appreciate. As a disclaimer, when I am not preaching about Data Protection or Scouting, I can often be found with a game controller in my hand. And in one of my past lives, I was a columnist for Xbox.com (archived onXboxDad.com) and was also a product manager for the Microsoft management tools used for Xbox Live, back when Call of Duty and Halo release dates would wreak havoc during week one matchmaking events.
Here are five things that the Pokemon Go folks could learn (and maybe some of us, too):
Plan for scalability to accommodate success. Especially when you are using a cloud-service, which is presumably elastic as one of the benefits of being a cloud service, there is no excuse for authentication or user-creation or user-login issues. Quite simply, people couldn’t log on fast enough – and the pokemon.com site couldn’t handle it. Any as-a-Service vendor would kill to have Pokemon Go’s initial sign-up rates, but the game systems weren’t prepared and left a horrible first impression for days.
Don’t confuse folks with your branding. Ask most folks, where Pokemon comes from, they’ll answer “Nintendo.” But if you go to the AppStore, you’ll find a game from Niantic. In a world where malware is everywhere, cautious users might shy away or at least reconsider this as a knock-off from someone trying to sneak in. In this case, the splash screen (after you’ve installed it and whatever dubious hidden gems are there) shows “Niantic, a Pokemon Company.” There have to be better ways to build on brand recognition and assure credibility without creating new company names.
Whatever 1.0 is functionality wise, it has to work (everytime). It’s okay to not be feature-rich, or hold new capabilities for version 1.1 or 2.0. But whatever functions are there have to work. Either Niantic didn’t do enough QA testing, or they didn’t run enough of a beta program, or they just don’t care — but there are a lot of folks out there who are habitually rebooting the app. Some of it is likely tied to the failing to connect to the backend server farm, but still — it is broken as often as it is running. For anything besides a treasured game franchase, those customers will never come back or try again.
Ensure your initial users’ success if you want them to evangelize for you. Part of ensuring users’ success is telling them how your stuff should be operated. Provide a help file, some tutorial material, a few friendly videos, something! Even the most avid fans will be left with blank stares in this new application, with the community coming to the vendor’s rescue with fan-made materials. A short ‘system requirements’ set of info so that folks don’t install it, but then find it doesn’t work, would also be helpful.
Don’t be afraid to reinvent yourself, but respect what made you originally successful. This game is very different than anything you’ve played on a GameBoy, DS, or a console. It plays on your smartphone, uses your GPS, and expects you to get up and walk around. If you want to “Catch ‘em All” then you are going to have to get off that couch and start walking (a lot). In that regard, Pokemon Go actually deserves kudos for challenging the gamer stereotype and encouraging fitness, while extending a coveted brand and a few decades of avid fans.
In the case of Pokemon Go, there really are decades of fans out there who will begrudgingly forgive the catastrophic missteps that would have killed similar projects in the real world, and those folks will keep trying until the overwhelmed developers and system admins at Niantic figure out how to make it better. And they likely will figure it out, because (based on the huge initial attempts) they know that they have a potential hit on their hands, so they’ll (hopefully) devote the extra effort to fixing it.
To be fair, with Nintendo having seen billions in increased valuation after the launch, the company, the Pokemon empire, and (eventually) the game will be fine — but for the rest of us, that won’t happen.
For the rest of us, your stuff has to work at launch, don’t make it hard to trust you or try your stuff, and you can’t be timid to the point that you can’t handle success (especially if it is cloud-based). If you don’t have a plan for what happens when you find success, then you very likely never will.
Though some folks mistakenly view tape as an outdated technology — thanks in no small part to the fact that disk companies keep predicting its demise — there’s no denying its importance at the table of backup products. A recent Enterprise Strategy Group report indicates that 56% of organizations are still using tape. Perhaps not surprisingly, the larger the overall IT environment, the more likely that organization is to embrace tape as part of its data management strategy. There are two main reasons for this: The existence of more data heightens the importance of finding ways to store the data efficiently and, as an organization grows, it has a greater responsibility to store data for longer periods of time.
As a result, when looking at the Opex and Capex numbers, not to mention the ease of long-term data retention, tape moves to the forefront of the data management platform. When the LTO Consortium announced the release of LTO-7 as part of the Linear Tape File System in fall 2015, it reinforced the notion that tape technology is continuing to move forward. This three-part guide will take a closer look at what’s out there and what’s coming in terms of tape backup and tape libraries. It will also explain why archiving does not need to be complicated or expensive. In fact, when done correctly, it will actually save organizations money while meeting archival needs.
In the 25+ years that I have been in data protection, much of it has been spent hearing about “better” alternatives to tape as a medium. Certainly, in the earlier days, tape earned its reputation of slowness or unreliability. But nothing else in IT is the same as it was twenty years ago, so why do people presume that tape hasn’t changed?
Do I believe that most recoveries should come from disk? Absolutely. But candidly, my preferred go-to would be a disk-based snapshot/replica first, and then my backup storage disks, which would presumably be deduplicated and durable.
Do I believe in cloud as a data protection medium? Definitely. But not because it is the ultimate cost-saver for cold storage. Instead, cloud-based data protection services (cloud-storage tier or BaaS) are best when you are either dealing with non-data center data (including endpoints, ROBO servers or IaaS/SaaS native data) or when you want to do more with your data than just store it (BC/DR preparedness, test/dev/analytics). Of course, ‘cloud’ isn’t actually a medium, but a consumption model for service-delivered disk or tape, but we’ll ignore that for now.
Do I believe that tape is as relevant as it’s ever been?Yes, I really do. As data storage requirements in both production and protection continue to skyrocket, retention mandates continue to lengthen, and while IT teams are struggling to ‘do more with less,’ there are many organizations that need to re-discover what modern tape (not legacy stuff) really can do for their data protection and data management strategies.
Check out this video that we did in partnership a few vendors within the LTO community:
Your organization’s broader data protection and data management strategy should almost certainly use all three mediums for what each of them are best at. Disk is a no-brainer and cloud is on everyone’s minds, but don’t forget abouttape.