Download - Data Deduplication for Dummies 2011
-
8/4/2019 Data Deduplication for Dummies 2011
1/43
Complimentsof
Quantum2ndSpecialEdition Get up to speed
on the hottest
topic in storage!
Data
Deduplication
Mark R. Coppock
Steve Whitner
A Referencefor the
Rest of Us!FREE eTips at dummies.com
-
8/4/2019 Data Deduplication for Dummies 2011
2/43
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
3/43
DataDeduplicationFOR
DUMmIES
QUANTUM 2ND SPECIAL EDITION
by Mark R. Coppockand Steve Whitner
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
4/43
Data Deduplication For Dummies, Quantum 2nd Special Edition
Published by
Wiley Publishing, Inc.111 River StreetHoboken, NJ 07030-5774
www.wiley.com
Copyright 2011 by Wiley Publishing, Inc., Indianapolis, Indiana
Published by Wiley Publishing, Inc., Indianapolis, Indiana
No part of this publication may be reproduced, stored in a retrieval system or transmitted in anyform or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise,except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without theprior written permission of the Publisher. Requests to the Publisher for permission should beaddressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ
07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.
Trademarks: Wiley, the Wiley Publishing logo, For Dummies, the Dummies Man logo, A Referencefor the Rest of Us!, The Dummies Way, Dummies.com, Making Everything Easier, and related tradedress are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in theUnited States and other countries, and may not be used without written permission. Quantum andthe Quantum logo are trademarks of Quantum Corporation. All other trademarks are the propertyof their respective owners. Wiley Publishing, Inc., is not associated with any product or vendormentioned in this book.
LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: THE PUBLISHER AND THE AUTHOR MAKENO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETE-NESS OF THE CONTENTS OF THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES,
INCLUDING WITHOUT LIMITATION WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE.NO WARRANTY MAY BE CREATED OR EXTENDED BY SALES OR PROMOTIONAL MATERIALS.THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE SUITABLE FOR EVERY SITU-ATION. THIS WORK IS SOLD WITH THE UNDERSTANDING THAT THE PUBLISHER IS NOTENGAGED IN RENDERING LEGAL, ACCOUNTING, OR OTHER PROFESSIONAL SERVICES. IF PRO-FESSIONAL ASSISTANCE IS REQUIRED, THE SERVICES OF A COMPETENT PROFESSIONALPERSON SHOULD BE SOUGHT. NEITHER THE PUBLISHER NOR THE AUTHOR SHALL BE LIABLEFOR DAMAGES ARISING HEREFROM. THE FACT THAT AN ORGANIZATION OR WEBSITE ISREFERRED TO IN THIS WORK AS A CITATION AND/OR A POTENTIAL SOURCE OF FURTHERINFORMATION DOES NOT MEAN THAT THE AUTHOR OR THE PUBLISHER ENDORSES THEINFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR RECOMMENDATIONS ITMAY MAKE. FURTHER, READERS SHOULD BE AWARE THAT INTERNET WEBSITES LISTED INTHIS WORK MAY HAVE CHANGED OR DISAPPEARED BETWEEN WHEN THIS WORK WAS WRIT-TEN AND WHEN IT IS READ.
For general information on our other products and services, please contact our BusinessDevelopment Department in the U.S. at 317-572-3205. For details on how to create a customFor Dummies book for your business or organization, contact [email protected] . Forinformation about licensing theFor Dummies brand for products or services, contactBrandedRights&[email protected].
ISBN: 978-1-118-03204-6
Manufactured in the United States of America
10 9 8 7 6 5 4 3 2 1
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
http://www.wiley.com/http://www.wiley.com/http://www.wiley.com/go/permissionshttp://www.wiley.com/go/permissionshttp://www.wiley.com/go/permissionshttp://www.wiley.com/ -
8/4/2019 Data Deduplication for Dummies 2011
5/43
Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
How This Book Is Organized .................................................... 1
Icons Used in This Book ............................................................ 2
Chapter 1: Data Deduplication: Why Less Is More . . . . .3
Duplicate Data: Empty Calories for Storageand Backup Systems .............................................................. 3
Data Deduplication: Putting Your Data on a Diet .................. 4
Why Data Deduplication Matters ............................................. 6
Chapter 2: Data Deduplication in Detail . . . . . . . . . . . . . .7
Making the Most of the Building Blocks of Data .................... 7
Fixed-length blocks versus
variable-length data segments ................................... 8
Effect of change in deduplicated storage pools ......... 10Sharing a Common Data Deduplication Pool ....................... 12
Data Deduplication Architectures ......................................... 13
Chapter 3: The Business Case forData Deduplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
Deduplication to the Rescue: Replication
and Disaster Recovery Protection ..................................... 16
Reducing the Overall Cost of Storing Data ........................... 18
Data Deduplication Also Works for Archiving ..................... 20
Looking at the Quantum Data Deduplication Advantage ......20
Chapter 4: Ten Frequently Asked DataDeduplication Questions (And Their Answers) . . . .23
What Does the Term Data Deduplication Really Mean? .....23
How Is Data Deduplication Applied to Replication? ............ 24
What Applications Does Data Deduplication Support? ...... 24
Is There Any Way to Tell How Much ImprovementData Deduplication Will Give Me? ...................................... 24
What Are the Real Benefits of Data Deduplication? ............ 25
What Is Variable-Block-Length Data Deduplication? ........... 25
If the Data Is Divided into Blocks, Is It Safe? ......................... 26
When Does Data Deduplication Occur during Backup? ...... 26
Does Data Deduplication Support Tape? .............................. 27
What Do Data Deduplication Solutions Cost? ...................... 28
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
6/43
Data Deduplication For Dummies, Quantum 2nd Special Editioniv
Appendix: Quantums Data DeduplicationProduct Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
DXi4500 ........................................................................... 31
DXi6500 Family ............................................................... 31
DXi6700 ........................................................................... 31
DXi8500 ........................................................................... 32
iv
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
7/43
Publishers AcknowledgmentsWere proud of this book and of the people who worked on it. For details on how to
create a customFor Dummies book for your business or organization, contact [email protected]. For details on licensing theFor Dummies brand for products or services,contact BrandedRights&[email protected].
Some of the people who helped bring this book to market include the following:
Acquisitions, Editorial, and Media
Development
Project Editor: Linda Morris
Editorial Managers: Jodi Jensen,Rev Mengle
Acquisitions Editor: Kyle Looper
Business Development Representative:Karen Hattan
Custom Publishing Project Specialist:Michael Sullivan
Composition Services
Project Coordinator: Kristie Rees
Layout and Graphics: Lavonne Roberts,Laura Westhuis
Proofreaders: Jessica Kramer,Lindsay Littrell
Publishing and Editorial for Technology Dummies
Richard Swadley, Vice President and Executive Group Publisher
Andy Cummings, Vice President and Publisher
Mary Bednarek, Executive Director, Acquisitions
Mary C. Corder, Editorial Director
Publishing and Editorial for Consumer Dummies
Diane Graves Steele, Vice President and Publisher, Consumer Dummies
Ensley Eikenburg, Associate Publisher, Travel
Composition Services
Debbie Stailey, Director of Composition Services
Business Development
Lisa Coleman, Director, New Market and Brand Development
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
8/43
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
9/43
Introduction
Right now, duplicate data is stealing time and moneyfrom your organization. It could be a presentation sit-ting in hundreds of users network folders or a group e-mail
sitting in thousands of inboxes. This redundant data makesboth storage and your backup process more costly, moretime-consuming, and less efficient. Data deduplication, usedon Quantums DXi-Series disk backup and replication appli-ances, dramatically reduces this redundant data and the costsassociated with it.
Data Deduplication For Dummies, Quantum 2nd SpecialEdition, discusses the methods and rationale for reducing the
amount of duplicate data maintained by your organization.This book is intended to provide you with the information youneed to understand how data deduplication can make a mean-ingful impact on your organizations data management.
How This Book Is OrganizedThis book is arranged to guide you from the basics of data
deduplication, through its details, and then to the businesscase for data deduplication.
Chapter 1: Data Deduplication: Why Less Is More:Provides an overview of data deduplication, includingwhy its needed, the basics of how it works, and why itmatters to your organization.
Chapter 2: Data Deduplication in Detail: Gives a relatively
technical description of how data deduplication functions,how it can be optimized, its various architectures, andwhat happens when it gets applied to replication.
Chapter 3: The Business Case for Data Deduplication:Provides an overview of the business costs of duplicatedata, how data deduplication can be effectively appliedto your current data management process, and how itcan aid in backup and recovery.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
10/43
Data Deduplication For Dummies, Quantum 2nd Special Edition2
Chapter 4: Ten Frequently Asked Data DeduplicationQuestions (And Their Answers): This chapter lists, well,
frequently asked questions and their answers.
Icons Used in This BookHere are the helpful icons you see used in this book.
The Remember icon flags information that you should payspecial attention to.
The Technical Stuff icon lets you know that the accompanyingtext explains some technical information in detail.
A Tip icon lets you know that some practical information thatcan really help you is on the way.
A Warning lets you know of a potential problem that canoccur if you dont take care.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
11/43
Chapter 1
Data Deduplication:Why Less Is More
In This Chapter Understanding where duplicate data comes from
Identifying duplicate data
Using data deduplication to reduce storage needs
Figuring out why data deduplication is needed
Maybe youve heard the clich Information is the life-blood of an organization. But many clichs have truthbehind them, and this is one such case. The organization thatbest manages its information is likely the most competitive.
Of course, the data that makes up an organizations informa-tion must also be well-managed and protected. As the amount
and types of data an organization must manage increase expo-nentially, this task becomes harder and harder. Complicatingmatters is the simple fact that so much data is redundant.
To operate most effectively, every organization needs toreduce its duplicate data, increase the efficiency of its storageand backup systems, and reduce the overall cost of storage.Data deduplication is a powerful technology for doing just that.
Duplicate Data: Empty Caloriesfor Storage and Backup Systems
Allowing duplicate data in your storage and backup systemsis like eating whipped cream straight out of the bowl: You get
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
12/43
Data Deduplication For Dummies, Quantum 2nd Special Edition4
plenty of calories, but no nutrition. Take it to an extreme, andyou end up overweight and undernourished. In the IT world,
that means buying lots more storage than you really need.
The tricky part is that its not really the IT team that controlshow much duplicate data you have. All of your users andsystems generate duplicate data, and the larger your organiza-tion and the more careful you are about backup, the biggerthe impact is.
For example, say that a sales manager sends out a 10MB pre-
sentation via e-mail to 500 salespeople and each person storesthe file. The presentation now takes up 5GB of your storagespace. Okay, you can live with that, but look at the impact onyour backup!
Because yours is a prudent organization, each users networkshare is backed up nightly. So day after day, week after week,you are adding 5GB of data each day to your backup, and mostof the data in those files consists of the same blocks repeated
over and over and over again. Multiply this by untold numbersof other sources of duplicate data, and the impact on your stor-age and backup systems becomes clear. Your storage needsskyrocket, and your backup costs explode.
Data Deduplication: Putting
Your Data on a DietIf you want to lose weight, you either reduce your calories orincrease your exercise. The same is sort of true for your data,except you cant make your storage and backup systems runlaps to slim down.
Instead, you need a way to identify duplicate data and theneliminate it.Data deduplication technology provides just such
a solution. Systems like Quantums DXi products that useblock-based deduplication start by segmenting a dataset intovariable-length blocks and then check for duplicates. Whenthey find a block theyve seen before, instead of storing itagain, they store a pointer to the original. Reading the file issimple the sequence of pointers makes sure all the blocksare accessed in the right order.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
13/43
Chapter 1: Data Deduplication: Why Less Is More 5
Compared to other storage reduction methods that look forrepeated whole files (single-instance storage is an example),
data deduplication provides much more granularity. Thatmeans that in most cases, it dramatically reduces the amountof storage space needed.
As an example, consider the sales deck that everybody saved.Imagine that everybody put their name on the title page. Asingle-instance system would identify all the files as uniqueand save all of them. A system with data deduplication, how-ever, can tell the difference between unique and duplicate
blocks inside files and between files, and its designed to saveonly one copy of the redundant data segments. That meansthat you use much less storage.
Data deduplication isnt a stand-alone technology it canwork with single-instance storage and conventional compres-sion. That means data deduplication can be integrated intoexisting storage and backup systems to decrease storagerequirements without making drastic changes to an
organizations infrastructure.
A brief history of data reductionOne of the earliest approaches todata reduction was data compres-sion, which searches for repeated
strings within a single file. Different types of compression technologiesexist for different types of files, butall share a common limitation: Eachreduces duplicate data only withinspecific parts of individual files.
Next came single-instance storage,which reduces storage needs byrecognizing when files are repeated.Single-instance storage is used inbackup systems, for example, wherea full backup is made first, and then
incremental backups are made ofonly changed and new files. Theeffectiveness of single-instance
storage is limited because it savesmultiple copies of files that may haveonly minor differences.
Data deduplication is the newest technique for reducing data.Because it recognizes differences ata variable-length block basis withinfiles and betweenfiles, data dedu-plication is the most efficient datareduction technique yet developedand allows for the highest savings instorage costs.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
14/43
Data Deduplication For Dummies, Quantum 2nd Special Edition6
Data deduplication utilizes proven technology. Most data isalready stored in non-contiguous blocks, even on a single-disk
system, with pointers to where each files blocks reside. InWindows systems, theFile Allocation Table (FAT) maps thepointers. Each time a file is accessed, the FAT is referenced toread blocks in the right sequence. Data deduplication refer-ences identical blocks of data with multiple pointers, but ituses the same basic principles for reading multi-block filesthat you are using today.
Why Data Deduplication MattersIncreasing the data you can put on a given disk makes sensefor an IT organization for lots of reasons. The obvious one isthat it reduces direct costs. Although disk costs have droppeddramatically over the last decade, the increase in the amountof data being stored has more than eaten up the savings.
Just as important, however, is that data deduplication also re-duces network bandwidth needs for transmitting data whenyou store less data, you have to move less data, too. That opensup new protection and disaster recovery capabilities replica-tion of backup data, for example which make management ofdata much easier.
Finally, there are major impacts on indirect costs theamount of space required for storage, cooling requirements,and power use. Management time is also reduced oftendramatically. Quantum DXi customers in a recent surveyaveraged a 63 percent reduction in the amount of timethey had to spend managing their backups.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
15/43
Chapter 2
Data Deduplicationin Detail
In This Chapter Understanding how data deduplication works
Optimizing data deduplication
Defining the data deduplication architectures
Data deduplication is really a simple concept with verysmart technology behind it: You only store a block once.If it shows up again, you store a pointer to the first one thattakes up less space than storing the whole thing again. Whendata deduplication is put into systems that you can actuallyuse, however, there are several options for implementation.And before you pick an approach to use or a model to plug in,you need to look at your particular data needs to see whether
data deduplication can help you. Factors to consider includethe type of data, how much it changes, and what you want todo with it. So lets look at how data deduplication works.
Making the Most of theBuilding Blocks of Data
Basically, data deduplication segments a stream of data intovariable-length blocks and writes those blocks to disk. Alongthe way, it creates a digital signature like a fingerprint for each data segment and an index of the signatures it hasseen. The index, which can be recreated from the stored datasegments, lets the system know when its seeing a new block.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
16/43
Data Deduplication For Dummies, Quantum 2nd Special Edition8
When data deduplication software sees a duplicate block, itinserts a pointer to the original block in the datasets meta-data (the information that describes the dataset) rather thanstoring the block again. If the same block shows up more thanonce, multiple pointers to it are created. Its a slam dunk pointers are smaller than blocks, so you need less disk space.
Data deduplication technology clearly works best when it seessets of data with lots of repeated segments. For most people,thats a perfect description of backup. Whether you back upeverything every day (and lots of us do this) or once a weekwith incremental backups in between, backup jobs by theirnature send the same pieces of data to a storage system overand over again. Until data deduplication, there wasnt a goodalternative to storing all the duplicates. Now there is.
Fixed-length blocks versusvariable-length data segmentsSo why variable-length blocks? You have to think about thealternative. Remember, the trick is to find the differencesbetween datasets that are made up mostly but not com-
pletely of the same segments. If segments are found by
A word about wordsTheres no science academy thatforces IT writers to standardize worduse thats a good thing. But itmeans that different companies usedifferent terms. In this book, we usedata deduplication to mean a vari-able-length block approach to reduc-
ing data storage requirements and
thats the way most people use the term. But some companies use thesame word to describe systems thatlook for duplicate data in other ways,like at a file level. If you hear the termand youre not sure how its beingused, ask.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
17/43
Chapter 2: Data Deduplication in Detail 9
dividing a data stream into fixed-length blocks, then chang-ing any single block means that all the downstream blocks
will look different the next time the data set is transmitted.Bottom line, you wont find very many common segments.
So instead of fixed blocks, Quantums deduplication technol-ogy divides the data stream into variable-length data seg-ments using a system that can find the same block boundariesin different locations and contexts. This block-creation pro-cess lets the boundaries float within the data stream so thatchanges in one part of the dataset have little or no impact on
the blocks in other parts of the dataset. Duplicate data seg-ments can then be found globally at different locations insidea file, inside different files, inside files created by differentapplications, and inside files created at different times.Figure 2-1 shows fixed-block data deduplication.
A B C D
E F G H
Figure 2-1: Fixed-length block data in data deduplication.
The upper line shows the original blocks the lowershows the blocks after making a single change to Block A(an insertion). The shaded sequence is identical in bothlines, but all of the blocks have changed and no duplicationis detected there are eight unique blocks.
Data deduplication utilizes variable-length blocks. In Figure 2-2,Block A changes when the new data is added (it is now E), butnone of the other blocks are affected. Blocks B, C, and D are allidentical to the same blocks in the first line. In all, we have onlyfive unique blocks.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
18/43
Data Deduplication For Dummies, Quantum 2nd Special Edition10
E B C D
A B C D
Figure 2-2: Variable-length block data in data deduplication.
Effect of change in deduplicatedstorage poolsWhen a dataset is processed for the first time by a data de-duplication system, the number of duplicate data segmentsvaries depending on the nature of the data (both file typeand content). The gain can range from negligible to 50% ormore in storage efficiency.
But when multiple similar datasets like a sequence ofbackup images from the same volume are written to acommon deduplication pool, the benefit is very significantbecause each new write only increases the size of the totalpool by the number of new data segments. In typical businessdata sets, its common to see block-level differences betweentwo backups of only 1% or 2%, although higher change ratesare also frequently seen.
The number of new data segments in each new backupdepends a little on the data type, but mostly on the rate ofchange between backups. And total storage requirement alsodepends to a very great extent on your retention policies the number of backup jobs and the length of time they areheld on disk. The relationship between the amount of datasent to the deduplication system and the disk capacity actu-ally used to store it is referred to as the deduplicationratio.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
19/43
Chapter 2: Data Deduplication in Detail 11
Figure 2-3 shows the formula used to derive the data dedupli-cation ratio, and Figure 2-4 shows the ratio for four different
backup datasets with different change rates (compressionalso figures in, so the figure also shows different compressioneffects). These charts assume full backups, but deduplicationalso works when incremental backups are included. As it turnsout, though, the total amount of data stored in the deduplica-tion appliance may well be the same for either method becausethe storage pool only stores new blocks under either system.The deduplication ratio differs, though, because the amount ofdata sent to the system is much greater in a daily full model.
So the storage advantage is greater for full backups even if theamount of data stored is the same.
Data deduplication ratio =Total data before reduction
Total data after reduction
Figure 2-3: Deduplication ratio formula.
It makes sense that data deduplication has the most powerfuleffect when it is used for backup data sets with low or modestchange rates, but even for data sets with high rates of change,the advantage can be significant.
To help you select the right deduplication appliance, Quantumuses a sizing calculator that models the growth of backup data-sets based on the amount of data to be protected, the backupmethodology, type of data, overall compressibility, rates of
growth and change, and the length of time the data is to beretained. The sizing calculator helps you understand wheredata deduplication has the most advantage and where moreconventional disk or tape backup systems provide moreappropriate functionality.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
20/43
Data Deduplication For Dummies, Quantum 2nd Special Edition12
0
1
1
2
2
3
3
4
4
5
5
Day 1 Day 2 Day 3 Day 4
-
5
10
15
20
25
De-dup RatioCumulative Protected TB
TBS
tored
Cumulative Unique TB
De-Dup
Ratio
Compressibility = 5:1Data change = 0%
Events to reach 20:1 ratio = 4
Backups for Data set 1
Compressibility = 2:1
Data change = 1%
Events to reach 20:1 ratio = 11
Backups for Data set 2
0
2
4
6
8
10
12
14
D ay 1 D ay 2 D ay 3 D ay 4 D a y 5 D a y 6 D a y 7 D a y 8 D a y 9 D a y 1 0 D a y 1 1
-
5
10
15
20
25
De-Dup
Ratio
TBS
tored
Cumulative Protected TB Cumulative Unique TB De-dup Ratio
Figure 2-4: Effects of data change on deduplication ratios.
Contact your Quantum representative to participate in adeduplication sizing exercise.
Sharing a Common Data
Deduplication PoolSeveral data deduplication systems allow multiple streams ofdata from different servers and different applications to besent into a common deduplication pool (also called a block-pool) that way, common blocks between different datasetscan be deduplicated on a global basis. Quantums DXi-Seriesappliances are an example of such systems.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
21/43
Chapter 2: Data Deduplication in Detail 13
DXi-Series systems offer different connection personalitiesdepending on the model and configuration, including NAS
volumes (CIFS or NFS) and virtual tape libraries (VTLs). Theseries even supports Symantecs specific Logical Storage Unit(LSU) presentation, which is part of the OpenStorage Initiative(OST). Because all the presentations offered in the same unitaccess a common blockpool, redundant blocks are eliminatedacross all the datasets written to the appliance global dedu-plication. This means that a DXi-Series appliance recognizesand deduplicates the same data segments on a print and fileserver coming in through one backup job and on an e-mail
server backed up on a different server. Figure 2-5 demon-strates a sharing pool utilizing DXi-Series appliances.
DXi-Series Appliance Storage Pool
Sharing storage pool in DXi-Series appliancesAll the datasets written to the DXi appliance share a common,deduplicated storage pool irrespective of what presentation,interface, or application is used during ingest. One DXi-Seriesappliance can support multiple backup applicationsat the same time.
Source1
Source2
Source3
Figure 2-5: Sharing a global deduplication storage pool.
Data DeduplicationArchitectures
Data deduplication, like compression or encryption, introducescomputational overhead, so the choice of where and how dedu-plication is carried out can affect backup performance. The
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
22/43
Data Deduplication For Dummies, Quantum 2nd Special Edition14
most common approach today is to carry out deduplicationat the destination end of backup, but deduplication can also
occur at the source (that is, at the server where the backupdata is initially processed by the backup software, or even atthe host server where an application is backed up initially).
Wherever the data deduplication is carried out, just as withcompression or encryption, you get the fastest performancefrom purpose-built systems optimized for the process. If de-duplication is carried out by backup software agents runningon general-purpose servers, its usually slower, you have to
manage agents on all the servers, and deduplication can com-pete with and slow down primary applications. It can also becomplex to deploy or change.
The data deduplication approach with the highest performanceand ease of implementation is generally one that is carried outon specialized hardware systems at the destination end ofthe backup. Backup is faster and deduplication can workwith any backup software, so its easier to deploy and to
change down the road.
Deduplication appliances have been around for three or fouryears, and as vendors create later-generation products, thedevelopment teams are getting smarter about how to getthe most performance and data reduction out of a system.Quantums latest generation of products, for example, usedifferent kinds of storage inside the appliances to store thedata used for specific, often repeated operations. Looking up
and checking signatures happens all the time and is a prettyintensive operation, so that data is held on solid-state disks oron small, fast, conventional disk drives with a high-bandwidthconnection. Since both have very fast seek times, the perfor-mance of the whole system is increased significantly. Onerecent new product more than tripled the performance of themodel it replaced. Is there room for even more improvement?The engineers seem to think so so keep an eye out.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
23/43
Chapter 3
The Business Case forData Deduplication
In This Chapter Looking at the business value of deduplication
Finding out why applying the technology to replication anddisaster recovery is key
Identifying the cost of storing duplicate data
Looking at the Quantum data deduplication advantage
As with all IT investments, data deduplication must makebusiness sense to merit adoption. At one level, the valueis pretty easy to establish. Adding disk to your backup strategycan provide faster backup and restore performance, as well asgive you RAID levels of fault tolerance. But with conventionalstorage technology, the amount of disk people need for backup
just costs too much. Data deduplication solves that problemfor many users by letting them reduce the amount of disk theyneed to hold their backup data by 90 percent or more, whichtranslates into immediate savings.
Conventional disk backup has a second limitation that someusers think is even more important disaster recovery (DR)protection. Can data deduplication help there? Absolutely!The key is using the technology to power remote replication,
and the outcome provides another compelling set ofbusiness advantages.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
24/43
Data Deduplication For Dummies, Quantum 2nd Special Edition16
Deduplication to the Rescue:Replication and DisasterRecovery Protection
The minimum disaster recovery (DR) protection you need isto make backup data safe from site damage and other naturalor man-made disasters. After all, equipment and applicationscan be replaced, but digital assets may be irreplaceable. And
no matter how many layers of redundancy a system has, whenall copies of anything are stored on a single hardware system,they are vulnerable to fires, floods, or other site damage.
For most users, removable media provides all or most of theirsite loss protection. And its one of the big reasons that diskbackup isnt used more: When backup data is on disk, it justsits there. You have to do something else to get DR protection.People talk about replicating backup data over networks, but
almost nobody actually does it: Backup sets are too big andnetwork bandwidth is too limited.
Data deduplication changes all that by finally making remotereplication of backup practical and smart. How does datadeduplication work? Just like you store only the new blocksin each backup, you have to replicate only the new blocks.Suppose 1 percent of a 500GB backup has changed since theprevious backup. That means you have to move only 5GB of
data to keep the two systems synchronized and you canmove that data in the background over several hours. Thatmeans you can use a standard WAN to replicate backup sets.
For disaster recovery, that means you can have an off-sitereplica image of all your backup data every day, and you canreduce the amount of removable media you handle. Thats espe-cially nice when you have smaller sites that dont have IT staff.Less removable media can mean lower costs and less risk. Daily
replication means better protection. Its a win-win situation.
How do you get them synched up in the first place? Thefirst replication event may take longer, or you can co-locatedevices and move data the first time over a faster network, oryou can put backup data at the source site on tape and copyit locally onto the target system. After that first sync-up is fin-ished, the replication needs to move only the new blocks.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
25/43
Chapter 3: The Business Case for Data Deduplication 17
What about tape? Do you still need it? Disk-based deduplica-tion and replication can reduce the amount of tape you use,
but most IT departments combine the technologies, using tapefor longer-term retention. This approach makes sense for mostusers. If you want to keep data for six months or three years orseven years, tape provides the right economics and portability,and the new encryption capabilities that tape drives offer nowmake securing the data that goes off site on tape easy.
The best solution providers will help you get the right balance,and at least one of them Quantum lets you manage the
disk and tape systems from a single management console, and itsupports all your backup systems with the same service team.
The asynchronous replication method employed by Quantumin its DXi-Series disk backup and replication solutions can giveusers extra bandwidth leverage. Before any blocks are replicatedto a target, the source system sends a list of blocks it wants toreplicate. The target checks this list of candidate blocks againstthe blocks it already has, and then it tells the source what it
needs to send. So if the same blocks exist in two different offices,they have to be replicated to the target only one time.
Figure 3-1 shows how the deduplication process works onreplication over a WAN.
C e
Target
WAN
Step 2:Only the missing datablocks are replicatedand moved over the WAN.
Step 1:Source sends a list of elements toreplicate to the target. Targetreturns list of blocks not already
stored there.
A B C D A B D
C
A,B,C,D?
Sourceource
Source
Figure 3-1: Verifying data segments prior to transmission.
Because many organizations use public data exchanges tosupply WAN services between distributed sites, and becausedata transmitted between sites can take multiple paths fromsource to target, deduplication appliances should offer encryp-tion capabilities to ensure the security of data transmissions.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
26/43
Data Deduplication For Dummies, Quantum 2nd Special Edition18
In the case of DXi-Series appliances, all replicated data bothmetadata and actual blocks of data can be encrypted at the
source using SHA-AES 128-bit encryption and decrypted at thetarget appliance.
Reducing the OverallCost of Storing Data
Storing redundant backup data brings with it a number ofcosts, from hard costs such as storage hardware to opera-tional costs such as the labor to manage removable backupmedia and off-site storage and retrieval fees. Data deduplica-tion offers a number of opportunities for organizations toimprove the effectiveness of their backup and to reduceoverall data protection costs.
These include the opportunity to reduce hardware acquisi-
tion costs, but even more important for many IT organizationsis the combination of all the costs that go into backup. Theyinclude ongoing service costs, costs of removable media,the time spent managing backup at different locations, andthe potential lost opportunity or liability costs if critical databecomes unavailable.
The situation is also made more complex by the fact that in thebackup world, there are several kinds of technology and different
situations often call for different combinations of them. If data ischanging rapidly, for example, or only needs to be retained for afew days, the best option may be conventional disk backup. If itneeds to be retained for longer periods six months, a year, ormore traditional tape-based systems may make more sense.For many organizations, the need is likely to be different fordifferent kinds of data.
The savings from combining disk-based backup, deduplication,
replication, and tape in an optimal way can provide very sig-nificant savings when users look at their total data-protectioncosts. A recent analysis at a major software supplier showedhow the supplier could add deduplication and replication toits backup mix and save more than $1,000,000 over a five-year
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
27/43
Chapter 3: The Business Case for Data Deduplication 19
period reducing overall costs by about one-third. Wherewere the savings? In reduced media usage, lower power and
cooling, and savings on license and service costs. The key wasdata deduplication and combining it with traditional tape inan optimal way. If the supplier tried the same approach usingconventional disk technology, it would have increased costs both because of higher acquisition expenses and much higherrequirements for space, power, and cooling. (See Figure 3-2.)
Conventional Disk 1PB, 10 Racks
versus
Quantums DXi Appliance 28:1 DeDup = 1PB, 20 U
Figure 3-2: Conventional disk technology versus Quantums
DXi-Series appliances.
The key to finding the best answer is looking clearly at all thealternatives and finding the best way to combine them. A sup-plier like Quantum that can provide and support all the differ-ent options is likely to give users a wider range of solutionsthan a company that offers only one kind of technology, andsuch suppliers have teams of people that can help IT depart-ments look at the alternatives in an objective way.
Work with Quantum and the companys sizing calculator tohelp identify the right combination of technologies for theoptimal backup solution both in the short term and the longterm. See Chapter 2 for more on the sizing calculator.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
28/43
Data Deduplication For Dummies, Quantum 2nd Special Edition20
Data Deduplication AlsoWorks for Archiving
Weve talked about the power of data deduplication in thecontext of backup because that application includes so muchredundant data. But data deduplication can also have verysignificant benefits for archiving and nearline storage appli-cations that are designed to handle very large volumes ofdata. By boosting the effective capacity of disk storage, data
deduplication can give these applications a practical way ofincreasing their use of disk-based resources cost effectively.Storage solutions that use Quantums patented data dedupli-cation technology work effectively with standard archivingstorage applications as well as with backup packages, andthe company has integrated the technology into its ownStorNext data management software. Combining high-speeddata sharing with cost effective content retention, StorNexthelps customers consolidate storage resources so that work-
flow operations run faster and the storage of digital businessassets costs less. With StorNext, data sharing and retentionare combined in a single solution that now also includes datadeduplication to provide even greater levels of value acrossall disk storage tiers.
Looking at the Quantum DataDeduplication Advantage
The DXi-Series disk backup and replication systems useQuantums data deduplication technology to reduce theamount of disk users need to store backup data by 90 percentor more. And they make automated replication of backup dataover WANs a practical tool for DR protection. All DXi-Seriessystems share a common replication methodology, so users
can connect distributed and midrange sites with Enterprisedata centers. The result is a cost-effective way for IT depart-ments to store more backup data on disk, to provide high-speed, reliable restores, to increase DR protection, to centralizebackup operations, and to reduce media management costs.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
29/43
Chapter 3: The Business Case for Data Deduplication 21
Quantum deduplication products cover a broad range of sizes,from compact units for small businesses and remote offices, to
midrange appliances, to enterprise systems that can hold4 petabytes of backup data. All systems include deduplicationand replication functionality in their base price, and the largersystems include software for creating tapes directly.
The DXi-Series works with all leading backup software, includ-ing Symantecs OpenStorage API, to provide end-to-end sup-port that spans multiple sites and integrates with tape backupsystems to make integrating deduplication technology into
existing backup architecture easy for users. DXi-Series appli-ances are part of a comprehensive set of backup solutionsfrom Quantum, the leading global specialist in backup, recov-ery, and archive. Whether the solution is disk with deduplica-tion and replication, conventional disk, tape, or a combinationof technologies, Quantum offers advanced technology, provenproducts, centralized management, and expert professionalservices offerings for all your backup and archive systems.
The results that Quantum DXi customers report show the kindof direct business benefits that adding deduplication technol-ogy can have on IT departments. In a recent survey, IT depart-ments that added DXi to their backup systems reported that:
Average backup performance more than doubledup125 percent -- while time for restores was reduced to afew minutes for most files.
Failed backup jobs were reduced by 87 percent.
Even though users still deployed tape for long-termretention and regulatory compliance, removable mediapurchase costs were reduced by an average 48 percentand media retrieval costs were reduced by 97 percent.
Overall, the amount of time people spent managing their backupand restore processes was reduced by an average 63 percent.For environments that deployed deduplication-based replica-
tion for DR, overall savings were higher. Dollar savings varied,but it was common for IT departments to reduce costs enoughthat they could pay for their deployments in roughly a year.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
30/43
Data Deduplication For Dummies, Quantum 2nd Special Edition22
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
31/43
Chapter 4
Ten Frequently Asked DataDeduplication Questions
(And Their Answers)In This Chapter Figuring out what data deduplication really means
Discovering the advantages of data deduplication
In this chapter, we answer the ten questions most oftenasked about data deduplication.What Does the Term Data
Deduplication Really Mean?Theres really no industry-standard definition yet, but thereare some things that everyone agrees on. For example, every-body agrees that its a system for eliminating the need tostore redundant data, and most people limit it to systems thatlook for duplicate data at a block level, not a file level. Imagine20 copies of a presentation that have different title pages: Toa file-level data-reduction system, they look like 20 completelydifferent files. Block-level approaches see the commonalitybetween them and use much less storage.
The most powerful data deduplication uses a variable-lengthblock approach. A product using this approach looks at asequence of data, segments it into variable length blocks, and,when it sees a repeated block, stores a pointer to the original
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
32/43
Data Deduplication For Dummies, Quantum 2nd Special Edition24
instead of storing the block again. Because the pointer takesup less space than the block, you save space. In backup,
where the same blocks show up again and again, userstypically reduce disk needs by 90 percent or more.
How Is Data DeduplicationApplied to Replication?
Replication is the process of sending duplicate data from asource to a target. Typically, a relatively high performancenetwork is required to replicate large amounts of backup data.But with deduplication, the source system the one sendingdata looks for duplicate blocks in the replication stream.Blocks already transmitted to the target system dont needto be transmitted again. The system simply sends a pointer,which is much smaller than the block of data and requiresmuch less bandwidth.
What Applications Does DataDeduplication Support?
When used for backup, data deduplication supports allapplications and all qualified backup packages. Certain filetypes some rich media files, for example dont see much
advantage the first time they are sent through deduplicationbecause the applications that wrote the files already elimi-nated redundancy. But if those files are backed up multipletimes or backed up after small changes are made, deduplica-tion can create very powerful capacity advantages.
Is There Any Way to Tell HowMuch Improvement DataDeduplication Will Give Me?
Four primary variables affect how much improvement you willrealize from data deduplication:
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
33/43
Chapter 4: Ten Frequently Asked Data Deduplication Questions 25
How much your data changes (that is, how many newblocks get introduced)
How well your data compresses using conventionalcompression techniques
How your backup methodology is designed (that is,full versus incremental or differential)
How long you plan to retain the backup data
Quantum offers sizing calculators to estimate the effect thatdata deduplication will have on your business. Pre-salessystems engineers can walk you through the process andshow you what kind of benefit you will see.
What Are the Real Benefitsof Data Deduplication?
There are two main benefits of data deduplication. First, datadeduplication technology lets you keep more backup data ondisk than with any conventional disk backup system, whichmeans that you can restore more data faster. Second, it makesit practical to use standard WANs and replication for disasterrecovery (DR) protection, which means that users can pro-vide DR protection while reducing the amount of removablemedia (thats tape) handling that they do.
What Is Variable-Block-LengthData Deduplication?
Its easiest to think of the alternative to variable-length, whichis fixed-length. If you divided a stream of data into fixed-lengthsegments, every time something changed at one point, all
the blocks downstream would also change. The system ofvariable-length blocks that Quantum uses allows some of thesegments to stretch or shrink, while leaving downstream blocksunchanged. This increases the ability of the system to findduplicate data segments, so it saves significantly more space.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
34/43
Data Deduplication For Dummies, Quantum 2nd Special Edition26
If the Data Is Divided intoBlocks, Is It Safe?
The technology for using pointers to reference a sequence ofdata segments has been standard in the industry for decades:You use it every day, and it is safe. Whenever a large file iswritten to disk, it is stored in blocks on different disk sectorsin an order determined by space availability. When you reada file, you are really reading pointers in the files metadata
that reference the various sectors in the right order. Block-based data deduplication applies a similar kind of technology,but it allows a single block to be referenced by multiple setsof metadata.
When Does Data Deduplication
Occur during Backup?There are really three choices.You can send all your backup data to a backup target andperform deduplication there (usually called target-baseddeduplication), you can perform the deduplication on eachprotected host, or you can use a central media server tocarry out the deduplication. All three systems are available
and have advantages.
If you deduplicate on the host during backup, you send less dataover your backup connection, but you have to manage softwareon all the protected hosts, backup slows down because dedu-plication adds overhead, and youre using a general-purposeserver, which can slow down other applications.
If deduplication is carried out in the backup application on
the media server, you dont have to buy a special-purposetarget deduplication device, but support is limited to oneapplication and all the overhead of the deduplication is addedto the servers other duties and deduplication systemsthat provide good reduction require significant processing.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
35/43
Chapter 4: Ten Frequently Asked Data Deduplication Questions 27
So users deploying server-based deduplication report slowerbackup, limited scalability, and requirements to upgrade
their disk storage and buy more, heavier-duty servers.
If you use a target deduplication appliance, you send all thedata to the device and deduplicate it there. You have to buyan appliance, but in most cases, the appliance is designed justfor deduplication. This means the backup and restore perfor-mance stays high and deduplication doesnt slow down otherbackups or require that you beef up your backup servers.
Does Data DeduplicationSupport Tape?
Yes and no. Data deduplication needs random access to datablocks for both writing and reading, so it must be implementedin a disk-based system. But tape can easily be written from
a deduplication data store, and, in fact, that is the typicalpractice. Most deduplication customers keep a few weeks ormonths of backup data on disk, and then use tape for longer-term storage. Quantum makes that easy by providing a directdisk-to-tape connection in its larger deduplication appliancesso you can create tapes directly without sending the databack through a backup server. Supported applications includemany of the leading backup software, including SymantecsOpenStorage API (OST).
An important point: When you create a tape from data in adeduplicated datapool, most vendors re-expand the data andapply normal compression. That way files can be read directlyin a tape drive and do not have to be staged back to a disksystem first. That is important because you want to be able toread those tapes directly in case of an emergency restore. Afew suppliers write deduplicated data blocks to tape to savespace, but there is a big downside: Youll have to write any
data back to disk before you can restore it, so for a restore ofa significant size, or one that involves files of different ages,you might have to have a lot of free disk space available. Mostusers find that being able to read data directly from tape is amuch better solution.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
36/43
Data Deduplication For Dummies, Quantum 2nd Special Edition28
What Do Data DeduplicationSolutions Cost?
Costs can vary a lot, but seeing list prices in the range of 30to 75 cents per GB of stored, deduplicated data is common. Agood rule-of-thumb rate for deduplication is 20:1 meaningthat you can store 20 times more data than conventional disk.Using that figure, systems that could retain 40TB of backupdata would have a list price of $12,500 or 31 cents a GB. So
even at the manufacturers suggested list and discounts arenormally available deduplication appliance costs are a lotlower than if you protected the same data using conventionaldisk. Even more important, customers commonly report thatthey save enough money from switching to a dedupe appli-ance to pay for their system in about a year.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
37/43
Appendix
Quantums DataDeduplication Product Line
In This Appendix Reviewing the Quantum DXi-series disk backup and remote
replication solutions
Identifying the features and benefits of the DXi-Series
Quantum Corp. is the leading global storage companyspecializing in backup, recovery, and archive. Combining
focused expertise, customer-driven innovation, and platformindependence, Quantum provides a comprehensive range ofdisk, tape, and software solutions supported by a world-classsales and service organization. As a long-standing and trustedpartner, the company works closely with a broad network ofresellers, original equipment manufacturers (OEMs), and othersuppliers to meet customers evolving data protection needs.
Quantums DXi-Series disk backup solutions leverage patenteddata deduplication technology to reduce the disk needed forbackup by 90 percent or more and make remote replicatedata between sites over existing wide area networks (WANs)a practical and cost-effective DR technique. Figure A-1 showshow DXi-Series replication uses existing WANs for DR protec-tion, linking backup data across sites and reducing or elimi-
nating media handling.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
38/43
Data Deduplication For Dummies, Quantum 2nd Special Edition30
DXi8500located atcentraldata center
Quantums Replication TechnologyUsers replicate data over existing WANs to provide automated DRprotection and centralized media management. Quantum replicationfeatures cross-site deduplication prior to data transmission foradditional bandwidth savings.
Remote office ADXi4500
DXi6500
Remote office B
Remote office C
Scalar i500tape library
DXi4500
Figure A-1: DXi-Series replication.
The DXi Series spans the widest range of backup capacitypoints in the industry. Some of the features and benefits ofQuantums DXi Series include:
Patented data deduplication technology that reducesdisk requirements by 90 percent or more
A broad solution set of turnkey appliances for small andmedium business, distributed and midrange sites, andscalable systems for the enterprise
High backup performance that provides enterprise-scaleprotection, even for tight backup windows
Software licenses that are included in the base price tomaximize value and steamline deployment
Quantums data deduplication also dramatically reduces thebandwidth needed to replicate backup data between sites for automated disaster recovery protection.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
39/43
Appendix: Quantums Data Deduplication Product Line 31
All models share a common software layer, including dedu-plication and remote replication, allowing IT departments to
connect all their sites in a comprehensive data protectionstrategy that boosts backup performance, reduces or elimi-nates media handling, and centralizes disaster recovery oper-ations. Support includes Symantec OpenStorage API (OST) forboth disk and tape on DXi4500, DXi6500 and DXi8500 models.
The following sections offer more details about the individualDXi systems.
DXi4500The DXi4500 disk appliances with deduplication make it easyand affordable to increase backup performance, improverestores, and reduce data protection costs. Quantums dedu-plication technology provides disk performance for yourbackups, while it reduces typical capacity needs. Backupscan be economically retained on disk for instant restores,
simplified management, and reduced use of removable media.DXi4500 units are designed for rapid, seamless integrationand maximum client performance without changes to existingbackup architectures or potentially disruptive media serverupgrades, unlike software-based deduplication. Support forremote replication, Symantec OpenStorage (OST) interface,and virtual environments are standard features.
DXi6500 FamilyThe DXi6500 is a family of pre-configured disk backup appli-ances that provides simple and affordable solutions for userbackup problems. They provide disk-to-disk backup andrestore performance with all leading backup applicationsusing a simple NAS interface, and they leverage deduplicationtechnology to reduce typical capacity requirements. For DRprotection, the DXi6500 models replicate encrypted backup
data between sites using global deduplication to reduce typi-cal network bandwidth needs by a factor of 20 or more.
DXi6700The DXi6700 is a high-performance disk backup appliancefor Fibre Channel environments that provides a simple and
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
40/43
Data Deduplication For Dummies, Quantum 2nd Special Edition32
affordable solution for backup problems using a provenVTL interface. The deduplication technology of the DXi6700
reduces typical capacity requirements by 90 percent or moreso systems stop filling up, and it scales easily without a ser-vice visit, providing effective investment protection. For DRprotection, the DXi6700 replicates encrypted backup databetween sites to reduce typical network bandwidth needs bya factor of 20 or more. For long-term retention, the DXi6700 isdesigned to provide direct tape creation in conjunction withleading backup applications.
DXi8500The DXi8500 is a high-performance deduplication solution withthe power and flexibility to anchor an enterprise-wide backup,disaster recovery, and data protection strategy. The DXi8500offers industry-leading performance and advanced dedupli-cation technology that reduces typical disk and bandwidthrequirements by 90 percent or more. The DXi8500 presents
a wide range of interface choices. Featuring an automated,direct path to tape for both VTL and OST presentations, theDXi8500 integrates short-term protection and long-termretention requirements.
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
41/43
Notes
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
42/43
Notes
These materials are the copyright of Wiley Publishing, Inc. and anydissemination, distribution, or unauthorized use is strictly prohibited.
-
8/4/2019 Data Deduplication for Dummies 2011
43/43
What are the true costs in storage space, cooling
requirements, and power use for all your redundant
data? Redundant data increases disk needs and makes
backup and replication more costly and more time-
consuming. By using data deduplication techniques and
technologies from Quantum, you can dramatically
reduce disk requirements and media management
overhead while increasing your DR options.
Find listings of all our books
Choose from manydifferent subject categories
Sign up for eTips atetips.dummies.com
ExplanationsinplainEnglish
Getin,getoutinformation
Iconsandothernavigationalaids
Toptenlists
Adashofhumorandfun
Use replication to automatedisaster recovery across sites!
Make a meaningful impact on your data
protection and retention
Eliminate
duplicate data
Reduce disk
requirements
Lower networkbandwidth
requirements