diskeeper_best practices for defragmenting_0

Upload: carlos-patino

Post on 07-Apr-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/6/2019 DISKEEPER_Best Practices for Defragmenting_0

    1/8

    WHITE PAPER

    Best Practices or DeragmentingThin Provisioned Storage

  • 8/6/2019 DISKEEPER_Best Practices for Defragmenting_0

    2/8

    BEsT PRAcTIcEs foR DEfRAgmEnTIng THIn PRovIsIonED sToRAgE 1

    Why Defragment?

    Beore we cover considerations and recommended congurations in thin provisioned storage

    environments, its important to revisit why deragmentation o Windows operating systems is soimportant in a virtualized machine and/or virtualized storage environment.

    The problem is that ragmented data in a local disk le system, such as NTFS, causes the

    operating system to generate additional I/O requests. For each logical ragment in the le

    system, a separate I/O request packet (IRP) must be generated and passed on to underlying

    storage layers. So, or example, a le in 100 ragments would generate 100 separate smaller

    I/Os, rather than a single larger I/O.

    This translates to an operating system processing a great deal more unnecessaryI/O trac, thereby

    increasing CPU and memory demand. In many cases, that excess I/O is passed on to a Storage

    Area Network (SAN) and/or virtualization platorm, causingadditionalunnecessary overhead.

    In some cases, data that is in a contiguous sequence o clusters in a local disk le system will

    be physically contiguous on the actual storage media, i.e., the disk drive/array. This is generally

    a valuable added benet, but by no means required or deragmentation to greatly increase

    perormance.

    Some le systems (e.g., log-structured le system) used in SANs may intentionally ragment data

    at the block level. They may coalesce random writes rom the OS into sequential writes within

    the storage. While this will minimize I/O activity in the SAN, it actually increases the likelihood that

    the data in those sequentially written stripes is physically ragmented, because the coalescing

    process is not based on re-ordering o blocks as they map to a common le it simply dumpsthe data to the media. For these environments, youll need to check with your storage vendor

    regarding proprietary deragmentation solutions or their SAN.

    Regardless o spatial proximity, the benet o a ragment-ree local disk le system (NTFS) is that

    your OS and virtualization platorms arent processing extra I/Os generated due to ragmentation,

    and will thereore be able to host more operating systems and process more data, aster.

    Thin Provisioning 101

    Thin provisioning allocates resources

    rom an aggregate storage pool, which isessentially divided into assignable units

    commonly reerred to as chunks.

    Provisioning storage in thin environ-

    ments is done in chunks that are pulled

    rom that pool o available, and as yet

    unallocated, storage.

    We use it [Diskeeper

    perormance sotware] onour big SQL box (8-way

    processor, hundreds o

    gigs o space on a SAN,

    16 gigs o RAM) and it

    has increased our disk

    perormance by a actor

    o about 8 or 9. We were

    looking at adding more

    spindles to our SAN to help

    with some disk I/O issues

    we had, but this wonderul

    sotware did it or us. Dave Underwood,

    Senior Engineer,

    CustomScoop

    Storage pools are made up o many small storage

    units (chunks)

  • 8/6/2019 DISKEEPER_Best Practices for Defragmenting_0

    3/8

    BEsT PRAcTIcEs foR DEfRAgmEnTIng THIn PRovIsIonED sToRAgE 2

    As data is added to a thin provisioned container, such as a Dynamic/Thin virtual disk or a LUN,

    that container increments, usually in a just-in-time basis, by a chunk or number o those chunks,

    depending on how many chunks are needed to house all the incoming writes. A chunk can

    be anywhere rom a ew kilobytes to gigabytes in size, and varies rom one thin provisioning

    technology vendor to the next. In some cases it is a xed size, in other solutions the chunk size

    is user-selectable.

    How and when chunks are allocated also varies rom vendor to vendor.

    Many thin provisioning technologies provision orevery write. They monitor blocks, and specically

    changes to blocks. As new data is written, space is provisioned or it on a just-in-time basis, and

    it is stored.

    The term Thin on Thin

    reers to the use o thin

    provisioning technology

    at both the virtual platorm

    layer and the storage

    array level.

    Since Windows XP, all parts

    o a fle stream, up to and

    including the allocation size

    (the fle tail between valid

    data length and end o fle)

    can be deragmented.

    Another method to provision space is based on the Windows volume high-water mark. A

    high-water mark, with respect to a volume in this denition, is the term that describes the last

    written cluster/block o data (the highest used Logical Cluster Number, or LCN, on the volume).

    Everything beyond the high-water mark is assumed to be null.

    NTFS Write and Delete Design

    While not exactly thin riendly, NTFS is undeserved o the reputation o being a problem or thin

    provisioned disks/LUNs. It has been mistakenly stated that NTFS carelessly writes to continuingly

    new and higher LCNs, until it has written to every cluster on the volume, beore circling back

    around to clusters since reed up rom le deletes. This is not correct.

    When describing NTFS design as it relates to storage provisioning, we should rst describe the

    various le sizes. There are three sizes or les in NTFS, and they use high-water marks, too.

    The Valid Data Length (VDL) is the distance into the le that data has actually been written, as itresides in the cache. It is depicted as the blue bar in the diagram (next page). A VDL can include

    sparse runs interspersed between data. The highest written LCN that constitutes the VDL is

    the high-water mark or that le. There is no data, at least related to this le, that resides past

    the high-water mark. Without having to actually write zeroes, and just as with high-water mark

    storage volumes, reads attempted past the high-water mark return zeroes.

  • 8/6/2019 DISKEEPER_Best Practices for Defragmenting_0

    4/8

  • 8/6/2019 DISKEEPER_Best Practices for Defragmenting_0

    5/8

    BEsT PRAcTIcEs foR DEfRAgmEnTIng THIn PRovIsIonED sToRAgE 4

    A solution or this challenge, commonly known as Thin Reclamation, encompasses the awareness

    o space ormerly occupied by deleted data and actions then undertaken to recover and

    re-provision that space. There are a variety o solutions available to aid with thin reclamation, such

    as zeroing deleted clusters to the SCSI UNMAP / SCSI WRITE_SAME commands, and will vary

    rom vendor to vendor.

    Defragmentation and Thin Provisioning

    As covered earlier, deragmentation is vital to achieve and maintain peak perormance. When Thin

    Provisioning is implemented on a shared virtualization host le system, it creates a high degree o

    probability o thin/dynamic virtual disk les themselves becoming ragmented, adding additional I/O

    overhead. In those storage systems, solving ragmentation becomes even more important.

    However, or all the benets o deragmentation, it is important to be aware o potential side

    eects. The side eects rom deragmentation can vary rom one thin technology implementation

    to the next, so it is important to know how the two technologies interact.

    Using special IOCTLs (I/O controls) in Windows, deragmentation is essentially moving data to

    consolidate le ragments and to pool ree space into large contiguous extents.

    Where the provisioning technology allocates space on new writes, a deragmentation process

    (which is actually only moving data) will appear as new writes. Additionally, the ormer locations

    o moved data will not necessarily be known to be re-usable. Derag will thereore generate

    additional storage capacity requirements or every piece o data moved.

    What can occur is that the new writes are redundantly provisioned, which results in unnecessarilyconsumed space.

    Many storage solutions

    track block-level changes

    (e.g., VMFS Change

    Block Tracking CBT) to

    support advanced eatures

    like snapshots and live

    migration o data. So, i

    the fle system blocks used

    or deragmentation are

    minimized and the same

    subset o blocks are

    re-used through the derag

    process, the actual storage

    growth will likely be minimal.

  • 8/6/2019 DISKEEPER_Best Practices for Defragmenting_0

    6/8

    BEsT PRAcTIcEs foR DEfRAgmEnTIng THIn PRovIsIonED sToRAgE 5

    Thin reclamation can eectively recover the wasted space, as could executing a data deduplication

    process (which would recognize and remove redundant data).

    Where high-water mark provisioning is used, the water mark always increases and never decreases(on Windows), indicating less available space, creating a potential problem. I a le is written (or

    moved via deragmentation) to a higher cluster, the thin provisioning technology will need to provision

    space to accommodate. That is true even i the le is only moved to a high cluster temporarily.

    V-locity virtual platorm

    disk optimizer, Diskeeper

    Pro Premier, Diskeeper

    Server and Diskeeper

    EnterpriseServer editions

    use specialized engines

    (Terabyte Volume Engine

    and Titan Derag Engine

    technologies) designed

    to generally deragment/

    move fles to lower LCNs on

    Windows volumes.

    On the opposite end o the spectrum, moving les orward can allow or space reclamation

    processes tobetterrecover over-provisioned space (depicted below).

    The process o compacting les to the ront o a volume is something deragmenters can

    assist with.

  • 8/6/2019 DISKEEPER_Best Practices for Defragmenting_0

    7/8

    BEsT PRAcTIcEs foR DEfRAgmEnTIng THIn PRovIsIonED sToRAgE 6

    Proactive Fragmentation Prevention

    It is important to evaluate marketing claims rom deragmentation vendors about eliminating/

    preventing most ragmentation beore it happens, as the technology behind the marketing claimcan have diering consequences or thin provisioned storage.

    Reactive solutions that rely on aggressive ree space consolidation (packing les together) in

    order to rely on NTFSs native best t attempts will cause thin provisioned growth.

    Proactive technologies that do not require additional movement o any data in order to

    accomplish their objective do not cause increases in thin provisioned storage. They provide the

    benet o a largely ragment-ree OS le system without any negative consequences or thin

    provisioned storage.

    Patent pending IntelliWrite technology, rom Diskeeper Corporation, is such a proactive solution.

    IntelliWrite is a superior design (to NTFS native over-allocations) or reserving space at the tail o a

    les valid data. IntelliWrite is smarter in that it looks at the source o le writes/modications and

    learns their behaviors over time. This heuristic process means that IntelliWrite knows better how

    much reservation space an open le needs to prevent ragmentation. It may be the le needs

    more than NTFS would natively oer, or it may pad less. The result o IntelliWrites intelligent over-

    allocations is an unmatched degree o successul ragmentation prevention (up to 85% success

    rate and more).

    Best Practices

    Use proactive fragmentation prevention technology, such as IntelliWrite from Diskeeper

    Corporation.

    Know,fromyourvendorofchoice,howtheythinprovisionandwhatsolutionstheyhavefor

    space (thin) reclamation.

    InThin-on-Thinprovisionedenvironments,spacereclamationatonelayer(e.g.,thinvirtualdisk)

    does not necessarily address other provisioned storage on subsequent layers (e.g., LUN).

    Defragment thin provisioned volumes when the corresponding storage growth can be

    addressed (e.g., a de-duplication/thin reclamation process).

    For high-water mark provisioning, use a defragmenter that moves les to lower LCNs

    (i.e., the ront).

    UseanOS/GOSdefragmenter,oradefragmenter-modethatfocusesonperformanceand

    not a pretty display.

  • 8/6/2019 DISKEEPER_Best Practices for Defragmenting_0

    8/8

    BEsT PRAcTIcEs foR DEfRAgmEnTIng THIn PRovIsIonED sToRAgE 7

    ApplySAN/VMvendortoolstoeliminatefragmentationpertheirrecommendedpracticesfor

    their proprietary clustered le systems.

    Filesequencing/orderingtechnologiesfoundinenterpriseOSdefragmenterscanbequitevaluable in many environments, especially perormance-ocused solutions on Direct Attached

    Storage. However, they can cause thin provisioned storage technologies to grow excessively

    due to their extra movement o data, so the general recommendation is to disable them or

    run them only when the eects (i.e., storage growth) can be addressed.

    DiskeeperCorporation 7590NorthGlenoaksBoulevard BurbankCalifornia91504-1052USA

    Toll-Free 800-829-6468 Phone 818-771-1600 Fax 818-252-5514 www.diskeeper.com

    Counting over 80% o the U.S. Fortune 1000 as volume license customers, and with over two decades o innovation in system

    perormance and reliability (ocused on storage perormance), Diskeeper Corporation is a recognized expert in the storage

    perormance industry.

    2010 Diskeeper Corporation. All Rights Reserved. Diskeeper Corporation, the Diskeeper Corporation logo, IntelliWrite, Terabyte Volume Engine, Titan Derag Engine,V-locity, and Diskeeper are registered trademarks or trademarks owned by Diskeeper Corporation. All other trademarks are the property o their respective owners.