a-imex- solving io bottleneck io bottleneck.pdf(self-configuration, self-healing©imex, self-acctg....

20
IMEX RESEARCH.COM © 201011 IMEX Research, Copying prohibited. All rights reserved. Are SSDs Ready for Enterprise Storage Systems Anil Vasudeva, President & Chief Analyst, IMEX Research Solving the IO Bottleneck in NextGen DataCenters & Cloud Computing Anil Vasudeva President & Chief Analyst [email protected] 408-268-0800 © 2007-11 IMEX Research All Rights Reserved Copying Prohibited Please write to IMEX for authorization IMEX RESEARCH.COM

Upload: others

Post on 13-Feb-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    Are SSDs Ready for Enterprise Storage Systems

    Anil Vasudeva, President & Chief Analyst, IMEX Research

    Solving the IO Bottleneck in NextGen DataCenters &

    Cloud Computing

    Anil VasudevaPresident & Chief [email protected]

    © 2007-11 IMEX ResearchAll Rights Reserved

    Copying ProhibitedPlease write to IMEX for authorization

    IMEXRESEARCH.COM

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    22

    Abstract

    Solving the I/O Bottleneck in NextGen Data Centers & Cloud Computing

    Virtualization brings tremendous advantages to data centers of all sizes – large and small. But the rapid proliferation of virtual machines (VMs) per physical server, in its wake, creates a highly randomized I/O problem, raising performance bottlenecks. To address this I/O issue, the IT industry and storage administrators in particular, have begun the adoption of a slew of newer technology solutions - ranging from faster and larger memories in cache, SSDs for higher IOPS cost effectively, high bandwidth (10GbE) networks using NPIVs for virtualized networks between VMs and shared storage systems along with embedded intelligence software to optimize various VM workloads – all in order to meet various SLA metrics of performance, availability, cost etc..Are you aware of the side-effects that get created from Server Virtualization and prepared to ask pertinent questions of your suppliers of IT Infrastructure Equipment, Storage Virtualization and Data Storage Management software as well as in their implementation to achieve targeted results in performance, availability, scalability, interoperability and data management in their virtualized data centersThis presentation provides an illustrative view of the impact of Server Virtualization on existing storage I/O solutions and best practices. It delineates the roles, capabilities and cost effectiveness of emerging technologies in mitigating the I/O bottlenecks so the IT infrastructure implementers can achieve their targeted performance under various workloads, from their storage systems in virtualized data centers.

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    33

    Agenda

    • Data Centers & Cloud Infrastructure• Cloud Computing Architecture• Performance Metrics by Workload• Anatomy of Data Access• Data Center Performance Bottlenecks• Improving Query Response Time in OLTP• Role of SSD in Improving I/O Perf. Gap• SCM: A New Storage Class Memory SSDs • Price Erosion & IOPS/GB • Choosing SSD vs. Memory to Improve TPS• New Storage Usage Hierarchy in NGDC & Clouds • IO Bottleneck Mitigation in Virtualized Servers• I/O Forensics for Auto Storage-Tiering• Apps Benefitting from Improved I/O• Key Takeaways• Acknowledgements

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    44

    Data Centers & Cloud Infrastructure

    Enterprise VZ Data CenterOn-Premise Cloud

    Home Networks

    Web 2.0Social Ntwks.

    Facebook,Twitter, YouTube…

    Cable/DSL…Cellular

    Wireless

    InternetISPCoreOptical

    Edge ISP

    ISPISP

    ISP

    ISP

    Supplier/Partners

    Remote/Branch Office

    Public CloudCenter©

    Servers VPNIaaS, PaaS SaaS

    Vertical Clouds

    ISP

    Tier-3Data Base

    ServersTier-2 Apps

    ManagementDirectory Security PolicyMiddleware Platform

    Switches: Layer 4-7,Layer 2, 10GbE, FC Stg

    Caching, Proxy, FW, SSL, IDS, DNS,

    LB, Web Servers

    Application ServersHA, File/Print, ERP, SCM, CRM Servers

    Database Servers, Middleware, Data Mgmt

    Tier-1 Edge Apps

    FC/ IPSANs

    Source:: IMEX Research - Cloud Infrastructure Report ©2009-11

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    55

    IT Industry’s Journey - Roadmap

    CloudizationCloudizationOn-Premises > Private Clouds > Public CloudsDC to Cloud-Aware Infrast. & Apps. Cascade migration to SPs/Public Clouds.

    Integrate Physical Infrast./Blades to meet CAPSIMS ®IMEXCost, Availability, Performance, Scalability, Inter-operability, Manageability & Security

    Integration/ConsolidationIntegration/Consolidation

    Standard IT Infrastructure- Volume Economics HW/Syst SW(Servers, Storage, Networking Devices, System Software (OS, MW & Data Mgmt SW)

    StandardizationStandardization

    VirtualizationVirtualizationPools Resources. Provisions, Optimizes, MonitorsShuffles Resources to optimize Delivery of various Business Services

    Automatically Maintains Application SLAs (Self-Configuration, Self-Healing©IMEX, Self-Acctg. Charges etc)

    AutomationAutomationSIVAC SIVAC

    ®IMEX

    Source:: IMEX Research - Cloud Infrastructure Report ©2009-11

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    Platform Tools & Services

    Operating Systems

    Cloud ComputingPublic Cloud Service

    Providers Private Cloud

    Enterprise

    AppApp SLASLA

    SaaS Applications

    …...Net

    Python

    EJB

    Ruby

    PHP …..PaaS

    IaaS

    SaaS

    Virtualization

    Resources (Servers, Storage, Networks)

    HybridCloud

    AppApp SLASLA AppApp SLA

    SLA AppApp SLASLA AppApp SLA

    SLA

    Man

    agem

    ent

    Cloud Computing Architecture

    Application’s SLA dictates the Resources Required to meet specific requirements of Availability, Performance, Cost, Security, Manageability etc.

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    *IOPS for a required response time ( ms)*=(#Channels*Latency-1)

    (RAID - 0, 3)

    500100MB/sec

    101 505

    Data Warehousing

    OLAP

    BusinessIntelligence(RAID - 1, 5, 6)

    IOPS

    * (*L

    aten

    cy-1

    )

    Web 2.0Web 2.0AudioAudioVideoVideo

    Scientific ComputingScientific Computing

    ImagingImagingHPCHPC

    TPHPCHPC

    Performance Metrics by Workload

    10K

    100 K

    1K

    100

    10

    1000 K OLTP

    eCommerceeCommerceTransaction Transaction ProcessingProcessing

    Source:: IMEX Research - Cloud Infrastructure Report ©2009-11

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    Anatomy of Data Access

    Disk I/O

    Serve

    r

    Proces

    sor

    Perf

    orm

    ance

    1980 1990 2000 2010

    A 7.2K/15k rpm HDD can do 100/140 IOPS

    For the time it takes to do each Disk Operation:- Millions of CPU Operations can be done- Hundreds of Thousands of Memory Operations can be accomplished

    I/O Gap

    Anatomy of Data Access

    Time taken by CPU, Memory, Network, Disk

    for a typical I/O Operation during a

    Data Access

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    Data Center Performance Bottlenecks

    ServersWeb Servers Application Servers Database Servers

    ClientsWindows, Linux, Unix

    StorageWeb, Application, Database

    LAN Access Networks

    Storage I/O Access

    Device BottlenecksDevice I/O Hotspots Cache Flush Lack of Storage Capacity

    Server BottlenecksLack of Srvr Power IO Wait & Queuing CPU Overhead I/O Timeouts

    User BottlenecksConnectivity Timeouts, Workload Surges

    Storage I/O ConnectLack of Bandwidth Overloaded PCIe Connect Storage Device Contention

    ApplicationsExcessive Locking Data Contention I/O Delays/Errors

    Network I/ONetwork Congestion Dropped packets Data Retransmissions Timeouts Component Failures

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    1010

    • Improving Query Response Time• Cost effective way to improve Query response time for a given number

    of users or servicing an increased number of users at a given response time is best served with use of SSDs or Hybrid (SSD + HDDs) approach, particularly for Database and Online Transaction Applications

    HDDs14 Drives

    HDDs 112 Drives

    w short stroking SSDs

    12 Drives$$$$$$$$

    $$

    $$$

    IOPS (or Number of Concurrent Users)

    Que

    ry R

    espo

    nse

    Tim

    e (m

    s)

    0

    8

    2

    4

    6

    0 20,000 40,00010,000 30,000

    Conceptual Only -Not to Scale

    Improving Query Response Time in OLTP

    Hybrid HDDs

    36 Drives + SSDs$$$$

    Source: IMEX Research SSD Industry Report ©2011

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    1111

    Role of SSD in Improving I/O Perf. Gap

    Performance I/O Access Latency

    Price$/GB

    HDD

    Tape

    DRAM

    CPUSDRAM

    HDD becoming Cheaper, not faster

    DRAM getting Faster (to feed faster CPUs) & Larger (to feed Multi-cores & Multi-VMs from Virtualization)

    SCM

    NOR

    NANDPCIe SSD

    SATASSD

    SSD segmenting into PCIe SSD Cache- as backend to DRAM &SATA SSD- as front end to HDD

    Source: IMEX Research SSD Industry Report ©2011

    Servers

    Hybr. Storage

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    1212

    SCM: A New Storage Class Memory

    • SCM (Storage Class Memory)Solid State Memory filling the gap between DRAMs & HDDsMarketplace segmenting SCMs into SATA and PCIe based SSDs

    • Key Metrics Required of Storage Class MemoriesDevice - Capacity (GB), Cost ($/GB),

    • Performance - Latency (Random/Block RW Access-ms); Bandwidth BW(R/W- GB/sec)

    • Data Integrity - BER (Better than 1 in 10^17)• Reliability - Write Endurance (30K PE Cycles No. of writes before

    death);• - Data Retention (5 Years); MTBF (2 millions of Hrs),• Environment - Power Consumption (Watts); • Volumetric Density (TB/cu.in.); Power On/Off Time

    (sec),• Resistance - Shock/Vibration (g-force); Temp./Voltage Extremes

    4-Corner (oC,V); Radiation (Rad)

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    SSDs - Price Erosion & IOPS/GB

    Note: 2U storage rack, • 2.5” HDD max cap = 400GB / 24 HDDs, de-stroked to 20%, • 2.5” SSD max cap = 800GB / 36 SSDs

    Source: IMEX Research SSD Industry Report ©2011

    0

    300

    600

    2009 2010 2011 2012 2013 2014

    Uni

    ts (M

    illio

    ns)

    0

    4

    8

    IOPS

    /GB

    Enterprise HDD Units (M)

    SSD Un

    its (M)

    IOPS/GB SSDs

    IOPS/GB HDDs

    HDD

    SSD

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    2500

    2000

    1500

    1000

    500

    0

    TPS

    Buffer Pool (Memory) GB0 2 4 6 8 10 12 14 16 18 20

    SSD PCIe

    SSD SATA

    HDD RAI

    D 10

    2.5 x4.0 x

    2.5 x

    4.0 x

    Choosing SSD vs. Memory to Improve TPS

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    PerformanceData Protection

    80% of IOPsScaleCost

    Data Reduction

    80% of TB

    Dat

    a A

    cces

    s

    Age of Data1 Day 2 Mo.1 Month 3 Mo. 1 Year1 Week 2 Yrs6 Mo.

    Stor

    age

    Gro

    wth

    SSDs

    Data Storage Usage Patterns –Data Access vs. Age of Data

    Source:: IMEX Research - Cloud Infrastructure Report ©2009-11

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    1616

    I/O Access Frequency vs. Percent of Corporate Data

    SSD• Logs• Journals• Temp Tables• Hot Tables

    SSD• Logs• Journals• Temp Tables• Hot Tables

    FCoE/ SAS

    Arrays• Tables• Indices

    • Hot Data

    FCoE/ SAS

    Arrays• Tables• Indices

    • Hot Data

    Cloud Storage

    SATA• Back Up Data• Archived Data

    • Offsite DataVault

    Cloud Storage

    SATA• Back Up Data• Archived Data

    • Offsite DataVault

    2% 10% 50% 100%1%% of Corporate Data

    65%

    75%

    95%

    % o

    f I/O

    Acc

    esse

    s

    New Storage Hierarchy in NGDC & Clouds

    Source:: IMEX Research - Cloud Infrastructure Report ©2009-11

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    IO Bottleneck Mitigation in Virtualized Servers

    Primary Storage

    VM Client 1

    I/O Reg.

    XCLDriver

    Disk Controller. SSD

    Driver

    VM Client 2

    I/O Reg.

    XCLDriver

    VM Client n

    I/O Reg.

    XCLDriver

    XCLVLUN

    vSphere ESX

    SSD Drive w/ESX Driver

    ESX Kernel

    XCLMgr.

    Offloading IOPS from Primary Storage

    Both Applications & Storage Run Faster

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    1818

    I/O Forensics for Auto Storage-Tiering

    LBA Monitoring and Tiered Placement• Every workload has unique I/O access signature• Historical performance data for a LUN can identify

    performance skews & hot data regions by LBAs

    Source: IBM & IMEX Research SSD Industry Report 2011 ©IMEX 2010-11

    Storage-Tiering at LBA/Sub-LUN LevelStorage-Tiered Virtualization

    Physical Storage Logical Volume

    SSDs Arrays

    HDDs Arrays

    Hot Data

    Cold Data

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    1919

    Dat

    a: IM

    EX R

    esea

    rch

    & Pa

    nasa

    s

    Smart Mobile Devices

    Commercial Visualization

    Bioinformatics& Diagnostics

    Decision Support Bus. Intelligence

    Entertainment-VoD / U-Tube

    Instant On Boot UpsRugged, Low Power

    1GB/s, __ms

    Rendering (Texture & Polygons)Very Read Intensive, Small Block I/O

    10 GB/s, __ms 4 GB/s, __ms

    Data WarehousingRandom IO, High OLTPM

    Most Accessed VideosVery Read Intensive

    1GB/s, __ms

    Apps Benefitting from Improved I/O

  • IMEXRESEARCH.COM

    © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.

    Key Takeaways

    • Solving I/O Problems• I/O Bottlenecks occur at multiple places in the Compute Stack, the largest

    being at Storage I/O• SSD comes out cheaper/IOP for IO Intensive Apps• To get of Reads – Improve Indexing, archive out old data• Minimize the impact of writes – Get rid of temp tables/filesorts on slow

    disks.• Compress big varchar/text/blobs

    • Data Forensics and Tiered Placement• Every workload has unique I/O access signature• Historical performance data for a LUN can identify performance skews &

    hot data regions by LBAs• Use Smart Tiering to identify hot LBA regions and non-disruptively migrate

    hot data from HDD to SSDs.• Typically 4-8% of data becomes a candidate and when migrated to SSDs

    can provide response time reduction of ~65% at peak loads