dsc861a emerging technology

26
1 Storage Virtualization Team 3 Jennifer Brola-Richards Mohib Fanek Kathy Larson Donovan Miles Vishu Reddy Fran Trees

Upload: thornton-cedric

Post on 30-Dec-2015

17 views

Category:

Documents


2 download

DESCRIPTION

Storage Virtualization Team 3 Jennifer Brola-Richards Mohib Fanek Kathy Larson Donovan Miles Vishu Reddy Fran Trees. DSC861A Emerging Technology. lecture 21 Storage Virtualization: Forms of virtualization. Presentation Outline. Storage Virtualization - PowerPoint PPT Presentation

TRANSCRIPT

1

Storage Virtualization

Team 3 Jennifer Brola-Richards

Mohib FanekKathy Larson

Donovan Miles Vishu ReddyFran Trees

2

Storage Virtualization What is storage virtualization and why storage

virtualization? Storage Evolution and Fundamental

Concepts What are innovations and fundamental concepts

associated with storage? Storage Virtualization Deep Dive

What, Where and How of Storage Virtualization? Case Study Research Topics in Storage Virtualization

What are potential topics of research and dissertation? Summary and Verbal Quiz

3

4

Storage Virtualization is the next frontier in Storage Advances that aims to provide a layer of abstraction to reduce complexity.

Storage Networking Industry Association (SNIA) defines Storage Virtualization as:

1. The act of abstracting, hiding, or isolating the internal functions of a storage (sub) system or service from applications, host computers, or general network resources, for the purpose of enabling application and network-independent management of storage or data.

2. The application of virtualization to storage services or devices for the purpose of aggregating functions or devices, hiding complexity, or adding new capabilities to lower level storage resources.

What is storage virtualization?

5

Allocate and manage storage in accordance to the Quality of Service (QoS) associated with the data(e. g. Gartner estimates average data center doubling its storage every 18 to 24 months)!)

Provided continuous availability despite exponential growth (e.g. FaceBook- Over 55 billion page views a month, 41 million active users1)

Storage Virtualization aims to provide a layer of abstraction to manage storage and reduce complexity !!!

Why storage virtualization?

Effectively group and manage heterogeneous storage devices & servers (e.g. Estimated number of Google Servers 450,000 2!)

(1) Lucas Nealan, php|works, Atlanta September 13, 2007 (2) Wikipedia

Multiple Storage Software Platforms (e.g. IBM, EMC, HP,..)

Mergers and Acquisitions (e.g. Microsoft & Yahoo!)

6

Client side storage innovations… variety of storage device innovations that are smaller, higher capacity and cheaper have helped end users cope with increasing storage requirements!

What are the innovations and fundamentals associated with storage?

7

Server side storage innovations… a combination of storage devices, storage interfaces and storage software innovations have helped enterprises cope with exponential growth of data storage requirement !

Storage devices have evolved from tapes to hard drives to RAID hard drives increasing capacity and resiliency.

What are the innovations and fundamentals associated with storage?

8

Storage interface innovations have evolved from SCSI to ISCI, Fiber Channel (FCP) and InfiniBand to inter connect devices and transport the data faster.

SCSI

ISCSI FCP Infiniband

What are the innovations and fundamentals associated with storage?

9

File level access: Files are accessed by "semantics" instructions [example: Open, Close]. Data inside files is accessed by byte-ranges within the file (example: the first 10 bytes of a file). GFS (Google File System) is an example of a large scale distributed file system.

Block level access: Block addresses are used to Read/Write data [Read/Write, Block #] to the storage media.

Sample conventional Block Allocation Map

Storage Access File level access takes center stage along with conventional Block level access.

What are the innovations and fundamentals associated with storage?

10

Metadata is Data about data; in the context of storage metadata may describe an individual datum, or content item, or a collection of data including multiple content items.

Examples include: file size, who created file, attributes such as read only, free block bitmaps, control data.

What are the innovations and fundamentals associated with storage?

11

Storage Software from simple back-up and restore to advanced storage networks and storage management software functions.

(A) Simple Direct Attached Storage (DAS)

(B) Storage Area Network (SAN) (C) Network Attached Storage (NAS)

What are the innovations and fundamentals associated with storage?

12

SAN and NAS: Key Differences

NAS SAN

Access Methods File access Disk block access

Access Medium Ethernet Fiber Channel

Architecture Decentralized Centralized

Transport Protocol Layer over TCP/IP SCSI/FC and SCSI/IP

Efficiency Less More

Sharing and Access Control

Good Poor

Typical Applications Web Database

Typical Clients Workstations Database servers

What are the innovations and fundamentals associated with storage?

Taxonomy, Configuration, Challenges of CAS

13

14

File Level Virtualization

Storage Level Virtualization

Network Virtualization

Host Level Virtualization

*

* Host aka Server

* *

** Device=aggregation of Host and Network (Meta Data)

2

1

4

6

Device Virtualization5

3

SNIA Storage Model

What and Where can Storage be Virtualized?

Block Virtualization

Potential Areas of Virtualization

Source: The Storage Networking Tutorials, SNIAVIRT- Page 20 http://www.snia.org/education/tutorials/

15

File Level Virtualization

Storage Device Level Virtualization

Network Virtualization

Host Level Virtualization

21 5

Storage Virtualization: Innovations and Trends

Historical: MainframeRecent development example: VMware

Historical: RAID Level, SCSI InterfaceRecent Development Examples: Fiber Channel

3 4

Block Virtualization

Device Virtualization

6

Sub-Technique Sub-TechniqueHistorical: MainframeRecent development example: NAS

What and Where can Storage be Virtualized?

Major innovations continue to emerge even in historical areas of storage virtualization

Symmetrical (aka in-band) and Asymmetrical (aka Out-of-Band) are emerging as key areas of abstraction and virtualization.

16

How is storage virtualized at the enterprise level?

Source: IBM Redbook Page 8http://www.redbooks.ibm.com/redbooks/pdfs/sg246210.pdf

Metadata or Storage Volume Controllers (SVC) are placed (in-band) or in the path of data flow.

Metadata or Storage Volume Controllers are placed (out of band) outside the path of data flow.

Currently Networks are virtualized using Metadata or Storage Volume Controllers. There are two types of network virtualization…

17

In-Band Virtualization

Source: IBM Redbook Page 10http://www.redbooks.ibm.com/redbooks/pdfs/sg246210.pdf

Metadata or Storage Volume Controllers (SVC) are placed (in-band) or in the path of data flow.

How is storage virtualized at the enterprise level?

SVC controls who can get access to the storage devicecontrols, how storage can be accessed, how storage is allocated, etc.

SVC are managed through Storage Management Software.

12

3

Key Challenge is the potential IO bottlenecks4

18

Out-of-Band Network Virtualization

Metadata or Storage Volume Controllers (SVC) are placed (in-band) or in the path of data flow.

1

3

2 Host sends Metadata to SVC

Storage Pool sends Metadata to SVC

4SVC controls who can get access to the storage devicecontrols, how storage can be accessed, how storage is allocated, etc.

Source: IBM Redbook Page 12http://www.redbooks.ibm.com/redbooks/pdfs/sg246210.pdf

How is storage virtualized at the enterprise level?

Types of virtualization and case study

19

San Fabric ADirector

Ethernet

xSeries server

(xxx) pSeries server(s)

Type 2 SAN Storage_ 40 TB

Type 1 SAN Storage with_52TB

Blade SAN Fabric

Ethernet

SD

Pwr

SD

Pwr

Com3

Com3

Com3

(xxx) xSeries server

(xxx) pSeries server(s)

Virtualization Engines

Type 1 Storage

PRIMARY SITEEnvironment:PROD, DEV, QA, SITApplication:App1, App2

SECONDARY SITEEnvironment:Prod

Application:App1, App2

HI GH LEVEL DI AGRAM _ Typical Primary/ Secondary site data replication with Storage Virtualization

Monitor

VPN Comm-link f or remote

support

VPN Comm-link f or remote

support

DWDM

SAN Fabric A

Library wi LTO3drives

Type 2 SAN Storage26TB ea

Type 2 SAN Storage

Virtualization Engine Monitor

Network Appliances

Virtualization Engine

Monitor

Library wi LT03 drives

SAN Fabric A

SAN Fabric B

Com3

Network Appliances

San Fabric A

(xxx) Blade server(s)

D. M

iles 06/0

9/0

7

Management VLAN _ QA/

DEV _ storage, library,

director _ 950

PROD_ Blades + Blade

Fabric_ 955

CISCOSYSTEMS

CISCOSYSTEMS

CISCOSYSTEMS

CISCOSYSTEMS

(2) Cisco 6509 switch

Network Appliances

San Fabric B

San Fabric ADirector

San Fabric BDirector

SAN Fabric B

20

How is storage virtualized at enterprise level?

The Study 1. Shows that commingling of data and meta-data on a

single logical device means that there is no way to achieve different service level objectives for data and meta-data in the same file system, without moving file-system specific knowledge into the logical disk layers.

2. Shows that the standard assumptions underlying the organization of data and meta-data in file systems are no longer valid in virtualized storage environment and hence fail to materialize the full benefits of storage virtualization.

Proposes a different file system organization of data and meta-data designed to exploit the power of virtualized storage.

21

22

• Organization A Needs No Encryption

• Organization B_ Needs Encryption– Stores Medical Records– Security requirements for file data

is extremely high. – Performs nightly indexing operation

on file systems– All directory information and file

access times must be read to determine “changed” state of data

– Business requirement that all file data be encrypted at rest.

– File meta data has no security requirement

Service Level requirements within a single file system

In Unix fast file system (ffs), a logical disk is divided into collections of blocks called cylinder groups, each of which stores both file data blocks as well as file meta-data

Results Clean logical separation

between data and metadata Allows file system feature to

use virtualization features and achieve different SLO’s

Redesign changes ◦ Code change◦ Packing the re-located

cylinder group header in the first few meta data cylinder groups ensures each header is located @ a fixed, predictable offset from the front of the block device

◦ User configurable block address space before which no data stored and after no meta data stored

23

24

5-7% gains on the new file system layout

31-44% for the file lookup and file delete benchmarks, which result in little or no file data i/o, the advantage of data-only encryption become obvious

Future Work• Differing SLO’s for granular meta data• Completely separate fixed/dynamic metadata• Separate file data from user defined file attribute

data

Bayesian analysis for resource management

Bayesian analysis for diagnostics

Trusted domains for security

Storage Virtualization and Metadata Standards

Algorithm advances for block, device and other component virtualization techniques

25

What are potential topics of research and dissertation?

26

Storage Basics1. What type of storage is found in your work station?2. What type of storage systems may be found in a large

enterprise?3. How is data accessed from storage?4. Network Attached Storage (NAS) is well suited for what type

of applications?5. Storage Area Network (SAN) is well suited for what type of

applications?

Storage Virtualization1. What is Storage Virtualization?2. Where and What can be virtualized in storage?3. How is storage virtualized at a network level?4. How is storage virtualization currently implemented?5. What are the potential research topics in storage virtualization?