open storage: intel’s investments in object storage
DESCRIPTION
Open Storage: Intel’s Investments in Object Storage. Paul Luse and Tushar Gohad, Storage Division, Intel. Transforming the Datacenter through Open Standards. Speed-up new application and services deployment on software-defined infrastructure created from widely available IA servers. - PowerPoint PPT PresentationTRANSCRIPT
Open Storage: Intel’s Investmentsin Object StoragePaul Luse and Tushar Gohad, Storage Division, Intel
2
Transforming the Datacenter through Open Standards
Drag picture to placeholder or click icon to add
Transforming the Business
Transforming the Ecosystem
Transforming the
Infrastructure
Speed-up new application and services deployment on software-defined infrastructure created from widely available IA servers.
Strengthen open solutions with Intel code contributions and silicon innovations to speed-up development, while building a foundation of trust.
Assure OpenStack based cloud implementations offer highest levels of agility, automation and efficiency using IA platform innovations.
3
Legal DisclaimersCopyright © 2014 Intel Corporation. All rights reservedIntel, the Intel logo, Xeon, Atom, and QuickAssist are trademarks of Intel Corporation in the U.S. and/or other countries.*Other names and brands may be claimed as the property of others.All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice.Intel® Advanced Vector Extensions (Intel® AVX)* are designed to achieve higher throughput to certain integer and floating point operations. Due to varying processor power characteristics, utilizing AVX instructions may cause a) some parts to operate at less than the rated frequency and b) some parts with Intel® Turbo Boost Technology 2.0 to not achieve any or maximum turbo frequencies. Performance varies depending on hardware, software, and system configuration and you should consult your system manufacturer for more information. *Intel® Advanced Vector Extensions refers to Intel® AVX, Intel® AVX2 or Intel® AVX-512. For more information on Intel® Turbo Boost Technology 2.0, visit http://www.intel.com/go/turboNo computer system can provide absolute security. Requires an enabled Intel® processor, enabled chipset, firmware and/or software optimized to use the technologies. Consult your system manufacturer and/or software vendor for more information. No computer system can provide absolute security. Requires an Intel® Identity Protection Technology-enabled system, including an enabled Intel® processor, enabled chipset, firmware, software, and Intel integrated graphics (in some cases) and participating website/service. Intel assumes no liability for lost or stolen data and/or systems or any resulting damages. For more information, visit http://ipt.intel.com/. Consult your system manufacturer and/or software vendor for more information.No computer system can provide absolute security. Requires an enabled Intel® processor, enabled chipset, firmware, software and may require a subscription with a capable service provider (may not be available in all countries). Intel assumes no liability for lost or stolen data and/or systems or any other damages resulting thereof. Consult your system or service provider for availability and functionality. No computer system can provide absolute reliability, availability or serviceability. Requires an Intel® Xeon® processor E7-8800/4800/2800 v2 product families or Intel® Itanium® 9500 series-based system (or follow-on generations of either.) Built-in reliability features available on select Intel® processors may require additional software, hardware, services and/or an internet connection. Results may vary depending upon configuration. Consult your system manufacturer for more details.For systems also featuring Resilient System Technologies: No computer system can provide absolute reliability, availability or serviceability. Requires an Intel® Run Sure Technology-enabled system, including an enabled Intel processor and enabled technology(ies). Built-in reliability features available on select Intel® processors may require additional software, hardware, services and/or an Internet connection. Results may vary depending upon configuration. Consult your system manufacturer for more details. For systems also featuring Resilient Memory Technologies: No computer system can provide absolute reliability, availability or serviceability. Requires an Intel® Run Sure Technology-enabled system, including an enabled Intel® processor and enabled technology(ies). built-in reliability features available on select Intel® processors may require additional software, hardware, services and/or an Internet connection. Results may vary depending upon configuration. Consult your system manufacturer for more details. The original equipment manufacturer must provide TPM functionality, which requires a TPM-supported BIOS. TPM functionality must be initialized and may not be available in all countries.Requires a system with Intel® Turbo Boost Technology. Intel Turbo Boost Technology and Intel Turbo Boost Technology 2.0 are only available on select Intel® processors. Consult your system manufacturer. Performance varies depending on hardware, software, and system configuration. For more information, visit http://www.intel.com/go/turboIntel® Virtualization Technology requires a computer system with an enabled Intel® processor, BIOS, and virtual machine monitor (VMM). Functionality, performance or other benefits will vary depending on hardware and software configurations. Software applications may not be compatible with all operating systems. Consult your PC manufacturer. For more information, visit http://www.intel.com/go/virtualization
Storage Policies in Swift Swift Primer / Storage Policies Overview Swift Storage Policy Implementation Usage Models
Erasure Coding Policy in Swift Erasure Coding (EC) Policy Swift EC Design Considerations and Proposed Architecture Python EC Library (PyECLib) Intel® Intelligent Storage Acceleration Library (ISA-L)
COSBench Cloud Object Storage Benchmark Status, User adoption, Roadmap
Public Swift Test Cluster
Agenda
OPENSTACK SUMMIT 2014
5
Storage Policies for OpenStack Swift
OpenStack Object Store Distributed, Scale-out Object Storage CAP
Eventually consistent Highly Available – no single point of failure Partition Tolerant
Well suited for unstructured data Uses container model for grouping objects with like
characteristics Objects are identified by their paths and have user-
defined metadata associated with them Accessed via RESTful interface
GET, PUT, DELETE Built on standard hardware
Cost effective, efficient
Swift Primer
OPENSTACK SUMMIT 2014
container
object
Capacity Tier
Access Tier
The Big Picture
Load Balancer
Proxy
StorageNodes
Proxy Proxy
Zone 1 Zone 2 Zone 3 Zone 4 Zone 5
ClientsRESTful API
Download
Copy 1 Copy 2
Copy 3
UploadObj A
• Handle incoming requests• Handle failures, ganged responses• Scalable shared nothing architecture• Consistent hashing ring distribution
Scalable for concurrency and/or capacity independently
• Actual object storage• Variable replication count• Data integrity services• Scale-out capacity
AuthService
Obj A
StorageNodes
StorageNodes
StorageNodes
StorageNodes
Why Storage Policies? New Opportunities for Swift
OPENSTACK SUMMIT 2014
Would you like 2x or 3x?
Are all nodes equal?
Can I add something like Erasure Codes?
Support Grouping of Storage Expose or make use of differentiated hardware with
a single cluster Performance tiers – a tier with high-speed SSDs can
be defined for better performance characteristics Multiple Durability Schemes
Erasure coded Mixed-mode replicated (Gold 3x, Silver 2x etc)
Other Usage models Geo tagging – ensure geographical location of data
within a container
Why Storage Policies? New Opportunities for Swift
OPENSTACK SUMMIT 2014
container
object
Community effort w/primary contributions from Intel and SwiftStack*
Triple Replication
n locations,object fragments
Introduction of multiple object rings Introduction of container tag: X-Storage-
Policy
Reduced Replication Erasure
Codes
3 Different Policies:3 Different Rings
3 locations,same object
2 locations,same object
Adding Storage Policies to Swift
wsgi server
account controllermiddleware (partially, modules like list_endpoints)
swift proxywsgi application container controller
object controllerhelper functions
wsgi server
replicatormiddleware
swift objectwsgi application auditor
expirerupdater
replicatorswift accountwsgi application auditor
reaperhelper functions
replicatorswift containerwsgi application auditor
syncupdater
Prox
y No
des
Stor
age
Node
s
DB s
chem
a up
date
s
Storage Policy Touch Points
Usage Model – Reduced Redundancy
OPENSTACK SUMMIT 2014
Container with 3x Policy
Container with 2x Policy
Container with HDD Policy
Container with SSD Policy
SSDs – previously limited to being
Used for account/container DB
Note: entire systems can comprise a policy as well…
Performance Tier
Geo TaggingGeo #1 Geo #2
Erasure CodesContainer with 3x Policy
Container with EC Policy
EC Fragments
Note: EC could also be on dedicated HW…
16
Erasure Coding Policy in OpenStack Swift
17
Erasure Codes
Object split into k data and m parity chunks and distributed across cluster
Space-optimal Redundancy and High Availability k = 10, m = 4 translates to 50% space requirement when
compared to 3x replication Higher Compute and Network Requirements Suitable for Archival workloads (high write %)
Object
D1 D2 D3 Dk P1 Pm
Swift with Erasure Coding Policy
OPENSTACK SUMMIT 2014
Capacity Tier (Storage)
Access Tier (Concurrency)Load Balancer
Proxy
StorageStorageStorage
StorageStorageStorage
StorageStorageStorage
StorageStorageStorage
StorageStorageStorage
Proxy Proxy
Zone 1 Zone 2 Zone 3 Zone 4 Zone 5
Clients
RESTful API, Similar to S3
Download
Frag 1
Frag 2
Frag 3
Frag 4Frag k +
m
EC Decode
r
Upload
EC Encoder
Obj A
Obj A
• Applications control policy• Inline EC
• Supports multiple policies• EC flexibility via plug-in
AuthService
redundancy n = k data fragments + m parity fragments
OPENSTACK SUMMIT 2014
EC Policy – Design Considerations First significant (non-replication) Storage Policy in OpenStack Swift In-line Proxy-centric Datapath Design
Erasure Code encode/decode during PUT/GET done at the proxy server Aligned with Swift architecture to focus demanding services in the access
tier Erasure Coding Policy applied at the Container-level
New container metadata will identify whether objects within it are erasure coded
Follows from the generic Swift storage policy design Keep it simple and leverage current architecture
Multiple new storage node services required to assure Erasure Code chunk integrity as well as Erasure Code stripe integrity; modeled after replica services
Storage nodes participate in Erasure Code encode/decode for reconstruction analogous to replication services synchronizing objectsCommunity effort w/ primary contributions from Intel, Box*, SwiftStack*
Erasure Coding Policy Touchpoints
OPENSTACK SUMMIT 2014
wsgi server
existing modules
middleware
swift proxywsgi application
wsgi servermiddleware
swift objectwsgi application
swift accountwsgi application
swift containerwsgi application
Prox
y No
des
Stor
age
Node
s
controller modifications EC Library InterfacePlug in 1 Plug in 2
existing modules
existing modules EC Auditor
EC ReconstructorEC Library Interface
Plug in 1 Plug in 2
metadata changes
metadata changes
Python interface wrapper library with pluggable C erasure code backends
Backend support planned in v1.0: Jerasure, Flat-XOR, Intel® ISA-L EC
BSD-licensed, hosted on bitbucket: https://bitbucket.org/kmgreen2/pyeclib
Use by Swift at Proxy server and Storage node level – most of the Erasure Coding details opaque to Swift
Jointly developed by Box*, Intel and the Swift community
Python Erasure Code Library (PyECLib)
OPENSTACK SUMMIT 2014
existing modules
swift proxy server
EC modifications to the Object Controller
PyECLib (Python)Jerasure (C) ISA-L (C, asm)
swift object server
existing modules
EC AuditorEC Reconstructor
PyECLib (Python)Jerasure ISA-L (C, asm)
Intel® ISA-L EC library Part of Intel® Intelligent Storage Acceleration Library
Provides primitives for accelerating storage functions Encryption, compression, de-duplication, integrity checks
Current Open Source version provides Erasure Code support Fast Reed Solomon (RS) Block Erasure Codes
Includes optimizations for Intel® architecture Uses Intel® SIMD instructions for parallelization Order of magnitude faster than commonly used lookup table methods Makes other non-RS methods designed for speed irrelevant
Hosted at https://01.org/storage-acceleration-library BSD Licensed
ISA-L Primitives (v2.10)
SSE PQ Gen (16+2)SSE XOR Gen (16+1)Reed Solomon EC (16+6,2)SSE MB: SHA-1
SSE MB: SHA-256SSE MB: SHA-512SSE MB: MD5CRC T10CRC IEEE (802.3)CRC32 iSCSI AES-XTS 128 AES-XTS 256 Compress “Deflate”IGZIP0, IGZIP1, IGZIP0C, IGZIP1C
Target: Summer ’14 PyECLib upstream on bitbucket and PyPi Storage Policies in plan for OpenStack Juno EC Expected to coincide with OpenStack Juno
Ongoing Development Activities The community uses a Trello discussion board:
https://trello.com/b/LlvIFIQs/swift-erasure-codes Launchpad blueprints:
https://blueprints.launchpad.net/swift Additional Information
Attend the Swift track in the design summit (B302, Thu 5:00pm) Talk to us on #openstack-swift or on the Trello discussion board To give PyECLib a test run, install it from https://pypi.python.org/pypi/PyECLib For information on ISA-L, check out http://www.intel.com/storage
Project Status
24
COSBench: Cloud Object Storage Benchmark
25
What is COSBench? Open Source Cloud Object Storage
Benchmarking Tool Announced at the Portland design summit
2013 Open Source (Apache License) Cross Platform (Java + Apache OSGI) Distributed load testing framework Pluggable adaptors for multiple object
storage backends Flexible workload definition Web-based real-time performance
monitoring Rich performance metric reporting
(Performance timeline, Response time histogram)
Storage backend Authtempauthswauthkeystonedirectnonebasic/digestlibradosrados GW (swift)rados GW (s3)
Amazon* S3 integratedScality sproxydCDMI
CDMI-base basic/digest CDMI-swift swauth/keystone
None noneMock mock
Amplidata* Amplistor
OpenStack* Swift
Ceph
Iometer(block)
COSBench(object)
Workload Configuration
Workflow for complex stages
Read/Write Operations
object size distribution
Flexible load control
Flexible configuration for complex workloads
Web Console
Test Generators
Active Workloads
History
0~1020~
3040~
5060~
7080~
90 100 120 140 160 180 200 220 2400
10,00020,00030,00040,00050,00060,00070,00080,000
0%20%40%60%80%100%120%
Response Time Histogram
readCDF (%)
16 32 64 128 256 512 20480
1,000 2,000 3,000 4,000 5,000 6,000
0
100
200
300
400 Performance Loadline
(100% Read)Throughput (Op/s)Avg Response Time (ms)
Workers
Thro
ughp
ut (
Op/
s)
Res
pons
e Ti
me
(ms)
Performance Reporting
Rich performance data help characterization
Progress since Havana
29
New FeaturesNew Object Store Backends
Amazon S3 adapter Ceph adapter (Librados based, and Radosgw
based) CDMI adapter (swift through cdmi
middleware, scality)
Authentication Support HTTP basic and digest
Core Functionality New selectors / new operator Object integrity checking Response time breakdown
Job management
User Interface ImprovementsBatch Workload Configuration UI
Adds Batch Test Configuration to COSBench Makes COSBench workload configuration
more like IOmeter
Bug Fixes 85 issues resolved
Roadmap
• Profiling tool
• Google/MS Storage
0.6.x(*15Q2)
• Workload suite
• Multi-part
0.5.x(*14Q4)
• Usability• CDMI
adapter
0.4.x(14Q2)
• S3 adapter
• Ceph adapter
0.3.x(13Q4)
• Open source baseline
0.3.0(13 Q2)
User Adoption
github activity (2 weeks)
31
Contributing to COSbenchActive code repository and community
Repository: https://github.com/intel-cloud/cosbench License: Apache v2.0Mailing-List: http://cosbench.1094679.n5.nabble.com
32
Public Swift Test Cluster
Public Swift Test ClusterJoint effort by SwiftStack*, Intel and HGST*6 Swift PACO Nodes
8-core Intel(R) Atom(TM) CPU C2750 @ 2.40GHz 16GB main memory, 2x Intel X540 10GbE, 4x 1GbE
Storage
12x HGST* 6TB Ultrastar(R) He6 Helium-filled HDDs
Operating Environment:
Ubuntu/Red Hat Linux, OpenStack Swift 1.13
Load Balancing / Management / Control / Monitoring
Using SwiftStack* Controller