intelligent storage and data management with ibm general parallel file system (gpfs) maciej...

Intelligent Storage and Data Management with IBM General

Parallel File System (GPFS)

Maciej REMISZEWSKIIBM Forum 2012 – EstoniaTallinn, October 9, 2012

RedBull STRATOS redbullstratos.com/live/

http://www.flickr.com/photos/ibm_media/7375835798/in/set-72157630136230036

Explosion of data

Inflexible IT infrastructures

Escalating IT complexity

How to spot trends, predict outcomes and take meaningful actions?

How to manage inflexible, siloed systems and business processes to improve business agility?

How to manage IT costs and complexity while speeding time-to-market for new services?

Critical IT Trends for Technical Computing Users

Introducing the new IBM Technical Computing Portfolio Powerful. Comprehensive. Intuitive.

Systems &Storage

Solutions

Software

BG/Q

iDataPlexIntelligent Cluster

System x &BladeCenter P7-775

DCS3700

LTO Tape 3592Automation

DS5000DS3000

SoNASSoNAS

Parallel EnvironmentRuntime

Platform MPI

Engineering and Scientific Libraries

GPFS

Platform LSF Platform HPC

Platform SymphonyPlatform

Application Center

HPC Cloud

Integrated Solutions

IndustrySolutions

Parallel Environment Developer

IntelligentCluster

PureFlex

BigData

Platform Cluster Manager

NEW HPC Cloud SolutionsOverview• Innovative solutions for dynamic, flexible HPC cloud

environments

What’s New• New LSF add-on: IBM Platform Dynamic Cluster V9.1

oWorkload driven dynamic node re-provisioningoDynamically switch nodes between physical & virtual

machineso Automated job checkpoints and migrationo Smart, flexible policy and performance controls

• Enhanced Platform Cluster Manager – Advanced capabilities• New complete, end to end solutions

Use Case 1:HPC Infrastructure

Management

Use Case 1:HPC Infrastructure

Management• Self-service cluster provisioning &

management• Consolidate resources into a HPC

cloud• Cluster flexing

Use Case 3: Cloud Bursting

Use Case 3: Cloud Bursting

• ‘Burst’ internally to available resources

• Burst externally to cloud providers

Use Case 2:Self-service HPC

Use Case 2:Self-service HPC

• Self-service job submission & management

• Dynamic provisioning• Job migration and/or checkpoint-

restart• 2D/3D remote visualization

NEW Financial Risk and Crimes SolutionOverview•High-performance, low-latency integrated risk solution stack with Platform Symphony - Advanced Edition and partner products including:

oBigInsights and IBM Algorithmicso3rd party Partner products: Murex and Calypso

What’s New•New solution stacks to manage and process big data with speed and scale•Sales tools that highlight value of IBM Platform Symphony

o Financial Risk: Customer testimonial videos; inclusion in SWG risk frameworks, and S&D blueprints

o Financial Crime: with BigInsight for credit card fraud analytics

o TCO tool and benchmarks

8

Use Case 1:Financial Risk including

Credit Value Adjustment (CVA)

analytics

Use Case 1:Financial Risk including

Credit Value Adjustment (CVA)

analytics • Accelerates compute intensive

workloads up to 4X e.g. Monte Carlo simulations, Algorithmic Riskwatch “cube” simulations

• Integrated with IBM Algorithmics, Murex and Calypso

• High throughput: 17K tasks/sec

Use Case 2:Big Data for Financial

Crimes

Use Case 2:Big Data for Financial

Crimes• Accelerates analyses of data for

fraud and irregularities• Supports BigInsight• Faster than Apache Hadoop

distribution

NEW Technical Computing for Big Data SolutionsOverview• High-performance, low-latency “Big Data” solution stack featuring Platform Symphony, GPFS, DCS3700, Intelligent Cluster – proven across many industries •Low Latency Hadoop stack with Platform Symphony, Advanced Edition and InfoSphere BigInsightsWhat’s New•New solution stacks to manage and process big data with speed and scale

9

IBM General Parallel File System (GPFS)

IBM DCS3700 / IBM Intelligent Cluster

IBM is delivering a Smarter Computing foundation for Technical Computing

Achieve faster time to insight with scalable, low latency data access and control for Big Data analytics

Optimize agility with on-demand and workload-driven dynamic cluster, grid, and HPC clouds

Increase throughput, utilization, and lower operating costs with workload optimized systems and intelligent resource management

Designed for dataDesigned for data

Managed withCloud Technologies

Managed withCloud Technologies

Tuned to the taskTuned to the task

Smarter Computing

Technical Computing Software

Simplified management, optimized performance

The backbone of Technical Computing

xx

IBM acquired Platform Computing, a leader in cluster, grid, and HPC cloud management software

• 20 year history delivering leading management software for technical computing and analytics distributed computing environments

• Enable the usage and management of 1000s of systems as 1 - powering the evolution from clusters to grids to HPC clouds

• 2000+ global customers including 23 of 30 largest enterprises

• Market leading scheduling engine with high performance, mission-critical reliability and extreme scalability

• Comprehensive capability footprint from ready-to-deploy complete cluster systems to large global grids

• Heterogeneous systems support

• Large ISV and global partner ecosystem

• Global services and support coverage

De facto Standard for

Commercial HPC

Over 5 MM CPUs under

management

60% of top Financial Services

12

From June 2012, IBM Platform Computing™ portfolio is ready to deploy!

IBM Platform Computing can help accelerate your application results

AggregatesResource

Pools

• Compute & Data intensive apps• Heterogeneous resources• Physical, virtual, cloud• Easy user access

Optimizes Workload

Management

• Batch and highly parallelized • Policy & resource-aware scheduling• Service level agreements• Automation / workflow

Delivers Shared Services

Transforms Static Infrastructure to

Dynamic

• Workload-driven dynamic clusters• Bursting and “in the cloud”• Enhanced self-service / on-demand• Multi-hypervisor and multi-boot

For technical computing and analytics distributed computing environments

• Multiple user groups, sites• Multiple applications and workloads• Governance• Administration/ Reporting / Analytics

13

xx

Clients span many industriesPlatform LSF

“Platform Computing came to us as a true innovation partner, not a supplier of technology, but a partner who was able to understand our problems and provide appropriate solutions to us, and work with us to continuously improvethe performance of our system”

- Steve Nevey, Business Development Manager Red Bull Technology

Watch Red Bull videoWatch Red Bull video

“Platform’s software was a clear leader from the beginning of the process”

-Chris Collins, Head of Research & Specialist Computing

University of East Anglia

Platform HPC

Platform Symphony

"Platform Computing and its enterprise grid solution enable us to share a formerly heterogeneous and distributed hardware infrastructure across applications regardless of their location, operating system and application logic, … helping us to achieve our internal efficiency targets while at the same time improving our performance and service quality“

-Lorenzo Cervellin, Head of Global Markets and Treasury Infrastructure UniCredit Global Information Services

European BankEuropean Bank

http://www.youtube.com/watch?v=KDJya43N6ig

http://www.youtube.com/watch?v=KDJya43N6ig

http://www.unicreditgroup.eu/

IBM also offers the most widely used, commercially available, technical computing data management software

IBM General Parallel File System - Scalable, highly-available, high performance file system optimized for

multi-petabyte storage management

Virtualized Access to Data

GPFS™Virtualized, centrally deployed, managed,

backed up and grown

Cluster file system, all nodes access data.

Seamless capacity and performance scaling

GPFS pioneered Big Data management

File system

263 files per file system

Maximum file system size: 299 bytes

Maximum file size equals file system size

Production 5.4 PB file system

Number of nodes

1 to 8192

Extreme Scalability

No Special Nodes

Add/remove on the fly

Nodes

Storage

Rolling Upgrades

Administer from any node

Data replication

Proven Reliability

High Performance Metadata

Striped Data

Equal access to data

Integrated Tiered storage

Performance

xx

IBM innovation continues with GPFS Active File Management (AFM) for global namespace

Multi-cluster expanded the global namespace by connecting multiple sites

GPFS introduced concurrent file system access from multiple nodes.

AFM takes global namespace truly global by automatically managing asynchronous replication of data

GPFS GPFS

GPFS

GPFS

GPFS

GPFS

1993 2005 2011

xx

Native Raid Native Raid

Storage RichServers

Storage RichServers

MapReduce Farm

MapReduce Farm

Legacy HPCStorage

Legacy HPCStorage

Legacy NFSStorage

Legacy NFSStorage

AFMAFM

TSM/HPSS

TSM/HPSS

GNRx GNRx

NSDNSD

SNC

SNC

AFMAFM

NSD

/GN

RxN

SD/G

NRx

AFM + Multicluser

AFM + Multicluser

GPFS™Eliminating data islands

GPFS™Eliminating data islands

xx

How can GPFS deliver value to your business?

KnowledgeManagementand efficiency

thru filesharing

Businessflexibility withCloud Storage

Innovate withMap-Reduce/

Hadoop

Maintain businesscontinuity

thru Disaster Recovery

Reduce storagecosts thruLife-cycle

Management

Speed time-to-market thru

faster analytics

GPFS

xx

Speed time-to-market with faster analytics

• Issue:– We are in the era of “Smarter Analytics”

• Data explosion makes I/O a major hurdle.• Deep analytics result in longer running workloads• Demand for lower-latency analytics to beat the competition

• GPFS was designed for complex and/or large workloads accessing lots of data:– Real time disk scheduling and load balancing ensure all relevant

information and data can be ingested for analysis– Built-in replication ensures that deep analytics workloads can

continue running should a hardware or low level software failure occur.

– Distributed design means it can scale as needed


thru filesharing



Hadoop




Management

Faster time-to-market thru

faster analytics

GPFS

xx

Reduce storage costs thru Life-cycle Management

• Issue:– Increasing storage costs as dormant files sit on spinning disks– Redundant files stored across the enterprise to ease access– Aligning user file requirements with cost of storage

• GPFS has policy-driven, automated tiered storage management for optimizing file location.– ILM tools manage sets of files across pools of storage based

upon user requirements– Tiering across different economic classes of storage: SSD,

spinning disk, tape – regardless of physical location.– Interface with external storage sub-systems such as TSM and

HPSS to exploit ILM capability enterprise-wide.


thru filesharing



Hadoop




Management


faster analytics

GPFS

xx

Maintain business continuity thru disaster recovery

• Issue:– Need for real-time or low latency file access– File data contained in geographic areas susceptible to downtime– Fragmented file based information across a wide geographic

area

• GPFS has inherent features that are designed to ensure high availability of file-based data– Remote file replication with built-in failover– Multi-site clustering enables risk reduction of stored data via

WAN– Space efficient point-in-time snapshot view of the file system

enabling quick recovery


thru filesharing



Hadoop




Management


faster analytics

GPFS

xx

Innovate with Big Data or Map-Reduce/Hadoop

• Issue:– Unlocking value in large volumes of unstructured data– Mission critical applications requiring enterprise-tested

reliability– Looking for alternatives to the Hadoop File System (HDFS) for

map-reduce applications

• As part of a Research project, there is an active development project called GPFS-SNC to provide a robust alternative to HDFS– HDFS is a centralized file system with a single point of failure,

unlike the distributed design of GPFS– GPFS Posix compliance expands the range of application that can

access files (read, write, append) vs HDFS which cannot append or overwrite.

– GPFS contains all of the rich ILM features for high availability and storage management, HDFS does not.


thru filesharing



Hadoop




Management


faster analytics

GPFS

xx

Knowledge management and efficiency thru file sharing

• Issue:– Geographically dispersed employees need access to same set of

file based information– Supporting “follow-the-sun” product engineering and

development processes (CAD, CAE, etc)– Managing and integrating the workflow of highly fragmented

and geographically dispersed file data generated by employees

• GPFS global name space support and Active File Management provide core capabilities for file sharing– Global namespace enables a common view of files, file location

no matter where the file requestor, or file resides.– Active File Management handles file version control to ensure

integrity.– Parallel data access allows for large number of files and people

to collaborate without performance impact.


thru filesharing



Hadoop




Management


faster analytics

GPFS

Intelligent Cluster

System x iDataPlex

Optimized platforms to right-size

your Technical Computing operations

xx

IBM leadership for a new generation of Technical Computing

Technical Computing is no longer just the domain of large problems

– Businesses of all sizes need to harness the explosion of data for business advantage

– Workgroups and departments are increasingly using clustering at a smaller scale to drive new insights and better business outcomes

– Smaller groups lack the skills and resources to deploy and manage the system effectively

IBM brings experience in supercomputing to smaller workgroup and department clusters with IBM Intelligent Cluster™

– Reference solutions for simple deployment across a range of applications

– Simplified end-to-end deployment and resource management with Platform HPC software

– Factory integrated and installed by IBM

– Supported as an integrated solution

– Now even easier with IBM Platform Computing

IBM intelligence for clusters of all sizes!

IBM Technical Computing expertise

xx

IBM Intelligent Cluster™ – it’s about faster time-to-solution

Building Blocks: Industry-leading IBM and 3rd Party components

OS

Management Servers

Compute Nodes

Networking

Storage

IBM Intelligent Cluster

Factory-integrated, interoperability-tested system with compute, storage, networking and cluster management tailored to your requirements and supported as a solution!

Cluster Management

DesignBuildTest

InstallSupport

Take the time and risk out Technical Computing deployment

Allows clients to focus on their business not their IT – that is backed by IBM

xx

IBM Intelligent Cluster simplifies large and small deploymentsLarge Small

Research

LRZ SuperMUCEurope-wide research cluster9,587 servers, direct-water cooled

University of ChileEarthquake prediction and astronomy56 servers, air-cooled

Kantana Animation Studios Thailand television production36 iDataPlex servers, air-cooled

Media

Illumination Entertainment 3D Feature-length movies800 iDataPlex serversRear-Door Heat eXchanger cooled

Technical Computing Storage

Complete, scaleable, dense solutions

from a single vendor

xx

IBM System Storage® for Technical Computing

Complete, scaleable, integrated solutions from single vendor

Scaling to the multi-petabyte and hundreds gigabyte/sec

Industry leading data management software and services

Big Green features lower overall costs

Worldwide support and service

DCS 3700

Storage

GPFS

Middleware/Tools

LTO Tape

DS5000

Services

SONASSONAS

xx

IBM System Storage DCS3700 Performance Module 6Gb/s x4 SAS-based storage system

IBM’s Densest Storage Solution Just Got Better…

Expandable performance, scalability and density starting at entry-level prices

• Powerful hardware platform• 2.13GHz quad core processor• 12, 24 or 48GB cache / controller pair • 8x base 8Gb FC ports / controller pair• Additional host port options via Host Interface Cards

• Drastically Improved Performance

• Supports up to 360 drives

• Fully supports features in recent and upcoming releases• 10.83 feature set

• DDP, Enhanced FlashCopy, FlashCopy Consistency Groups, Thin Provisioning, ALUA, VAAI

xx

IBM System Storage DCS3700 now withPerformance Module Option6Gb/s x4 SAS-based storage system

Expanded Capabilities of IBM’s Densest Storage Solution…

Expandable performance, scalability and density starting at entry-level prices

• New DCS3700 Performance Controller

• High density storage system designed for General Purpose Computing and High Performance Technical Computing applications

• IBM’s densest disk system: 60 drives and dual controllers in 4U now scales to over 1PB per system with 3TB drives

• New Dynamic Disk Pooling feature enables easy to configure Worry-Free storage reducing maintenance requirements and delivering consistent performance

• New Thin Provisioning, ALUA, VAAI, Enhanced FlashCopy features deliver increased utilization, higher efficiency, and performance

• Superior serviceability and easy installation with front load drawers

• Bullet-proof reliability and availability designed to ensure continuous high-speed data delivery

xx

The DCS3700 Can Scale In clusters…with IBM GPFS™

• Combining IBM’s GPFS clustered file management software and DCS3700, creates an extremely scalable and dense file-based management system

• Using a flexible architecture, “building blocks” of DCS3700+GPFS can be organized

Single Building Block

Two Building Blocks

Configuration 2 GPFS x3650 Servers

3 DCS3700

4 GPFS x3650 Servers

6 DCS3700

Capacity:Raw

Usable

360TB

262TB

720TB

524TB

Streaming Rate:Write

Read

Up to 4.8 GB/s

Up to 5.5 GB/s

Up to 9.6 GB/s

Up to 11.0 GB/s

IOP Rate (4K trans.)Write

Read

3,600 IOP/s

6,000 IOP/s

7,200 IOP/s

12,000 IOP/s

Customer Success Stories

Applying IBM technology and experience

to solve real-world issues and deliver value

xx

Solution components: IBM Power IBM General Parallel File

System

The Need:NDA needed a cost-effective IT solution that it could use to significantly increase internal efficiencies and archiving capacity. NDA currently holds almost 15 million photographs, 30,000 sound recordings and 2,500 films. It provides free access to these materials.

The Solution:

The client implement a solution based on IBM Power Systems servers, IBM System Storage devices, and IBM GPFS. To provide scalability for the ongoing work, CompFort Meridian helped NDA implement an IBM Power 750 Express server. Using this system, the client will be able to rapidly expand the digital archive without impeding the performance of the ZoSIA service. NDA also uses IBM General Parallel File System to conduct online storage management and integrated information lifecycle management and to scale accessibility to the expanding volume of archived material. Using this solution, the client can maintain the performance of the ZoSIA service even when numerous users access the same resource at the same time.

The Benefit

NDA saved its nearly 290,000 users an estimated $35million by enabling them to check archives online rather than spending the time and money required to visit NDA in person. The client also gained a high-performance, stable and secure solution to support the ZoSIA online archive system. In addition, with this solution in place, NDA can consolidate national and cultural remembrance - and extend a sense of national origin and heritage into the future.

National Digital Archive (NDA) – PolandHeritage and cultural preservation for society.

xx

The need:

DEISA wanted to advance European computational science through close collaboration between Europe’s most important supercomputing centers by supporting challenging computing tasks and sharing data across a wide-area network.

The solution:

To allow different IBM® and non-IBM supercomputer architectures to access data from across a wide-area network, DEISA worked closely with the IBM Deep Computing team to create a global multicluster file system based on IBM General Parallel File System (GPFS™).

The benefit: Allows scientists in different countries to share supercomputing

resources and collaborate on large-scale projects

Enables allocation of specific computing tasks to the most suitable supercomputing resources, boosting performance.

Provides a rapid, reliable and secure shared file system for a variety of supercomputing architectures—both IBM and non-IBM.

“Our work with IBM GPFS demonstrates the viability and usefulness of a global file system for collaboration and data-sharing between supercomputing centers—even when the individual supercomputing clusters are based on very different technical architectures. The flexibility of GPFS and its ability to support all the different DEISA supercomputers is highly impressive.”

—Dr. Stefan Heinzel, Director of the Rechenzentrum Garching

at the Max Planck Society

Solution components: IBM Power Systems™ IBM BlueGene®/P IBM PowerPC® Several non-IBM supercomputing

architectures including Cray XT5, NEC SX8 and SGI-Altix

IBM® General Parallel File System (GPFS™)

DEISAEnabling 15 European supercomputing centers to collaborate

xx

Solution components: IBM System x iDataPlex IBM General Parallel File

SystemTM

The Need:To reduce its impact on the environment, Snecma adopts a number of key factors such as reducing fuel consumption and therefore greenhouse gas emissions, reducing noise and choosing environment-friendly materials for manufacturing and maintenance of aviation engines. The company was required to meet the 'Vision 2020' plan set by the European community. The plan defines the European aviation industry’s objectives for 2020, with ambitious environmental objectives, including:

- 50% reduction in perceived noise and CO2 releases per passenger-kilometer

- 80% reduction in nitrogen oxide (NOx) emissions compared to the year 2000.

To meet these objectives, Snecma needed heavy investment in research and development, powered with supercomputers.

The Solution:Snecma implemented a powerful high performance computing (HPC) environment with optimal energy efficiency. The core architectural components were based upon highly dense, low power server cluster packaging, low latency interconnect, and a high performance parallel file system.

Thanks to IBM technologies, Snecma gains a powerful and reliable high performance computing solution. The new supercomputer will be used by leading-edge researchers to make highly complex computations in the aviation field. The simulations carried on iDataPlex supercomputer allow Snecma to reduce fuel consumption and therefore greenhouse gas emissions, reduce noise, while addressing data center energy crisis.

SnecmaHPC to achieve regulatory objectives

xx

• IBM Technical Computing – General Parallel File System

• IBM InfoSphere® BigInsights Enterprise Edition

• IBM System x ®, iDataPlex ®

“Today, more and more sites are in complex terrain. Turbulence is a big factor at these sites, as the components in a turbine operating in turbulence are under more strain and consequently more likely to fail. Avoiding these pockets of turbulence means improved cost of energy for the customer."

- Anders Rhod Gregersen,Senior Specialist, Plant Siting & Forecasting

This wind technology company relied on the World Research and Forecasting modeling system to run its turbine location algorithms, in a process generally requiring weeks and posing inherent data capacity limitations. Poised to begin the development of its own forecasts and adding actual historical data from existing customers to the mix of factors used in the model, Vestas needed a solution to its Big Data challenge that would be faster, more accurate, and better suited to the its expanding data set.

The Opportunity

Vestas Wind SystemsMaximize power generation and durability in its wind turbines with HPC

What Makes it SmarterPrecise placement of a wind turbine can make a significant difference in the turbine's performance–and its useful life. In the competitive new arena of sustainable energy, winning the business can depend on both value demonstrated in the proposal and the speed of RFP response. Vestas broke free of its dependency on the World Research and Forecasting model with a powerful solution that sliced weeks from the processing time and more than doubled the capacity needed to include all the factors it considers essential for accurately predicting turbine success. Using a supercomputer that is one of the world's largest to-date and a modeling solution designed to harvest insights from both structured and unstructured data, the company can factor in temperature, barometric pressure, humidity, precipitation, wind direction and wind velocity at the ground level up to 300 feet, along with its own recorded data from customer turbine placements. Other sources to be considered include global deforestation metrics, satellite images, geospatial data and data on phases of the moon and tides. The solution raises the bar for due diligence in determining effective turbine placement.

Real Business Results– Reduces from weeks to hours the response time for business user requests

– Provides the capability to analyze ALL modeling and related data to improve the accuracy of turbine placement

– Reduces cost to customers per kilowatt hour produced and increases the precision of customer ROI estimates

Solution Components

…best to hear from a Client themselvesPlease join me in welcoming

Ivar Koppel

deputy director of research

intelligent storage and data management with ibm general parallel file system (gpfs) maciej...

Documents

platform computing

data management

new hpc cloud solutions

new technical computing

big data solutions

hpc cloud management

selfservice hpc use

ibm algorithmics o