intelligent storage and data management with ibm general parallel file system (gpfs) maciej...
TRANSCRIPT
Intelligent Storage and Data Management with IBM General
Parallel File System (GPFS)
Maciej REMISZEWSKIIBM Forum 2012 – EstoniaTallinn, October 9, 2012
RedBull STRATOS redbullstratos.com/live/
Explosion of data
Inflexible IT infrastructures
Escalating IT complexity
How to spot trends, predict outcomes and take meaningful actions?
How to manage inflexible, siloed systems and business processes to improve business agility?
How to manage IT costs and complexity while speeding time-to-market for new services?
Critical IT Trends for Technical Computing Users
Introducing the new IBM Technical Computing Portfolio Powerful. Comprehensive. Intuitive.
Systems &Storage
Solutions
Software
BG/Q
iDataPlexIntelligent Cluster
System x &BladeCenter P7-775
DCS3700
LTO Tape 3592Automation
DS5000DS3000
SoNASSoNAS
Parallel EnvironmentRuntime
Platform MPI
Engineering and Scientific Libraries
GPFS
Platform LSF Platform HPC
Platform SymphonyPlatform
Application Center
HPC Cloud
Integrated Solutions
IndustrySolutions
Parallel Environment Developer
IntelligentCluster
PureFlex
BigData
Platform Cluster Manager
NEW HPC Cloud SolutionsOverview• Innovative solutions for dynamic, flexible HPC cloud
environments
What’s New• New LSF add-on: IBM Platform Dynamic Cluster V9.1
oWorkload driven dynamic node re-provisioningoDynamically switch nodes between physical & virtual
machineso Automated job checkpoints and migrationo Smart, flexible policy and performance controls
• Enhanced Platform Cluster Manager – Advanced capabilities• New complete, end to end solutions
Use Case 1:HPC Infrastructure
Management
Use Case 1:HPC Infrastructure
Management• Self-service cluster provisioning &
management• Consolidate resources into a HPC
cloud• Cluster flexing
Use Case 3: Cloud Bursting
Use Case 3: Cloud Bursting
• ‘Burst’ internally to available resources
• Burst externally to cloud providers
Use Case 2:Self-service HPC
Use Case 2:Self-service HPC
• Self-service job submission & management
• Dynamic provisioning• Job migration and/or checkpoint-
restart• 2D/3D remote visualization
NEW Financial Risk and Crimes SolutionOverview•High-performance, low-latency integrated risk solution stack with Platform Symphony - Advanced Edition and partner products including:
oBigInsights and IBM Algorithmicso3rd party Partner products: Murex and Calypso
What’s New•New solution stacks to manage and process big data with speed and scale•Sales tools that highlight value of IBM Platform Symphony
o Financial Risk: Customer testimonial videos; inclusion in SWG risk frameworks, and S&D blueprints
o Financial Crime: with BigInsight for credit card fraud analytics
o TCO tool and benchmarks
8
Use Case 1:Financial Risk including
Credit Value Adjustment (CVA)
analytics
Use Case 1:Financial Risk including
Credit Value Adjustment (CVA)
analytics • Accelerates compute intensive
workloads up to 4X e.g. Monte Carlo simulations, Algorithmic Riskwatch “cube” simulations
• Integrated with IBM Algorithmics, Murex and Calypso
• High throughput: 17K tasks/sec
Use Case 2:Big Data for Financial
Crimes
Use Case 2:Big Data for Financial
Crimes• Accelerates analyses of data for
fraud and irregularities• Supports BigInsight• Faster than Apache Hadoop
distribution
NEW Technical Computing for Big Data SolutionsOverview• High-performance, low-latency “Big Data” solution stack featuring Platform Symphony, GPFS, DCS3700, Intelligent Cluster – proven across many industries •Low Latency Hadoop stack with Platform Symphony, Advanced Edition and InfoSphere BigInsightsWhat’s New•New solution stacks to manage and process big data with speed and scale
9
IBM General Parallel File System (GPFS)
IBM DCS3700 / IBM Intelligent Cluster
IBM is delivering a Smarter Computing foundation for Technical Computing
Achieve faster time to insight with scalable, low latency data access and control for Big Data analytics
Optimize agility with on-demand and workload-driven dynamic cluster, grid, and HPC clouds
Increase throughput, utilization, and lower operating costs with workload optimized systems and intelligent resource management
Designed for dataDesigned for data
Managed withCloud Technologies
Managed withCloud Technologies
Tuned to the taskTuned to the task
Smarter Computing
Technical Computing Software
Simplified management, optimized performance
The backbone of Technical Computing
xx
IBM acquired Platform Computing, a leader in cluster, grid, and HPC cloud management software
• 20 year history delivering leading management software for technical computing and analytics distributed computing environments
• Enable the usage and management of 1000s of systems as 1 - powering the evolution from clusters to grids to HPC clouds
• 2000+ global customers including 23 of 30 largest enterprises
• Market leading scheduling engine with high performance, mission-critical reliability and extreme scalability
• Comprehensive capability footprint from ready-to-deploy complete cluster systems to large global grids
• Heterogeneous systems support
• Large ISV and global partner ecosystem
• Global services and support coverage
De facto Standard for
Commercial HPC
Over 5 MM CPUs under
management
60% of top Financial Services
12
From June 2012, IBM Platform Computing™ portfolio is ready to deploy!
IBM Platform Computing can help accelerate your application results
AggregatesResource
Pools
• Compute & Data intensive apps• Heterogeneous resources• Physical, virtual, cloud• Easy user access
Optimizes Workload
Management
• Batch and highly parallelized • Policy & resource-aware scheduling• Service level agreements• Automation / workflow
Delivers Shared Services
Transforms Static Infrastructure to
Dynamic
• Workload-driven dynamic clusters• Bursting and “in the cloud”• Enhanced self-service / on-demand• Multi-hypervisor and multi-boot
For technical computing and analytics distributed computing environments
• Multiple user groups, sites• Multiple applications and workloads• Governance• Administration/ Reporting / Analytics
13
xx
Clients span many industriesPlatform LSF
“Platform Computing came to us as a true innovation partner, not a supplier of technology, but a partner who was able to understand our problems and provide appropriate solutions to us, and work with us to continuously improvethe performance of our system”
- Steve Nevey, Business Development Manager Red Bull Technology
Watch Red Bull videoWatch Red Bull video
“Platform’s software was a clear leader from the beginning of the process”
-Chris Collins, Head of Research & Specialist Computing
University of East Anglia
Platform HPC
Platform Symphony
"Platform Computing and its enterprise grid solution enable us to share a formerly heterogeneous and distributed hardware infrastructure across applications regardless of their location, operating system and application logic, … helping us to achieve our internal efficiency targets while at the same time improving our performance and service quality“
-Lorenzo Cervellin, Head of Global Markets and Treasury Infrastructure UniCredit Global Information Services
European BankEuropean Bank
IBM also offers the most widely used, commercially available, technical computing data management software
IBM General Parallel File System - Scalable, highly-available, high performance file system optimized for
multi-petabyte storage management
Virtualized Access to Data
GPFS™Virtualized, centrally deployed, managed,
backed up and grown
Cluster file system, all nodes access data.
Seamless capacity and performance scaling
GPFS pioneered Big Data management
File system
263 files per file system
Maximum file system size: 299 bytes
Maximum file size equals file system size
Production 5.4 PB file system
Number of nodes
1 to 8192
Extreme Scalability
No Special Nodes
Add/remove on the fly
Nodes
Storage
Rolling Upgrades
Administer from any node
Data replication
Proven Reliability
High Performance Metadata
Striped Data
Equal access to data
Integrated Tiered storage
Performance
xx
IBM innovation continues with GPFS Active File Management (AFM) for global namespace
Multi-cluster expanded the global namespace by connecting multiple sites
GPFS introduced concurrent file system access from multiple nodes.
AFM takes global namespace truly global by automatically managing asynchronous replication of data
GPFS GPFS
GPFS
GPFS
GPFS
GPFS
1993 2005 2011
xx
Native Raid Native Raid
Storage RichServers
Storage RichServers
MapReduce Farm
MapReduce Farm
Legacy HPCStorage
Legacy HPCStorage
Legacy NFSStorage
Legacy NFSStorage
AFMAFM
TSM/HPSS
TSM/HPSS
GNRx GNRx
NSDNSD
SNC
SNC
AFMAFM
NSD
/GN
RxN
SD/G
NRx
AFM + Multicluser
AFM + Multicluser
GPFS™Eliminating data islands
GPFS™Eliminating data islands
xx
How can GPFS deliver value to your business?
KnowledgeManagementand efficiency
thru filesharing
Businessflexibility withCloud Storage
Innovate withMap-Reduce/
Hadoop
Maintain businesscontinuity
thru Disaster Recovery
Reduce storagecosts thruLife-cycle
Management
Speed time-to-market thru
faster analytics
GPFS
xx
Speed time-to-market with faster analytics
• Issue:– We are in the era of “Smarter Analytics”
• Data explosion makes I/O a major hurdle.• Deep analytics result in longer running workloads• Demand for lower-latency analytics to beat the competition
• GPFS was designed for complex and/or large workloads accessing lots of data:– Real time disk scheduling and load balancing ensure all relevant
information and data can be ingested for analysis– Built-in replication ensures that deep analytics workloads can
continue running should a hardware or low level software failure occur.
– Distributed design means it can scale as needed
KnowledgeManagementand efficiency
thru filesharing
Businessflexibility withCloud Storage
Innovate withMap-Reduce/
Hadoop
Maintain businesscontinuity
thru Disaster Recovery
Reduce storagecosts thruLife-cycle
Management
Faster time-to-market thru
faster analytics
GPFS
xx
Reduce storage costs thru Life-cycle Management
• Issue:– Increasing storage costs as dormant files sit on spinning disks– Redundant files stored across the enterprise to ease access– Aligning user file requirements with cost of storage
• GPFS has policy-driven, automated tiered storage management for optimizing file location.– ILM tools manage sets of files across pools of storage based
upon user requirements– Tiering across different economic classes of storage: SSD,
spinning disk, tape – regardless of physical location.– Interface with external storage sub-systems such as TSM and
HPSS to exploit ILM capability enterprise-wide.
KnowledgeManagementand efficiency
thru filesharing
Businessflexibility withCloud Storage
Innovate withMap-Reduce/
Hadoop
Maintain businesscontinuity
thru Disaster Recovery
Reduce storagecosts thruLife-cycle
Management
Faster time-to-market thru
faster analytics
GPFS
xx
Maintain business continuity thru disaster recovery
• Issue:– Need for real-time or low latency file access– File data contained in geographic areas susceptible to downtime– Fragmented file based information across a wide geographic
area
• GPFS has inherent features that are designed to ensure high availability of file-based data– Remote file replication with built-in failover– Multi-site clustering enables risk reduction of stored data via
WAN– Space efficient point-in-time snapshot view of the file system
enabling quick recovery
KnowledgeManagementand efficiency
thru filesharing
Businessflexibility withCloud Storage
Innovate withMap-Reduce/
Hadoop
Maintain businesscontinuity
thru Disaster Recovery
Reduce storagecosts thruLife-cycle
Management
Faster time-to-market thru
faster analytics
GPFS
xx
Innovate with Big Data or Map-Reduce/Hadoop
• Issue:– Unlocking value in large volumes of unstructured data– Mission critical applications requiring enterprise-tested
reliability– Looking for alternatives to the Hadoop File System (HDFS) for
map-reduce applications
• As part of a Research project, there is an active development project called GPFS-SNC to provide a robust alternative to HDFS– HDFS is a centralized file system with a single point of failure,
unlike the distributed design of GPFS– GPFS Posix compliance expands the range of application that can
access files (read, write, append) vs HDFS which cannot append or overwrite.
– GPFS contains all of the rich ILM features for high availability and storage management, HDFS does not.
KnowledgeManagementand efficiency
thru filesharing
Businessflexibility withCloud Storage
Innovate withMap-Reduce/
Hadoop
Maintain businesscontinuity
thru Disaster Recovery
Reduce storagecosts thruLife-cycle
Management
Faster time-to-market thru
faster analytics
GPFS
xx
Knowledge management and efficiency thru file sharing
• Issue:– Geographically dispersed employees need access to same set of
file based information– Supporting “follow-the-sun” product engineering and
development processes (CAD, CAE, etc)– Managing and integrating the workflow of highly fragmented
and geographically dispersed file data generated by employees
• GPFS global name space support and Active File Management provide core capabilities for file sharing– Global namespace enables a common view of files, file location
no matter where the file requestor, or file resides.– Active File Management handles file version control to ensure
integrity.– Parallel data access allows for large number of files and people
to collaborate without performance impact.
KnowledgeManagementand efficiency
thru filesharing
Businessflexibility withCloud Storage
Innovate withMap-Reduce/
Hadoop
Maintain businesscontinuity
thru Disaster Recovery
Reduce storagecosts thruLife-cycle
Management
Faster time-to-market thru
faster analytics
GPFS
Intelligent Cluster
System x iDataPlex
Optimized platforms to right-size
your Technical Computing operations
xx
IBM leadership for a new generation of Technical Computing
Technical Computing is no longer just the domain of large problems
– Businesses of all sizes need to harness the explosion of data for business advantage
– Workgroups and departments are increasingly using clustering at a smaller scale to drive new insights and better business outcomes
– Smaller groups lack the skills and resources to deploy and manage the system effectively
IBM brings experience in supercomputing to smaller workgroup and department clusters with IBM Intelligent Cluster™
– Reference solutions for simple deployment across a range of applications
– Simplified end-to-end deployment and resource management with Platform HPC software
– Factory integrated and installed by IBM
– Supported as an integrated solution
– Now even easier with IBM Platform Computing
IBM intelligence for clusters of all sizes!
IBM Technical Computing expertise
xx
IBM Intelligent Cluster™ – it’s about faster time-to-solution
Building Blocks: Industry-leading IBM and 3rd Party components
OS
Management Servers
Compute Nodes
Networking
Storage
IBM Intelligent Cluster
Factory-integrated, interoperability-tested system with compute, storage, networking and cluster management tailored to your requirements and supported as a solution!
Cluster Management
DesignBuildTest
InstallSupport
Take the time and risk out Technical Computing deployment
Allows clients to focus on their business not their IT – that is backed by IBM
xx
IBM Intelligent Cluster simplifies large and small deploymentsLarge Small
Research
LRZ SuperMUCEurope-wide research cluster9,587 servers, direct-water cooled
University of ChileEarthquake prediction and astronomy56 servers, air-cooled
Kantana Animation Studios Thailand television production36 iDataPlex servers, air-cooled
Media
Illumination Entertainment 3D Feature-length movies800 iDataPlex serversRear-Door Heat eXchanger cooled
Technical Computing Storage
Complete, scaleable, dense solutions
from a single vendor
xx
IBM System Storage® for Technical Computing
Complete, scaleable, integrated solutions from single vendor
Scaling to the multi-petabyte and hundreds gigabyte/sec
Industry leading data management software and services
Big Green features lower overall costs
Worldwide support and service
DCS 3700
Storage
GPFS
Middleware/Tools
LTO Tape
DS5000
Services
SONASSONAS
xx
IBM System Storage DCS3700 Performance Module 6Gb/s x4 SAS-based storage system
IBM’s Densest Storage Solution Just Got Better…
Expandable performance, scalability and density starting at entry-level prices
• Powerful hardware platform• 2.13GHz quad core processor• 12, 24 or 48GB cache / controller pair • 8x base 8Gb FC ports / controller pair• Additional host port options via Host Interface Cards
• Drastically Improved Performance
• Supports up to 360 drives
• Fully supports features in recent and upcoming releases• 10.83 feature set
• DDP, Enhanced FlashCopy, FlashCopy Consistency Groups, Thin Provisioning, ALUA, VAAI
xx
IBM System Storage DCS3700 now withPerformance Module Option6Gb/s x4 SAS-based storage system
Expanded Capabilities of IBM’s Densest Storage Solution…
Expandable performance, scalability and density starting at entry-level prices
• New DCS3700 Performance Controller
• High density storage system designed for General Purpose Computing and High Performance Technical Computing applications
• IBM’s densest disk system: 60 drives and dual controllers in 4U now scales to over 1PB per system with 3TB drives
• New Dynamic Disk Pooling feature enables easy to configure Worry-Free storage reducing maintenance requirements and delivering consistent performance
• New Thin Provisioning, ALUA, VAAI, Enhanced FlashCopy features deliver increased utilization, higher efficiency, and performance
• Superior serviceability and easy installation with front load drawers
• Bullet-proof reliability and availability designed to ensure continuous high-speed data delivery
xx
The DCS3700 Can Scale In clusters…with IBM GPFS™
• Combining IBM’s GPFS clustered file management software and DCS3700, creates an extremely scalable and dense file-based management system
• Using a flexible architecture, “building blocks” of DCS3700+GPFS can be organized
Single Building Block
Two Building Blocks
Configuration 2 GPFS x3650 Servers
3 DCS3700
4 GPFS x3650 Servers
6 DCS3700
Capacity:Raw
Usable
360TB
262TB
720TB
524TB
Streaming Rate:Write
Read
Up to 4.8 GB/s
Up to 5.5 GB/s
Up to 9.6 GB/s
Up to 11.0 GB/s
IOP Rate (4K trans.)Write
Read
3,600 IOP/s
6,000 IOP/s
7,200 IOP/s
12,000 IOP/s
Customer Success Stories
Applying IBM technology and experience
to solve real-world issues and deliver value
xx
Solution components: IBM Power IBM General Parallel File
System
The Need:NDA needed a cost-effective IT solution that it could use to significantly increase internal efficiencies and archiving capacity. NDA currently holds almost 15 million photographs, 30,000 sound recordings and 2,500 films. It provides free access to these materials.
The Solution:
The client implement a solution based on IBM Power Systems servers, IBM System Storage devices, and IBM GPFS. To provide scalability for the ongoing work, CompFort Meridian helped NDA implement an IBM Power 750 Express server. Using this system, the client will be able to rapidly expand the digital archive without impeding the performance of the ZoSIA service. NDA also uses IBM General Parallel File System to conduct online storage management and integrated information lifecycle management and to scale accessibility to the expanding volume of archived material. Using this solution, the client can maintain the performance of the ZoSIA service even when numerous users access the same resource at the same time.
The Benefit
NDA saved its nearly 290,000 users an estimated $35million by enabling them to check archives online rather than spending the time and money required to visit NDA in person. The client also gained a high-performance, stable and secure solution to support the ZoSIA online archive system. In addition, with this solution in place, NDA can consolidate national and cultural remembrance - and extend a sense of national origin and heritage into the future.
National Digital Archive (NDA) – PolandHeritage and cultural preservation for society.
xx
The need:
DEISA wanted to advance European computational science through close collaboration between Europe’s most important supercomputing centers by supporting challenging computing tasks and sharing data across a wide-area network.
The solution:
To allow different IBM® and non-IBM supercomputer architectures to access data from across a wide-area network, DEISA worked closely with the IBM Deep Computing team to create a global multicluster file system based on IBM General Parallel File System (GPFS™).
The benefit: Allows scientists in different countries to share supercomputing
resources and collaborate on large-scale projects
Enables allocation of specific computing tasks to the most suitable supercomputing resources, boosting performance.
Provides a rapid, reliable and secure shared file system for a variety of supercomputing architectures—both IBM and non-IBM.
“Our work with IBM GPFS demonstrates the viability and usefulness of a global file system for collaboration and data-sharing between supercomputing centers—even when the individual supercomputing clusters are based on very different technical architectures. The flexibility of GPFS and its ability to support all the different DEISA supercomputers is highly impressive.”
—Dr. Stefan Heinzel, Director of the Rechenzentrum Garching
at the Max Planck Society
Solution components: IBM Power Systems™ IBM BlueGene®/P IBM PowerPC® Several non-IBM supercomputing
architectures including Cray XT5, NEC SX8 and SGI-Altix
IBM® General Parallel File System (GPFS™)
DEISAEnabling 15 European supercomputing centers to collaborate
xx
Solution components: IBM System x iDataPlex IBM General Parallel File
SystemTM
The Need:To reduce its impact on the environment, Snecma adopts a number of key factors such as reducing fuel consumption and therefore greenhouse gas emissions, reducing noise and choosing environment-friendly materials for manufacturing and maintenance of aviation engines. The company was required to meet the 'Vision 2020' plan set by the European community. The plan defines the European aviation industry’s objectives for 2020, with ambitious environmental objectives, including:
- 50% reduction in perceived noise and CO2 releases per passenger-kilometer
- 80% reduction in nitrogen oxide (NOx) emissions compared to the year 2000.
To meet these objectives, Snecma needed heavy investment in research and development, powered with supercomputers.
The Solution:Snecma implemented a powerful high performance computing (HPC) environment with optimal energy efficiency. The core architectural components were based upon highly dense, low power server cluster packaging, low latency interconnect, and a high performance parallel file system.
Thanks to IBM technologies, Snecma gains a powerful and reliable high performance computing solution. The new supercomputer will be used by leading-edge researchers to make highly complex computations in the aviation field. The simulations carried on iDataPlex supercomputer allow Snecma to reduce fuel consumption and therefore greenhouse gas emissions, reduce noise, while addressing data center energy crisis.
SnecmaHPC to achieve regulatory objectives
xx
• IBM Technical Computing – General Parallel File System
• IBM InfoSphere® BigInsights Enterprise Edition
• IBM System x ®, iDataPlex ®
“Today, more and more sites are in complex terrain. Turbulence is a big factor at these sites, as the components in a turbine operating in turbulence are under more strain and consequently more likely to fail. Avoiding these pockets of turbulence means improved cost of energy for the customer."
- Anders Rhod Gregersen,Senior Specialist, Plant Siting & Forecasting
This wind technology company relied on the World Research and Forecasting modeling system to run its turbine location algorithms, in a process generally requiring weeks and posing inherent data capacity limitations. Poised to begin the development of its own forecasts and adding actual historical data from existing customers to the mix of factors used in the model, Vestas needed a solution to its Big Data challenge that would be faster, more accurate, and better suited to the its expanding data set.
The Opportunity
Vestas Wind SystemsMaximize power generation and durability in its wind turbines with HPC
What Makes it SmarterPrecise placement of a wind turbine can make a significant difference in the turbine's performance–and its useful life. In the competitive new arena of sustainable energy, winning the business can depend on both value demonstrated in the proposal and the speed of RFP response. Vestas broke free of its dependency on the World Research and Forecasting model with a powerful solution that sliced weeks from the processing time and more than doubled the capacity needed to include all the factors it considers essential for accurately predicting turbine success. Using a supercomputer that is one of the world's largest to-date and a modeling solution designed to harvest insights from both structured and unstructured data, the company can factor in temperature, barometric pressure, humidity, precipitation, wind direction and wind velocity at the ground level up to 300 feet, along with its own recorded data from customer turbine placements. Other sources to be considered include global deforestation metrics, satellite images, geospatial data and data on phases of the moon and tides. The solution raises the bar for due diligence in determining effective turbine placement.
Real Business Results– Reduces from weeks to hours the response time for business user requests
– Provides the capability to analyze ALL modeling and related data to improve the accuracy of turbine placement
– Reduces cost to customers per kilowatt hour produced and increases the precision of customer ROI estimates
Solution Components
…best to hear from a Client themselvesPlease join me in welcoming
Ivar Koppel
deputy director of research