university of lorraine peta project - red hat...on-premise solution ... email zimbra storage for...
TRANSCRIPT
University of Lorraine
Peta Project Scalable Research Data Storage
Sébastien Morosi
Deputy CIO
University of Lorraine
Tuesday, May 8, 2018
Frédéric Nass
Project Manager
University of Lorraine
Tony Botté
Key Account Manager
Red Hat France
AGENDA
The Red Hat Academic Program
University of Lorraine & its context
The Peta Project
Questions / Answers
Red Hat Academic Program
● Our common goals
○ Develop and strengthen Open Source technologies for future IT workforce
○ Support IT departments through innovative education IT projects
● National agreement with French Ministry of Higher Education, Research and Innovation
○ Easier access to our technologies for IT students and researchers
○ Provides supported and reliable solutions for education IT Departments
● French eligible institutions
○ 40+ French Ministry “Academies” (education regional districts) and education authorities- K12
○ 300+ universities, specialized schools & national research institutions
University of Lorraine
& and its context
« Innovation through the dialogue of knowledge »
YOU ARE HERE
WE ARE THERE
● Offers curricula in all fields of knowledge*
● 63,500+ students including 2,000 PhDs
● 7,500+ staff members including 4,500
teachers & researchers
● 65 research units spread over 50
geographical sites
● “Lorraine University of Excellence” is a
project designed as an “engine” for the
development of excellence, by stimulating
dialogues between knowledge fields
* sciences, health, technology, engineering sciences, human and social sciences, law, economy, management, arts, literature and language
Secure research data
● Project started in 2014:
Status report & State of needs
○ Research Data is critical
○ Many different storage media : USB hard
drives, local servers, public cloud services
○ Must be protected from destruction, loss
and theft
○ Must comply with the French data security
policy
● New needs, new technologies
○ Needs : 1 Petabyte usable storage,
scalability (start small and grow),
maintain performances while
growing, long-term storage,
hardware agnostic and durable,
economically acceptable
○ Traditional SAN and NAS storage
were no longer suitable
○ Decision: use distributed storage
technology, with professional
support for a production service
The PETA Project
Red Hat Ceph Storage How it fulfilled our needs
● Why Ceph?
● RHCS vs Community version of Ceph?
● RHCS Jumpstart including licensing, support and on-site consulting
● On-site support during cluster installation in December 2015
Production cluster 3 datacenters in the Lorraine region
● Stretched over 3 datacenters
● Why? Survive permanent loss of a datacenter and loss of a host in one of the two
remaining datacenters
● Possible? Yes, with a highest network latency between datacenters < 1ms
● Supported? Yes, with a support exception (that we didn’t know )
Cluster map Stretch cluster
● 3 sites located in 2 Cities
● Network connections
o Based on dedicated
Education & Research WAN
o 35 miles, 0.8ms latency, 20Gbits/s
o 5 miles, 0.3ms latency, 40Gbits/s
Cluster figures What hardware for how much capacity?
● Per datacenter: 1 MON node, 5 OSD nodes (12x 4TB OSDs per node)
● Cluster capacity: 720TB (raw) for 400TB (usable) using an erasure coding k=5, m=4
data placement scheme
● 180 OSDs providing great performances given the stretch cluster configuration:
4K writes on a replicated pool at 16,000 IOPS
4K writes on an erasure coding k=5, m=4 pool at 20,000 IOPS
● Majority of data written to the cluster through highly available S3 gateways (RGW)
Initial use cases Research Data, and more…
● What kind of data? Lots. Large files. Hosted for long. No need for performance but for a
high level of storage efficiency
● Erasure Coding seemed a perfect fit along with the S3 gateways (RGW)
● S3 protocol was not really “user friendly” and would not facilitate collaborative work
● We needed more…
Storage Made Easy Enterprise File Sync and Share (EFSS)
● On-premise solution
● Provides collaborative functionalities
● Multi-tenancy (adapts to multi-sites and multi-entity environments)
● Provides desktop integration and is OS & Mobile agnostic
● Data can be synced to local desktop
● Data can be accessed without being synced to desktop
● Connects many different storage backends together
● Turns any S3 bucket into an SME Shared Team Folder
Email Zimbra storage for 100,000+ mailboxes
21 virtual servers
100,000 email accounts
32TB of data
Rados connector from Beezim.fr
Zimbra on Ceph
● Ceph is very suitable for other services
● It meets our mailbox storage needs
● Backup only metadata and restore data from
Ceph snapshots
● Instant backup and restore, unlimited
scalability, Zimbra updates with no downtime
Current and future use cases Virtualization, web and application hosting
● Virtual machines hosting through Ceph iSCSI
targets
● Web hosting with CephFS for CMSs and
blogs
● Data hosting for S3 compatibles applications
like ownCloud and Nuxeo ECM
● Data hosting through Rados for home-made
applications
Benefits From using RHCS and from our special partnership with Red Hat
● We sleep on both ears thanks to the best possible support in production
● We get pro-active support thanks to the fantastic job of our dedicated CSM
● We get support on planning and experiencing « Tech Preview » features
● We’re confident that every new use cases in our University will benefit from using RHCS in
the future
u2l.fr/googleplus
linkedin.com/school/universit-de-
lorraine
u2l.fr/youtube
facebook.com/univlorraine
twitter.com/univ_lorraine
THANK YOU