prace-2ip wp10 - irods workshop irods projects @ cines gerard gil (cines) – [email protected]...
TRANSCRIPT
PRACE-2IP WP10 - iRODS workshop
iRODS projects @ CINES
Gerard GIL (CINES) – [email protected] (Linkoping 26-28 September 2012) 2012)
2
2/2/2010
PRACE-2IP WP10 - iRODS workshop (Linkoping 26-28 September 2012)
CINES presentation
CINES projects using iRODS Adonis
Archive Replication ISAAC
3
CINES is supervised and funded by the Ministry of
Higher Education and Research.
2/2/2010
CINES is located in Montpellier (South of France) 30 years serving the National Academic Research CINES provides the french public research community with
computing resources and services. 50 persons : (technicians, engineers and administratives)
2 missions :• Digital preservation• High Performance Computing
PRACE-2IP WP10 - iRODS workshop (Linkoping 26-28 September 2012)
4
Digital preservation
In 2004 the CINES was given the mandate to provide long-term preservation capabilities for digital objects related to scientific and technical information. Electronic PhD theses Digitized publications Multimedia pedagogics Scientific datasets
International certification process:CINES is one of the four pilot sites (UKDA, DNb, DANS) to test European Certification Framework for long term preservation (supported by European Commission) ISO certification
National agreement : (Dec 2010)CINES has received the agreement by the French Institution of Archives for his Digital Archive Repository (PAC platform)
PRACE-2IP WP10 - iRODS workshop (Linkoping 26-28 September 2012)
Member of EUDAT
5
High Performance Computing
Scalar, vector and parallel processing French T1 system : JADE SGI altix ICE 8200 23040 intel core, 267 Tflops 600 TB Lustre Filesystem
Accelators : GPU, CELL, FPGA, ...
PRACE member on behalf of GENCI
National HPC Center since 1980
PRACE-2IP WP10 - iRODS workshop (Linkoping 26-28 September 2012)
6
iRODS in CINES projects
ADONIS
Archives Replication
ISAAC
PRACE-2IP WP10 - iRODS workshop (Linkoping 26-28 September 2012)
7
ADONIS TGE (Very Large Infrastructure)
TGE is in charge of building a digital infrastructure for:Unified Access to Digital Document and Data produced by « Human and Social Sciences »
CRDO (Resource Center for Oral Data)
Pilot project selected to validate options for « Oral data » (2010)
PRACE-2IP WP10 - iRODS workshop (Linkoping 26-28 September 2012)
ADONIS
8
Archive Platform
Transfer
Synchronization
Formats Conversions
DisseminationSystem
ADONIS
iRODS
iRO
DS
?
replication/Synchronization
deposit
• Fault tolerence• Integrity control• Resource management rules• High speed data transfers
• Fault tolerence• Integrity control• Resource management rules• High speed data transfers
PRACE-2IP WP10 - iRODS workshop (Linkoping 26-28 September 2012)
9
ARCHIVES REPLICATION
Several solutions studied : Arcsys, iRODS, iSCSI, …
iRODS was selected :• Open solution• Metadata description• Resource management rules• Fault tolerence
PRACE-2IP WP10 - iRODS workshop (Linkoping 26-28 September 2012)
To complete its national agreement for Digital Archive, CINES has to provide a distant copy of the archives it manages.
The Archives Replication Project has been defined to reach this goal.
• Integrity control• Authorization management
• …
other iRODS serverCINES iRODS serverCINES Archive system
otherArchive system
10
ARCHIVES REPLICATION (cross-replication)
PRACE-2IP WP10 - iRODS workshop (Linkoping 26-28 September 2012)
CINES ZONE
other ZONE
replication
replication
Otherdistant storage
resource
CINESdistant storage
resource
11
ISAACInformation Scientifique Archivée Au CINES
Mid-term preservation of Scientific Data3 to 4 years preservation/archive;
Objectives : preserve/archive for 3 to 4 years Give additional time for the researcher to appraise the
relevance / importance of the information Put in place processes for scientific data valorization / preservation
This goes well beyond a simple storage or backup.
At the end of this 3-4 years period, two options : Migration onto the long-term preservation platform (PAC) Restitution to the producer/owner.
PRACE-2IP WP10 - iRODS workshop (Linkoping 26-28 September 2012)
INGEST•Document audit•Format validation•Metadata input•Unique persistent indetifier•Additional checks•Rights management
INGEST•Document audit•Format validation•Metadata input•Unique persistent indetifier•Additional checks•Rights management
STORAGE•Fixity checks•Replication•Event logging
STORAGE•Fixity checks•Replication•Event logging
ACCESS•Search on metadata•Files catalog•Rights management•File download
ACCESS•Search on metadata•Files catalog•Rights management•File download
TRANSF
ER
DATA PRODUCER
DATA PRODUCER
DATA USER
•Producer•Authorized user•Communities
DATA USER
•Producer•Authorized user•Communities
Preservation context definition and service level agreeement.
- Metadata;- File formats;-Knowledge base;-Etc.
Preservation context definition and service level agreeement.
- Metadata;- File formats;-Knowledge base;-Etc.
EXPERTS GROUPEXPERTS GROUP THEMATIC COMMITTEETHEMATIC
COMMITTEE
ISAAC
PRACE-2IP WP10 - iRODS workshop (Linkoping 26-28 September 2012)
iRODS• Distributed data• Resource management rules• Metadata description• Versatile management for Big Data• Authorization management• Fault tolerence• Integrity checks• High speed data transfer•…
iRODS• Distributed data• Resource management rules• Metadata description• Versatile management for Big Data• Authorization management• Fault tolerence• Integrity checks• High speed data transfer•…
12
STORAGESTORAGE
STORAGESTORAGE
STORAGESTORAGE
Storage abstract LayerStorage abstract Layer
Ingest APIFormat validation
Metadata managementIntegrity checks
Rights management……….
Ingest APIFormat validation
Metadata managementIntegrity checks
Rights management……….
DATA USER
•Producer•Authorized user•Communities
DATA USER
•Producer•Authorized user•Communities
WEBINTERFACE
WEBINTERFACE
PRACE-2IP WP10 - iRODS workshop (Linkoping 26-28 September 2012)
ISAAC
13
14
ISAACInformation Scientifique Archivée Au CINES
A generic platform with a national/european scope A prototype is being put in place for the PRECCINSTA datasets
• Prediction and Control of Combustion Instabilities for industrial gas turbines • potentially 2TB of data, (up to 10 TB June 2012) • HDF5, Netcdf, XDMF
Developments based on standard, open technologies : iRODS, Java, PostgreSQL, OpenLDAP,
PRACE-2IP WP10 - iRODS workshop (Linkoping 26-28 September 2012)
15
Questions ?
Informations : http://www.cines.fr [email protected] (Digital Archive Dept. leader)
PRACE-2IP WP10 - iRODS workshop (Linkoping 26-28 September 2012)