the mars matrix of rna‐seq project · 2016-11-30 · the exploration of mars! implementation of...

1
Fabien Pierrat 1 , L. Manchon 1 , R. Bruno 1 , D. Piquemal 1 1 ACOBIOM, Montpellier, France 1682, rue de la Valsière / CS 77394 – Cap Delta Biopôle Euromédecine II / 34184 MONTPELLIER Cedex 04 Phone: +33 (0)467 419 748 / Fax: +33 (0)467 457 726 [email protected] / www.acobiom.com Corresponding author: Fabien PIERRAT [email protected] Next? The Exploration of MaRS! Implementation of an advanced search tool to query the matrix in order to highlight biomarkers. Extend work to other species: Mouse. The MaRS (Matrix of RNA‐Seq) Project B. Cirou 2 , V. Cameo Ponz 2 , F. Daumas 2 2 CINES, Montpellier, France Recent years have seen a dramatic increase in the amount of genomic and transcriptomic data produced by laboratories around the world. The aim of the MaRS research program is to collect and to allow the comparison of these data. MaRS is focused on the RNA‐Seq method, which reflects the expression of the genes in a specific condition. RNA‐Seq method is used in a wide variety of applications like identifying disease‐related genes, analysing the effects of drugs on tissues or providing insight into disease pathways. The RNA‐Seq is widely used to identify gene expression patterns associated with tumor formation. MaRS represents a fantastic tool for the discovery and the validation of Biomarkers. 27000 Human RNA‐Seq profiles are selected for the project. In collaboration with the CINES, all these data are collected and computed with a single method to allow their compilation in one matrix: MaRS. It represents a huge amount of data and requires the use of High performance computing (HPC) on the cluster Occigen (CINES): 120 To of compressed data downloaded and 1.2 million hours/core consumed. Tissues & pathologieswealth in MaRS The MaRS project In partnership with: High Performance Computing toolchain Many challenges were encountered and solved during the project: ‐ the download of the very large amount of data ‐ the installation and the settings of the software's in the cluster ‐ the optimization of the compiler and the packages ‐ the constraints in jobs duration and jobs limitations.

Upload: others

Post on 12-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The MaRS Matrix of RNA‐Seq Project · 2016-11-30 · The Exploration of MaRS! Implementation of an advancedsearch tool to query the matrix in order to highlight biomarkers. Extend

Fabien Pierrat1, L. Manchon1, R. Bruno1 , D. Piquemal1 1 ACOBIOM, Montpellier, France

1682, rue de la Valsière  /  CS 77394 – Cap DeltaBiopôle Euromédecine II  /  34184 MONTPELLIER Cedex 04Phone: +33 (0)467 419 748  /  Fax: +33 (0)467 457 [email protected] /  www.acobiom.com

Corresponding author: Fabien [email protected]

Next? The Exploration of MaRS! Implementation of an advanced search tool to query the matrix in order to highlight biomarkers.Extend work to other species: Mouse.

The MaRS (Matrix of RNA‐Seq) Project

B. Cirou2, V. Cameo Ponz2, F. Daumas2 2 CINES, Montpellier, France

Recent years have seen a dramatic increase in the amount of genomic and transcriptomic data produced bylaboratories around the world. The aim of the MaRS research program is to collect and to allow the comparisonof these data. MaRS is focused on the RNA‐Seq method, which reflects the expression of the genes in a specificcondition.

RNA‐Seq method is used in a wide variety of applicationslike identifying disease‐related genes, analysing the effectsof drugs on tissues or providing insight into diseasepathways. The RNA‐Seq is widely used to identify geneexpression patterns associated with tumor formation.MaRS represents a fantastic tool for the discovery and thevalidation of Biomarkers.

27000 Human RNA‐Seq profiles are selected forthe project. In collaboration with the CINES, allthese data are collected and computed with asingle method to allow their compilation in onematrix: MaRS.It represents a huge amount of data andrequires the use of High performance computing(HPC) on the cluster Occigen (CINES): 120 To ofcompressed data downloaded and 1.2 millionhours/core consumed.

Tissues & pathologies’ wealth in MaRS

The MaRS project

In partnership with:

High Performance Computing toolchain

Many challenges were encountered and solved during theproject:‐ the download of the very large amount of data‐ the installation and the settings of the software's in thecluster‐ the optimization of the compiler and the packages‐ the constraints in jobs duration and jobs limitations.