diapositive 1 - vorgogoz/bioinfocourses/2018-10... · 2018-11-05 · lamark building 4th floor ....

Post on 04-Jun-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

https://paris7bioinfo.wordpress.com/

UMR7216 – Who are we?

http://parisepigenetics.com/fr/

Epigenetic & cell fate - UMR7216 – Where are we?

Lamark building4th floor

Our biological questions

•Development

•Stem cell biology

•Response to stress

•Oncogenesis

•Human deseases

•DifferentiationKanherkar et al

Our biological questions

•Gene expression

•Chromatin modification

•Chromatin organisation

•DNA modification / structure

•RNA biology

Sequencing technologies

Late 70’s 90’s 20071st generation 2nd generation 3rd generation

HTS : High throughput sequencing

Novaseq = 10*109 seq/44h

High throughput sequencing applications in biology

High throughput sequencing applications in biology

« OMICS »

Bioinformatics at UMR 7216

1/ Design data analysis workflows 2/ Write/adapt scripts with

- open source tools / algo- in house tools

3/ Perform data processing with

- (galaxy)- local machine- Cluster (RPBS)

Questions biologiques :

•Vision intégrée des altérations de l’épigénome et du transcriptome dans des contextes pathologiques : •Compréhension du lien causal:

- Entre ces différents mécanismes moléculaires- Avec les phénotypes cellulaires et signes

cliniques•Etablissement de biomarqueurs

Données publiques : Encode, GEO, EBI, UCSC

Approches : Illumina methylchip 450-850K, RNA-Seq, ChIP-seq, et données cliniques

Gestion des données : Galaxy, R

Packages clés : DESeq2, Minfi, ChAMP

Chromatin segmentation (ChromHMM)

RepeatMaskerRepli-seq

RNA-seq: total, small, « medium »ChIP-seqMethylome

Données générées : projet OMICS

Team Claire Francastel : Guillaume Velasco, Florent Hubé

Somatic cell Cancer cell

Repressed C/T gene Re-activated C/T gene

Questions biologiques : Données publiques :cancers du colon, sein et poumons

GEO Dataset , SRA

DonnéesWGS, RNA-Seq, 450K et données cliniques

Données générées au laboChIP-Seq, RRBS, Nanostring, micro arrays

Gestion des données : R, Shell unix, conda, docker en local ou sur le cluster, Rmd, Notebooks, jupyter

Packages / outils : (TCGAbiolinks), DESeq2, MCPcounter ++ scripts maisonsHOMER, Samtools, bedtools, bedops, star, hisat2, htseq, methylkit, ggplot2, complexheatmaps etc….

Team Pierre-Antoine Defossez : Marthe Laisné, Olivier Kirsh

DNA methylation dependant:- Gene expression in cancer- Gene expression in mES- DNA methylation maintenance- Functions of a MBD (ZBTB4)- HIV latency mecanisms- Interplay between signalling pathways & chromatin remodellers

1. How does an intracellular parasite controls both host & parasite gene expression for its survival and

the completion of its life cycle?

A complex life cycle with different hosts and many cell types involving immortalisation, proliferation and differentiation processes

Comparative genomics

Functional (epi)genomics

Db, motifs & sequence mining,network analyses,MSA & phylogeny

RNA-seq,ChIP-seq,clustering,

Ontologies/GSEA

Lab datasets & public db (EuPathDb, Pfam, Interpro, OrthoMCL-db)

blast,mafft,

ete3, Iqtree

Rsubread, HISAT2, EdgeR, Music,

Deeptools

(local) Linux VM & cluster,Bash, R (Bioconductor) & Docker

data

work env.

approaches

main tools

2. How did the parasite genome evolve these functions?

compact & dynamic apicomplexan genomes

Team Jonathan Weitzmann : Kevin Cheeseman

Personal data : mouse embryos cortices

BS-seq : targeted capture of DNA

methylome (EpiCapture)

ChIP-seq of TF involved in stress response

Prenatal alcohol exposure during brain development : molecular mecanisms involved in DNA methylation defects

EtOH

control

Differentially methylated region ?

HSF2EtOH

HSF2control

EtOH

•Environment - languages : -bash-R : packages + personal scripts

•Project management :-local computer or distant server-Galaxy

•People involved in BI project :-Agathe Duchateau (PhD student)

-Délara Sabéran Djoneidi (MCF)

-Sascha Ott (collab. Warvick University)X X

Impact of inflammatory stress on cell identity

O4+

MA

GN

ET

ATAC-seq

Personal data : oligodendrocytes

Micro array

Team Valérie Mezger : Agathe Duchateau, Delara Saberan

Objectifs : Conséquences de la dérégulation de SETDB1 sur la distribution pan-génomique de H3K9me3, la localisation de CTCF/Cohésine et le transcriptome

CTCF, Cohésine

H3K9me3 et SETDB1

Effet sur l’expression des gènes ?

TranscriptomeRNA-Seq

Impact sur l’architecture 3D du génome

Hi-C

L’amplification de SETDB1 pourrait jouer un rôle clé dans l’organisation 3D du génome et la régulation de l’expression des gènes

Immunoprécipitation de la chromatine (ChIP) -Seq

Conséquences épigénomiques ?

H3K9me3 H3K9me3

CTCF

CTCF

X

H3K9me3

SETDB1

Team Slimane Ait-si-Ali : Gilles tellier, Vlada Zakharova

Biological questions

• Roles of long non-coding RNAs, repeat elements and genomic organization in X chromosome inactivation • X chromosome inactivation dynamic in the Human hematopoietic lineage : physiological and pathological

conditions.• Systematic identification and characterization of non-coding RNAs associated with bladder cancer progression

Bioinformatics approach

• RNA-seq : hierarchical clustering, differential expression analysis, transcript reconstruction

• single-cell RNA-seq : allelic analysis• ChIP-seq• Hi-C

• "Homemade” datasets

• Databases : BLUEPRINT, TCGA, GEO, SRA, HPA

Data analysis management

• MTI cluster + Migale (INRA)• PCs Ubuntu / MAC• NAS (32To)• R packages : DE-Seq2, ConsensusClusterPlus,

limma, ggplot2, RCircos…• Other tools : fastqc, hisat2, samtools, StringTie,

scallop, GATK4, DE-kupl, deeptools, HiC-Pro, Blast…

• Used languages : BASH, R, python

Team Claire Rougeulle: Olga Rospopoff, Louis Chauvières, Jean-françois Ouimette

Team Bertrand Cosson: Costas Bouyioukos

Biological-bioinformatics questions• Post transcriptional epigenetic regulation. The role and the fate of mature mRNA. Regulation on the translation

level. MRNA RNA Binding Properties interactions. • Large scale, genome wide, integration of multiple -omics datasets. Exploratory and predictive analysis.• Network systems biology, software development of analytical tools for biological knowledge extraction.

Research approaches

• All different kinds of *omics datasets.RNA-seq, ribo-seq, polysome profile, Hi-C, RIP-seq and any sorts...

• Sourced either locally, through collaborations or from big public databases. SRA, ENA, 4D genomes etc.

• Using the full repertoire of R – Bioconductorand scientific computing for data science in Python (Scipy, Pandas, biopython, sklearn etc.)

• Software development of user friendly, visualisation rich, bioinformatics applications and pipelines (nextflow, shell scripts, python and R packages, Docker - Singularity).

• Curation of the Paris Epigenetics GitHub repository.

https://github.com/parisepigenetics/

Methods and competences.

• Exploratory data analysis. Dimensionality reduction, correlation matrices.

• Computational and statistical learning. Supervised/unsupervised machine learning- k-mean, hierarchical clustering.- Bi-clustering.- Building classifiers (ANN, random forests, Boosting etc.).

• Network inference, computational modelling.• Visualisation as a tool.

Training & teaching

• teaching in the M1-2 BIB:- Network Systems Biology- Bioinformatics Genomics- Scientific Software Carpentry- Genomics week- R / Python

• Magistère de génétique –RNA-seq• L3 BMA• L3 OMIC• L2 / L3 Biostatistics - R

C BouyoukosMCF

O KirshMCF

JF OuimetteMCF

A DuchateauPhD student

K CheesmanPost Doc

O RospopoffPost Doc

UFR SDV UMR7216

M LaisnéPhd Student

B CossonPR

G VelascoMCF

Formation continue - DU

Diplôme UniversitaireBioinformatique Intégrative

Organisateurs : Jacques Van Helden, Hélène Chiapello, Bertrand Cosson

Diplôme Universitaire (DU), Université Paris Diderot

Intégrer différents jeux de

données, qui proviennent de

différents niveaux d’analyse

dans la cellule : génomes,

transcriptomes, protéomes,

métabolomes, réseaux

d’interactions…

Plos one, Guo et al 2016

Integrative bioinformatics - Multi-omics approaches - Data Analysis

https://paris7bioinfo.wordpress.com/

Save the date

How does an intracellularparasite controls both host &parasite gene expression for itssurvival and the completion ofits life cycle?

UMR7216 – people

top related