bif-30806 group project group (a)rabidopsis: david nieuwenhuijse matthew price qianqian zhang thijs...
TRANSCRIPT
BIF-30806 Group Project
Group (A)rabidopsis:David NieuwenhuijseMatthew PriceQianqian ZhangThijs Slijkhuis
Species:C. Elegans
Project:Advanced (+Basic)
Progress Report
Project Overview
Dataset Preparation
Transcriptome Construction
Pipeline
Differentially Expressed
Genes
Gene Function
Biological Explanatio
n
Co-expressed Genes Modules
Functional Description
& Explanatio
n
Module Conservati
on b/w species
Gene Expression
(Basic Project)
Relationship to
Transcript Properties
Visualisation of
Interaction Network
Results so farDavid Nieuwenhuijse
◦ GeneID and GO term extraction tool◦ Cytoscape GO enrichment analysis◦ Finding automatic GO enrichment tool for pipeline
Qianqian Zhang◦ Create shell script for running Cuffdiff, Gffread
and Samtools program ◦ Get the gene lists of most differentially expressed
genes and highest expressed genes◦ Visualization of differentially expressed genes by
cummeRbund package: Density plot, Scatter plot, Volcano plot, P value distribution plot, MA plot etc.
◦ Basic statistics of differentially expressed genes
Results so farMatthew Price
◦ Script for listing the top 100 expressed genes◦ Script for determining GC-content, transcript & intron
length◦ Script for getting correlation between each transcript
property and the expression levelThijs Slijkhuis
◦ Created a shell script that: Downloads the source files Converts SRA into FASTQ files Performs bowtie2-build Performs tophat Performs cufflinks
◦ Programmed a script that sorts cuffdiff output on p-value (significance in differential expression), extracts gene names from it
Issues/Challenges Co-expressed Genes Modules
◦WGCNA package not usable in our case◦Use cummeRbund package to get Heatmaps
GO enrichment analysis ◦Not many genes are annotated in the GO
database.◦Gene id of the differentially expressed genes
are not compatible with the NCBI database.Transcript sequences
◦Not all expressed transcripts in the .gtf file can be matched to their corresponding sequence in the fasta file.
Thank you for your attention!