![Page 2: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/2.jpg)
Bioconductor repository
• Is the repository with extensions and libraries for R-language found at http://www.bioconductor.org/
• Bioconductor libraries cover• Micorarray analysis• Genetic variants analysis (SNPs)• Sequence analysis (FASTA, RNA-seq)• Annotation (pathways, genes)• High-troughput assays (Mass Spec)
• All libraries are free to use and contain documentation (i.e. vignettes)• Vignettes are short and require some previous
knowledge of R and/or other defendant libraries
2
![Page 3: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/3.jpg)
Configuring R for Bioconductor
• To install Bioconductor libraries the R environment needs to be configuredsource("http://bioconductor.org/biocLite.R")
• To download any Bioconductor library use biocLite("package_name") functionbiocLite("GenomeGraphs")
• Load the downloaded library functions via the library("lib_name")functionlibrary("GenomeGraphs")
• Read help file(s) via the ?? or browseVignettes()??library_name (e.g. ??GenomeGraphs)
browseVignettes(package = "GenomeGraphs")
3
![Page 4: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/4.jpg)
Genome Data Display and Plotting
The GenomeGraphs libraryAuthors: Steen Durinck and James Bullard
(Bioconductor library)
![Page 5: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/5.jpg)
Intro to GenomeGraphs library
• Allows to retrieve data from• Ensembl comprehensive DB on genomes
• intron/exon locations• sequence genetic variation data• protein properties (pI, domains, motifs)
• GenomeGraphs Bioc library allows to display data on:• Gene Expression (expression / location)• Comparative genomic hybridization (CGH) • Sequencing (e.g. variations/introns/exons)
5
![Page 6: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/6.jpg)
GenomeGraphs: Plotting with gdPlot
• gdPlot main plotting functiongdPlot(gdObjects, minBase, maxBase ...)
•gdObjects = any objects created by GenomeGraphsa) BaseTrack; b) Gene; c) GenericArray; d) RectangleOverlay
•minBase the lowest nt position to be plotted (optional)•maxBase the largest nt to display (optional)
• Objects are data structures that hold/organize a set of variables1) Create an object to plot using makeBaseTrack() functionObjBaseT = makeBaseTrack(1:100, rnorm(1:100),strand = "+")
2) Plot the newly created objectgdPlot(ObjBaseT)
• Display object structure / properties (e.g. of BaseTrack)attributes(ObjBaseT)
•$ sign next to the name represents object variables
• Change or view individual object variablesattr(ObjBaseT, “strand")
6
![Page 7: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/7.jpg)
Manipulating GenomeGraphs class graphical properties
• Changing values of variables (classical way)attr(Object, "variable/parameter") = value
attr(ObjBaseT, "strand") = "-“
•If an array, to access individual elements use [element number 1..n] attr(ObjBaseT, "variable/parameter")[number] = new_value
attr(ObjBaseT, "base")[1] = 0
• Changing graphical parameters such as colorshowDisplayOptions(ObjBaseT)alpha = 1 lty = solid color = orange lwd = 1 size = 5 type = p getPar(ObjBaseT, "color")[1] "orange"setPar(ObjBaseT, "color", "blue")
7
![Page 8: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/8.jpg)
gdPlot composite/group plots
• gdPlot(…) can take any number of gdObjects to plot• Let’s plot two BaseTrack objects a and b
a = makeBaseTrack(1:100, rlnorm(100), strand = "+")
b = makeBaseTrack(1:100, rnorm(100), strand = "-")
ab=list(a,b)
gdplot(ab)
8
![Page 9: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/9.jpg)
Manipulating values of the grouped gdObjects
• ab is list of objects a and b• To access individual object within the list use [ ]
ab[1] will display object a
ab[2] will display object b• To modify the grouped object use double [[ ]]
•To change color of b to red getPar(ab[[2]], "color")
setPar(ab[[2]], "color", "red") -OR- (re-create object)
b = makeBaseTrack(1:100,
rnorm(100), strand = "-",
dp=DisplayPars(color="red"))
ab=list(a,b)
gdPlot(ab)9
![Page 10: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/10.jpg)
Plotting with labels and legend
• gdPlot does not provide functions to label axis• Trick = “use labeled / tagged” objects
"label" = GenomeGraph object
"+ strand"= makeBaseTrack(1:100, rlnorm(100), strand = "+")
• To display legend use makeLegend("text","color") ab=list("+"=a, "-"=b, makeGenomeAxis(),
makeLegend(c("+", "-"),c("orange", "red")) )
10
![Page 11: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/11.jpg)
Retrieving and Displaying Data from
Public Database
Combining capabilities of
biomaRt and GenomeGraph libraries
![Page 12: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/12.jpg)
Welcome to biomaRt• biomaRt library allows to retrieve data from public DBs
ensembl ENSEMBL GENES 68 (SANGER UK)
snp ENSEMBL VARIATION 68 (SANGER UK)
unimart UNIPROT (EBI UK)
bacteria_mart_14 ENSEMBL BACTERIA 14 (EBI UK)
***Use listMarts() to see all available databases***
• Let’s retrieve gene data of the Bacillus subtilis strain•useMart(database,dataset)allows to connect to specified database and dataset within this database
db=useMart("bacteria_mart_14")
listDatasets(db)
Dataset Description version
… … …
bac_6_gene Bacillus subtilis genes (EB 2 b_subtilis) EB 2 b_subtilis
db=useMart("bacteria_mart_14", "bac_6_gene")12
![Page 13: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/13.jpg)
Exploring biomaRt object• listAttributes()shows all prop. of the biomaRt obj.
• The db object has total of 4175 genes• Use getBM(attribute, filter, value, biomaRt_obj)to
extract values belonging to specified attributes• attribute: general term such as gene
name/chromosome # / strand (+ or - )• filter: parameter applied on attribute such as
genomic region to consider (i.e. start and end in nt)• value: actual values of the applied filter(s)
getBM(c("external_gene_id", "description","start_position", "end_position", "strand"), filters = c("start", "end"),
values = list(1,10000), db)
ID description start(nt) end(nt) strand
metS Methionyl-tRNA synthetase 45633 47627 1
ftsH Cell division protease ftsH homolog 76984 78897 1
hslO 33 kDa chaperonin 79880 80755 1
DgkDeoxyguanosine kinase 23146 23769 -1 13
![Page 14: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/14.jpg)
Plotting the selected “Genome Region”
• Create an object with makeGeneRegion()function
makeGeneRegion(start,end,chromosome name,
strand, biomaRt object, plotting options)
• Find notation used for the chromosome naminggetBM("chromosome_name","","",db)
chromosome_name
1 ChromosomegRegion = makeGeneRegion(1, 10000, chr = "Chromosome",
strand = "+", biomart = db, dp =DisplayPars(plotId = TRUE, idRotation = 90,
cex = 0.8, idColor = "black"))
gdPlot(list(gRegion, makeGenomeAxis(), makeTitle("Position(nt)",cex=3,"black",0.1)))
14
![Page 15: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/15.jpg)
Bacillus subtilis genome region (intron / exon)
15
ensembl_gene_id
![Page 16: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/16.jpg)
Mapping Expression data RNAseq
GenomeArray()
![Page 17: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/17.jpg)
Intro into RNA-seq data
• HT sequencing technologies allow to sequence mRNA in a series of short contigs of 50-200 bp
• In addition to gene expression analysis it possible to• Map location of introns (UTRs) / exons • Principal: one searches for a rapid changes in
abundance of the RNA-Seq signal (contigs)• Integration of sequence + expression information• Ugrappa Nagalakshmi et. al. 2008 had used this
strategy to accurately map yeast genome1 • Task: Display part of the seqDataEx dataset having both
• abundance of mRNA (cDNA) transcripts and• annotated yeast genome
171Ugrappa Nagalakshmi et. al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science, 2008
![Page 18: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/18.jpg)
Plotting RNA-seq data with GenomeArray() • Need to get mRNA contig abundance data
data("seqDataEx")
rnaSeqAb=seqDataEx$david
• Create GeneticArray object with makeGenericArray(intensity, probeStart, probeEnd, trackOverlay, dp = NULL)
•intensity either microaray or RNAseq transcript abundance signal•probStart start position of the probe (location in nt)
• Plot contig abundance w.r.t. to genomic locationgdPlot(list("abun"=makeGenericArray(rnaSeqAb[, "expr", drop = FALSE], rnaSeqAb [, "location"]) , makeGenomeAxis() ) )
18
![Page 19: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/19.jpg)
Adding genomic annotation to the prev plot
• Need to get annotated yeast genome dataannotGen = useMart("ensembl", "scerevisiae_gene_ensembl")
• Create mRNA contig abundancemRNAabun = makeGenericArray(rnaSeqAb[, "expr",
drop = FALSE], rnaSeqAb [, "location"])
• Create annotated seq. covering the mRNA contigs locationannotSeq= makeGeneRegion(start = min(rnaSeqAb[, "location"]),
end = max(rnaSeqAb[, "location"]), chr = "IV",
strand = "+", biomart = annotGen,
dp = DisplayPars(plotId = TRUE, idRotation = 0,
cex = 0.85, idColor = "black", size=0.5))
• Combine objects and plot themgdPlot( list("abund" = mRNAabun,makeGenomeAxis(), "+" = annotSeq),
1299000,1312000)19
![Page 20: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/20.jpg)
Resulting plot:Transcript abundance w.r.t. to location
20
![Page 21: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/21.jpg)
Overlays of Basic Shapes and Custom Text
• To overlay rectangle use makeRectangleOverlay() makeRectangleOverlay(start, end, region = NULL,
coords = c("genomic", "absolute"), dp)
rectOver = makeRectangleOverlay( 1301500, 1302200, region=c(1,2), "genomic", DisplayPars(alpha = 0.5))
• To overlay text use makeTextOverlay()tOver = makeTextOverlay("Ribosomal Large subunit",
1302000, 0.95, region = c(1,1),
dp = DisplayPars(color = "red"))
• Combine all overlay objects into one vector with c(v1,v2)gdPlot( list("abund" = mRNAabun,makeGenomeAxis(), "+" = annotSeq), overlays=c(rectOver, tOver) )
21
![Page 22: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/22.jpg)
Overlay of rectangle and text
22
![Page 23: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/23.jpg)
Alternative splicing of transcript
makeTranscript(id, type, biomart, dp = NULL)
![Page 24: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/24.jpg)
Displaying alternative splicing of a gene• mRNA coming from the same ORF could be spliced in
many ways • E.g. case of VDR genes of IgG• Given biomaRt object, the makeTranscript()will
extract splicing information for given id (i.e. gene)• Download human genome databse hGenome <- useMart("ensembl", "hsapiens_gene_ensembl")
• Select Ensembl ID to look at head(getBM(c("ensembl_gene_id", "description"),"","", hGenome))
• Get splicing data from biomaRt object (hGenome)spliceObj = makeTranscript("ENSG00000168309",
"ensembl_gene_id" ,hGenome)
• Plot the object with gdPlotgdPlot(list(makeTitle("Transcript ID:
ENSG00000168309"),splicingObj, makeGenomeAxis()) )
24
![Page 25: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/25.jpg)
The final result
25
![Page 26: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/26.jpg)
Conclusion
• GeneGraphs provides a wide range to plot genomic data• Can use external databases through biomaRt• Main useful features
• identifies exons/introns• allows to cross-reference expression / genome data• flexible albeit complex plotting capabilities• allows to overlay graphical objects and text• ability to create custom legends• annotation capabilities provided by powerful biomaRt
26
![Page 27: Intro to Biocoductor and GeneGraph by Kyrylo Bessonov (kbessonov@ulg.ac.be) 9 Oct 2012](https://reader033.vdocuments.us/reader033/viewer/2022051018/56649e9d5503460f94b9ebeb/html5/thumbnails/27.jpg)
Thank you for your patience!&
Happy Bioconductor/R Exploration!