week 13, lecture 26
TRANSCRIPT
Practical Bioinformatics for Life Scientists
Week 13, Lecture 26
István Albert
Bioinformatics Consulting CenterPenn State
Visualizing high dimensionality data
by Hadley Wickham: http://had.co.nz/
There is nothing like it in any programming environment!
Parts of this presentation follow the tutorial of ggplot2
Getting started with ggplot2
http://had.co.nz/ggplot2/book/qplot.pdf
We will start out with example plots from this manual
Then at the end wegenerate a peak distribution plotaround gene starts sites.
Install ggplot2
diamonds.txt (data comes with ggplot2)
NOTEFor the next few slidesI will be changing only
line 10
(sometimes we use alldata or just the small
data)
ggplot2 concepts
• geometry what plot looks like
• faceting how many plots/panels
• statistics transformation on the data
• positioning fine tunes locations in the plot
• scales maps data to an x,y coordinate
Faceting - multiplots
Faceting and shapes and colors
scripts are in supporting data located in the 26.tar.gz file on the website
Recall intersecting peaks with genes from the Chip-Seq lecture. We needan R script to prepare the data for plotting. Code included in this week’s download
Homework 26
Generate four plots with ggplot2 that demonstrate one ore more features including:
– histograms
– shapes
– colors
– faceting