introduction to s-plus

19
Introduction to S- Plus by Francesco Ferretti Analysis of Biological Data Course Winter term 2007 Dalhousie University

Upload: harlan-hawkins

Post on 30-Dec-2015

33 views

Category:

Documents


2 download

DESCRIPTION

Introduction to S-Plus. by Francesco Ferretti Analysis of Biological Data Course Winter term 2007 Dalhousie University. Introduction. S-plus and R are statistical programs using the S language. Developed in the Bell Labs of AT&T in 1970s by Rick Becker, John Chambers and Allan Wilks - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Introduction to S-Plus

Introduction to S-Plusby

Francesco FerrettiAnalysis of Biological Data Course

Winter term 2007 Dalhousie University

Page 2: Introduction to S-Plus

Introduction S-plus and R are statistical programs using the S language. Developed in the Bell Labs of AT&T in 1970s by Rick

Becker, John Chambers and Allan Wilks In 1987 Douglas Martin at the University of Washington

created the present Insightful Corporation. He made S more popular, compatible with many hardware platforms, and provided with the necessary support for technical and statistical problems. S become S-plus

In 1997 the R project started. It was created by Ross Ihaka and Robert Gentelman at the university of Auckland, New Zealand. R is Similar to S-plus and freely available.

Page 3: Introduction to S-Plus

S-Plus and R Flexible and powerful statistical program Particularly appealing for its graphical

capabilities Can be problematic with large amount of data

SAS is more powerful in these cases

Page 4: Introduction to S-Plus

GUI (Grafical User Interface) Main toolbar and several windows Object Explorer

Overview of what is available on the system. Computational Engine

data frames, list, matrices, vectors Interface Objects

Search path, menu items, toolbars, dialogs Documet objects – Outputs

Graph sheets, Scripts and ReportsObject explorer visualize all the objects you have in your

work directory

Page 5: Introduction to S-Plus

GUI (Grafical User Interface) Import data

File>Import Data>From file Export data

File>Export Data>to file chose among all the data frames present in your working directory, give location

and extension Creating graphs1. Highlight a dataset in object explorer2. Select variables (Ctrl-select)3. Click on 2D plots4. Chose the preferred graph type5. Save graphs

• Default *.sgr (s-plus graph sheet)• Eventually you can choose your preferred picture extension with

File>Export Graph.. then specify location, name and extension then click OK

Page 6: Introduction to S-Plus

GUI (Grafical User Interface) Summary statistics

1. From object explorer select a data frame

2. On the main toolbar select Statistics>Summary Statistics

3. Select data, variables and statistics to be shown then click OK

Page 7: Introduction to S-Plus

Programming modeFull potential and flexibility of S-plus. Highly recommended! While GUI can perform much of the S-Plus commands and functions, programming mode allows you to resolve potentially all problems you will encounter in data manipulation, analysis and plotting.

Command window Can be used step by step interactively Writing functions

Using a text editor (notepad, emacs, editplus, etc.) or directly on the command line

Page 8: Introduction to S-Plus

Command line (the basic) S-plus is case sensitive # commenting sign ? Call help q() quit S-plus <- assignment sign. This is to associate a

value or a function to a variable name

Page 9: Introduction to S-Plus

Use of S-Plus in programming mode Calculator

*/+-, =, log, exp, sqrt, ^, sin, cos

Follow the same arithmetic rules */ before +- and () before */

Manipulate data Fitting models to data Plotting graphs

Page 10: Introduction to S-Plus

Logical Values Boolean Values: True, False < (less than), >, <= (less than or equal to),

>=, == (equal to), != (not equal to) Conditional expressions and operators

If, else, ifelse

& (and) | (or)

Page 11: Introduction to S-Plus

Brackets () to enclose arguments of functions and

perform arithmetic calculations [] indexing objects

x<-c(1,5,7,8) then x[3] = 7

{} to enclose groups of commands Function bodies If else statements loops

Page 12: Introduction to S-Plus

S-plus common objects Vector

Ordered group of numbers or strings X<-c(45,29,27) z<-c(180,180,165) y<-c(“Hall”,”Francesco”,”Sara”)

Matrix “rectangular layout of cells each one containing a value”

AH<-matrix(c(45,29,27,180,180,165),nrow = 3) AH<-matrix(c(x,z),nrow=3)

Array Multidimentional matrix

Data frame AHP<-data.frame(x,z,y) AHP<-data.frame(x,z,y,)

List

group together data not having the same structure. Output or summary come out as list. You can access or use part of these output.

Page 13: Introduction to S-Plus

Functions Set of commands performed on specified

variables Y<-mean(x) …or..y<-(x1+x2+x3+x4)/4 ..or..

y<-sum(x)/4 ..or..y<-sum(x)/length(x)

You can build your own functions In command line

SD<-function(x){sqrt(var(x))} function will be saved in your working directory…..SD(x)

Page 14: Introduction to S-Plus

Functions Creating a file with an s extension (file.s, sort of a library where you can store one ore more

functions) Open and editor Write the function:

# this function create the dataset “buddy” and # plot its variables one against the otherbuddy<-function(){

x<-c(2,3,5,6,8,10)y<-c(4,6,10,12,16,20)buddy<-data.frame(x,y)

plot(buddy$x,buddy$y,xlab=“x”,ylab=“y”,type=“l”)print(buddy)}

Save the file as an s file: c:\buddy.s Open the file with source(“c:\\buddy.s”) Access the funtion calling it as buddy()

Function namearguments

Body of the function, set of commands

Page 15: Introduction to S-Plus

Use of S-Plus in programming mode (Manipulation of data) Dataset never ready for analyses

Importing datasets: read.table() Subsetting object Creating new variables

seq(), rep(), sort(), unique(), length()

Merging and binding datasets: merge(), cbin(),rbin()

Page 16: Introduction to S-Plus

Graphical analysis Plotting to the active device: s-plus window

or file

pdf.graph(file=“”,horizontal=“”)

postscript(file=“”,horizontal=“”)

graphsheet(file=“”,format=“”)

Important functions:

par(), plot(), hist(), boxplot(), pairs()

Page 17: Introduction to S-Plus

Fitting a model to data Take SharkLife data Summary of the data, summary() EDA (Exploratory Data Analysis), pairs(),

hist(), boxplot(), plot() Fitting a linear regression model between

Lmax and birth.size, model1<-lm() Checking the model (using statistics and

plots), summary(model), plot(model)

Page 18: Introduction to S-Plus

Programming mode Script window

Mode where you can write programs, run them and keep track of your operations for future work File>New>Script File

Page 19: Introduction to S-Plus

Useful Reference Books The Basic of S-Plus by Krause A. and Olson M.

Statistical computing with S-Plus by Crawley M.J.

Modern Applied Statistics with S-plus by

Venables W.N. and Ripley B.D

…much more in the internet