overview of statistical software spss, stata, sas, r...overview of statistical software spss, stata,...

26
Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University

Upload: others

Post on 15-Mar-2021

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Overview of Statistical SoftwareSPSS, Stata, SAS, R

Debby Kermer

Data Services

George Mason University

Page 2: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Software

v 25spss.com

v 15stata.com

v 9.4sas.com

3.5.1r-project.org

2

Page 3: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Pros and Cons

SPSS Stata SAS R

Use High Low High Growing

Jobs Some Academic Many More

Cost Expensive Depends Expensive Free

Learning Easy Middle Hard Very Hard

Extensible Scripts Users Built-in Users

3

Page 4: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

What can it do well?

SPSS Stata

ANOVA, Factor Analysis, Discriminant Analysis

License modules separatelyTrends, Missing Data, Tables

Regression, diagnostics, and robust regression; Analysis of Survey Data, Time Series, SEM

Freely downloadable packages

SAS R

Data Management; Complex models; Mixed Model Analysis,

License components separately SAS/GIS, SAS/STAT, SAS/ACCESS

Anything, if you can find a [well written] package

Download additional packages from CRAN for free

4

Page 5: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Who Uses it?

SPSS Stata

Academic: Social Scientists (the “SS”), and non-scientists

Non-Academic: Companies that just want to do neat things

Academic: Economics, Public Policy, Biomedical Researchers

Non-Academic: Groups that often work with academics

SAS R

Academic: Statistics, Medicine

Non-Academic: Government, and corporations who are serious about data

Academic: Statistics, various

Non-Academic: Small companies with big plans, and others serious about data

5

Page 6: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Which to Pick?

SPSS Stata

Easy to start, limited capability

Best for those with infrequent and/or minimal needs

Easy syntax, highly extensible

Best for academics doing cutting-edge research

SAS R

Hard to learn, highly capable

Best for managing huge and/or complex datasets

Hard to learn, highly extensible

Best for those who program and know what they are doing

6

Page 7: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Job Prospects

Page 8: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

R vs SAS vs Python

9

http://www.burtchworks.com/2016/07/13/sas-r-python-survey-2016-tool-analytics-pros-prefer/

Survey of selected “quantitative professionals”, 2016

Page 9: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Use in Academia

12

http://r4stats.com/articles/popularity/

# of Scholarly Articleson Google Scholar

2015

Page 10: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

http://r4stats.com/articles/popularity/

Use in Industry

# of Analytics Jobs on Indeed.comFebruary 2014

13

Page 11: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Companies using it

http://blog.datacamp.com/statistical-language-wars-the-infograph/

14

Page 12: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Use

Page 13: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

InterfaceSPSS Stata

SAS R

16

Page 14: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

GUISPSS Stata

SAS Studio Deducer & R Cmdr

17

Page 15: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Syntax Contingency Table for variable q1 and q2;

with only n, row %, and χ2 test

SPSS

CROSSTABS/TABLES= q1 BY q2/STATISTICS=CHISQ /CELLS=COUNT ROW.

Stata

tabulate q1 q2, obs row chi2

SAS

PROC FREQ data=test; table q1*q2 / NOCOL NOPERCENT CHISQ;

RUN;

R

mytable <- table(q1, q2)mytableprop.table(mytable, 1)chisq.test(mytable)

19

Page 16: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Learning Curve

20

http://guides.nyu.edu/quant/statsoft#s-lib-ctab-6295863-7

Page 17: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Important Differences

Page 18: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Working with multiple files

SPSS Multiple datasets allowed, active data can be specified

Stata One dataset at a time, allows multiple instances

SAS Data always specified, no datasets in memory

R Data always specified, multiple objects in memory

22

Page 19: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Directories & Data Files

SPSS cd "directory" filename.sav

Stata cd "directory" filename

SAS libname name "directory" name.filename

R setwd("directory") use / or \\ filename.RData

23

Page 20: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Labeled/Categorical Variables

SPSS separate LABEL VALUES assigns labels to levels

Stata shared label define creates a 'label'

SAS shared PROC step creates label 'formats'

R separate defining a 'factor' creates labels for levels

24

Page 21: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Missing Values

SPSS . no value or user defined FALSE FALSE

Stata . highest possible value TRUE FALSE

SAS . lowest possible value FALSE TRUE

R NA no value, comparable TRUE TRUE

25

> # < #

Page 22: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Code Characteristics

CodeFile

Code Prompt

CommandEnd

Case Sensitive

Code Comment

SPSS Syntax File [nothing] . No *

Stata Do file . [line break] Yes *

SAS Program [line #] ; No *

R R Script > or + [interpreted] Yes #

26

Page 23: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Data Files

Page 24: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Files

Data Syntax Output Others

SPSS .sav .sps .spo / .spv .por

Stata .dta .do .smcl / .log .dct

SAS .sas7bdat .sas .lst / .log .sas7???

R .RData / .rda .R / .txt .txt .R??

28

Page 25: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Opening other File Types in…

29

Can open Stata and SAS directly

Use usespss, R, or Stat/Transfer (commercial)

Can import SPSS and Stata directly

Use packages foreign or haven to convert

Page 26: Overview of Statistical Software SPSS, Stata, SAS, R...Overview of Statistical Software SPSS, Stata, SAS, R Debby Kermer Data Services George Mason University Software v 25 spss.com

Resources

Help transitioning, links to help for each software

http://dataservices.gmu.edu/resources/software

Single Statistical Software Initiative

https://wikis.uit.tufts.edu/confluence/display/SSSI/Home

31