stata class

Upload: ainhoa-aparicio-fenoll

Post on 06-Apr-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Stata Class

    1/22

    STATA LESSON

    Applied Economics

    March 2012

    Instructor: Ainhoa Aparicio-Fenoll

    Outline:

    1. Introduction

    2. Dataset management

    3. Generating and recoding arrays

    4. Control flows

    5. Data overview

    6. Stata for econometrics

    Materials:

    http://intranet.barcelonagse.eu

    http://www.stata.com/links/resources1.html

    http://intranet.barcelonagse.eu/http://www.stata.com/links/resources1.htmlhttp://intranet.barcelonagse.eu/http://www.stata.com/links/resources1.html
  • 8/2/2019 Stata Class

    2/22

    1. INTRODUCTION

    STATA is a general command-drive package for statistical analysis, data management and

    graphics. It can be considered a stat package, like SAS, SPSS, RATS, or eViews. STATA

    operates in a graphical (windowed) environment.

    Basic features

    When to use STATA? STATA has three major strengths, namely, data manipulation,

    statistics and graphics. Which STATA to use? There are different versions in terms of maximum number of

    variables and matrix size.

    In general, Small STATA

  • 8/2/2019 Stata Class

    3/22

    Interface description

    STATA, by default opens with four windows:

    Results - where all of your commands and their results are displayed (with the exception

    of graphs which are displayed in their own window). Anything displayed in blue can be

    clicked on to get help or other information. When results are too long it shows the word

    more. You should click on it to continue watching the results. If you want all the results

    to be displayed even if you miss the upper part, just type: set more off, permanently Review - just your commands are displayed here. You can click on any command in the

    window and it will be pasted to the command window. The Review window has one extra

    option in its windows icon menu: "Save Review Contents." This will allow you to save

    everything in the review window to a file for later use. Although this is not a substitute for

    using log and do files. Command - this is where you type your commands when working in interactive mode.

    Any command typed here would be executed just by pressing the enter key. Everything

    you type in here is echoed in the Results window as well as the Review window. The

    "Page Up" and "Page Down" keys can be used to scroll back and forth through

    Result window

    Command windowReview

    window

    Variabels

    window

    http://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.html
  • 8/2/2019 Stata Class

    4/22

    commands you have executed previously. You can also copy and paste between this

    window and your "do" file. Variables - a list of all your variables and their labels is displayed here. You can click on a

    variable here and it will be pasted to the command window.

    There are three other windows in STATA:

    Do-file editor - this is STATA's simple text editor for writing do-files, or programs. You

    should do all of your work in a do-file so you can reproduce what you did later on. This is

    accessible through Window>Do-file editor>New Do-file or by clicking on the envelope

    icon. You can execute all commands or just some of them and you may choose the

    results to be displayed on the results window or not. Viewer - the viewer is used for displaying help and log files. Like the Results window,

    anything displayed in blue can be clicked on for more information. This window is reachedby clicking on Help>Content or by typing help in the command window.

    Graph - as you may have guessed, this is where all of your graphs will be displayed.

    The tool bar:

    (1) (2) (3) (4) (5) (6) (7) (8) (9) (10)

    (1): Opens a dataset

    (2): Saves a dataset

    (3): Prints graphs or contents of viewers

    (4): Begins, closes or suspends a log-file(5): Brings viewer in front

    (6): Brings the result window in front

    (7): Opens a do-file

    (8): Opens the data editor

    (9): Opens the data browser

    (10): Breaks/stops the operation

    M enu bar:

    File: To open, save, view, launch do-files, save graphs, print graphs or results

    http://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.html
  • 8/2/2019 Stata Class

    5/22

  • 8/2/2019 Stata Class

    6/22

  • 8/2/2019 Stata Class

    7/22

    You can add more results to an already closed log by typing: log using

    /logfilename.log , append

    Or you can simply replace an existing log with the command: log using

    /logfilename.log , replace To take a look at a log-file go to File>Log>View and browse it.

    Do-file

    The do-file is an ASCII file that collects a collection of commands to be executed sequentially. It is

    very useful because it makes it very easy for everybody to replicate your results.

    When starting a do-file, you should specify:

    That the old data should be removed from the memory, otherwise you would not be ableto upload your dataset. Use the command clear

    The directory where you will be working (that directory should contain your data,

    dictionary, etc.). The corresponding command is cd C:/EXAMPLE

    The amount of memory to be allocated to STATA. You must make sufficient memory

    available to STATA to load the entire le, since STATAs speed is largely derived from

    holding the entire data set in memory. Use the command set memory

    Alternatively you may want to set the maximum number of variables ( set maxvar ), the

    maximum matrix size ( set matsize ) and set the more option off ( set more off ). Now you can upload your new dataset and work on it.

    If you want to see the results of the operations in the do-file displayed in the results window, type

    or click the icon for run dofilename and if you do not want to see the results in the do-file

    displayed in the results window, type or click the icon for do dofilename.

    If you want to execute only some of the commands in the do-file, select them and type or click on

    the icon for run dofilename or do dofilename.

    (1)(2) (3)(4) (5)(6) (7)(8)

    http://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.html
  • 8/2/2019 Stata Class

    8/22

    (1): New do-file

    (2): Open a folder where you can find an old do-file

    (3): Save the do-file

    (4): Print the do-file(5): Find the do-file

    (6): Un-do

    (7): Run the commands of the do-file

    (8): Do the commands of the do-file

    (9): Search

    2. DATASET MANAGEMENT

    Getting data into STATA

    You can input data by hand:

    set obs

    edit

    Type the data in the bar above and press enter.

    Or alternatively,

    input v1 v2.vn And typing the numbers on the command window

    end

    Note: For the missing values, type .

    There are several commands to input already created datasets. You will need to use one

    or another depending on the format in which the dataset is provided.

    - insheet : This is used to input ASCII files where the file is delimited by tabulations or commas. For this command to work the data must contain only one observation per line.

    For instance, you may type: insheet using mydata.txt. You may even want to specify

    which the delimiter is, then type: insheet using mydata.txt, delimiter (;)

    - infix : You may need this command if the variables in your data are not in a fixed format.You should declare their position by typing infix v1 1-4 v2 6-7 v3 9-11

    - infile: It is useful when inputting complex ASCII files where several observations appear inthe same line or where the variables are separated by tabulations, commas or spaces.

    For instance, you may type infile var1 var2varn using filename.txt , automatic

    http://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.html
  • 8/2/2019 Stata Class

    9/22

  • 8/2/2019 Stata Class

    10/22

    Joining, expanding and collapsing datasets

    Joining datasets

    1. Merge joins corresponding observations from the dataset currently in memory (called the

    master dataset) with those from the STATA-format dataset stored as filename (called the using

    dataset) into single observations.

    Sintax:

    merge [varlist] using filename [, keep(varlist) unique uniqmaster uniqusing nolabel update

    replace nokeep _merge(varname) ]

    After each merge a new variable is created (the default is _merge) whose values are:

    _merge==1 obs. from master data

    _merge==2 obs. from only one using dataset

    _merge==3 obs. from at least two datasets, master or using

    2. Append appends a Stata-format dataset stored on disk to the end of the dataset in memory.

    Sintax:

    append using filename [, nolabel keep(varlist) ]

    Expanding and colla psing datasets

    1. collapse [(stat)] varlist [ [(stat)] ... ] [if] [in] [weight] [, options] It transforms the database in

    memory into a database of sums, means, medians, etc. Example: "collapse (sum) wage, by(zone

    education age)"

    2. expand [=]exp [if] [in] Example: "expand 2" / "expand pop. It multiplies the number of lines a

    certain number of times (2 in the first example) or according to the value of one variable in thatline (pop in the second example).

    3. GENERATING AND RECODING ARRAYS

    Variables

    generate varname=x to create a variable

    replace varname=x to replace the value of a variable

    label variable varname to give a variable a certain description

    http://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.html
  • 8/2/2019 Stata Class

    11/22

    Note: these commands follow the common syntax

    [by varlist:] command [ varlist] [if exp] [ in range ], [ options ] and thus can be applied to a restricted

    sample or as an extended version.

    Where:

    by varlist:

    It is used to group data (before you have to sort the data according to the varlist by the command

    sort varlist)

    c ommand:

    You can find a list of commands below

    in:

    It is used to restrict the sample to a certain range (e.g. in 4, restricts to case 4, in 1/4, which

    shows cases 1 to 4)

    Options:

    Various options are possible depending on the command (see help)

    e.g. ,replace replaces a variable//dataset etc. in case it already exists

    if :

    You can restrict the sample by using the following expressions:

    Arithmetic Logical (numeric and string)+ addition ~ not > greater than- subtraction | or < less than* multiplication & and >= greater or equal/ division 0, 0 if x=0 and -1 if x

  • 8/2/2019 Stata Class

    12/22

    Furthermore you might want to use some system variables, here a selection:

    Command System variable _n Index of current observation _N Total number of observations _all All variables _b Vector of regression coefficients

    Vector of standard errors of regression

    coefficients

    Note: Depending on the mathematical/statistical function you want to use you might have to use

    an extension of generate called egen .

    egen varname=function(varx) is used to create a variable which is the mean, median,of

    another variable

    Below a selection of functions where you have to use egen :

    Command Mathematical & Statistical Functionmean (varx) Mean of variable xMedian (varx) Median of variable xsum (varx) Sum of variable x (Note: gen varz=sum(varx)

    varz= cumulative sum)

    rank (varx) Rank of variable x, highest value gets value1count (varx) The number of nonmissing observations of x

    Examples of discrete variables

    "generate byte rich=(wage>100)": It generates a dummy variable.

    "generate byte rich=(wage>100) if !missing wage": It generates missing values for rich if wage is

    missing.

    "generate young=inrange(age,18,25)". Young takes value one if age is between 18 and 25.

    "generate cs=2 if inlist(csp,1,2,3,4)". The variable cs takes value 2 if csp takes any of the listed

    values.

    "tabulate csp, gen(csp)". This generates as many dummy variables as values are taken by csp.

    "generate size=recode(nb,0,5,10,15)". This command generates a new variable size that takes

    the values 0, 5, 10 and 15 whenever nb is zero, between 1 and 5, between 6 and 10 and 11 or

    more, respectively.

    "separate treff, by(csp)". This generates as many variables as values takes csp. Each variable

    is equal to csp for a certain value and zero otherwise.

    "gen szone=group(size zone)" is used for generating identifiers. It gives a code to any

    combination of the values of size and zone.

    http://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.html
  • 8/2/2019 Stata Class

    13/22

    "egen sizegr=cut(size), group(5)" It divides the variable size in 5 different categories with

    homogenous frequencies.

    "egen sizegr=cut(size), at(5,10,15)". It divides size into two categories, one from 5 to 9, and

    another from 10 to 15.

    "egen delta=diff(v1 v2 v3)". Delta is equal to one only if the three variables take the same value.

    Examples of continuous variables

    "gen suma=sum(wage)". This gives the cumulative sum of the variable wage. This is different

    form "egen suma=sum(wage)" which generates the total sum.

    "generate id=_n". It gives the values 1, 2, 3,... to each observation following the order in which

    they appear in the data.

    "(bysort year:) egen avg=mean(wage)"/ "egen mini=min(wage)" / "egen maxi=max(wage)" /

    "egen sd=std(wage)" / "egen moda=mode(wage)". It generates a variable equal to this statistic for

    all observations (or observations in the same group in the case of using bysort).

    "egen avgr=rowmean(wage1 wage2 wage3)" calculates the mean of the specified variables for

    each observation.

    "egen sumr=rowtotal(wage1 wage2 wage3)" calculates the sum of the specified variables for

    each observation.

    Examples of string variables

    "generate str4 zone4=substr(zone,1,4)". It takes the 4 first digits of the variable zone. Note:

    zone must be a string.

    "egen abc=concat(a b c)". It generates one variable that results from concatenating the three

    variables a, b and c.

    Variable elimination

    Once we have created the variables we might want to get rid of some of them. These we caneither do by eliminating the variables that we are not interested anymore (var1 varn)

    drop var2 var1 varn

    or by keeping the ones that we still want to use (var1 var2 varn)

    keep var1 var2 varn

    Furthermore we can change the name of a variable (from oldname to newname)

    rename oldname newname

    http://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.html
  • 8/2/2019 Stata Class

    14/22

    Examples of data elimination

    "rename sex gender": It renames the variable sex, gender.

    "replace zone=4 if zone==2 | zone==3". It will write 4 whenever the variable of zone was 2 or 3.

    "recode zone (2 3=4) (6 7=5), gen(zone2)". This way you generate a new variable with values 4and 5 according to whether the values of zone were 2 / 3 or 6 / 7.

    Macros

    Macros store information as strings. You can collect several variables under a common

    macroname.

    global macroglobalname var1 var2 varn

    local macrolocalname var1 var2 varn

    Both macros are working exactly the same way (you assign strings to specified macro names and

    can call them later by calling it `macroname). The main difference is that a global macro may be

    used by any program whereas a local macro is only for private use of the program in which you

    define it.

    When you want to use the variables contained in a macro, you should use the following syntax:

    sum $macroglobalname

    sum `macrolocalname (be careful with the commas)

    Scalars

    The scalars are variables that contain one single element. To handle them, you must use an

    specific syntax:

    - To define the content of the scalar:

    scalar a=normal(0.7)

    scalar b=199876

    scalar c=V[1,3]

    - To list the contents of scalars:

    scalar list

    scalar list a b

    http://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.html
  • 8/2/2019 Stata Class

    15/22

  • 8/2/2019 Stata Class

    16/22

    }

    else {

    command

    }

    The if command evaluates expression1 (e.g. var1>0). If the result is true, then it executes the

    command inside the first set of brackets. If not, it evaluates expression2 (e.g. vr1

  • 8/2/2019 Stata Class

    17/22

    5. DATA OVERVIEW

    Descriptive statistics

    describe

    It provides basic information about the file and the variables (number of observations, number and

    names of the variables, format of the variables, labels of variables if attached)

    browse

    It opens the data-browser where you can see the dataset

    list varlist

    It diplays all the data about the listed variables on the screen

    Summarize varlist

    It provides a summary statistic about the variables listed (frequencies, mean, min, max).

    Note: a possible option is , detail which provides more details such as percentiles, kurtosis etc.

    tabulate varlist

    It shows all different values of the variables with their frequencies

    table varlist

    It displays higher dimensional tables (see help for options)

    correlate varlist

    It gives the correlation coefficient between the variables named

    Graphs

    An easy way to create graphs is to use the graph option in the menu. There you can find the

    whole variety of graphs offered by STATA.

    http://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.html
  • 8/2/2019 Stata Class

    18/22

    There are two different graph commands in STATA:

    graph plottype [var1] [var2]

    graph twoway plottype [var1] [var2]

    The first command draws (in the announced plot-type) a one dimensional graph of the means of

    the variable listed. The second graph command shows the two-dimensional relationship between

    the variables listed. [var1] appears on the y-axis, and [var2] on the x-axis of the graph. The default

    extension for Stata graph files is .gph which Stata will automatically add to the file name. You can

    select the font, colors, line thickness and other options by selecting Graph Preferences in the

    Prefs menu.

    The following table gives an overview of the type of graphs you can draw in STATA

    Graphics* Descriptiontwoway[plottype] vary varx

    plottype : scatter, line, bar etc.

    2-dimensional family of plots, all of which fit on

    numeric y and x scalesgraph bar (mean) vary , over (varx) vertical bar charts (y axis is numerical, and the

    x axis is categoricalgraph dot (mean) vary , over (varx) horizontal dot charts (categorical axis is

    presented vertically, the numerical axis is

    presented horizontally)graph pie vary , over (varx) Pie chartsgraph matrix matrix plot of two-way scattergramsgraph save graphname saves a graph under graphname with the

    default extension .gphgraph use graphname.gph redisplays the graph graphnamegraph combine graphname1 graphname2 combines several graphs into one

    Example of graphs

    graph hbar (asis) pop, over(voter) title("Population by city") ytitle("Number of inhabitants")

    note("The source of the data is unknown")

    http://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.html
  • 8/2/2019 Stata Class

    19/22

    0 50,000 100000 150000 200000Number of inhabitants

    15

    1413121110

    9876543

    21

    The source of the data is unknown

    Population by city

    twoway (scatter price mpg, mlabel(mpg) xline(20) xscale(range(10(10)45))) (lfit price mpg),

    legend(off) title("Price over mileage") ytitle("Mileage") xtitle("Price") graphregion(fcolor(green)

    lcolor(green))

    http://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.html
  • 8/2/2019 Stata Class

    20/22

  • 8/2/2019 Stata Class

    21/22

    -------------+------------------------------ F( 3, 236) = 1.08

    Model | .728611679 3 .24287056 Prob > F = 0.3571

    Residual | 52.9338883 236 .224296137 R-squared = 0.0136

    -------------+------------------------------ Adj R-squared = 0.0010

    Total | 53.6625 239 .224529289 Root MSE = .4736

    ------------------------------------------------------------------------------

    male | Coef. Std. Err. t P>|t| [95% Conf. Interval]

    -------------+----------------------------------------------------------------

    age | -.0153606 .0162436 -0.95 0.345 -.0473615 .0166404

    grade | -.030674 .0201378 -1.52 0.129 -.0703469 .0089989

    toefl | -.0003715 .0013527 -0.27 0.784 -.0030364 .0022934

    _cons | 1.372898 .5609794 2.45 0.015 .267731 2.478065

    Anova block: This block shows the sum of squares for both the explained part of

    the model and the residuals. df shows the degree of freedom and

    MS the mean squared error with respect to the degrees of freedom

    Modelfit block: This block shows the number of investigated units, the F-Statistic and the

    adjusted R squared value.

    Coefficient block: This block provides the estimation results for all control variables, incl.

    the coefficients, the standard errors, the value of the t-statistic and the p-

    value as well as the confidence interval.

    For further advanced estimation procedures the following commands might be helpful:

    Regression procedures DescriptionRreg Robust regressionIvreg Instrumental variables regressionHeckman Heckmans selection model

    probit/logit Probit/Logit analysismlogit Multinomial logitTobit Censored-normal and Tobit regression

    Arima* Autoregressive integrated MA models Arch* AR conditional heteroscedasiticity estimators xt*... Panel analysisst* Survival time data* data must be time-series/panel data/ survival data; use the command tsset panelvar timevar or

    stset.

    You may want to type quietly in front of the regression command to avoid all the table being

    displayed. Specially, if you just want to make predictions or you are just interested in the

    coefficients.

    Post-estimation commands

    http://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.html
  • 8/2/2019 Stata Class

    22/22

    After having done a regression you might want to use the results for further analysis, e.g. some

    significance test or graphical analysis. In the following table you can find some useful commands

    to save regression results, to calculate marginal effects or to do certain types of tests. For more

    detailed information see the HELP function.

    Post estimation command Description predict name, options saves predictions (xb), residuals (res), influence statistics, etc. predictnl name, options point estimates, standard errors, testing, and inference for

    generalized predictionsmfx marginal effects or elasticities in a nonlinear equation (e.g. probit,

    logit etc.)estimate store saves estimation resultsEstat AIC, BIC, VCE, and estimation sample summarymat name = e() saves the estimation results such as the coefficients (e(b)),

    variance covariance matrix (e(V)Test Wald tests for simple and composite linear hypothesestestnl Wald tests of nonlinear hypotheseslrtest Likelihood-ratio test (to conduct the test, both the unrestricted and

    the restricted models must be fitted using the ML method)hausman Hausman's specification test: tests if an efficient and a consistent

    estimator are significantly different.

    http://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.htmlhttp://www.stata.com/links/resources1.html