christina gustafsson introductory guide to sas enterprise ...lipas.uwasa.fi › ~chg ›...

38
Christina Gustafsson Introductory Guide to SAS Enterprise Guide 6.1 Part I 1. Introduction and Main Windows .................................................................................................. 1 2. Help? ............................................................................................................................................. 5 3. SAS datasets ................................................................................................................................. 6 3.1. Libraries ................................................................................................................................... 6 3.2. How to Type in a SAS Dataset in EG ...................................................................................... 7 3.3. How to Open a Microsoft Excel File in EG ........................................................................... 10 3.4. How to Open a SAS Dataset in EG....................................................................................... 13 3.5. How to Import Data from a Text File .................................................................................... 13 3.6. How to Save Datasets ............................................................................................................ 14 4. About SAS Tasks and Results .................................................................................................... 15 5. About Modifying Sas Datasets .................................................................................................. 19 5.1. Creating Display Formats ...................................................................................................... 19 5.2. Transforming Variables and Creating Computed Variables .................................................. 21 5.3. Classifying Continuous Variables And Recoding Data Values ............................................. 27 5.4. Sorting And Filtering Data ..................................................................................................... 31 5.5. How to Merge Datasets ......................................................................................................... 36 5.6. Some Other Tasks to Modify SAS Datasets .......................................................................... 37

Upload: others

Post on 24-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

Christina Gustafsson

Introductory Guide to

SAS Enterprise Guide 6.1

Part I

1. Introduction and Main Windows .................................................................................................. 1

2. Help? ............................................................................................................................................. 5

3. SAS datasets ................................................................................................................................. 6

3.1. Libraries ................................................................................................................................... 6

3.2. How to Type in a SAS Dataset in EG ...................................................................................... 7

3.3. How to Open a Microsoft Excel File in EG ........................................................................... 10

3.4. How to Open a SAS Dataset in EG....................................................................................... 13

3.5. How to Import Data from a Text File .................................................................................... 13

3.6. How to Save Datasets ............................................................................................................ 14

4. About SAS Tasks and Results .................................................................................................... 15

5. About Modifying Sas Datasets .................................................................................................. 19

5.1. Creating Display Formats ...................................................................................................... 19

5.2. Transforming Variables and Creating Computed Variables .................................................. 21

5.3. Classifying Continuous Variables And Recoding Data Values ............................................. 27

5.4. Sorting And Filtering Data ..................................................................................................... 31

5.5. How to Merge Datasets ......................................................................................................... 36

5.6. Some Other Tasks to Modify SAS Datasets .......................................................................... 37

Page 2: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

1

1. INTRODUCTION AND MAIN WINDOWS

SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows client application. SAS Enterprise

Guide is a graphical user interface to SAS 9.4 (= Statistical Analysis Software) software. When you

use SAS Enterprise Guide, you are also using SAS software behind the scenes. As you access data

and build tasks, SAS Enterprise Guide generates SAS code. When you run a task, the generated

code is sent to SAS for processing and the results are returned to SAS Enterprise Guide.

The first time you start SAS EG, a welcoming window is opened. You can select one of the options

or just close the window.

If you close the welcoming window and you are using SAS EG for the first time, the windows are

arranged in the default application layout which consists of the Project Tree, the Server List

window and the workspace area. The workspace area is the main area of the SAS EG application

and it is used to display for example your data, task results and process flows. At first, the Process

Flow is the only window that is open in the workspace area. When you generate reports or open

datasets, other windows open in the workspace with a tabbed interface. You can use the recently

viewed items menu in the upper-left corner of the workspace to navigate between the windows.

If you customize your layout by closing, opening, or changing the position of windows, your

changes are saved when you exit SAS Enterprise Guide. If you want to restore the default layout,

you can select from the menu bar Tools > Options, and then click Restore Window Layout button

and then the OK button. If you close one of the application windows and want to restore it, you can

select the window name from the View menu.

The File menu contains commands relating to handling of files, such as open, save, import and

print. The Edit menu contains commands relating to text editing, such as copy and paste. With this

menu you can also protect or unprotect your dataset. With the commands in the View menu you can

select for example which windows are open in your project. With the commands in the Tasks menu

you can modify your dataset, perform statistical analysis and create graphs. The Program menu

contains commands relating to creating and running SAS code. With the commands in the Tools

menu you can change EG’s options and assign libraries. With the commands in the Help menu you

can access comprehensive help.

Page 3: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

2

In SAS Enterprise Guide, the work that you do is saved in projects. A project is a collection of data,

tasks and results. You can save a project and its contents to any location by selecting File > Save

Project As. The Save window opens, then select a location for the project and type a nice name for

your project in the File name box, and click “Save”. Project file is saved with the extension .egp

and the project file icon looks like this:

The Project Tree displays a hierarchical view of the active project. You can use the Project Tree

window to manage the objects in your project. You can delete, rename, and reorder the items in the

project. You can also modify and re-run your tasks via the Project Tree window. In front of the

name of the task there is a little icon to show what kind of result the task generated. For example the

icon means that the result of the task One-Way Frequencies is (at least) one table and the icon

illustrates that with the task Linear Regression the results are graphs (and tables, too). The icon

represents a SAS dataset.

Page 4: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

3

You can have one or more process flows in your project. You can copy or move objects between

process flows. When you create a new project, an empty Process Flow window opens. As you add

data, run tasks, etc., an icon for each object is added to the process flow. The process flow displays

the objects, any relationships that exist between the objects, and the order in which the objects have

been generated.

In the previous Process Flow window, the SAS dataset questionnairedata is the input of several

tasks, for example the One-Way Frequencies task, and the SAS Report -One-Way Frequencies

object is the output of the task. So, this Process Flow window shows perhaps a little clearer picture

how your tasks have proceeded than the Project Tree window.

Page 5: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

4

In SAS Enterprise Guide you use tasks for example to transform data and to run specific analytical

procedures. You choose tasks by using the Task List window or by using menus. The Task List

window is hidden by default, but you can display it by clicking the Task List button in the

Resources section in the lower-left corner of the workspace.

The Task List has four different views. The Task Gallery view shows individual tasks, grouped by

categories. The Tasks by Category view lists individual tasks, grouped by type. The Tasks by

Name view lists individual tasks alphabetically and it also lists the SAS procedures that are related

to the task. The Task Templates view shows your task templates if you have done some.

The Task Status window displays messages about the status of tasks as they are processing. The

Task Status window is not displayed by default, but you can view it by selecting from the menu bar

View > Task Status.

Project Log and Log Summary windows/tabs contain information about your SAS tasks: which

procedures have been used and did everything go as planned.

Page 6: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

5

2. HELP?

There are many ways that you can get help with your work in SAS EG. Comprehensive help is

available by selecting from the menu bar Help > SAS Enterprise Guide Help. From this window

you can browse through the table of contents and index, or you can use the search feature.

There is a context help area at the bottom of each task window that provides targeted help for the

options that you can select within a SAS task window. In some task windows there is an animated

help to assign the variables. And in each task window there is Help button that provides primary

help. Sometimes it is even necessary to use the manual of the SAS 9.4 software

(http://support.sas.com/documentation/).

Page 7: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

6

3. SAS DATASETS

3.1. Libraries

Libraries are shortcut names for directory locations or folders on your computer. Libraries contain

for example SAS datasets and variable display formats. You need the library reference to define the

folder where the new SAS datasets will be saved in, or where the existing SAS datasets are. Some

libraries are defined by SAS, and some are defined by SAS EG. The first time you start the SAS

EG, two folders will be created automatically: My SAS Files and its subfolder 9.4. In SAS software

a permanent SAS library called SASUSER is the shortcut name for the folder 9.4. Beware of the

SAS library called WORK because it is a temporary library that exists only during your SAS

session.

You can also create your own libraries. When you are assigning an own library, first you have to

create a new folder on your computer (e.g. SASDATAT). When you are assigning your own library

reference, it stands only the SAS session that is running. So, every time you start SAS EG or your

project again, the temporary library reference has to be assigned again. You can start assigning our

own library by selecting Tools > Assign Project Library… A four step tool window opens. In the

first step, type in a name for your library (e.g. MYOWN), and then select the server (usually the

only option is Local). Click “Next” to enter the second step.

In the second step, specify the path of your folder. Click “Next”. The next two steps you can just

skip by clicking “Next”. Lastly click “Finish”.

Page 8: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

7

You do not necessarily have to use your own libraries, because as a permanent library you can use

the SASUSER library.

3.2. How to Type in a SAS Dataset in EG

SAS is a bit clumsy tool to type in a dataset, but here are some ideas how you might proceed. To

type in a new dataset select from the menu bar File > New > Data. A two-step tool window opens.

In the first step, type in a name for your dataset (e.g. Data1) in the Name box, and at the Locations

zone select the library (e.g. MYOWN) to save your data in. Click “Next” to enter the second step.

Page 9: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

8

In the second step here you can define properties of the variables. Variables have properties such as

name, type, length, label, informat, and format.

Click the default name A of the first variable (=column). You can type in a new short name for the

first variable in the Name box (for example age). The name of the variable can contain up to 32

characters, but it’s recommended to use shorter names. The characters you can use are mainly

letters (not Scandinavian) and numbers. Do not use special characters or space. Start the variable

name with a letter.

You can type a label of the variable in the Label box (for example Age in years) to explain what the

variable really measures. You can use whatever characters you want to.

Column (variable) Type can be either character or numeric. The column's type is important because

it affects how the column/variable can be used in a SAS Enterprise Guide task. It’s recommended to

use numeric variables where every you can use them. Numeric variables can contain only numeric

values (as the variable age does). When the Type is numeric, the Group can be either numeric,

currency, date, or time. If your variable is numeric, currency, date, or time, you can select from a

selection of Read-in formats, how to type in the data values. You can also select from a selection

on Display formats, how to show the data values in your dataset. Unfortunately, the formats for

date and time do not always work as they should. (So, when you are typing in your data, check that

your data looks the way it should. If it doesn’t, you can try to use another format.) Most of

statistical variables are numeric, so the values are usually integers or real numbers and you do not

have to select any special read-in or display formats. But for special purposes (usually for

categorical variables) you can even define your own formats.

Page 10: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

9

Character variables can contain any values. When the Type is character, the Group is also

character.

Page 11: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

10

When you have defined your variables, click “Finish”. A data grid opens and you can type in the

data values. For numerical variables the decimal separator is period. If you do not have any value

for a certain numeric variable just leave the cell empty, and automatically there will appear a period

in the cell as a mark of a missing value if the variable is a numeric one. Missing character values are

represented by a blank. In the data grid one column is one variable, and one row is usually one case

(= observation). On the top of a column you can see the variable name and also information about

the type or group of that variable.

When you have finished typing in your data values, you can protect the data by selecting Edit >

Protect Data. If you want to edit your data again, you have to unprotect it first by unselecting Edit

> Protect Data. If you want to edit the properties of your variables, you can do it by right-clicking

the variable name and then select Properties.

Now this dataset is ready, the name of the dataset is DATA1 and it is stored in a library called

MYOWN which refers to the folder called SASDATAT. The file extension for a SAS dataset is

.sas7bdat and the icon for a dataset looks like this: .

3.3. How to Open a Microsoft Excel File in EG

Before you convert the Excel file into a SAS dataset, it’s a good idea to maybe edit the Excel sheet

a bit. The first row of the Excel sheet might contain just the short names of the variables (but do not

use special characters and Scandinavian letters), and starting from the second row you then have the

variable values of your data. It’s also a good idea to look through the data: how many variables

there are, how many cases there are, are the variables character or numeric ones, …

You can start to open a data file in SAS Enterprise Guide simply by selecting File > Open > Data.

Then you can browse the folder structure of your computer to locate your data file. You can open

any type of data file that SAS Enterprise Guide can read (for example SAS data files (with

extension .sas7bdat, .sas7bvew, .sd7, .sd2), Microsoft Excel data files (.xls, .xlsx, .xslm, .xlsb) and

other kind of data files (.txt, .html)).

Page 12: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

11

Double-click the file icon you want to open. A four step tool window “Import Data from …”

opens. You can use this task to convert non-SAS data into SAS dataset. In the first step you can

define a name for your SAS dataset and the destination library (e.g. SASUSER). The source data

file (Consumptions.xls) is the file that you are using as a basis for your output SAS dataset

(Consumptions_in_Europe). Click “Next” to enter to the next step.

In the second step you select which Excel worksheet contains the data. You can also define which

columns and rows in the Excel sheet has the data you want to convert in to a SAS dataset. If you

don’t use this option, all the rows and columns in the worksheet are going to be converted. If there

are variables names in the first row of the selected range in your Excel sheet, the names will be used

in SAS also, when you check “First row of range contains field names..”. If you suspect that the

variable names in Excel sheet do not comply the SAS variable naming conventions, you should also

check “Rename columns to comply…”.

In the third step you can browse through the variables and their properties. If you do not need some

variable in you SAS dataset, you can uncheck “Inc” (=Included). You can rename the variables and

change their properties by clicking first the variable you want to modify and then Modify. Date and

time data might cause some problems, so sometimes it’s necessary to modify the formats of those

variables. Sometimes you even have to modify your Excel sheet beforehand to get the data as it

should be.

The fourth step you can usually skip. Lastly click “Finish”. A data grid opens and it consists of the

observations in your Excel file. Check that the dataset looks at it should look. If it doesn’t, you

might have to edit your Excel sheet or/and import the file again with some modifications.

Page 13: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

12

Page 14: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

13

3.4. How to Open a SAS Dataset in EG

You can open an existing SAS dataset in EG simply by selecting File > Open > Data. In the Open

Data window select the folder where you have saved your dataset and the double-click the dataset

icon. If you have saved your data with the library reference (for instance SASUSER), then you can

open the dataset in the Open Data window by first clicking Servers and then open Local by

double-clicking the icon, open the Libraries by double-clicking, open the proper library by double-

clicking the library icon, and open your SAS dataset by double-clicking the icon.

3.5. How to Import Data from a Text File

If your data is saved as a text file (extension .txt), first you have to figure out how the data values

are arranged in your file. If the variable fields are aligned so that each variable field starts in the

same column for each row, then your text format is Fixed Width. If there is a specific character (for

instance such as comma, tab, semicolon) that separates the variable fields, then your text format is

Delimited.

You start to import the data from a text file by selecting File > Import Data. Again you have to

select the proper folder to access your text file. Double-click your text file icon and again a four

step tool window opens. The first step is similar to cases when you’re converting an Excel sheet to a

SAS dataset. In the first step you select your destination library and type in a name for your SAS

dataset

In the second step, in the Text format zone, you select either Delimited fields or Fixed columns

depending on how the variable values are arranged in your text file. If you select Delemited fields,

you need to define which character is the delimiter.

If you select Fixed columns, you need to specify column breaks by clicking on the ruler at the start

of each variable (except first one). If the variable names are included in your text file for instance

on the first row, you should check “File contains field names on record number”.

Page 15: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

14

In the third step, you can browse through the variables and their properties. If you do not need some

variable in you SAS dataset, you can uncheck “Inc” (=Included). You can rename the variables and

change their properties by clicking first the variable you want to modify and then “Modify”.

In the last step, you can accept the default selections and click “Finish”.

3.6. How to Save Datasets

Whether you create your SAS dataset by typing it in, opening an Excel file, or importing a text file,

your data is saved automatically as a SAS dataset file in your destination library (e.g. SASUSER).

So, even if you do not save your projects, you still have your SAS datasets (unless you have been

using WORK library as your destination library).

You can also save the dataset in a SAS data file format (file extension is then .sas7bdat) or a

different file format (for example as an Excel file) by selecting from the menu bar File > Export >

Export [Datasetsname]. Then choose the folder where you want to save in the file and select file

type (for example as Microsoft Excel) and type your file a name. You can do the saving also via

project tree or process flow by right-clicking the dataset icon and then from the pop-up menu select

Export [Datasetsname] and etc.

Page 16: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

15

4. ABOUT SAS TASKS AND RESULTS In SAS Enterprise Guide, task windows have a common format, so once you are familiar with

running one task, running other tasks is easy.

So, let’s perform some simple statistics. I have a SAS dataset questionnairedata. There are 322

cases (= rows) and 10 variables (= columns). The cases represent students in the University of

Vaasa. One case is one student. When the SAS dataset is open in the workspace, you can browse

the properties of the dataset by clicking Properties button.

When the Properties window opens, you can see on the left side a selection pane. When you click

Columns tab, you can browse the properties of the variables.

In the project tree or process flow, select your dataset to make it active by double-clicking the

dataset icon. (I know and maybe you know, too that) the variable math is measured on ordinal level,

so a nice way to present the distribution of math, is to create a frequency table. Select from the

main menu bar Tasks > Describe > One-Way Frequencies. You can do the same task by selecting

from the dataset’s own menu bar (or from the Task List window) Describe > One-Way

Frequencies. In the selection pane, select Data. From the Variables to assign list select math and

drag it to the Analysis variables role. (You can select the proper role also by using the arrow

button). Click Run to perform the task. Save saves the task without running it, Cancel cancels the

task and Help displays help for the task. (As you select options in the task window, SAS Enterprise

Page 17: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

16

Guide generates the SAS code that creates the results. If you want to see the code, click the Preview

code button in the task window.)

Wait a while and you can view the result and other tabs related to One-Way Frequencies task:

Page 18: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

17

If you want to present the distribution as a bar graph, too, you can modify your task by clicking

Modify Task button. In the selection pane, select Plots. Then check either Vertical or Horizontal

and click Run. You can replace your previous results by answering Yes to the next question.

And again you can view the results.

Tasks generate SAS Report output by default. You can change the output type by selecting Tools >

Options. In the selection pane, select Results General and check one of the Result Formats. If

you use the default format: SAS Report, then you can copy result tables and graphs into a word

processing software and the copied tables are editable, however, the graphs are uneditable.

You can also change the output type of an individual task by changing the task properties. In the

project tree, right-click the task which output format you want to change, then select Properties

from the pop-up menu. In the selection pane, select Results. Select Customize result formats,

styles, and behavior and then select the desired type and click “OK”.

Page 19: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

18

Page 20: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

19

5. ABOUT MODIFYING SAS DATASETS

Before you can modify your SAS dataset, you have to make your dataset active (for instance

double- click the data set icon in the project tree).

You can start modifying your data by selecting from the main menu bar Tasks > Data. Then select

a task from the pop-up menu. You can select the modifying task also from the Task List window.

Some tasks you can even select from the dataset’s menu bar.

5.1. Creating Display Formats

When you have categorical variables in your dataset, you usually have some code numbers or

characters to present the data values. For example, in the questionnairedata, I have originally typed

in integers 1 and 2 as code numbers for gender. So, at first, the gender column looked like this:

Page 21: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

20

In order to explain what the code numbers or character mean in a dataset, you can create a special

display format for a specific variable.

You start creating your own display format by selecting Tasks > Data > Create Format. In the

selection pane, select Options tab. Type a name (for example gend) for the format in the Format

name box, in the Location to store format zone select always Currently assigned libraries and

the SASUSER library. In the Format type zone select now Numeric (because integers 1 and 2 are

numbers and they have been used as code numbers for gender).

In the selection pane, select Define formats. Click the upper New button and replace Label1 with

the word male and type in 1 in the Values box to explain that number code 1 means male. Again

click the upper “New” and replace Label2 with the word female and type in 2 in the Values box to

explain that code 2 means female. Click “ Run“.

Page 22: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

21

Activate your dataset, unprotect it and in the dataset tab right-click variable name gender to link

up the format with the right variable. Select Properties. In the selection pane, select Formats tab

and from the Categories select User Defined and from the Formats select the GEND and click

“OK”. In this Formats list you can only such display formats where the type is the same than the

current variable’s type. If you have created a display format for a character variable, SAS

automatically adds the prefix $ sign in the format’s name.

And now the gender column looks like this:

So, first you create a display format and then you link it up with the proper variable.

5.2. Transforming Variables and Creating Computed Variables

When you want to transform your variables or want to create new variables based on some existing

variables, you can use Query Builder tool. With this tool, the original input dataset does not

change at all, but as a result you get a new dataset with all the modifications you have defined.

By default, no variables are included in the result set of a query. If you want to add all the variables

that exist in your input dataset to the result dataset, you can do that beforehand by selecting from

the main menu bar Tools > Options and select in the selection pane Query and then check

“Automatically add columns from input tables to result set of query” and click “OK”.

Page 23: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

22

Let’s make a new SAS dataset that has all the same variables as questionnairedata and some new

variables, too. Let’s assume that you have beforehand defined that all the variables in the input

dataset are included in the result dataset. Select Tasks > Data > Query Builder. You can type in a

descriptive name for your query task in the Query name box (e.g. Transformations and new

variables). You can change the result dataset name and library by clicking “Change” and typing in

a descriptive name (e.g. modified_questionnairedata) and selecting the desired library. In the

Output name box you see the name of the selected library (e.g. SASUSER) in front of the

dataset’s name. On the left side of the window, you can see the variables in the input dataset (=

QUESTIONNAIREDATA, Query Builder uses shortened name t1 (= table1). All the variables in

the Select Data tab are included in the result dataset.

Let’s make a new variable which measures the percentage of attendance.. The new variable is based

on two existing variables: attend1 and attend2. Click “Computed Columns” button and “New”.

A six (or four) step tool window opens. Select Advanced expression to build a mathematical

expression and click “Next”.

Page 24: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

23

In the second step, you can enter for instance mathematical expression by using functions,

constants, mathematical operators and variables. In the selection pane, you can see three main

items: Functions, Tables and Selected Columns. If you double-click “Functions”, you can look at

all the functions (in alphabetical order) that are available for building an expression. If you click a

function (e.g. ABS) once, you can see a short description of that function. If you double-click a

function, you are adding that function to your expression. If you click “ Tables”, you can browse all

the datasets that you can use in expression building. If you click “t1”, you can see all the variables

that you can use in expression building. If you double-click a variable, you are adding that variable

to your expression. In that case the “t1” name will appear in front of the variable’s name. If you

click “Selected Columns”, you can see all the variables in the output dataset.

You can also enter an expression by just typing if you know for example the names of the variables

and functions you need in your expression.

Now, let’s enter the expression into the expression box at the top of the window in order to

calculate percentage of attendance. The first version was created by double-clicking the variable

names on t1 table’s name list and the latter version was created by just simply typing the names of

the variables. The result is the same whichever method you use. Then click “Next”.

Page 25: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

24

In the third step, you can rename your new variable by typing in a new name in Column box (e.g.

attperc) and you can type a Label, too (e.g. Percentage of attendance).You can Change the

display Format if you need to. Click “Next”.

In the fourth step, you can see properties of the new variable. Click “Finish” and the Computed

Columns window opens.

Page 26: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

25

In the previous expression, only variables, arithmetical operators and a constant were used. Let’s

create a new expression with a function. Many statistical tests have some special assumptions about

the distribution of the variable to be analyzed (e.g. normal distribution). If the assumptions are not

satisfied, you usually have to try out some kind of transformation to satisfy the assumptions.

If you examine the distribution age, you can see, that the distribution is positively skewed (= right-

skewed). To get the distribution less skewed, you could try out to do a logarithmical transformation.

So, click “New”. Select Advanced expression and continue with “Next”. Double-click Functions

and double-click the function LOG10 to add it into the expression box. Function LOG10 requires

one numeric argument in round brackets, so replace <numvalue> with the variable age by first

double-clicking the table “t1” and then double-clicking age to add it into the expression box OR

you can just type age to replace <numvalue>.

Click “Next” and again you can rename your new variable (e.g. Log10Age). And click “Finish”.

The Computed Columns window opens and now you can close it by clicking “Close”.

In the Query Builder task window, you can see the two new variables. You can check what your

dataset looks like by clicking “Preview” and then selecting Results tab. Close the window. If you

have to do some corrections, you can edit you expression by double-clicking your new variable in

the Select Data tab.

Page 27: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

26

When everything is correct, click “Run” and now you can look your new dataset and admire your

two new variables along with original ones.

So, if you want to create a new variable with a (mathematical) expression, you start to create a

query where you give your result dataset a nice name and you specify the library you want to save

the dataset in and you can give a name for your query task. Then you compute the new column

(variable) by advanced expression where you enter the expression and then you give a name (and

maybe label and display format, too) for your new variable.

Page 28: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

27

5.3. Classifying Continuous Variables And Recoding Data Values

With Query Builder you can replace individual values with values that you specify, or classify

values of continuous variable into specified classes.

Again, you start by selecting Tasks > Data > Query Builder. You can type in a descriptive name

for your query task in the Query name box and you can change the result dataset name (and

library) by clicking “Change” and typing in a nice name (and selecting the desired library).

Let’s classify exam into four classes: 1) at most 9 , 2) 10 - 19, 3) 20 - 29 and 4) 30 or more. Click

“Computed Columns” button and click “New”.

A six (or four) step tool window opens. Select Recoded column and click “Next”.

In the second step, select the variable you want to classify or recode (e.g. exam). Click “Next”.

Page 29: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

28

In the third step, you can specify the type for the new variable and you can specify the classes (or

the new code numbers) by clicking “Add”.

T

Now, a new Specify a Replacement window opens. You can choose whether you want to replace a

single value, a range of values, or values that meet a specific condition.

Since exam is a continuous variable, click the Replace a Range tab. For the first class, you don’t

have to set a lower limit, so check the Set an upper limit check box. Then type the upper limit

value (e.g. 9) or select the value from the drop-down list. In the With this value box, type the value

that you want to assign (e.g. 1) and click OK to add the replacement value to the list. So, now the

first class is done.

Page 30: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

29

Click “Add” again to create the second class, click the Replace a Range tab, check the Set a lower

limit check box, type the lower limit value (e.g. 10), check the Set an upper limit check box, type

the upper limit value (e.g. 19), in the With this value box, type the value that you want to assign

(e.g. 2) and click OK.

To create the third class click “Add” again, click the Replace a Range tab, check the Set a lower

limit check box, type the lower limit value (e.g. 20), check the Set an upper limit check box, type

the upper limit value (e.g. 29), in the With this value box, type the value that you want to assign

(e.g. 3) and click OK. To complete the fourth class, click “Add” again, click the Replace a Range

tab, check the Set a lower limit check box, type the lower limit value (e.g. 30). For the last

(=fourth) class, you do not have to set an upper limit. In the With this value box, type the last value

that you want to assign (e.g. 4) and click OK.

Page 31: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

30

In the third step window, you can see all your new classes. In the Other values zone, you can select

the type of value with which you want to replace all other values that are not included in the list of

replacements. The current value option uses the current value of the selected variable, A missing

value option replaces the values with a missing value and Specify a value option replaces the

values with a new value that you can type in the value box. When you have done all the

replacements that you want, click “Next”.

In the fourth step, you can specify a name and a label for your new variable (e.g. examcl, Classified

exam) and you can change display format, too. Click “Next”. In the fifth step, you can see

properties of the new variable. Click “Finish” and the Computed Columns window opens and you

can close it by clicking “Close”.

Page 32: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

31

In the Query Builder task window, you can see the new variable. Again, you can check what your

dataset looks like by clicking “Preview” and then selecting Results tab. If you have to do some

corrections, you can edit you expression by double-clicking your new variable in the Select Data

tab. When everything is correct, click “Run” and now you can look your new dataset and admire

your new variable.

To replace a single value, you have to click the Replace Values tab in the third step. Click “Add”

and type in the value you want to replace (you can also select the value from the list of values by

clicking the button next to “Add”). Then enter the new value in the With this value box and click

OK. You can repeat these steps to add more replacement expressions. After you have created all of

the replacement expressions that you want, click “Next” and continue in the same way you did

earlier when classifying a continuous variable.

So, if you want to create a new variable with classification or recoding, you start to create a query

where you give your result dataset a nice name and you specify the library you want to save the

dataset in and you can give a name for your query task. Then you compute the new column

(variable) by recoded columns where you enter replacements for your data values and then you give

a name (and maybe label and display format, too) for your new variable.

5.4. Sorting And Filtering Data

If you want to sort the data by one or more variables from the original dataset or you want to create

a filter in order to select only the cases that meet certain criteria, select Tasks > Data > Filter and

Sort. (You can also filter or sort data with Query Builder.)

Page 33: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

32

Filter and Sort window opens. In the Variables tab, you have to select always whether you are

filtering or sorting, which variables to include in the output dataset. You add or remove the

variables by using the arrow buttons. In the next example three variables (math, prog and random)

are not included in the output dataset, however, seven variables are selected to the result dataset.

When you want to sort your data, select the Sort tab. From the Sort by drop-down list, select the

variable (e.g. age) that you want to use to sort the cases and the select whether you want to sort the

values for the variable in ascending or descending order. To sort by additional variables, select

another variable (e.g. exam) from the Then by drop-down list and select whether you want to sort

the values for the variable in ascending or descending order. You can repeat this step to sort your

data by additional variables.

Then select the Results tab in order to give your Task and your new Output dataset nice names

(e.g. Sort_dataset and sorted_by_age_exam) and also to specify desired library to save the dataset

in (e.g. SASUSER).

Page 34: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

33

Now, let’s filter your data. Let’s create an output dataset that is a subset of the original dataset in

the sense, that the output dataset contains only those cases that meet certain criteria. To just filter

your data, select Filter tab. You can use just one variable in a filter, or you can use multiple

variables to create for instance several comparison expressions. Let’s try out now the latter version

of filtering.

Let’s create an output dataset where the cases are such female students who are at most 25 years

old. Select a variable from the first drop-down list (e.g. age). Select an operator from the second

drop-down list (e.g. Less than or equal to). Enter a value in the third box (e.g. 25). You can add

additional filter conditions by selecting AND or OR from the last drop-down list (e.g. AND). Then

select a variable from the drop-down list (e.g. gender) and select an operator from the drop-down

list (e.g. Equal to). Enter a value in the box (e.g. 2). You can repeat these steps to create additional

filter elements. Click “OK” to create the output dataset. Then select the Results tab in order to give

your Task and your new Output dataset nice names (e.g. Filter_dataset and females_under_26)

and also to specify desired library to save the dataset in (e.g. SASUSER).

If you want to filter your dataset and need to add some computed variables to the dataset at the

same time, you should do the filtering with Query Builder. Select Tasks > Data > Query Builder.

You can (and your should) type in a descriptive name for your query task in the Query name box

and you can change the result dataset name (and library) by clicking “Change” and typing in a

descriptive name. Then select the Filter Data tab. Click the New Filter button.

A four step tool window opens. If you are using just one variable in a filter, select in the first step

Basic Filter and click “Next”. Let’s create such an output dataset where the cases are such students

whose math values are at least 4.

Page 35: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

34

In the second step, select the variable on which you want to filter the data (e.g. math) and click

Next.

In the third window, select a comparison operator (e.g. Greater than or equal to) from the

Operator drop-down list. If the operator that you have selected requires a value, you can enter the

value in the Value box (e.g. 4). Click “Next” or “Finish”. Now, the output dataset contains only

those students whose knowledge of basic mathematics has been either good (4) or very good (5).

Page 36: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

35

If you are using multiple variables in creating a filter, select in the first step Advanced Filter and

click “Next”. In the second step, you can build your filtering expression by using mathematical and

logical operators (e.g. >=, =, OR), functions, variables (e.g. math, age), and constant values (e.g. 4,

22). Click “Next” or “Finish”. Now, the output dataset contains only students whose knowledge of

basic mathematics has been either good (4) or very good (5) or whose age is exactly 22.

Page 37: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

36

5.5. How to Merge Datasets

Let’s assume that you have several datasets which you want to merge into a single dataset to

perform statistical tasks.

If the datasets have the same variables but the cases are different, then you can concatenate the

datasets. Activate one of your datasets and select Tasks > Data > Append Table. The Tables to

append area lists the tables that will be concatenated, in the order in which they will be

concatenated. First, the only dataset is the one you have just activated. To add another dataset, click

“Add Table”. Open Data window opens. Select the location and the name of the dataset to add. If

you want to select several datasets at the same time, hold down the CTRL key while selecting the

datasets that you want to add. Click “Open” to add the dataset to the Tables to append area. In the

selection pane click Results tab, and in order to create a SAS dataset, select Data table. Click

“Browse” and type in a name for your new dataset and select the location to save the data.

If the datasets have the same cases but at least some of the variables are different, you can join the

datasets by using Query Builder. But at first, activate one of your datasets and then select Tasks >

Data > Query Builder. Again, you can change the names of this task and the result dataset. Click

Join Tables button. Table and Joins window opens. To add another dataset to this task, click Add

Tables button. Open Data window opens. Select the location and the name of the dataset to add. If

both of your datasets have a key variable (e.g. Sukunimi) to identify and correctly match cases from

the two datasets, then EG automatically joins the datasets in a way that the cases in the result

dataset include those cases for which the value of a key variable in the first dataset matches the

value of a key variable in the second dataset. This join type is called an Inner join. There are other

kind of joins, too.

Page 38: Christina Gustafsson Introductory Guide to SAS Enterprise ...lipas.uwasa.fi › ~chg › IntroGuide_SAS _EG_6_ch_1_5.pdf · SAS Enterprise Guide 6.1 (EG) is a menu-driven Windows

37

When you have defined the join type, click “Close”. In the Query Builder task window, it’s always

a good idea to check what your dataset looks like by clicking “Preview” and then selecting Results

tab.

5.6. Some Other Tasks to Modify SAS Datasets

When you want to delete SAS datasets or your own display formats, the best way to do it is to use

Delete Datasets and Formats tool.

To standardize your variable values for example in order to look if there are any outliers, you can

use Standardize Data tool.

With Compare data tool, you can compare whether two SAS datasets have the same variables and

cases. With this tool, you can also compare whether two variables have the same values.

With Import SPSS (or Stata) file, you can easily open SPSS (or Stata) datasets.