Biostatistics 600
Supplement 2 Packet Contents
Fall, 2011
P2. How to Use a Permanent SAS Data Set
P13. How to Create a Permanent SAS Data Set
P18. Overview of Data Management Tasks
P27. Basic One-Sample and Two-Sample Statistical Tests Using SAS
How to Use a Permanent SAS Data Set
(commands=useperm.sas)
Introduction:
This chapter discusses using permanent SAS data sets from different releases of SAS on Windows. In general, SAS data sets are downwardly compatible across releases (e.g. SAS 9, SAS 8, SAS 7, SAS 6); a later release of SAS can generally read data sets from an earlier release by specifying the correct engine to read the data set.
The SAS data sets discussed in this handout are contained in two Zipped Files: SASDATA1.ZIP (Version 6 SAS data sets) and SASDATA2.ZIP (Version 8/9 SAS data sets). These two zipped files can be found on my web page: http://www.umich.edu/~kwelch.
Download and unzip these zip files to two folders on your desktop: SASDATA1 and SASDATA2.
SASDATA1.ZIP contains the following version 6 SAS data sets:
- FITNESS.SD2
- GPA.SD2
- MARCH.SD2
- SURVEY.SD2
Plus a version 6 formats catalog, FORMATS.SC2, which will not be discussed in this handout.
SASDATA2.ZIP contains the following SAS version 8/9 data sets:
- autism_demog.sas7bdat
- autism_socialization.sas7bdat
- bank.sas7bdat
- baseball.sas7bdat
- business.sas7bdat
- cars.sas7bdat
- employee.sas7bdat
- iris.sas7bdat
- tecumseh.sas7bdat
- wave1.sas7bdat
- wave2.sas7bdat
- wave3.sas7bdat
Plus, a SAS transport file:
- owen.xpt.
And, a SAS dataset with a short file extension:
- ship.sd7
Some definitions:
Library:
A library is a location on your computer (i.e., a folder or directory) where SAS data sets are stored. Because a library refers to the entire folder (not to an individual data set), one library can have several data sets stored in it, and it is possible for them to be of mixed types. However, a particular engine assigned to a given folder will only "SEE" files of one type. It is good practice to keep SAS data sets of different types in different folders.
Default Library: Work, the temporary library, is the default library that SAS assumes if no libname is specified for a data set. .
Engine:
An engine tells SAS the type of files it is to read. Some engines that you can use are:
- V9 the default engine for SAS release 9 (.sas7bdat) data sets (the V8 engine works, too)
- V8 the default engine for SAS release 8 (.sas7bdat) data sets
- V6 the engine used to read/write SAS release 6.08 through 6.12 (.sd2) data sets
- V604 to read (but not write) data sets created using PC SAS release 6.04 (.ssd) data sets
Default engine:
If you do not assign an engine to a library, the default engine will be the engine corresponding to the release of the data sets found in the folder. If there are no data sets in a folder, the default engine will be the the current version of SAS; i.e., if you are running SAS 9.1 or 9.2, SAS will automatically use the V9 engine, etc.
Mixed engines:
If there are data sets of mixed types within a folder, SAS will assign the engine corresponding to the highest version compatible with any of the data sets in the folder. So if there are both V6 and V9 data sets in a folder, SAS will assign the V9 engine to read data from that folder. To read V6 data sets from a folder containing both V6 and V9 data sets, the V6 engine must be used explicitly.
To avoid confusion, we highly recommend including only one type of data sets within any given folder/library.
Step-By-Step Instructions for Using a Permanent SAS Data Set:
There are three steps necessary to use a permanent SAS dataset.
- Determine the file type
- Assign a libref and engine
- Use the data set with a two-level name
Step 1: Determine the File type
The first step in using a SAS data set is to determine what type of file it is (i.e. the operating system from which it originated and the SAS release or engine used to create it). Check out the file extension to determine which type of SAS dataset(s) you have.
Operating SystemSAS Release ExtensionExample
Windows V7.0 to V9.0 .sas7bdat* business.sas7bdat
Windows V7.0 to V8.0 .sd7** ship.sd7
WindowsV6.08 to V6.12 .sd2 fitness.sd2
UnixV6.06 to V6.12.ssd01 mydata.ssd01
MacintoshV6.10 to V6.12.ssd01 mydata.ssd01
DOSV6.04(PC SAS).ssdmydata.ssd
* .sas7bdat is the default data set extension for SAS Windows release 7, 8, and 9. In newer releases of SAS, these datasets are compatible across all operating systems.
** .sd7 extension files cannot be read by SAS Windows release 9. If you have SAS data sets that end in the .sd7 file extension, rename them to .sas7bdat before trying to use them in SAS.
How to make file extensions visible
Windows XP:
Go to Windows Explorer or My Computer. Select Tools…Folder Options…View. Make sure that “Hide file extensions for known file types” is NOT selected.
Windows 7:
Click on Organize > Folder and search options:
In the Folder Options Pop-up box, select the View tab.
Step 2: Assign a libref and engine
You can assign a libref using one of the methods listed below. We discuss each in turn.
- Libname Statement
- Assign New Libraries Icon
How to assign a library using a Libname statement:
The libname statement assigns an alias (called a libref) to a directory that you specify. The directory must already exist The libref name that you assign must be 8 characters or less to be valid in SAS. You can assign any number of libname statements in a given SAS session.
Example:
- Download sasdata1.zip and sasdata2.zip and unzip them to two desktop folders: sasdata1 and sasdata2.
- Submit SAS commands to assign a libname to each of these folders, as shown below
libname sasdata1 v6 "C:\Users\kwelch\Desktop\sasdata1";
libname sasdata2 v9 "C:\Users\kwelch\Desktop\sasdata2";
Note: no engine needs to be specified for sasdata2, because it uses the default (V9), but we include it here for completeness. Be sure you highlight and submit both commands.
- View the datasets in each library using the SAS Explorer window.
Click on the libraries icon in the SAS Explorer window to view the libraries.
Double-click Sasdata1 to view the datasets in that library. Double-click Sasdata2 to view the datasets in that library.
Note that the SAS datasets have an icon that looks like a spreadsheet. The formats catalog looks like a folder. We will not use the formats catalog at this time. Double-click on any dataset to browse it in viewtable mode.
You can keep the viewtable window open while you use a dataset, but it is not necessary. The viewtable window must be closed before you can sort or modify the dataset. Click on the small x to close just this dataset.
Yikes My Explorer Window Disappeared!
To open the Explorer window again click on View> Contents Only…
How to assign a library using the New Library icon:
Click on the New Library icon in the menu bar
Type the name of your new libref in the Name: box. Choose the appropriate Engine from the drop-down menu. Browse to the folder you want to assign. Click on on Enable at startup if you want to have this library defined each time you run SAS. (This option will work on your personal computer, but will not take effect at the Public Computing Sites.)
Step 3: Use the data set with a two-level name
Specify a two-level name for the dataset for any procs or data steps. The two-level name is of the form libname.datasetname. Note that there are no spaces between the libname and the dataset name.
Examples:
Once the libraries are assigned, you can use any SAS datasets in either the Sasdata1 or Sasdata2 library, simply by using its two-level name, as shown in the examples below.
You will need to specify the data set to use with the data = option for each procedure. The libname statement will be in effect for the entire SAS session, and so it only needs to be submitted once.
title "Business data set";
proc means data=sasdata2.business;
run;
title "Iris data set";
proc means data=sasdata2.iris;
run;
title "GPA data set";
proc means data=sasdata1.gpa;
run;
title "Fitness data set";
proc means data=sasdata1.fitness;
run;
How to get the contents of all datasets in a library:
To get the contents of all datasets in a library, use the keyword _all_ in the data = option.
proc contents data=sasdata2._all_ ;
run;
proc contents data=sasdata1._all_;
run;
Make sure there are no blanks in the sasdata._all_ portion of your command.
HoHow to assign a default data set:
A default data set can be assigned with an options _last_= statement after a libname statement. (Be sure you have no blanks in _last_.) This allows you to utilize the same data set without having to specify it for each procedure. In the example below, sasdata2.baseball will be used for all procedures.
options _last_= sasdata2.baseball;
title "SASDATA2.BASEBALL Data Set";
proc means;
run;
proc freq;
tables team;
run;
proc reg;
model salary = cr_home;
run;
quit;
The default data set will be in effect until a new one is specified with another options statement, or until another new data set is created.
Note on Using Permanent Data Sets in SAS Release 8 and 9:
You can specify permanent SAS data sets to use by giving the complete path and file name in quotes, starting with SAS release 8. This avoids the libname statement, but does not allow a default data set to be specified. Data set options (e.g., obs = ) can still be specified in parentheses after the quoted file name.
title "SASDATA2.IRIS Data Set";
proc freq data="C:\Users\kwelch\Desktop\sasdata2\iris.sas7bdat";
tables species;
run;
proc print data="C:\Users\kwelch\Desktop\sasdata2\iris.sas7bdat"(obs=10);
run;
How to create a temporary SAS data set from a permanent one:
Many SAS users simply create a temporary SAS data set to use in a given session. This temporary data set becomes the default automatically.
data business;
set sasdata2.business;
run;
title "Business data set";
proc means data=business;
run;
This method has the advantage of allowing you to work with a temporary SAS data set, which is often simpler than working with a permanent one. But it can be cumbersome if you have a large data set, because it creates a whole new copy of the data in the WORK library.
How to de-assign a library:
Use the libname statement with the option clear to de-assign a library. The library assignment will be cleared, but the data sets in the library will not be affected. Do not specify an engine here.
libname sasdata1 clear;
How to automatically assign libraries using the Autoexec.sas file:
The library or libraries that you wish to use must be re-assigned for each session, if you assign them using a libname statement. To have SAS remember your libraries from one run to another, you can create a file called autoexec.sas, and place the libname statements in it, as shown below. Each time SAS starts up, it will read the autoexec.sas file, and assign the appropriate libraries.
libname sasdata1 V6 "C:\Users\kwelch\Desktop\sasdata1";
libname sasdata2 v9 "C:\Users\kwelch\Desktop\sasdata2";
If you place the autoexec.sas file in the directory from which SAS is running, SAS will read it and execute the commands it contains each time it starts up. However, if you save the autoexec.sas file in another location, you can specify it as an option in the SAS shortcut. An example SAS shortcut is shown below, followed by the notes in the SAS Log.
"C:\Program Files\SAS\SAS 9.1\sas.exe" -CONFIG "C:\Program Files\SAS\SAS 9.1\nls\en\SASV9.CFG" –AUTOEXEC “c:\temp\autoexec.sas”
NOTE: AUTOEXEC processing beginning; file is c:\temp\autoexec.sas.
NOTE: Libref SASDATA1 was successfully assigned as follows:
Engine: V6
Physical Name: c:\Users\kwelch\Desktop\sasdata1
NOTE: Libref SASDATA2 was successfully assigned as follows:
Engine: V9
Physical Name: c:\Users\kwelch\Desktop\sasdata2
NOTE: AUTOEXEC processing completed
How to Create a Permanent SAS Data Set
(commands=saveperm.sas)
Introduction:
A permanent SAS data set is saved to a location where it can be retrieved and used later without having to recreate it each time you restart SAS. In addition, transformations, recodes and other data manipulations are saved and do not need to be re-run every time the data set is used. Several people can share the same permanent data set over a network.
There are two steps necessary to create a permanent SAS data set:
- Assign a library and engine.
- Create the data, giving it a two-level name.
A library is a location on your computer (e.g. a folder or directory) where SAS data sets and other SAS files are stored. A library usually refers to the entire folder and not to individual data sets. One library can have several data sets stored in it. It is highly recommended that you store only one type of SAS data set in a given folder.
How to create a permanent SAS data set using a Data Step:
Suppose you wish to store your SAS data sets in a folder on your desktop, such as C:\Users\kwelch\Desktop\sasdata2. First submit a libname statement. Then use a data step to create the dataset, using the two-level name.
libname sasdata2 V9 "C:\Users\kwelch\Desktop\sasdata2";
data sasdata2.pulse;
infile "pulse.dat";
input pulse1 pulse2 ran smokes sex height weight activity;
run;
Sasdata2.pulse, will contain all variables originally read into SAS using the input statement, plus any new variables that you create. It will now be the default dataset.
How to create a permanent SAS data set using the Import Wizard:
To make a permanent SAS data set using the Import Wizard, you must first submit a libname statement from the Program Editor Window.
Follow all the usual steps to import the dataset. The data set can then be saved in the (pre-defined) library in the “Select library and member” window of the Import Wizard.
For example, the libname statement below can be submitted from the Program Editor Window to define the sasdata2 library.
libname sasdata2 V9 "C:\Users\kwelch\Desktop\sasdata2";
From the pull-down menu in the Library box, choose SASDATA2 as the library. Then type the data set name, PULSE, in the Member box:
The data set sasdata2.pulse will now be the default, because it was the most recent one created by SAS in this current session. It can be used without referring to its name in the current session.
proc means;
run;
proc freq;
tables sex ran smokes;
run;
Or you can refer to the data set using its two-level name by specifying a data= option.
proc means data=sasdata2.pulse;
run;
proc freq data=sasdata2.pulse;
tables sex ran smokes;
run;
How to create a permanent SAS data set using Proc Import:
You can also import an Excel file using Proc Import syntax. Type the two-level name as the value for the out= keyword, as shown below. (The syntax below was saved from importing the file PULSE.XLS using the Import Wizard.)
libname sasdata2 V9 "C:\Users\kwelch\Desktop\sasdata2";
PROC IMPORT OUT= SASDATA2.PULSE
DATAFILE= "C:\Users\kwelch\Desktop\labdata\PULSE.XLS"
DBMS=EXCEL REPLACE;
SHEET="pulse$";
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
How to create a permanent SAS Data Set as output from another procedure:
Many SAS procedures can create output data sets to be used later. For example, when running Proc Reg, an output data set can be created containing the predicted values and residuals from a fitted model. The commands below show how to create a permanent SAS data set, named sasdata2.resids, as output from Proc Reg. Note that the libname statement must be submitted first:
libname sasdata2 V9 "C:\Users\kwelch\Desktop\sasdata2";
proc reg data=sasdata2.pulse;
model pulse2 = pulse1 ;
output out = sasdata2.resids p=predict r=resid rstudent=rstudent;
run;
quit;
The following note is produced in the SAS Log:
180 proc reg data=sasdata2.pulse;
181 model pulse2 = pulse1 ran;
182 output out = sasdata2.resids p=predict r=resid rstudent=rstudent;
183 run;
183 quit;
NOTE: The data set SASDATA2.RESIDS has 92 observations and 15 variables.
NOTE: PROCEDURE REG used (Total process time):
real time 0.06 seconds
cpu time 0.06 seconds
The sasdata2.resids data set can now be used to check the distribution of the residuals, using Proc Univariate, as shown below:
proc univariate data=sasdata2.resids;
var resid;
histogram;
qqplot / normal (mu=est sigma=est);
run;
Note that the data set, sasdata2.resids will now be the default data set, because it was the most recently created data set in the current session of SAS.
How to use a permanent SAS data set in later runs of SAS:
To use a permanent SAS data set in later runs of SAS, you must submit a libname statement, and refer to the data set by its two-level name:
libname sasdata2 V9 "C:\Users\kwelch\Desktop\sasdata2";
proc means data=sasdata2.pulse;
run;
proc freq data=sasdata2.pulse;
tables ran smokes;
run;
How to delete a permanent SAS data set:
There are 3 basic ways to do delete a permanent SAS data set.
- Go to the SAS Explorer Window and delete the files by right-clicking a data set name and choosing delete.
- Go to the Windows Explorer and delete the data sets.
- Use Proc Datasets, as shown in the example below:
libname sasdata2 V9 "C:\Users\kwelch\Desktop\sasdata2";
proc datasets library=sasdata2;
delete pulse;
delete resids;
run;
quit;