SPSS for Windows: Reading SAS Data Sets

SPSS for Windows: Reading SAS Data Sets

Ed Greenberg

College of Nursing

Arizona State University

Revised November 5, 2001

Permission to freely use or adapt any portion of this document is granted, provided that the author is cited.Preface

This document contains instructions for converting SAS data sets for use in SPSS. The procedures described herein apply to SPSS for Windows Version 10 and SAS for Windows Version 8. For different versions or operating system editions of either package, the procedures may differ.

For “quick and dirty” instructions, see the last page of this document.

SPSS Data Files

A SPSS data file contains dictionary information and data. The dictionary information includes variable names, variable labels, value labels, missing value specifications, and other attributes of the variables in the data file. The data portion of the SPSS data file contains data values, logically organized into rows and columns containing cases and variables, respectively. In the Microsoft Windows environment, a SPSS data file is named name.SAV where name is any name that conforms to Windows file naming rules and “SAV” is the file extension.

SAS Data Sets

A SAS data set consists of descriptor information and data values. The descriptor information describes the contents of the data set to SAS. The data values are organized into rows and columns containing observations (cases) and variables, respectively.

The Structure of SAS Data Libraries

SAS data sets are stored in SAS data libraries and are referred to as members of a library. In the Microsoft Windows environment, a SAS data library consists of a Windows directory that contains one or more SAS data sets, each in a separate file. The SAS System identifies SAS files by using unique file extensions. For example, a file containing a permanent SAS Version 7 or 8 data set has a file name of the form name.SAS7BDAT where name is used in a SAS statement such as a DATA or PROC statement to refer to the SAS data set, and “SAS7BDAT” is the file extension. Optionally, a SAS data set can have a shorter, three-character, extension of SD7.

A “permanent” SAS data set is one that is retained after you end your SAS session. In contrast, a “temporary” SAS data set exists only for the duration of a SAS session. Temporary SAS data sets are stored in a subdirectory created by SAS in C:\WINDOWS\TEMP or another temporary Windows directory.

SAS data libraries can contain materials other than SAS data sets. For example, they can contain SAS catalogs. A SAS catalog is a special type of SAS file that can contain multiple entries. You can keep different types of entries in the same SAS catalog. For example, catalogs can contain SAS/GRAPH graphs, SAS/IML matrices, SAS formats, etc.

A SAS format is an instruction that SAS uses to write data values, and is largely equivalent to value labels in SPSS. In the Microsoft Windows environment, a SAS catalog has a file name of the form name.SAS7BCAT where name is used in a SAS command to refer to the catalog and SAS7BCAT is the file extension. Optionally, a SAS catalog can have a three-character extension of SC7.

When converting SAS data sets to SPSS, any SAS formats that are associated with the SAS data set are not converted. Thus, the variables in the resulting SPSS data file will not have any value labels.

Note that earlier versions of SAS used different extensions for SAS files. For example a SAS Version 6 data set has an extension of SD2 and a SAS version 6 catalog has an extension of SC2.

In the figure below, a SAS data library is contained in the directory C:\PROJECTS, and it contains two SAS data sets, ONE.SAS7BDAT and TWO.SAS7BDAT, and a SAS catalog, MYFMTS.SAS7BCAT.

(C:)

PROJECTS

ONE.SAS7BDAT

TWO.SAS7BDAT

MYFMTS.SAS7BCAT

Using SAS Data Sets in SAS Programs

When using a permanent SAS data set in a SAS program, it must be referenced via a LIBNAME statement. The basic format of the LIBNAME statement is as follows:

LIBNAME libref'SAS-data-library';

You can choose any name for libref (library reference), as long as it conforms to the rules for SAS names. The parameter SAS-data-library, enclosed in single quotes, specifies the path to the data library. The libref is used in subsequent SAS commands to refer to the SAS data library. These references will be of the form libref.name.

On the sample LIBNAME statement below, MYDATA is the libref, which is associated with the SAS data library, C:\PROJECTS. On the PROC PRINT statement is a reference to one of the SAS data sets in this library, MYDATA.ONE. The contents of the data set C:\PROJECTS\ONE.SAS7BDAT will be printed.

LIBNAME MYDATA 'C:\PROJECTS';

PROC PRINT DATA=MYDATA.ONE;

RUN;

LIBNAMEs can also be assigned interactively in the SAS Display Manager (the shell which provides an interactive interface to SAS). See Appendix A of this document for instructions on how to do this.

Conversion Method 1: Reading a SAS XPORT Format Data Set with SPSS

Versions of SPSS through Version 10 cannot read native-format SAS Version 8 data sets. However, SPSS can read SAS data sets that have been stored in several other formats. One of these is XPORT format, a SAS portable format that enables SAS data sets to be transferred from one type of system to another, e.g., Windows to Unix.

The translation of a SAS data set from one format to another is accomplished by invoking the appropriate access method, or “engine” on the LIBNAME statement.

When invoking a SAS library engine, the format of the LIBNAME statement is as follows:

LIBNAME libref engine 'SAS-data-library';

As a first step, the SAS data set must be saved in XPORT format. This can be done using PROC COPY.

LIBNAME ABC 'C:\PROJECTS';

LIBNAME DEF XPORT 'C:\PROJECTS\ONE.TPT';

PROC COPY IN=ABC OUT=DEF;

SELECT ONE;

RUN;

The librefs ABC and DEF on the two LIBNAME statements are arbitrary, although they must conform to the rules for SAS names. The XPORT parameter on the second LIBNAME statement invokes the SAS XPORT engine. The extension used for the output data set name, TPT, is the one SPSS expects for SAS files in transport format. PROC COPY is used to copy contents of the SAS data library referenced via the libref ABC to the XPORT format data set referenced by libref DEF. The SELECT statement specifies that only the data set named ONE is to be copied. This statement is only necessary if the original SAS data library contains more than one SAS data set.

The same result could have been accomplished in a SAS Data step, as follows:

LIBNAME ABC 'C:\PROJECTS';

LIBNAME DEF XPORT 'C:\PROJECTS\ONE.TPT';

DATA DEF.ONE;

SET ABC.ONE;

RUN;

The SET statement reads the records from the SAS data set ONE.SAS7BDAT and the DATA statement causes them to be written to the SAS XPORT file ONE.TPT.

If your data are included in-stream within your SAS program, a permanent SAS data library need not be referenced. In this case, your program may appear as follows:

LIBNAME DEF XPORT 'C:\PROJECTS\ONE.TPT';

DATA DEF.ONE;

INPUT ID A B C;

CARDS;

1 1 2 3

2 4 5 6

3 7 8 9

4 10 11 12

RUN;

Only one LIBNAME statement is needed, the one that references the SAS XPORT data file to be written via the SAS Data step. Note the use of the libref “DEF” on the LIBNAME and DATA statements.

The next step is to open SPSS and from the File menu select Open and Data...: /
In the Open File window that pops up, locate the directory containing the SAS portable file, in this case. C:\PROJECTS. In the drop-down list for “Files of Type:” select “SAS portable (*.tpt)”. The SAS portable file ONE.TPT should be displayed in the window. /
Click on the file name to copy it into the “File name:” box. Then click on the “Open” button. /
SPSS will read the SAS portable file ONE.TPT and place its contents into the Data Editor window: /

You can also accomplish the same result via the SPSS command language. Type the following command in a SPSS syntax window and run it:

GET SAS DATA='C:\PROJECTS\ONE.TPT' DSET(ONE).

Note that this method will also work on other systems, such as Unix. However, the file specifications (path and filename) must conform to that system's rules.

Conversion Method 2: Reading a SAS Version 6 Data Set with SPSS

SPSS for Windows Version 8 can read SAS data sets in Version 6 format. If your SAS data sets are stored in Version 7 or Version 8 format, they'll first need to be converted to this older format. This can be accomplished via a short SAS program as in the following example:

LIBNAME SASV8 'C:\PROJECTS';

LIBNAME SASV6 V6 'C:\ PROJECTS';

DATA SASV6.ONE;

SET SASV8.ONE;

RUN;

This example assumes that a SAS Version 8 data set ONE.SAS7BDAT is a member of the SAS data library in C:\PROJECTS. It is referenced via the libref SASV8 on the first LIBNAME statement and the SET statement. The SAS Version 6 data set ONE.SD2 will be written to the same SAS data library, C:\PROJECTS. This data set is referenced on the second LIBNAME statement and the DATA statement. Note the inclusion of the V6 engine specification on this LIBNAME statement, specifying that SAS is to write this data set using the V6 library engine. The Data step is executed once for each observation in the input data set. The SET statement causes a record (observation) to be read from the Version 8 data set ONE.SAS7BDAT and the DATA statement causes the record to be written to the data set ONE.SD2 in Version 6 format.

If your data are included in-stream within your SAS program, a permanent SAS data library need not be referenced. In this case, your program may appear as follows:

LIBNAME SASV6 V6 'C:\ PROJECTS';

DATA SASV6.ONE;

INPUT ID A B C;

CARDS;

1 1 2 3

2 4 5 6

3 7 8 9

4 10 11 12

RUN;

Only one LIBNAME statement is needed, the one that references the SAS Version 6 data set to be written via the SAS Data step. Note the use of the libref “SASV6” on the LIBNAME and DATA statements and the use of the V6 engine specification on the LIBNAME statement.

The next step is to open SPSS and from the File menu select Open and Data...: /
In the Open File window that pops up, locate the directory containing the SAS Version 6 data set, in this case. C:\PROJECTS. In the drop-down list for “Files of Type:” select “SAS for Windows (*.sd2)”. The SAS Version 6 data set ONE.SD2 should be displayed in the window. /
Click on the file name to copy it into the “File name:” box. Then click on the “Open” button. /
SPSS will read the SAS Version 6 data set ONE.SD2 and place its contents into the Data Editor window: /

A concise example of this method is shown in Appendix B of this document.

Good news, we hope!

On their web site, SPSS, Inc. claims that the forthcoming Version 11 of SPSS for Windows will be able to “read current versions of SAS data files and SAS portable files.” I'm hoping that this means that SAS data sets in Version 7 and Version 8 formats can be read without prior conversion.

APPENDIX A

Assigning Libraries and LIBNAMES Interactively in SAS

Start the SAS System. A screen similar to this one should be displayed. /
In the Explorer pane, select the Libraries icon: / /
And click on the New tool on the toolbar. /
Type the LIBNAME into the Name: box.
Using the drop-down list, select the desired library engine (in this example, V6). /
Using the browse button, locate the directory where the SAS library is located, in this case, C:\PROJECTS. / /
Then, click on the OK button. /
When you double-click on the Libraries icon in the Explorer window, you’ll see the libraries that are defined in the current SAS session. The new library, MYDATA, is among them.
As you can see, an advantage to this method is that you can see at a glance what libraries are assigned to your SAS session. You can also browse their contents interactively. /

APPENDIX B

FAST TRACK: Exporting Data from SAS Version 8 for Windows to SPSS for Windows Version 10

Run the following in SAS:

PROGRAM / COMMENTS
LIBNAME SASV6 V6 'C:\ PROJECTS'; / SASV6 is the libref, also used on the DATA statement. V6 specifies the Version 6 library engine. A Version 6 SAS data set will be written into the directory C:\PROJECTS.
DATA SASV6.ONE; / SASAV6 is the libref. ONE is the SAS data set name, corresponding to the name of the file that will contain it. This file, ONE.SD2, will be written into the directory specified on the LIBNAME statement above.
INPUT ID A B C; / The INPUT statement causes four variables, ID, A, B and C to be read from each data line.
CARDS; / CARDS; signals the start of in-stream data lines.
1 1 2 3
2 4 5 6
3 7 8 9
4 10 11 12 / Four data lines are read.
RUN; / The RUN statement executes the above SAS statements.

Do the following in SPSS:

1.Start SPSS for Windows.

2.From the File menu, select Open, and Data...

3.Locate the directory containing the SAS data set, in this example C:\PROJECTS.

4.In the Files of Type: drop-down list, select SAS for Windows (*.SD2).

5.Click on the file name, ONE.SD2, and then click on Open.

P:\My Documents\SPSSmisc\SAS-to-SPSS.docPage 1 of 11