Building a SAS Data Set --- 2

INFILE and FILENAME – reading data from an external file

Example 2.1a (identify the external file using INFILE)

data external;

infile ‘a:\dataset2\china#1.dat’;

input year total exports imports;

run;

proc print data=external;

title 'Example 2.1a';

run;

Example 2.1a 10:15 Wednesday, August 9, 2000 1

OBS YEAR TOTAL EXPORTS IMPORTS

1 1955 3.15 1.41 1.73

2 1956 3.21 1.65 1.56

3 1957 3.10 1.60 1.51

34 1988 80.49 40.64 39.85

35 1989 82.58 43.44 39.14

Example 2.1b (refer to an external file by FILENAME)

filename china#1 'a:\dataset2\china#1.dat';

data external;

infile china#1;

input year total exports imports;

run;

proc print data=external;

title 'Example 2.1b';

run;

The output of this SAS program is identical to the last one.

INFILE and Options statements – reading parts of a data

Example 2.2

filename china#1 'a:\dataset2\china#1.dat';

data external;

infile china#1 firstobs=20 obs=25;

input year total exports imports;

run;

proc print data=external;

title 'Example 2.2';

run;

Example 2.2 10:15 Wednesday, August 9, 2000 3

OBS YEAR TOTAL EXPORTS IMPORTS

1 1974 14.57 6.95 7.62

2 1975 14.75 7.26 7.49

3 1976 13.43 6.86 6.58

4 1977 14.80 7.59 7.21

5 1978 20.64 9.75 10.89

6 1979 29.33 13.66 15.68

Subsetting IF – reading in observations satisfying ‘IF’ condition

Example 2.3

filename car 'a:\dataset2\car.dat';

data external;

infile car;

input model & $34. price time1 time2 maxspeed brakdist

mpg grip;

if 10 < price < 15000;

run;

Example 2.3 10:15 Wednesday, August 9, 2000 5

OBS MODEL PRICE TIME1 TIME2 MAXSPEED BRAKDIST MPG GRIP

1 Chevrolet Cavalier 13417 8.8 16.7 113 209 25 0.76

2 Honda Civic DX 13210 8.7 16.7 110 198 35 0.72

3 Hyundai Elantra GLS 12965 10.0 17.3 113 193 21 0.81

4 Kia Sephia GS 12065 10.2 17.6 109 197 27 0.75

5 Nissan Sentra GXE 13914 8.5 16.7 109 208 30 0.78

6 Toyota Corolla DX 13498 9.3 17.1 110 197 27 0.77

7 Toyota Tercel 12595 9.2 17.0 110 195 33 0.75

‘IF … THEN … ELSE’ – creat a new variable

Example 2.4

filename car 'a:\dataset2\car.dat';

data external;

infile car;

input model & $34. price time1 time2 maxspeed brakdist mpg grip;

if 0 <= price < 15000 then grade='cheap';

else if 15000 <= price < 22000 then grade='average';

else grade='expensi';

drop price time2 maxspeed grip;

run;

proc print data=external;

title 'Example 2.4';

run

Example 2.4 10:15 Wednesday, August 9, 2000 10

OBS MODEL TIME1 BRAKDIST MPG GRADE

1 Acura Integra GS-R 7.0 185 25 avera

2 Acura Integra LS 7.5 189 25 avera

3 Acura NSX-T 5.2 173 18 expen

4 AM General Hummer 18.1 253 10 expen

5 Audi A6 Quattro 8.3 191 18 expen

6 Audi Cabriolet 9.7 189 18 expen

7 BMW M3 5.3 165 19 expen

126 Volswagen Passat GLX 8.6 178 18 expen

127 Volvo 850 Turbo 6.1 172 19 expen

128 Volvo 850 T-5R 6.7 181 19 expen

129 Volvo 960 8.6 190 17 expen

Proc SORT – rearrange the order of the observations

Example 2.5

filename car 'a:\dataset2\car.dat';

data external;

infile car;

input model & $34. price time1 time2 maxspeed brakdist mpg grip;

drop time2 maxspeed brakdist grip;

run;

proc sort data = external;

by price mpg;

proc print data=external;

title 'Example 2.5';

run;

Example 2.5 09:22 Thursday, August 10, 2000 10

OBS MODEL PRICE TIME1 MPG

1 Kia Sephia GS 12065 10.2 27

2 Toyota Tercel 12595 9.2 33

3 Hyundai Elantra GLS 12965 10.0 21

4 Honda Civic DX 13210 8.7 35

5 Chevrolet Cavalier 13417 8.8 25

6 Toyota Corolla DX 13498 9.3 27

7 Nissan Sentra GXE 13914 8.5 30

124 Porsche 911 Turbo 106465 3.7 13

125 BMW 850CSi 108395 5.3 12

126 Ferrari F355 128800 4.5 10

127 Guldstrand GS90 144695 4.6 17

128 Rolls-Royce Silver Spur III 211520 9.3 10

129 Lamborghini Diablo VT 275056 5.9 9

SET statement – identify a data set when there are more than one

Example 2.6

filename car 'a:\dataset2\car.dat';

data external;

infile car;

input model & $34. price time1 time2 maxspeed brakdist mpg grip;

drop time2 maxspeed brakdist grip;

run;

data milepg;

set external;

if 0 < price <= 15000;

run;

proc sort data = milepg;

by decending mpg;

proc print data=milepg;

title 'Example 2.6';

run;

Example 2.6 09:22 Thursday, August 10, 2000 17

OBS MODEL PRICE TIME1 MPG

1 Honda Civic DX 13210 8.7 35

2 Toyota Tercel 12595 9.2 33

3 Nissan Sentra GXE 13914 8.5 30

4 Kia Sephia GS 12065 10.2 27

5 Toyota Corolla DX 13498 9.3 27

6 Chevrolet Cavalier 13417 8.8 25

7 Hyundai Elantra GLS 12965 10.0 21