SAS univariate plots

INPUT

***************************************************

* SAS program to examine a distribution of data

* for a variable of interest.

* Import data from MedianHousePrice.xls

* (Modify the statements for your needs.)

**************************************************;

proc print data=d1;

proc univariate plot normal;

var p2011q1;

proc plot;

plot p2011q1*p2008;

data d1; set d1;

label p2011q1 = 'Thousands of Dollars Q1 2011';

proc sgplot data=d1;

title "Metropolitan Area Median House Prices";

histogram p2011q1;

density p2011q1;

density p2011q1 / type=kernel;

keylegend / location=inside position=topright;

run;

data d2;

do i = 1 to 500;

y = normal(0);

output;

end;

proc sgplot data=d2;

title "Random numbers generated from normal distribution";

histogram y;

density y;

density y / type=kernel;

keylegend / location=inside position=topright;

run;

OUTPUT

[Proc Print data listing is omitted]

The SAS System 12:00 Monday, July 18, 2011 8

The UNIVARIATE Procedure

Variable: p2011q1 (p2011q1)

Moments

N 153 Sum Weights 153

Mean 165.608228 Sum Observations 25338.0589

Std Deviation 91.5055228 Variance 8373.2607

Skewness 2.25802997 Kurtosis 5.83982768

Uncorrected SS 5468926.67 Corrected SS 1272735.63

Coeff Variation 55.2542127 Std Error Mean 7.39778305

Basic Statistical Measures

Location Variability

Mean 165.6082 Std Deviation 91.50552

Median 136.8000 Variance 8373

Mode 114.4000 Range 523.85887

Interquartile Range 70.40000

Note: The mode displayed is the smallest of 5 modes with a count of 2.

Tests for Location: Mu0=0

Test -Statistic------p Value------

Student's t t 22.3862 Pr > |t| <.0001

Sign M 76.5 Pr >= |M| <.0001

Signed Rank S 5890.5 Pr >= |S| <.0001

Tests for Normality

Test --Statistic------p Value------

Shapiro-Wilk W 0.762199 Pr < W <0.0001

Kolmogorov-Smirnov D 0.181955 Pr > D <0.0100

Cramer-von Mises W-Sq 1.96616 Pr > W-Sq <0.0050

Anderson-Darling A-Sq 10.95824 Pr > A-Sq <0.0050

Quantiles (Definition 5)

Quantile Estimate

100% Max 579.3000

99% 545.0000

95% 372.0000

90% 292.7000

75% Q3 184.0000

The SAS System 12:00 Monday, July 18, 2011 9

The UNIVARIATE Procedure

Variable: p2011q1 (p2011q1)

Quantiles (Definition 5)

Quantile Estimate

50% Median 136.8000

25% Q1 113.6000

10% 91.8000

5% 80.7000

1% 64.4000

0% Min 55.4411

Extreme Observations

------Lowest------Highest----

Value Obs Value Obs

55.4411 160 439.3 102

64.4000 83 465.9 134

64.9000 149 511.8 7

68.7000 140 545.0 135

74.9000 2 579.3 72

Missing Values

-----Percent Of-----

Missing Missing

Value Count All Obs Obs

. 7 4.38 100.00

Histogram # Boxplot

575+* 1 *

.* 2 *

.* 1 *

.* 1 *

.*** 5 0

325+** 3 0

.*** 5 0

.****** 12 |

.*************** 30 +--+--+

.*********************************** 70 *-----*

75+************ 23 |

----+----+----+----+----+----+----+

* may represent up to 2 counts

The SAS System 12:00 Monday, July 18, 2011 10

The UNIVARIATE Procedure

Variable: p2011q1 (p2011q1)

Normal Probability Plot

575+ *

| * *

| *

| *

| **** +++++

325+ **+++++

| ++**+

| +++++****

| ++++******

| **************

75+* * ************++

+----+----+----+----+----+----+----+----+----+----+

-2 -1 0 +1 +2

Plot of p2011q1*P2008. Legend: A = 1 obs, B = 2 obs, etc.

600 ˆ

‚ A

‚ A

‚ A

500 ˆ

‚ A

‚ A

400 ˆ

‚ A AA

‚ A A

p ‚ A

2 ‚ A

0 ‚

1 300 ˆ B A

1 ‚ A A

q ‚ A

1 ‚

‚ AA

‚ A C AAA A

200 ˆ A CB

‚ BAC AA

‚ A AAAC B A

‚ A AA AA A A A

‚ CEE AAA A

‚ DFFCA AAB A

‚ CCEDB AAA A

100 ˆ AAABCAAA

‚ ACBBA AA

‚ AABA A

‚ A

0 ˆ

Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒ

0 100 200 300 400 500 600 700

P2008

NOTE: 8 obs had missing values.


The UNIVARIATE Procedure

Variable: i

Moments /
N / 500 / Sum Weights / 500
Mean / 250.5 / Sum Observations / 125250
Std Deviation / 144.481833 / Variance / 20875
Skewness / 0 / Kurtosis / -1.2
Uncorrected SS / 41791750 / Corrected SS / 10416625
Coeff Variation / 57.6773784 / Std Error Mean / 6.46142399
Basic Statistical Measures /
Location / Variability /
Mean / 250.5000 / Std Deviation / 144.48183
Median / 250.5000 / Variance / 20875
Mode / . / Range / 499.00000
Interquartile Range / 250.00000
Tests for Location: Mu0=0 /
Test / Statistic / p Value /
Student's t / t / 38.76854 / Pr > |t| / <.0001
Sign / M / 250 / Pr >= |M| / <.0001
Signed Rank / S / 62625 / Pr >= |S| / <.0001

The UNIVARIATE Procedure

Variable: y

Histogram # Boxplot
2.75+* 3 0
.**** 11 |
.******* 19 |
.************ 35 |
.*********************** 68 +-----+
0.25+******************************* 93 | |
.************************************ 107 *--+--*
.************************** 76 +-----+
.***************** 49 |
.*********** 32 |
-2.25+*** 7 |
----+----+----+----+----+----+----+-
* may represent up to 3 counts