SAS univariate plots
INPUT
***************************************************
* SAS program to examine a distribution of data
* for a variable of interest.
* Import data from MedianHousePrice.xls
* (Modify the statements for your needs.)
**************************************************;
proc print data=d1;
proc univariate plot normal;
var p2011q1;
proc plot;
plot p2011q1*p2008;
data d1; set d1;
label p2011q1 = 'Thousands of Dollars Q1 2011';
proc sgplot data=d1;
title "Metropolitan Area Median House Prices";
histogram p2011q1;
density p2011q1;
density p2011q1 / type=kernel;
keylegend / location=inside position=topright;
run;
data d2;
do i = 1 to 500;
y = normal(0);
output;
end;
proc sgplot data=d2;
title "Random numbers generated from normal distribution";
histogram y;
density y;
density y / type=kernel;
keylegend / location=inside position=topright;
run;
OUTPUT
[Proc Print data listing is omitted]
The SAS System 12:00 Monday, July 18, 2011 8
The UNIVARIATE Procedure
Variable: p2011q1 (p2011q1)
Moments
N 153 Sum Weights 153
Mean 165.608228 Sum Observations 25338.0589
Std Deviation 91.5055228 Variance 8373.2607
Skewness 2.25802997 Kurtosis 5.83982768
Uncorrected SS 5468926.67 Corrected SS 1272735.63
Coeff Variation 55.2542127 Std Error Mean 7.39778305
Basic Statistical Measures
Location Variability
Mean 165.6082 Std Deviation 91.50552
Median 136.8000 Variance 8373
Mode 114.4000 Range 523.85887
Interquartile Range 70.40000
Note: The mode displayed is the smallest of 5 modes with a count of 2.
Tests for Location: Mu0=0
Test -Statistic------p Value------
Student's t t 22.3862 Pr > |t| <.0001
Sign M 76.5 Pr >= |M| <.0001
Signed Rank S 5890.5 Pr >= |S| <.0001
Tests for Normality
Test --Statistic------p Value------
Shapiro-Wilk W 0.762199 Pr < W <0.0001
Kolmogorov-Smirnov D 0.181955 Pr > D <0.0100
Cramer-von Mises W-Sq 1.96616 Pr > W-Sq <0.0050
Anderson-Darling A-Sq 10.95824 Pr > A-Sq <0.0050
Quantiles (Definition 5)
Quantile Estimate
100% Max 579.3000
99% 545.0000
95% 372.0000
90% 292.7000
75% Q3 184.0000
The SAS System 12:00 Monday, July 18, 2011 9
The UNIVARIATE Procedure
Variable: p2011q1 (p2011q1)
Quantiles (Definition 5)
Quantile Estimate
50% Median 136.8000
25% Q1 113.6000
10% 91.8000
5% 80.7000
1% 64.4000
0% Min 55.4411
Extreme Observations
------Lowest------Highest----
Value Obs Value Obs
55.4411 160 439.3 102
64.4000 83 465.9 134
64.9000 149 511.8 7
68.7000 140 545.0 135
74.9000 2 579.3 72
Missing Values
-----Percent Of-----
Missing Missing
Value Count All Obs Obs
. 7 4.38 100.00
Histogram # Boxplot
575+* 1 *
.* 2 *
.* 1 *
.* 1 *
.*** 5 0
325+** 3 0
.*** 5 0
.****** 12 |
.*************** 30 +--+--+
.*********************************** 70 *-----*
75+************ 23 |
----+----+----+----+----+----+----+
* may represent up to 2 counts
The SAS System 12:00 Monday, July 18, 2011 10
The UNIVARIATE Procedure
Variable: p2011q1 (p2011q1)
Normal Probability Plot
575+ *
| * *
| *
| *
| **** +++++
325+ **+++++
| ++**+
| +++++****
| ++++******
| **************
75+* * ************++
+----+----+----+----+----+----+----+----+----+----+
-2 -1 0 +1 +2
Plot of p2011q1*P2008. Legend: A = 1 obs, B = 2 obs, etc.
‚
600 ˆ
‚ A
‚
‚
‚ A
‚
‚ A
500 ˆ
‚
‚ A
‚
‚ A
‚
‚
400 ˆ
‚
‚ A AA
‚ A A
p ‚ A
2 ‚ A
0 ‚
1 300 ˆ B A
1 ‚ A A
q ‚ A
1 ‚
‚
‚ AA
‚ A C AAA A
200 ˆ A CB
‚ BAC AA
‚ A AAAC B A
‚ A AA AA A A A
‚ CEE AAA A
‚ DFFCA AAB A
‚ CCEDB AAA A
100 ˆ AAABCAAA
‚ ACBBA AA
‚ AABA A
‚ A
‚
‚
‚
0 ˆ
‚
Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒ
0 100 200 300 400 500 600 700
P2008
NOTE: 8 obs had missing values.
The UNIVARIATE Procedure
Variable: i
Moments /N / 500 / Sum Weights / 500
Mean / 250.5 / Sum Observations / 125250
Std Deviation / 144.481833 / Variance / 20875
Skewness / 0 / Kurtosis / -1.2
Uncorrected SS / 41791750 / Corrected SS / 10416625
Coeff Variation / 57.6773784 / Std Error Mean / 6.46142399
Basic Statistical Measures /
Location / Variability /
Mean / 250.5000 / Std Deviation / 144.48183
Median / 250.5000 / Variance / 20875
Mode / . / Range / 499.00000
Interquartile Range / 250.00000
Tests for Location: Mu0=0 /
Test / Statistic / p Value /
Student's t / t / 38.76854 / Pr > |t| / <.0001
Sign / M / 250 / Pr >= |M| / <.0001
Signed Rank / S / 62625 / Pr >= |S| / <.0001
The UNIVARIATE Procedure
Variable: y
Histogram # Boxplot2.75+* 3 0
.**** 11 |
.******* 19 |
.************ 35 |
.*********************** 68 +-----+
0.25+******************************* 93 | |
.************************************ 107 *--+--*
.************************** 76 +-----+
.***************** 49 |
.*********** 32 |
-2.25+*** 7 |
----+----+----+----+----+----+----+-
* may represent up to 3 counts