Homework 5 Solutions
Assigned:29 September 2004
Due:06October 2004
C:\Documents and Settings\John Bailer\My Documents\baileraj\Classes\Fall 2004\sta402\hw\Homework-5.doc
1. Refer to the 4 group (packaging method)- log-bacterial growth study. In this exercise, you will use SAS/GRAPH to construct figures displaying the means and standard deviations.
a. Construct a figure containing side-by-side boxplots using the INTERPOL=BOX option associated with a SYMBOL statement used with PROC GPLOT.
ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\hw\hw5-prob1a.rtf’;
procgplot data=class.meat;
title h=1.5‘Plot of log(Bacterial count) vs. packaging condition';
title2 h=1 '[boxplot is plotted for each condition]';
symbol1 interpol=box /* plots +/- 2 SD from the mean at each conc */
value=dot;
plot logcount*condition;
run;
ODS RTF CLOSE;
b. Construct a figure with the data points plotted. Superimpose the mean plus/minus 2 std. deviation segments on this plot.
ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\hw\hw5-prob1b.rtf’;
procgplot data=class.meat;
title h=1.5‘Plot of log(Bacterial count) vs. packaging condition';
title2 h=1 '[mean +/- 2SD is plotted for each condition]';
symbol1 interpol=STD2T /* plots +/- 2 SD from the mean at each conc */
/* T= add top and bottom to each 2 SD diff */
value=dot;
plot logcount*condition;
run;
ODS RTF CLOSE;
c. Construct a horizontal bar chart with the end of the bar at the value of the mean with a 1 SD whisker extending from this bar.
/* from
If you want the error bars to represent a given number C of standard errors instead of a confidence interval, and if the number of observations assigned to each midpoint is the same, then you can find the appropriate value for the CLM= option by running a DATA step. For example, if you want error bars that represent one standard error (C=1) with a sample size of N , you can run the following DATA step to compute the appropriate value for the CLM= option and assign that value to a macro variable &LEVEL:
data null;
c = 1;
n = 10;
level = 100 * (1 - 2 * (1 - probt( c, n-1)));
put _all_;
call symput('level',put(level,best12.));
run;
*/
data null;
c = 1;
n = 3;
level = 100 * (1 - 2 * (1 - probt( c, n-1)));
put _all_;
call symput('level',put(level,best12.));
run;
c=1 n=3 level=57.735026919 _ERROR_=0 _N_=1
ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\hw\hw5-fig1c.rtf';
goptions reset=global gunit=pct border cback=white
colors=(black blue green red) ftext=swiss
ftitle=swissb htitle=5 htext=3.5;
title1 'Average log(Bacteria count) by Packaging Condition';
axis1 label=('log(count)' minor=(number=1);
axis2 label=('Packaging' j=r 'Condition');
pattern1 color=cyan;
procgchart data=class.meat;
hbar condition / type=mean
sumvar=logcount
/* freqlabel='Number in Group' */
/* meanlabel='Mean Number Young' */
errorbar=bars
clm=&level
raxis=axis1
maxis=axis2
noframe
coutline=black;
run;
ODS RTF CLOSE;
ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\hw\hw5-fig1c-2.rtf';
goptions reset=global gunit=pct border cback=white
colors=(black blue green red) ftext=swiss
ftitle=swissb htitle=5 htext=3.5;
title1 'Average log(Bacteria count) by Packaging Condition';
axis1 label=('log(count)' minor=(number=1);
axis2 label=('Packaging' j=r 'Condition');
pattern1 color=cyan;
procgchart data=class.meat;
vbar condition / type=mean
sumvar=logcount
/* freqlabel='Number in Group' */
/* meanlabel='Mean Number Young' */
errorbar=bars
clm=&level
raxis=axis1
maxis=axis2
noframe
coutline=black;
run;
ODS RTF CLOSE;
2. Refer to the nitrofen data. Generate a plot of the nitrofen data (sqrt total as response) vs. concentration. Superimpose the fitted quadratic model. Only use GPLOT to generate this figure – i.e. don’t use PROC REG to generate fits first, I want you to use the SYMBOL INTERPOL reg options.
libname class 'D:\baileraj\Classes\Fall 2003\sta402\data’;
options ls=75;
Data new_nitro; set class.nitrofen;
Sqrt_total = sqrt(total);
run;
ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\hw\hw5-prob2.rtf’;
procgplot data=new_nitro;
title h=1.4 'Plot of the number of C. dubia young versus Nitrofen Concentration';
title2 h=1 '[Regression line for quadratic model displayed]';
symbol1 interpol=rq
/* r=regression, l=linear (q,c also possible), clm=conf. int. mean (cli option), 95= conf. level */
value=diamond
height=3
cv=red
ci=blue
co=green
width=2;
plot sqrt_total*conc /
hminor=1
overlay
regeqn; /* adds regression eqn to bottom left of plot */
run;
ODS RTF CLOSE;
3. Refer to the nitrofen data. Fit a polynomial regression model (center the concentration variable). Use ODS GRAPHICS to generate graphical summaries of this analysis
libname class "\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\data";
data nitro;
infile
"\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\SAS-programs\ch2-dat.txt"
firstobs=16expandtabsmissoverpad ; * referencing M drive directly;
input animal conc brood1 brood2 brood3 total;
stotal = sqrt(total);
c_conc = conc – 157; * center the concentration;
c_conc2 = c_conc**2;
/* proc means;
var conc;
run;
*/
ods html;
ods graphics on;
proc reg;
model stotal = c_conc c_conc2;
run;
ods graphics off;
ods html close;
The SAS SystemThe REG Procedure
Model: MODEL1
Dependent Variable: stotal
Number of Observations Read / 50Number of Observations Used / 50
Analysis of Variance
Source / DF / Sum of
Squares / Mean
Square / F Value / PrF
Model / 2 / 81.29572 / 40.64786 / 122.57 / <.0001
Error / 47 / 15.58676 / 0.33163
Corrected Total / 49 / 96.88248
Root MSE / 0.57588 / R-Square / 0.8391
Dependent Mean / 4.57628 / Adj R-Sq / 0.8323
Coeff Var / 12.58393
Parameter Estimates
Variable / DF / Parameter
Estimate / Standard
Error / tValue / Pr|t|
Intercept / 1 / 5.25168 / 0.12697 / 41.36 / <.0001
c_conc / 1 / -0.01068 / 0.00074385 / -14.36 / <.0001
c_conc2 / 1 / -0.00005621 / 0.00000811 / -6.93 / <.0001
The SAS System
The REG Procedure
Model: MODEL1
Dependent Variable: stotal
4. Find a data set that interests you and generate a display that best represents it using SAS/GRAPH.