Converting a Multivariate Data Setup to a Univariate Data Setup
Bob Wendling in Leisure System Studies collected data designed todetermine whether persons prefer interpretative messages (explanatorytext accompanying a display of some interesting piece of natural history)with many personal words (like 'you'), or no personal words. For each ofseveral such displays each subject read a personal interpretative message,a nonpersonal interpretative message, and a blank message (no text accompanyingthe display), and then ordered the three in terms of preference (1 = mostpreferred, 2 = intermediate, 3 = least preferred). Type of message is athree-level repeated measures variable. Preference ranks were summed acrossseveral objects for each subject. The data are embedded in the program below. Incols 1-2 are subject id's (0 to 36), in cols 3-4 are preference sums for thenonpersonal messages, in cols 5-6 for the blank messages, and in cols 7-8 forthe personal messages. The higher the score the lower the preference.
If you run the program below and look at the output, you will see that there are problems with skewness that cannot be easily resolved with datatransformations (positive skewness in one group, negative in another), so Idecided to use a nonparametric analysis, the FRIEDMAN test. At the time I was involved in the research (long ago), I preferred to useMinitab for such an analysis, but Minitab wants the data to be in a univariatesetup for a Friedman test. The setup of the data in the program is multivariate, all of each subject's scores (for all three conditions)being on a single line. In the second part of the program I use a DO OVERARRAY to transform the multivariate data setup to a univariate setup. PROCPRINT is used just to print the univariate setup data so you can see whatit looks like in that form. The last part of the program PUTs the univariatesetup data into a file (that goes to the reader) that can be read intoanother program, such a Minitab.
======The SAS Program ======
optionsformdlim='-';
data bob; input id 1-2 (y1-y3) (2.0);
cards;
00091809
01091610
02170613
03071811
04130617
05081810
06091512
07081711
08061812
09160614
10091512
11101808
12091216
13111510
14091809
15111807
16110916
17071811
18091512
19130617
20180711
21061515
22061515
23081810
24091809
25081414
26131310
27061812
28081810
29111015
30151011
31101808
32081810
33081612
34121014
35121311
36071811
procunivariate; var y1-y3; procglm; model y1-y3 = / nouni;
repeated condtn 3 / printe;
*Convert the data from multivariate setup to univariate setup;
data bob2; set bob;
array ys (X) y1-y3; doover ys; y=ys; output; end; drop y1-y3; procprint; run;
*Write the data to an external file for use by another program;
data bob3; set bob2; FILE'C:\D\univariate.dat'; put id x y; run;
Here is a summary of the results of the Friedman test:
We collected data designed to determine whether persons prefer interpretative messages (explanatory text accompanying a display of some interesting piece of natural history) with many personal words (like 'you'), or no personal words. For each of several such displays each subject read a personal interpretative message, a nonpersonal interpretative message, and a blank message (no text accompanying the display), and then ordered the three in terms of preference (1 = most preferred, 2 = intermediate, 3 = least preferred). Type of message is a three-level repeated measures variable. Preference ranks were summed across several objects for each subject. The higher the score the lower the preference.
Since the data were not normally distributed, we decided to use a nonparametric analysis of variance, the Friedman test. The Friedman test indicated a significant effect of type of interpretative message, H(2, n = 37) = 14.52, p < .001. Wilcoxon's signed-ranks test was used to compare each mean with each other mean (see the means in Table 1). Nonpersonal messages were preferred over personal messages, T(n = 34) = 136.5, z = 2.74, p = .006 and over blank messages, T(n = 36) = 133.5, z = 3.13, p = .002. Personal messages were preferred over blank messages, T(n = 34) = 429, z = 2.26, p = .024.
Table 1
Type of Message Median InterquartileRange
------
Nonpersonal 9 3.5
Personal 11 4.0
Blank 15 7.0
Return to Wuensch’s SAS Lessons Page
Karl L. Wuensch, Dept. of Psychology, East Carolina University, February, 2009.