Converting a Multivariate Data Setup to a Univariate Data Setup

Bob Wendling in Leisure System Studies collected data designed todetermine whether persons prefer interpretative messages (explanatorytext accompanying a display of some interesting piece of natural history)with many personal words (like 'you'), or no personal words. For each ofseveral such displays each subject read a personal interpretative message,a nonpersonal interpretative message, and a blank message (no text accompanyingthe display), and then ordered the three in terms of preference (1 = mostpreferred, 2 = intermediate, 3 = least preferred). Type of message is athree-level repeated measures variable. Preference ranks were summed acrossseveral objects for each subject. The data are embedded in the program below. Incols 1-2 are subject id's (0 to 36), in cols 3-4 are preference sums for thenonpersonal messages, in cols 5-6 for the blank messages, and in cols 7-8 forthe personal messages. The higher the score the lower the preference.

If you run the program below and look at the output, you will see that there are problems with skewness that cannot be easily resolved with datatransformations (positive skewness in one group, negative in another), so Idecided to use a nonparametric analysis, the FRIEDMAN test. At the time I was involved in the research (long ago), I preferred to useMinitab for such an analysis, but Minitab wants the data to be in a univariatesetup for a Friedman test. The setup of the data in the program is multivariate, all of each subject's scores (for all three conditions)being on a single line. In the second part of the program I use a DO OVERARRAY to transform the multivariate data setup to a univariate setup. PROCPRINT is used just to print the univariate setup data so you can see whatit looks like in that form. The last part of the program PUTs the univariatesetup data into a file (that goes to the reader) that can be read intoanother program, such a Minitab.

======The SAS Program ======

optionsformdlim='-';

data bob; input id 1-2 (y1-y3) (2.0);

cards;

00091809

01091610

02170613

03071811

04130617

05081810

06091512

07081711

08061812

09160614

10091512

11101808

12091216

13111510

14091809

15111807

16110916

17071811

18091512

19130617

20180711

21061515

22061515

23081810

24091809

25081414

26131310

27061812

28081810

29111015

30151011

31101808

32081810

33081612

34121014

35121311

36071811

procunivariate; var y1-y3; procglm; model y1-y3 = / nouni;

repeated condtn 3 / printe;

*Convert the data from multivariate setup to univariate setup;

data bob2; set bob;

array ys (X) y1-y3; doover ys; y=ys; output; end; drop y1-y3; procprint; run;

*Write the data to an external file for use by another program;

data bob3; set bob2; FILE'C:\D\univariate.dat'; put id x y; run;

Here is a summary of the results of the Friedman test:

We collected data designed to determine whether persons prefer interpretative messages (explanatory text accompanying a display of some interesting piece of natural history) with many personal words (like 'you'), or no personal words. For each of several such displays each subject read a personal interpretative message, a nonpersonal interpretative message, and a blank message (no text accompanying the display), and then ordered the three in terms of preference (1 = most preferred, 2 = intermediate, 3 = least preferred). Type of message is a three-level repeated measures variable. Preference ranks were summed across several objects for each subject. The higher the score the lower the preference.

Since the data were not normally distributed, we decided to use a nonparametric analysis of variance, the Friedman test. The Friedman test indicated a significant effect of type of interpretative message, H(2, n = 37) = 14.52, p < .001. Wilcoxon's signed-ranks test was used to compare each mean with each other mean (see the means in Table 1). Nonpersonal messages were preferred over personal messages, T(n = 34) = 136.5, z = 2.74, p = .006 and over blank messages, T(n = 36) = 133.5, z = 3.13, p = .002. Personal messages were preferred over blank messages, T(n = 34) = 429, z = 2.26, p = .024.

Table 1

Type of Message Median InterquartileRange

------

Nonpersonal 9 3.5

Personal 11 4.0

Blank 15 7.0

Return to Wuensch’s SAS Lessons Page

Karl L. Wuensch, Dept. of Psychology, East Carolina University, February, 2009.