Have ExamMonitor running before you start you assignments. When you are done with the assignments, then stop ExamMonitor. ExamMonitor will save a file with extension ENC.

When you are done, send your Word-file and your ENC-file to . As the topic of your mail, choose “AMMBR exam April 2013”.

Add syntax and output in your Word file so that I can understand what it is that you are doing, and what you base your conclusions on. Do not add separate do-files or output files.

You are obviously allowed to use Stata and Word, including Stata’s help files and manuals.

You cannot: use your own notes or slides, use home-made do-files, have contact with others, use the Internet, etc. When in doubt, ask!

You can answer in Dutch if you want.

Best of luck, don’t forget to enjoy the new skill you have mastered …

AMMBR, Tuesday, april 9, 2013 Chris Snijders

As you know, we use the file DermaApps_AMMBR.dta for both assignments. This is the file that was based on our experimental data. We have 173 participants who each completed the experiment from start to finish. Hence: every participant is represented by exactly 11 rows of data in the data matrix, with all the other data as you have seen before.

Explain all your answers.

Tip: if your laptop is slow, try creating most of your syntax with a smaller part of the data first.

Another tip to make Stata run faster – if you need it to – is not to use the option to include categorical variables through the “i.” command. Make dummies instead.

ASSIGNMENT 0: CLEANING UP (10 points)

There might be respondents in our data file that we do not wish to include in our subsequent analyses. Get rid of these respondents – if you think this is necessary – and explain why you exclude the ones you exclude.

ASSIGNMENT 1: LOGISTIC REGRESSION (45 points)

For this part, we only consider respondent data: we need only one line per respondent. To establish this, type

keep if tasknumber==1

This leaves you with only one line per person. We now consider the variables medical1 through medical9. You can run a factor analysis on them by copy-pasting the following commands in your do-file (note: the question mark in the below code tells Stata to use all variables that start with medical and then have one character behind it):

factor medical?, pcf

rotate, promax

predict MedA MedB MedC

What this does is create three variables, [MedA], [MedB] and [MedC]. As you can probably infer from Stata’s output right after these commands, [MedA] mainly contains questions 1, 2, 4 and 9. [MedB] mainly contains 7 and 8, and [MedC] mainly contains 3 and 6. The factor analysis itself is not that great, but we will nevertheless use the three newly created variables in the subsequent analyses of this assignment.

In addition, we create a scale score out of the variables [CompLiteracy] variables, excluding number 7, by

alpha CompLiteracy1 – CompLiteracy6 CompLiteracy8 CompLiteracy9, gen(complit)

Our target variable in this assignment is the variable [ownsmartphone].

a)  (10 points) Analyze a logistic regression model with [female], [age], [MedA], [MedB], [MedC], and [complit]. Interpret your results.

b)  (5 points) How often do we predict a 1 when we should have predicted a 0?

c)  (10 points) Test the assumptions of the logistic regression model on the model under a). What do you conclude?

d)  (20 points) Make a model as good as you can get it to predict Y, using the variables mentioned under a) and whatever other variables in your data set that might make sense. Think about possible interaction effects, transformations, and outliers. Test the model assumptions. Present a final model, and a conclusion.

ASSIGNMENT 2: MULTI-LEVEL REGRESSION (45 points)

For this assignment, we use the original data set once again. So first reload the original data, get rid of any respondents that you do not want included (as you already did in assignment 0).

The variable [reliable] is your target variable. Note that this is actually not really an interval-valued variable (it takes on values from 1-5 only), but let us not be bothered by that today.

a.  (5 points) You could argue that the data are clustered in several ways. Calculate the percentage of the variance at the cluster level in the following cases:

1)  when you assume that the data are clustered within participants

2)  when you assume that the data are clustered within the student that has invited the participant

For both cases, test whether the percentage you find differs significantly from zero. What do you conclude?

b.  (5 points) Give a reason why clustering at the level of the student that has invited the participant might occur.

c.  (5 points) Now consider clustering at the participant level only. Run a multi-level regression with as predictors the variables [female], [age], [vig_graphic], [vig_author], [vig_price], [vig_stars], and [vig_downloads]. Make sure to include variables the right way. Interpret the results.

d.  (5 points) In the model under c., test whether there is evidence for a random slope for the variable [vig_author]. Calculate the percentage of the variance at the slope-level. Interpret the result.

e.  (10 points) For the five variables [vig_graphic] through [vig_downloads], create the mean of the variable across participants, and the deviation from the mean within participants. Add the variables you created to the model under c., instead of the original [vig_graphic] through [vig_downloads] variables. Explain: what are you now testing? If you could choose, what would you hope to happen: that the means across participants are significant, or the deviations from the mean within participants? Interpret the results.

f.  (15 points) Now, if possible, improve your model using the rest of the data. Think about interactions, outliers, transformations, etc. Test the multi-level regression assumptions.

g.  (Bonus: +5 points) You could correctly argue that this is actually a cross-classified multi-level model. Explain why and what this is, and run the model under c. as a cross-classified multi-level model. Briefly interpret your findings.

When you are done, send your Word-file and your ENC-file to . Do not forget the ENC-file! Use “AMMBR exam april 2013” as the subject of your mail.

2