Additional Notes: Single-Level to ML Models 1

Additional Notes

From Single-Level to Multilevel Models

Multilevel models necessitate some changes in the way we specify our models. We are usually trying to investigate a set of theoretical relations that are thought to exist in the population. Decisions about data analysis are embedded in research questions, designs, and the data structures themselves.

We talked last week about MLMs attend to the sampling schemes of many large-scale studies as well as the specification of processes that exist at multiple levels of an educational system.

Often the first step, is determining whether a multilevel analysis is indeed necessary. Typically we first partition the variance in an outcome into its between-group and within-group parts.

ρ= σ2B/(σ2B + σ2W)

The intraclass correlation can also be understood as the correlation between two randomly chosen individuals in the same group. Suppose the variance in the first situation is the following:

Between = 20

Within = 60

ICC = 20/80 = .25

Suppose in the second it is:

Between = 40

Within = 40

ICC = 40/80 = .50

In the second case the groups are more homogenous—that is, people within each group are more alike. If there is no ICC, there is little reason to conduct a multilevel analysis. You would just analyze individuals (as randomly selected and independent from each other). If the ICC is high, there is no need to do a multilevel analysis since the groups are homogeneous. So you would just conduct the analysis at the group level.

Let’s look at a simple regression analysis. The model is typically described like this:

Y = BX,

where the bold indicates a p x 1 vector of B coefficients.

Suppose we wish to explain students’ math test scores from their SES background (coded participate in free/reduced lunch = 1; Else = 0) and female coded 1 (versus male coded 0). Sometimes people will add the subscript i to refer to individuals. We have the following model:

,

where is the intercept, is the unstandardized beta for SES, is the unstandardized beta for female, and represents errors in predicting values of . Here is the set of estimates.

We can plug these coefficients into the equation as follows:

Yi = 650.600 -19.091(lowSES) + 5.491(female) + e

The intercept can be interpreted as the estimate for an individual whose status is 0 on the other variables (i.e., low SES = 0) and female = 0. Hence, the individual who is not low SES (i.e., participating in the federal free/reduced lunch program) and male would be expected to score 650.6 on the math test. Holding SES constant, females would be expected to score the following:

650.6 + 5.491(1) = 656.091

The key part of the single-level regression analysis is that the estimates for lowSES and female are fixed—that is, they are considered to be non-varying averages. Moreover, the prediction errors are assumed to be independent with mean = 0 and some variance.

Now, suppose we believe that the relationship between lowSES and math might vary across schools—that is, the relationship might be stronger in some schools and weaker in others. If we look at variability in either the outcome (math scores) or a slope, it is called a “random” effect, since it can take on different values for different units.

We might conduct a multilevel analysis. We can devise a series of steps. The proposed model might look something like this at the moment:

You can see that figure takes in the nesting of individuals within schools; that is, there is a within-school portion of the model and a between-school portion. At this point there are no school (or level 2) variables, but they could be added subsequently.

  1. Unconditional model (Partition variance components within and between schools)

At level 1, we can define students’ average achievement:

At level 2 (school level), we can allow the average achievement intercept () to vary randomly across schools. The random component is indicated by the level-2 variance component ():

Through substitution of the level-2 intercept equation into the level-1 equation, we can arrive at the combined single equation:

.

This suggests there are three parameters to estimate. They include the intercept, the random effect (i.e., the randomly varying intercept), and the level-1 residual. We can

confirm that in the Model Dimension table.

We can also examine the variance components.

How would we calculate the variance components for the math variable?

2. Within-School Model

Now let’s look at the same analysis with two predictors at the student level, but this time we are adjusting the estimates for the nesting of individual students within schools. At level 1 we have the following model:

.

At level 2 (between schools) the intercept model remains the same.

We can first declare the slope coefficients for lowses and female to be fixed (not varying randomly) across schools:

.

We can see that the estimate of the slope coefficients is not proposed to vary in size across schools since there are no random components ( and , respectively). We can substitute the school level models (i.e., describing the intercept and the two level-1 predictors that are fixed at level 2) into the level-1 model. We can see that the fixed level-1 predictors now also have gamma coefficients.

.

We can then count up the fixed and random effects (5) and compare them to the model dimension table.

Let’s look at the fixed effects. We can see there are differences in the intercept, the low SES effect and the female effect, since the estimates now represent school-level averages now.

Now we look at the random effects:

We can also examine the amount of variance accounted for at each level.

3. Specifying a Random Slope

We can also estimate a random slope. We specify the slope for lowSES to vary randomly. We only have to make one change:

.

When we substitute the slope model into the previous combined model, we obtain the following “cross-level” effect for the level-1 slope at level 2:

.

Note that when we substitute into the combined equation, we must multiply what it is equal to (+) by the level-1predictor (lowses). This results in two terms (and ) in the combined equation. We can see there are now two random effects (the slope for lowses and the intercept).

Notice the fixed effects are different from the previous model.

Also we can examine the variance components. We can see the slope is significant across schools.

Some Notes on Defining Equations

  1. Typically, we refer to level-1 coefficients as Greek letter beta (β). We refer to the intercept asand to the predictors at level 1 as X variables and number them from 1 to q. We use the subscript i to refer to individuals. Subject j refers to groups.
  2. At level 2, we typically refer to the coefficients as gamma (γ). We refer to the intercept as and refer to the level 2 predictors as W (or Z) and number them from 01 to q.
  3. Level-1 variables that are referred to at level 2 keep their number from level 1 but add a zero behind. For example, becomes γ10 (note that if is randomly varying, the predictors explaining the random slope will be numbered as, , etc.).