Multilevel Situations

L1 21

Multilevel Situations

Individual observations are grouped.

Each group is randomly selected from a collection of groups.

Classrooms, schools, organizations are common examples of such groups.

Randomly here means that if the experiment were repeated, different groups would be included.

These groups would have different compositions would have different compositions and different values of group variables.

Example: A huge sample of students is taken and the Math achievement score of each student is obtain. In addition, the Social Economic Status of each student is computed. The interest is on the relationship of Math scores to SES scores. But the students are grouped by school, with from 10 to almost 30 students from each school. More than 400 schools are represented.

Ch3multilevel.sav

So what?

1. Differences between the groups may affect the estimates of regression coefficients.

2. The standard errors of the estimates, used for tests of significance, are (usually) larger when grouping is taken into account, leading to different conclusions regarding the significance of relationships than would be reached if the grouping were ignored.

3. There may be group characteristics that influence the dependent variable, accounting for variance that would not be accounted for if grouping were not recognized.

4. Variation in estimates associated with differences between groups, e.g., variation in intercepts or variation in slopes, may be of interest in their own right.

All of these are reasons to take grouping into account, if it’s present, when performing analyses.

Contextual variables

The groups may have characteristics that are important for the analyses. These are often called contextual variables. So a contextual variable is a characteristic of a group.

Examples: If individuals were grouped in schools, school size would be a contextual variable.

If individuals were grouped in organizations, organization verticality might be a contextual variable.

They’re called contextual because each group creates a context in which the relationships of the dependent variable to independent variable(s) exist. Those contexts vary from group to group.

Random vs. Fixed characteristics

We’ve considered grouping previously.

Each qualitative factor in an analysis – sex, race, graduate program, age group, training program – represents a set of groups. Are these the kinds of groups we’re thinking about here?

No.

Most characteristics we’ve previously studied are fixed from one study to the next. Two training programs, for example, would be the same two training programs if the experiment were replicated.

These are called fixed factors. The characteristics associated with the two or more groups would be the same, i.e., fixed, if the experiment were repeated.

So the training programs are not groups in the above sense because their characteristics are not randomly chosen but are fixed.

On the other hand, if we’re dealing with randomly selected group, then the characteristics of the groups would change randomly if the experiment were repeated. Characteristics of random groups are called random factors.

Examples are percentage of males in a classroom, average SES of a school, student/faculty ratio of a school, verticalness of an organization.

Multilevel modeling

Multilievel modeling is a collection of analyses designed to take into account the nesting of observations within randomly selected groups and also the random nature of the grouping variables typically found.

Also called random coefficients models, mixed-effect models, multilevel regression models, hierarchical linear models, and multilevel covariance structure or multilevel structural equation models. (Text – p. 1).

Example of a multilevel data file: ch3multilevel from the text’s data files.

Gender, SES, FEMSES, and MATH are individual variables. In this data set MATH, representing Math Achievement, is the dependent variable. Its relationship to other variables, such as student SES is being investigated. FEMSES is a product variable. More on it later.

SES_MEAN = Mean SES the a school , PER4YRC = percentage of students in the school going on to a 4-year school. PUBLIC = whether school is public or private.

Levels in multilevel analyses

The “lowest” level in multilevel analyses, usually the individual persons in the analyses, is called Level 1. Level 1 characteristics are individual person characteristics so they may vary from person to person both from person to person with a group and from person to person between different groups.

The groups, in which the level 1 individuals are nested, are at Level 2.

The vast majority of multilevel analyses are comprised of only two levels. However three and higher level analyses are possible.

For example, individuals could be nested within classes which could be nested within schools

Level 2 characteristics: Composites of Level 1 characteristics vs. Pure group characteristics

Level 2 characteristics will be the same for all persons within a group, then may change to a different value when a different group is considered and then be the same for all persons in that second group and so on.

Composites of Level 1 characteristics:

A composite of characteristics of individuals may be treated as a group characteristic.

Mean of SESs of individual students within a school can be used as a group characteristic.

The percentage of students going on to a 4-year college within a school can be used as a group characteristic.

Pure Group characteristics:

Pure group characteristics are characteristics of the group that are a property of the group entity only, not “built up” by combining scores of the elements.

Whether a school is public or private, as in the above data set, is an example.

The number of years experience of the principal of the school is another example.

Gross sales of an organization.

Debt rating of an organization

Longitudinal Models as multilevel analyses

Much research can be characterized as described above, with persons as the Level 1 entity and groups within which the persons exists – classrooms, schools, organizations – as the Level 2 entities. Let’s call them cross-sectional multilevel designs.

However, there is another situation to which multilevel analyses apply.

This is the situation in which a measurement of an individual at one of a series of times is the Level 1 entity. So the level 1 observation is a measurement at a time period.

In this situation, the person is the Level 2 entity.

So if we observe people for three years, once per year, the three observation will be the Level 1 observations and the person will be the Level 2 observation.

That is, time periods are grouped within persons.

The first type of situation

The second type of situation

Examples of Research with level 1 and level 2 data – 1st type of analysis only, not longitudinal

1. Assessing the relationship of household income to race and education of the head of the household and to percentage of Blacks in the state and the state-level mean educational attainment. (Bickel, p. 10)

Individual elements are heads of households

Level 1 DV INCOME Income of head of household

Level 1 IV BLACK1 Whether head of household is Black or not

Level 1 IV EDUCATION1 Level of education of head of household

Grouping variable is State

Level 2 IV BLACK2 Percentage of Blacks in the state in which the household resides. (Composite.)

Level 2 IV EDUCATION2 Mean education level in the state in which the household resides. (Composite?)

2. Assessing the relationship of child reading achievement to income of the child’s family, quality of the individual child’s home neighborhood, amount of education of the parents of the child, child’s ethnicity, Head Start participation, mean vocabulary of the school in which groups of children reside, and quality of the neighborhood in which the school resides.

Individual elements are individual children.

Level 1 DV Vocab Achievement Vocabulary achievement score of individual child

Level 1 IV INCOME1 Income level of individual child’s family

Level 1 IV NEIGHBORHOOD1 Quality of individual child’s home neighborhood

Level 1 IV ETHNICITY1 Individual child’s ethnicity

Level 1 IV HEADSTART1 Whether individual child participated in Head Start

Grouping variable is School

Level 2 IV VOCABULARY2 Mean of vocabulary achievement levels of kids in the school attended by the child. (A composite group characteristic.)

Level 2 IV NEIGHBORHOOD2 Quality of neighborhood in which the school resides (A pure group characteristic.)

3. Bickel p. 28. Assessing the relationship of child math achievement to child’s SES, whether the child is in a public or private school and mean SES of all the kids in the school.

Individual elements are individual children

Level 1 DV Math Achievement

Level 1 IV SES1 Child’s SES

Grouping variable is school

Level 2 IF PRIVATE2 Whether child is in a public or private school. (Pure.)

Level 2 IV SES2 Mean SES of kids in the child’s school. (A composite.)

A general procedure

Specify a Level 1 model which gives the relationship of the DV to Level 1 IVs.

This will typically be a regression equation relating the DV to one or more Level 1 IVs.

Specify a Level 2 model of the relationship between the intercept of the Level 1 model to level 2 IVs.

Specify a Level 2 model of the relationship(s) between the slope(s) of the Level 1 model to Level 2 IVs.

Write a Combination model which incorporates both the Level 1 and Level 2 models specified in the above.

Two ways of doing this.

1. Separate Level 1 and Level 2 approaches.

Primary example: HLM by Scientific Solutions

2. Combination model approach.

Primary Example: SPSS Mixed procedure.

We’ll use the 2nd.

Chapter 1 examples – used to introduce notation.

Notation is especially important in understanding multilevel analyses.

We’ll spend an inordinate amount of time on it.

In the interests of expedience, I’ll make some substitutions.

Text Name Here

β Greek beta B

γ Greek gamma g

ε Greek epsilon e

Text Example 2. (Example 1 was confusing, so it’s skipped.)

A model which assumes that both the intercept (B0j) and the slope (B1j) of the Level 1 model might vary from group to group.

Level 1 Model

DV / Level 1 model intercept / Level 1 slope model / Residual / Eq
Yij = / B0j / + B1jXij / + eij / 1.5

The i subscript represents the person, person i, in this case.

The j subscript represents the group in which person i resides, group j.

This model says that Y is linearly related to X within each group, but that the intercept (B0j) and the slope (B1j) of that relationship might vary from group to group. That possibility is indicated by the j in the subscript.

If there were no grouping, the equation would be Yi = B0 + B1Xi + ei.

Note that the authors use the subscript 0 on B to represent the intercept of a relationship and the subscript 1 to represent the slope.

Note that B1j has a j subscript, indicating that the Level 1 slope could vary from group to group. I believe there is a type in the text. They listed only B1. It should have been B1j.

Level 2 model of intercept.

Simplest possible model of the intercept: B0j = g00.

2nd simplest model is . . .

B0j = g00 + u0j / 1.6

This is new material. This model specifies how the intercept varies from group to group.

Subscripts: 1st: g0 means this is part of the model of the Level 1 intercept.

2nd 0, g00 means that it’s the intercept of the model of the Level 1 intercept.

It is the 2nd simplest type of Level 2 model: One that simply says that the intercept varies randomly (u0j) about a mean value (g00).

It could be even simpler. It could be B0j = g00. This model would say that the intercept is the same for every group. This is not commonly done, however. Most Level 2 models of the intercept allow for some variation of the intercept from group to group.

Level 2 model of slope.

B1j = g10 / 1.7

This model specifies that the Level 1 slope is the same for every group.

The 1 in g10 says that this g is a parameter of a slope model (as opposed to an intercept model).

The 0 indicates that it’s the intercept of the Level 2 model.

Note that there is no “slope residual” in the Level 2 slope model. There could be. If there were, the model of the slope would be B1j = g10 + u1j. More on this later.

The Combined model, based on the above.

Yij =
Yij = / B0j
g00 + u0j / + B1jXij
+ g10Xij / + eij
+ eij / 1.8

The model of individual scores is above.

Implicit in this model are quantities to be estimated.

1) The grand mean intercept, g00. This represents the mean of all the individual group intercepts.

2) The grand mean slope, g10. This is the common slope across all the groups.

These are what we’re primarily interested in. But the Level 2 model has created the opportunity to estimate one other characteristic, one that has emerged from the Level 2 model of the intercept. That is

3) The variance of intercepts from group to group.

Finally, the procedure used for the analysis, prints one final quantity, one which has always been available, but which we haven’t paid much attention to. It’s

4) The variances of the residuals – the variances of the individual Ys about the within-group regression lines.

Graphically, this model could be represented as follows