ALDA Chapter 51

Treating TIME more flexibly

Irregular Spacing

The spacing between times at which one individual is observed may not be the same as the spacing between times at which some other individual is observed.

For example, TIME=0, 1, 2 may not apply to everyone in the sample.

Reasons –

Design

People are late, get sick, have accidents, jobs, issues

Problem: Both regular GLM Repeated measures and Amos analyses assume equal spacing between times.

Data sets with variably spaced measurement occasions.

Data example in Ch 5 . . .

A small sample extracted from the Children of the National Longitudinal Study of Youth (CNLSY).

Dependent variable: Scores on the Peabody Individual Achievement Test (PIAT).

3 waves of data for 89 African-American children.

Initially observed in 1986 at age 6.

Second observation scheduled for 1988 at age 8.

Third observation scheduled for 1990 at age 10.

Three possible ways of identifying TIME in this data set.

WAVE is simply equally spaced integers representing the time period. If the actual ages of the kids at each observation time were equally spaced, WAVE could be used as the TIME variable with no difficulty in interpretation.

AGEGRP is the nominal age that each person was “supposed” to be at each time period. Note that it also is equally spaced and could be used interchangeably with WAVE.

AGE on the other hand, is the actual age to the nearest month of the child when the observation was made. The AGE values are not equally spaced. They are the most precise representation of age.

MIXED does not require that time periods be equally spaced. (Unlike GLM Repeated Measures and Amos).

This is the distribution of AGE-6.5 (that’s AGE minus 6.5). Since each child was not observed at precisely the correct age, the actual values of the TIME variable (AGE-6.5 in this case) will vary from child to child.

This is the distribution of AGEGRP-6.5. AGEGRP has exactly equal spacing between each age for each child so the distributions are spikes at 0, 2, and 4.

Following is a comparison of analyses involving the two types of representation of time – idealized and actual.

Mixed Model Analysis

[DataSet1] G:\MdbO\html\myweb\PSY5950C\ALDACh5_CNLSY_reading_pp.sav

mixed piat with cagegrp

/print = solution testcov /method=ml

/fixed = intercept cagegrp

/random = intercept cagegrp |subject(id) covtype(un).

mixed piat with cage

/print = solution testcov /method=ml

/fixed = intercept cage

/random = intercept cage |subject(id) covtype(un).

Mixed Model Analysis

[DataSet1] G:\MdbO\html\myweb\PSY5950C\ALDACh5_CNLSY_reading_pp.sav

Information Criteriaa
-2 Restricted Log Likelihood / 1819.780
Akaike's Information Criterion (AIC) / 1827.780
Hurvich and Tsai's Criterion (AICC) / 1827.934
Bozdogan's Criterion (CAIC) / 1846.099
Schwarz's Bayesian Criterion (BIC) / 1842.099
The information criteria are displayed in smaller-is-better forms.
a. Dependent Variable: piat.

Fixed Effects –cagegrp as the Level 1 time variable – idealized time

Estimates of Fixed Effectsa
Parameter / Estimate / Std. Error / df / t / Sig. / 95% Confidence Interval
Lower Bound / Upper Bound
Intercept / 21.162921 / .617747 / 88 / 34.258 / .000 / 19.935280 / 22.390563
cagegrp / 5.030899 / .297295 / 88 / 16.922 / .000 / 4.440087 / 5.621711
a. Dependent Variable: piat.
Fixed Effects –cage as the Level 1 time variable – actual time
Estimates of Fixed Effectsa
Parameter / Estimate / Std. Error / df / t / Sig. / 95% Confidence Interval
Lower Bound / Upper Bound
Intercept / 21.062143 / .562996 / 75.733 / 37.411 / .000 / 19.940774 / 22.183512
cage / 4.539923 / .262166 / 87.757 / 17.317 / .000 / 4.018904 / 5.060942
a. Dependent Variable: piat.

Note the rather large difference in slope estimates – about a 10% difference.

The difference is apparently due to the fact that cage values are more widely spread (see graph above) than are cagegrp values. The Y values are the same, so the slope will be shallower for the more widespread x-values.

Covariance Parameters

CAGEGRP is time variable

Estimates of Covariance Parametersa
Parameter / Estimate / Std. Error / Wald Z / Sig. / 95% Confidence Interval
Lower Bound / Upper Bound
Residual / 27.043071 / 4.053928 / 6.671 / .000 / 20.158374 / 36.279101
Intercept + cagegrp [subject = id] / UN (1,1) / 11.427477 / 6.134238 / 1.863 / .062 / 3.990504 / 32.724498
UN (2,1) / 1.588536 / 2.089759 / .760 / .447 / -2.507316 / 5.684389
UN (2,2) / 4.485838 / 1.289609 / 3.478 / .001 / 2.553502 / 7.880448
a. Dependent Variable: piat.
CAGE is time variable
Estimates of Covariance Parametersa
Parameter / Estimate / Std. Error / Wald Z / Sig. / 95% Confidence Interval
Lower Bound / Upper Bound
Residual / 27.400436 / 4.356643 / 6.289 / .000 / 20.063988 / 37.419476
Intercept + cage [subject = id] / UN (1,1) / 5.507288 / 6.079511 / .906 / .365 / .632840 / 47.927150
UN (2,1) / 2.304759 / 1.821561 / 1.265 / .206 / -1.265434 / 5.874952
UN (2,2) / 3.376830 / 1.028103 / 3.285 / .001 / 1.859320 / 6.132878
a. Dependent Variable: piat.

The text does not present the full variance-covariance matrix of the random parameters. The values above are nearly identical to those presented in Table 5.2. The major differences between the two representations of time are the variances of the intercepts.

Varying the number of measurement times from one person to the next, aka missing values.

ALDA 5.2, p. 146.

How many measurements must be taken on each person?

In general, we need at least 3 measurements on some of the people. This allows for realistic estimation of Var(eij). If we had no more than 2 measurements on each person, each person’s data would be fit perfectly by a straight line and Var(eij) would be 0. This would prevent statistical tests.

If there are some persons with 3 measures, then there can be others with fewer than 3, even as few as 1. Their data won’t contribute to Var(eij) but it will contribute to estimation of the fixed effects.

The data to illustrate these concepts is wages.sav, shown in Table 5.3 on p. 147.

id exper lnw black hgc uerate

206.00 1.87 2.03 .00 10.00 9.20

206.00 2.81 2.30 .00 10.00 11.00

206.00 4.31 2.48 .00 10.00 6.30

332.00 .13 1.63 .00 8.00 7.10

332.00 1.63 1.48 .00 8.00 9.60

332.00 2.41 1.80 .00 8.00 7.20

332.00 3.39 1.44 .00 8.00 6.20

332.00 4.47 1.75 .00 8.00 5.60

332.00 5.18 1.53 .00 8.00 4.60

332.00 6.08 2.04 .00 8.00 4.30

332.00 7.04 2.18 .00 8.00 3.40

332.00 8.20 2.19 .00 8.00 4.40

332.00 9.09 4.04 .00 8.00 6.70

1028.00 .00 .87 1.00 8.00 9.30

1028.00 .04 .90 1.00 8.00 7.40

1028.00 .52 1.39 1.00 8.00 7.30

1028.00 1.48 2.32 1.00 8.00 7.40

1028.00 2.14 1.48 1.00 8.00 6.30

1028.00 3.16 1.71 1.00 8.00 5.90

1028.00 4.10 2.34 1.00 8.00 6.90

The data tracks high school dropouts over several years.

The time variable is exper – time in years to the nearest day since the participant dropped out of high school.

The dependent variable, lnw, is the logarithm of the participants hourly wage in 1990 dollars.

The variable, black, is 1 if the respondent is black, 0 otherwise.

The variable, hgc is the highest grade completed. hgc_9 is hgc minus 9.

The analysis is quite straightforward.

The level 1 model is

Yij = p0j + p1j*EXPER + eij

Model A, the unconditional growth model is(Note z0j and z1j in keeping with ALDA.)

P0j = g00 + z0j.

P1j = g10 + z1j.So the slope of individual growth curve varies randomly across people.

The combined model, then is Yij = g00 + z0j + (g10 + z1j)*EXPER + eij

Yij = g00 + z0j + g10*EXPER + z1j*EXPER + eij

The syntax is

Mixed lnw with exper

/print = solution /method = ml

/fixed = intercept exper

/random = intercept exper | subject(id) covtype(un).

The output for Model A is

Estimates of Fixed Effectsa
Parameter / Estimate / Std. Error / df / t / Sig. / 95% Confidence Interval
Lower Bound / Upper Bound
Intercept / 1.715604 / .010797 / 803.249 / 158.903 / .000 / 1.694411 / 1.736797
exper / .045681 / .002342 / 545.978 / 19.509 / .000 / .041081 / .050280
a. Dependent Variable: lnw.

Covariance Parameters

Estimates of Covariance Parametersa
Parameter / Estimate / Std. Error
Residual / .095105 / .001944
Intercept + exper [subject = id] / UN (1,1) / .054268 / .005001
UN (2,1) / -.002915 / .000869
UN (2,2) / .001726 / .000220
a. Dependent Variable: lnw.

The results are identical to those presented in Table 5.4, p. 149 in ALDA.

Note that there is no mention of the fact that different persons were measured on different numbers of time periods ranging from 1 to ??.

Model B – Is the wage trajectory affected by the year at which the person dropped out (hgc_9) and whether or not the person is black (black).

Level 1 Model

Yij = p0j + p1j*EXPER + eij

The Level 2 model: It will be assumed that hgc_9 and black affect both the intercept and slopeeven though the primary interest is on the trajectory – the slope.

P0j = g00 + g01*hgc_9 + g02*black + z0j

P1j = g10 + g11*hgc_9 + g12*black + z1j

The combined model is

Yij = g00 + g01*hgc_9 + g02*black + z0j + (g10 + g11*hgc_9 + g12*black + z1j)*exper + eij

Yij = g00 + g01*hgc_9 + g02*black+ z0j +g10*exper + g11*hgc_9*exper + g12*black*exper + z1j*exper + eij

The syntax is

Mixed lnw with exper hgc_9 black

/print=solution

/method=ml

/fixed = intercept exper hgc_9 hgc_9*exper black black*exper

/random = intercept exper |subject(id) covtype(un).

The point of the arrows is to emphasize that each term in the composite model is represented in the syntax invoking the model.

Yij = g00 + g01*hgc_9 + g02*black+ z0j +g10*exper + g11*hgc_9*exper + g12*black*exper + z1j*exper + eij

The output

Estimates of Fixed Effectsa
Parameter / Estimate / Std. Error / df / t / Sig. / 95% Confidence Interval
Lower Bound / Upper Bound
Intercept / 1.717139 / .012542 / 799.383 / 136.907 / .000 / 1.692519 / 1.741758
exper / .049343 / .002632 / 514.460 / 18.750 / .000 / .044173 / .054513
hgc_9 / .034920 / .007881 / 837.509 / 4.431 / .000 / .019450 / .050390
exper * hgc_9 / .001279 / .001723 / 571.926 / .742 / .458 / -.002105 / .004664
black / .015395 / .023927 / 816.218 / .643 / .520 / -.031569 / .062360
exper * black / -.018213 / .005499 / 618.695 / -3.312 / .001 / -.029012 / -.007414
a. Dependent Variable: lnw.

Note that two effects are NS in the above output . . ., in red in the combined model above.

1. the effect of black on the intercept

2. The effect of hgc_9 on the slope.

Covariance Parameters

Estimates of Covariance Parametersa
Parameter / Estimate / Std. Error
Residual / .095194 / .001946
Intercept + exper [subject = id] / UN (1,1) / .051748 / .004868
UN (2,1) / -.002851 / .000844
UN (2,2) / .001636 / .000214
a. Dependent Variable: lnw.

The variance estimates are little changed from the unconditional growth mode, even though the intercepts were related to hgc_9 and the slopes were related to black.

Model C drops the two nonsignificant predictors.

Note that I’ve always been taught that if a variable is part of an interaction term, as black is, it should be kept in the model. I’m not sure why they dropped it here, although I’ll defer to their expertise.

Model C – dropping the two nonsignificant relationships from Model B

Level 1 Model

Yij = p0j + p1j*EXPER + eij

The Level 2 model: It will be assumed that hgc_9 and black affect both the intercept and slope

P0j = g00 + g01*hgc_9 +g02*black +z0j

P1j = g10 + g11*hgc_9 + g12*black + z1j

The combined model is

Yij = g00 + g01*hgc_9 + g0j + (g10 + g11*black + g1j)*exper + eij

Yij = g00 + g01*hgc_9 + z0j +g10*exper + g12*black*exper + z1j*exper + eij

The syntax

Mixed lnw with exper hgc_9 black

/print=solution

/method=ml

/fixed = exper hgc_9 hgc_9*experblack black*hgc_9

/random = intercept exper |subject(id) covtype(un).

Yij = g00 + g01*hgc_9 + z0j +g10*exper + g12*black*exper + z1j*exper + eij

Model C output

Mixed Model Analysis

[DataSet2] G:\MdbO\html\myweb\PSY5950C\ALDACh5_NLSY_wages_pp.sav

Fixed Effects

Estimates of Fixed Effectsa
Parameter / Estimate / Std. Error / df / t / Sig. / 95% Confidence Interval
Lower Bound / Upper Bound
Intercept / 1.721475 / .010697 / 808.322 / 160.929 / .000 / 1.700478 / 1.742472
exper / .048847 / .002513 / 562.835 / 19.435 / .000 / .043910 / .053784
hgc_9 / .038361 / .006433 / 849.979 / 5.963 / .000 / .025734 / .050988
exper * black / -.016115 / .004511 / 638.291 / -3.572 / .000 / -.024974 / -.007256
a. Dependent Variable: lnw.

Covariance Parameters

Estimates of Covariance Parametersa
Parameter / Estimate / Std. Error
Residual / .095174 / .001945
Intercept + exper [subject = id] / UN (1,1) / .051831 / .004873
UN (2,1) / -.002880 / .000845
UN (2,2) / .001647 / .000214
a. Dependent Variable: lnw.

Since many people say that if there is an interaction in the model, all of the variables that are part of that interaction must be in the model also. Thus, I reran the model with black in the model. The results.

Estimates of Fixed Effectsa
Parameter / Estimate / Std. Error / df / t / Sig. / 95% Confidence Interval
Lower Bound / Upper Bound
Intercept / 1.717310 / .012543 / 800.819 / 136.914 / .000 / 1.692689 / 1.741931
exper / .049351 / .002636 / 516.993 / 18.723 / .000 / .044173 / .054530
hgc_9 / .038295 / .006433 / 849.836 / 5.953 / .000 / .025669 / .050920
black / .015203 / .023931 / 816.078 / .635 / .525 / -.031770 / .062176
exper * black / -.018119 / .005505 / 621.304 / -3.291 / .001 / -.028931 / -.007308
a. Dependent Variable: lnw.

In this case, the significance results are the same, although there was a 10% change in the value of the interaction coefficient. .

Time varying Covariates, Section 5.3, p. 159 – Characteristics that change from one time to the next.

The unemployment data set

Dependent variable is Center for Epidemiologic Studies Depression (CES-D) scale (Radloff, 1977).

Scale scores from from 0 (minimal symptoms) to 80 (maximum symptoms).

TIME is measured in Months to the nearest day.

TIME is the time of an interview in which the CES-D was administered.

A time-varying covariate, a characteristics of the persons – UNEMP – is 1 if the respondent was employed at the time of the interview or 0 if not unemployed.

So UNEMP, a characteristic of the persons is not like gender, that (usually) remains the same across time periods, but is a characteristic that may vary from one time period to the next – a time varying covariate.

As we’ll see, such a covariate must be treated as a Level 1 factor.

Two research questions . . .

1. Nature of the overall trajectories over time – do the unemployed become more or less depressed over time.

2. The relationship of depression to whether or not the respondents were unemployed at the time of their interview.

Model A – an unconditional growth model

Level 1

Yij = p0j + p1j*TIME + eij

Level 2

P0j = g00 + z0j

P1j = g10 + z1j

The composite model: Yij = g00 + z0j + g01*TIME + z1j*TIME + eij

The syntax

mixed cesd with months

/print=solution

/method=ml

/fixed= intercept months

/random=intercept months | subject(id) covtype(un).

The output . . .

Yij = g00 + z0j + g01*TIME + z1j*TIME + eij

Mixed Model Analysis

[DataSet1] G:\MdbO\html\myweb\PSY5950C\ALDA_Ch5_unemployment_pp.sav

Fixed Effects

Estimates of Fixed Effectsa
Parameter / Estimate / Std. Error / df / t / Sig. / 95% Confidence Interval
Lower Bound / Upper Bound
Intercept / 17.669363 / .775564 / 250.433 / 22.783 / .000 / 16.141904 / 19.196822
months / -.421994 / .082979 / 217.973 / -5.086 / .000 / -.585538 / -.258450
a. Dependent Variable: cesd.

Covariance Parameters

Estimates of Covariance Parametersa
Parameter / Estimate / Std. Error
Residual / 68.850189 / 6.602701
Intercept + months [subject = id] / UN (1,1) / 86.848287 / 14.963139
UN (2,1) / -3.057239 / 1.384615
UN (2,2) / .355031 / .184494
a. Dependent Variable: cesd.

Note that depression lessens over time.

Model B – Adding a Level 1 time-varying covariate, UNEMPij

Level 1 Model

Yij = p0j + p1j*TIME + p2j*UNEMP + eij

Level 2 Model

P0j = g00 + z0j

P1j = g10 + z1j

P2j = g20.

Composite: Yij = g00 + z0j + (g01 + z1j)*TIME + g02*UNEMP + eij

Yij = g00 + z0j + g01*TIME + z1j*TIME + g02*UNEMP + eij

Note that there is no random component associated with UNEMP.

The syntax . . .

Title Model B: Main effect of unemployment.

mixed cesd with months unemp

/print=solution

/method=ml

/fixed=intercept months unemp

/random=intercept months | subject(id) covtype(un).

The output

Title Model B: Main effect of unemployment.

Model B: Main effect of unemployment

Yij = g00 + z0j + g01*TIME + z1j*TIME + g02*UNEMP + eij

mixed cesd with months unemp

/print=solution

/method=ml

/fixed=intercept months unemp

/random=intercept months | subject(id) covtype(un).

[DataSet1] G:\MdbO\html\myweb\PSY5950C\ALDA_Ch5_unemployment_pp.sav

Fixed Effects

Estimates of Fixed Effectsa
Parameter / Estimate / Std. Error / df / t / Sig. / 95% Confidence Interval
Lower Bound / Upper Bound
Intercept / 12.665595 / 1.242071 / 665.025 / 10.197 / .000 / 10.226742 / 15.104448
months / -.201983 / .093316 / 288.957 / -2.165 / .031 / -.385649 / -.018318
unemp / 5.111308 / .988844 / 464.745 / 5.169 / .000 / 3.168148 / 7.054467
a. Dependent Variable: cesd.

Covariance Parameters

Estimates of Covariance Parametersa
Parameter / Estimate / Std. Error
Residual / 62.387512 / 6.013229
Intercept + months [subject = id] / UN (1,1) / 93.518880 / 14.820168
UN (2,1) / -3.894108 / 1.370258
UN (2,2) / .464706 / .179787
a. Dependent Variable: cesd.

Depression decreased over time.

Note two major changes . . .

1. The average decrease in depression was reduced from -.4/month to -.2/month, suggesting that some of the decrease seen previous may have been due to the fact that respondents became employed again after the initial interview. Those differences in employment were controlled for in this analysis.

2. There is a substantial difference in mean depression between those unemployed at the time of interview and those employed at the time of interview – about 5 points.

Model C – Looking at the interaction of unemployment status and time – a Level 1 interaction.

Level 1 model

Yij = p0j + p1j*TIME + p2j*UNEMP + p3j*TIME*UNEMP + eij

Level 2 models

P0j = g00 + z0j

P1j = g10 + z1j.

P2j = g20.No random residual.

P3j = g30.No random residual.

Composite model: Yij = g00 + z0j + (g10+z1j)*TIME + g20*UNEMP + g30*TIME*UNEMP + eij

Yij = g00 + z0j + g10*TIME + z1j*TIMe + g20*UNEMP + g30*TIME*UNEMP + eij.

The syntax . . .

Title Model C: Effect of unemployment on initial status and growth rate.

mixed cesd with months unemp

/print=solution

/method=ml

/fixed=intercept months unemp unemp*months

/random=intercept months | subject(id) covtype(un).

Title Model C: Effect of unemployment on initial status and growth rate.

Model C: Yij = g00 + z0j + g10*TIME + z1j*TIMe + g20*UNEMP + g30*TIME*UNEMP + eij.

Mixed Model Analysis

Fixed Effects

Estimates of Fixed Effectsa
Parameter / Estimate / Std. Error / df / t / Sig. / 95% Confidence Interval
Lower Bound / Upper Bound
Intercept / 9.616744 / 1.889308 / 372.482 / 5.090 / .000 / 5.901697 / 13.331792
months / .162036 / .193662 / 436.474 / .837 / .403 / -.218591 / .542662
unemp / 8.529059 / 1.877874 / 278.749 / 4.542 / .000 / 4.832443 / 12.225674
months * unemp / -.465222 / .217215 / 427.868 / -2.142 / .033 / -.892162 / -.038282
a. Dependent Variable: cesd.

Covariance Parameters

Estimates of Covariance Parametersa
Parameter / Estimate / Std. Error
Residual / 62.031186 / 5.965541
Intercept + months [subject = id] / UN (1,1) / 93.713196 / 14.777097
UN (2,1) / -3.873161 / 1.358797
UN (2,2) / .451209 / .177345
a. Dependent Variable: cesd.

Note that there is no significant overall change in depression over time in this model.

Note the months*unemp interaction. This suggests that there is change in depression over time but the change in depression over time is dependent on whether the respondent was unemployed or not.

The text, in Figure 5.4 discusses this.

We compared Models B and C here. The change differences in the relationship of depression to time between those employed and those unemployed is shown in the center of the figure.