ESM1: Measurement Integration 1

Electronic Supplementary Material 1

Measurement Integration

We created commensurate measures of alcohol use and of socioeconomic status (SES) based on items drawn from four independent longitudinal intervention studies. Our method of measurement integration was that described byBauer and Hussong(1) and Curran et al. (2)and involved application of the moderated nonlinear factor analysis (MNLFA) model. The MNLFA model allows for the creation of scale scores that account for differences in both the latent factor and individual items as a function of study or other important covariates (1). MNLFA allows for the simultaneous testing of multiple moderators, which may be continuous or categorical, and for items of different scale types (continuous, count, discrete), important advantages over traditional psychometric models (1).

The four studies integrated include the Health Improvement Project-Rochester (HIP-R; 3), the HIP-R2(4), In the Know (ITK; 5), and Project ACHIEVE (Alcohol Counseling and HIV Intervention; 6). These studies were appropriate for integrative data analysis (IDA) because they sampled similar populations (patients at sexually transmitted infection [STI] clinics), assessed similar constructs using some common and some unique items, and had overlap in the times at which their follow-ups were collected. Participants in these studies (N = 3,752; study ns = 1536, 1009, 602, 605) were enrolled in trials with baseline assessments as well as follow-ups at 3, 6, 9, and/or 12 months. Collectively, the four studies administered alcohol use itemsacross13,200 total individual assessments.

Alcohol Use

Method

Measures

Alcohol use.Multiple self-report items assessed alcohol use within each study.Core indicators of alcohol use assessed across studies included frequency of alcohol use, engagement in heavy episodic drinking, and drinks per drinking day. Supplementary indicators included peak alcohol consumption, drinks per week, and frequency of intoxication.

All studies assessed frequency of drinking over the past 3 months. In the HIP-R2, ITK, and ACHIEVE, frequencymeasures were recoded to 5 categories representing approximately the following: never, monthly, weekly, several times a week, and most days.[1] One half of the HIP-R2 participants completed a measure of past-year drinking frequency (rather than past 3 months) at baseline only. In the HIP-R, the assessment of frequency was days per week of drinking (0-7).

All studies assessed heavy episodic drinking (HED, past 3 months); measures were dichotomized to indicate any engagement in HED.[2] The HIP-R and a HIP-R2 assessment completed by one half of the sample defined HED as 5+ drinks for men and 4+ for women. The HIP-R2 assessment completed by the other half of the sample defined HED as 5+ drinks for men and women and assessed HED in the past year at baseline only. ITK defined HED as 6+ drinks for men and women. ACHIEVE defined HED as 6+ drinks for men and 4+ for women.

All studies assessed drinks per drinking day (past 3 months); measures were dichotomized to indicate 3+ drinks per drinking day.[3] One half of the HIP-R2 participants completed a measure of drinks per drinking day in the past year at baseline only.

Two supplementary measures were completed by participants in the HIP-R2: a continuous measure of drinks per week (past 3 months) and, for half the sample, a measure of intoxication (past 3 months) which was recoded to indicate frequencies of never, once a month or less, or more than once a month.[4]

Moderating factors. Study membership, sex, and time point were considered as moderators. Some alcohol use measures differed in the two halves of the HIP-R2; thus, the two halves of the study are treated separately in the MNLFA model (as HIP-R2A and HIP-R2B).

Data Analysis

Given the assumption of independence in the MNLFA model, a calibration sample consisting of a single randomly selected observation per participant was selected. After assuring the alcohol use construct was unidimensional in each study and for the entire sample, using SAS NLMIXED, we first tested for differences in the factor means and variances based on study, sex, and time point, as well as the interactions between study and sex, study and time point, and sex and time point. Accounting for differences in factor means and variances, we tested for item intercept and loading differences on an item-by-item basis, again including interaction terms. All significant moderators were then included in a final complete MNLFA model. To derive scale scores, we used the complete set of parameter estimates from this final model to score the full, integrated data set with all repeated assessments, generating maximum a posteriori (MAP) scores for alcohol use for future hypothesis testing. These scale scores simultaneously account for differences in the factor mean and variance, item intercepts, and factor loadings due to study membership, sex, and time point.

Results

Descriptive Analyses and Dimensionality Testing

As shown in Table 1, all alcohol use items were correlated with one another in all samples, rs = .49-.71, ps.001.

First, we determined dimensionality of the alcohol use items within the calibration samples from each of the four studies. For this exploratory step only, we transformed the two count indicators (drinking days per week and drinks per week) with a natural log transformation due to restrictions on exploratory factor analysis (EFA) with count indicators. We initially estimated a nonlinear EFA in each calibration sample usingthe WLSMV estimator in Mplus. A model with one factor was supported in each sample, with all items loading highly (all ps < .0001).

We next reestimated the nonlinear EFA in the pooled calibration sample. Because of missing data (i.e., items not assessed in certain studies), we here used the MLR estimator, which provides factor loadings and standard errors but not traditional fit indices. A one-factor solution was again supported, with all items significantly loading on the factor. The factor loadings indicated that all items contributed significantly to defining the alcohol use factor (all ps < .0001; see Table 2).

Testing for Factor and Item Differences Through MNLFA

We next fitted MNLFA models to the calibration sample to test for differences in the factor means, factor variances, item intercepts, and item factor loadings as a function of study membership, sex, and time point. An initial model tested whether these covariates predicted mean and variance differences in the latent factor of alcohol use. To define the scale of the latent factor, the conditional mean and variance of the factor were fixed at 0 and 1, respectively, when all covariates were equal to zero.

Accounting for differences in the latent factor mean and variance, we tested whether covariates predicted intercept and factor loading differences in the specific items. We tested for item intercept and loading differences on an item-by-item basis. We initially tested two-way interactions between covariates, but found no significant interactions (all ps > .05), and thus dropped these terms. We used the Benjamini-Hochberg procedure (7, 8)to control the false discovery rate. We then estimated a full model that included all significant covariate effects for all items. In this full model, we removed those contributing effects that were no longer significant in the combined model.

The results from the final model incorporating both factor and item differences are presented in Table 3 (factor mean and variance) and Table 4 (item intercepts and factor loadings). The significant effects of study membership on the factor mean of alcohol use indicated that participants in ITK reported lower levels of alcohol use ( = -0.31, t = -4.19) and participants in one half of the HIP-R2 and in ACHIEVE reported higher levels of alcohol use ( = 0.14, t = 2.09 and = 0.74, t = 8.37, respectively) than did those in the HIP-R. Additionally, men reported higher levels of alcohol use than did women ( = 0.44, t = 9.06). Mean levels of alcohol use decreased over time ( = -0.05, t = -8.43).

Common items functioned differently across study and sex, but not across time. All significant differences were in item intercepts, reflecting differences in probability of endorsing these items at equivalent levels of underlying alcohol use. In terms of drinks per drinking day (Figure 1), on average, higher levels of underlying alcohol use were required for those in the HIP-R and HIP-R2 (conducted in Central New York) to indicate they consumed 3+ drinks per day compared with those in ITK and ACHIEVE (conducted in Milwaukee, WI). For HED, higher levels of underlying alcohol use were required for those in the HIP-R2B and ACHIEVE to indicate HED as compared to those in the other studies (Figure 2) and for men to indicate HED as compared to women (perhaps reflecting differential definitions of HED for men and women in some studies; Figure 3). For drinking frequency, higher levels of underlying alcohol use were required for those in all other studies to endorse each level of drinking frequency as compared to the HIP-R2B (Figure 4). The drinking frequency item was invariant across the HIP-R2A, ITK, and ACHIEVE; the drinks per drinking day item across the HIP-R and the HIP-R2 and across ITK and ACHIEVE; and the HED item across the HIP-R, the HIP-R2A, and ITK and across the HIP-R2B and ACHIEVE. Thus, all studies were linked by some items functioning identically.

Estimating Final Alcohol Use Scores in the Integrated Longitudinal Data Set

The MAPs over time by study are presented in Figure 5. Originally, the conditional mean and variance of these scores were fixed at 0 and 1, respectively, when all covariates were equal to 0 (thus, for women in the HIP-R at baseline). In order to aid in interpretation of the final scores, scores were rescaled such that the overall sample mean and variance at baseline were 0 and 1, respectively. Thus, alcohol use scores can be interpreted as being in standard deviation units with reference to baseline drinking in the entire integrated sample.

Socioeconomic Status

Method

Measures

Multiple self-report items assessed SES within each study. There were dichotomous indicators of both education (0 = high school or less, 1 = more than high school) and employment (0 = unemployed, 1 = employed full- or part-time) available in all studies. Studies assessed income using different categorical measures. The HIP-R and HIP-R2 assessed annual income on a 4-point scale (1 = less than $15,000, 2 = $15,000 to $30,000, 3 = $30,000 to $45,000, 4 = more than $45,000). ITK assessed annual income on a different 4-point scale (0 = $0 to $10,000 per year, 1 = $11,000 to $20,000 per year, 2 = $21,000 to $30,000 per year, 3 = over $30,000 per year). ACHIEVE assessed monthly income on a 10-point scale (0 = $0 to $399 a month, 1 = $400 to $599 a month, 2 = $600 to $799 a month, 3 = $800 to $999 a month, 4 = $1000 to $1199 a month, 5 = $1200 to $1399 a month, 6 = $1400 to $1599 a month, 7 = $1600 to $1799 a month, 8 = $1800 to $1999 a month, 9 = $2000 or more a month).

Moderating factors. Study membership, sex, age (standardized), and race(White vs. non-White) were considered as moderators.[5]

Data Analysis

After assuring the SES construct was unidimensional in each study and for the entire sample, using SAS NLMIXED, we first tested for differences in the factor means and variances based on study membership, sex, age, and race. Accounting for differences in factor means and variances, we tested for item intercept and loading differences on an item-by-item basis. All significant moderators were then included in a final complete MNLFA model, and maximum a posteriori (MAP) scores for SES were generated for future hypothesis testing. These scale scores simultaneously account for differences in the factor mean and variance, item intercepts, and factor loadings due to study membership, sex, age, and race.

Results

Descriptive Analyses and Dimensionality Testing

As shown in Table 5, in general, indicators of SES were correlated with one another across studies, rs = .17-.50, ps < .001 (in ITK, education and employment were not correlated, r = .07, p = .11).

First, we assured that the three SES indicators loaded on a single factor within each of the four studies through a confirmatory factor analysis (CFA) using the WLSMV estimator in Mplus. A single factor was supported in each sample, with all items loading highly (all ps < .003).

We next estimated a CFA in the pooled calibration sample. Because of missing data (i.e., items not assessed in certain studies), we here used the MLR estimator, which provides factor loadings and standard errors but not traditional fit indices. A single factor was again supported, with all items significantly loading on the factor. The factor loadings indicated that all items contributed significantly to defining the SES factor (all ps < .0001; see Table 6).

Testing for Factor and Item Differences Through MNLFA

We next fitted MNLFA models to the calibration sample to test for differences in the factor means, factor variances, item intercepts, and item factor loadings as a function of study membership, sex, age, and race. An initial model tested whether these covariates predicted mean and variance differences in the latent factor of SES. To define the scale of the latent factor, the conditional mean and variance of the factor were fixed at 0 and 1, respectively, when all covariates were equal to zero.

Accounting for differences in the latent factor mean and variance, we tested whether covariates predicted intercept and factor loading differences in the specific items. We tested for item intercept and loading differences on an item-by-item basis. We used the Benjamini-Hochberg procedure (7, 8) to control the false discovery rate. We then estimated a full model that included all significant covariate effects for all items. In this full model, we removed those contributing effects that were no longer significant in the combined model.

The results from the final model incorporating both factor and item differences are presented in Table 7 (factor mean and variance) and Table 8 (item intercepts and factor loadings). The significant effects of study membership on the factor mean of SES indicated that participants in ACHIEVE were of lower SES( = -1.33, t = -4.35). Additionally, men were of higher SES than were women (= 0.35, t = 5.00). Non-white participants had lower SES ( = -0.80, t = -8.07), and SES decreased with age ( = -0.13, t = -3.80).

Additionally, items functioned differently dependent on study membership, sex, age, and race. Most differences were in item intercepts, reflecting differences in probability of endorsing these items at equivalent levels of underlying SES, although there were also some differences in item loadings, reflecting a differential strength of the relationship between indicators and underlying SES. In terms of education (Figure 6), on average, higher levels of underlying SES were required for those in ITK and ACHIEVE (conducted in Milwaukee, WI) to report greater than a high school education than for those in the HIP-R and HIP-R2 (conducted in Central New York). Additionally, educational attainment was less strongly associated with SES in ACHIEVE, which recruited hazardous drinkers. Also related to education, non-White participants (Figure 7), men (Figure 8), and younger participants (Figure 9) were less likely to have more than a high school education than were White participants, women, and older participants at equivalent levels of SES.

In terms of employment, being employed was less strongly associated with underlying SES in ITK than in the other three studies (Figure 10). Finally, in terms of income, in the HIP-R and HIP-R2, women and non-White participants reported lower levels of income at equivalent underlying levels of SES than did men and White participants (Figures 11 and 12). Income was more strongly related to underlying SES for older participants in these studies (Figure 13). In ACHIEVE, women reported higher levels of income than did men at the lowest levels of SES but lower levels of income than did men at above-average levels of SES (Figure 14). Additionally, higher levels of SES were required for younger participants to report equivalent levels of income to older participants (Figure 15). The educationitem was invariant across the HIP-R and HIP-R2 and across ITK and ACHIEVE, while the unemployment item was invariant across the HIP-R, HIP-R2, and ACHIEVE. There was also additional partial invariance. Thus, all studies were linked by some items functioning identically.

Estimating Final SES Scores in the Integrated Data Set

The MAPs by study are presented in Figure 16. Originally, the conditional mean and variance of these scores were fixed at 0 and 1, respectively, when all covariates were equal to 0 (thus, for White women of average age in the HIP-R). In order to aid in interpretation of the final scores, scores were rescaled such that the overall sample mean and variance were 0 and 1, respectively. Thus, SES scores can be interpreted as being in standard deviation units with reference to the entire integrated sample.

ESM1: Measurement Integration 1

Table 1.

Correlations Between Alcohol Use Indicators by Study.

HIP-R / HIP-R2
Freq. Drink. / 3+ Drinks / HED / DPW / Freq. Drink. / 3+ Drinks / HED / DPW
3+ Drinks / .57*** / 3+ Drinks / .55***
HED / .57*** / .59*** / HED / .58*** / .61***
DPW / -- / -- / -- / DPW / .61*** / .52*** / .51***
Freq. Intox. / -- / -- / -- / -- / Freq. Intox. / .59*** / .53*** / .62*** / .49***
ITK / ACHIEVE
Freq. Drink. / 3+ Drinks / HED / DPW / Freq. Drink. / 3+ Drinks / HED / DPW
3+ Drinks / .64*** / 3+ Drinks / .66***
HED / .63*** / .62*** / HED / .61*** / .71***
DPW / -- / -- / -- / DPW / -- / -- / --
Freq. Intox. / -- / -- / -- / -- / Freq. Intox. / -- / -- / -- / --

Note. Freq. Drink. = drinking frequency (past 3 months); 3+ Drinks = 3+ drinks per drinking day (past 3 months); HED = heavy episodic drinking (past 3 months); DPW = drinks per week (past 3 months); Freq. Intox. = frequency of intoxication (past 3 months); HIP-R = Health Improvement Project-Rochester; HIP-R2 = Health Improvement Project-Rochester 2; ITK = In the Know; ACHIEVE = Alcohol Counseling and HIV Intervention.