Testing Latent Variable Models with Survey Data

TESTING LATENT VARIABLE MODELS WITH SURVEY DATA

TABLE OF CONTENTS FOR THIS SECTION

As a compromise between formatting and download time, the chapters below are in Microsoft Word. You may want to use "Find" to go to the chapters below by pasting the chapter title into the "Find what" window.

FORWARD

INTRODUCTION

THEORETICAL MODEL TESTING STUDIES

STEP I IN UNOBSERVED VARIABLE-SURVEY DATA (UV-SD) MODEL TESTING-- DEFINING MODEL CONCEPTS

FIRST-ORDER CONSTRUCTS

SECOND-ORDER CONSTRUCTS

INTERACTIONS AND QUADRATICS

DEFINING CONCEPTS

IDENTIFYING IMPORTANT ANTECEDENTS

SUGGESTIONS FOR STEP I-- DEFINING MODEL CONCEPTS

 2002 Robert A. Ping, Jr. 9/20/02 i

TESTING LATENT VARIABLE MODELS WITH SURVEY DATA

FORWARD

This book critically reviews the process of testing, or validating, as it is sometimes called, theoretical models involving unobserved or latent variables and survey data, and selectively suggests improvements in this process. Because I am a captive of my discipline, a branch of social science research that investigates socio-economic exchanges between organizations (also known as marketing channels research), the book's examples and many of its comments about theoretical model testing practices using survey data involve research in Marketing. Nevertheless, because research in Marketing follows the same conventions and practices for theoretical model testing as the other brances of the social sciences, this book and its suggestions have application across the social sciences. Thus, this book is intended for researchers who test theoretical models involving latent variables[1] and survey data, and its purpose is to help these researchers reliably test these models with survey data.

I am a theory tester, and my first experience with covariant structure analysis or structural equation analysis (I shall use the more popular term structural equation analysis) was while testing a theoretical model with multiple dependent or endogenous variables and survey data. After trying Ordinary Least Squares regression, then Canonical Correlation, to estimate the path coefficients in my structural model, I settled on using structural equation analysis, specifically LISREL, for its ability to jointly estimate simultaneous equations and its ability to model the effects of measurement error. I was subsequently surprised by how difficult structural equation analysis was to use. In those days LISREL was available only on a mainframe computer. Using it to estimate path coefficients in a structural model required learning the LISREL programming "language," then writing and debugging a comparatively lengthy computer program using this language. LISREL is now available on PC computers, and programming has been somewhat simplified (see Hayduk, 1996:xiii) with its SIMPLIS code (LISREL program) generator.

There were (and are) other structural equation analysis programs besides LISREL[2] (e.g., EQS, AMOS, etc.), but in comparison to OLS regression, for example, structural equation analysis still seems as much an "art" as an estimation technique. Thus, this book is also intended to help make analyzing survey data using structural equation analysis a little easier.

 2002 Robert A. Ping, Jr. 9/20/02 i

The book is organized around the process of testing theoretical models involving latent variables and survey data (i.e., first define the model constructs. Then state the relationships among these constructs, develop appropriate measures of the constructs, and gather data using these measures. Next validate these measures, and test the stated relationships among the constructs). It brings together what is known about this process, and it selectively adds to this body of knowledge. The book begins with a discussion of the process of testing theoretical models involving latent variables and survey data, then it details each step in this process using several examples involving real-world survey data.

Although the book assumes the reader is familiar with the terminology of covariant structure analysis or structural equation analysis, and a software package for the analysis of structural equations (e.g., LISREL, EQS, AMOS, etc.), I have tried to make it as accessible as possible.

The list of those I should thank in this book is long and certainly incomplete. My first exposure to structural equation analysis was while working with Bob Dwyer at the University of Cincinnati. Neil Ritchie, also at UC, helped refine that first exposure. My thinking about structural equations has been heavily influenced by the writings of James Anderson, Richard Bagozzi, David Gerbing and John Hunter; Peter Bentler; Kenneth Bollen; Michael Browne and Robert Cudeck; John Fox and Michael Sobel; Leslie Hayduk; Karl Jöreskog and Dag Sörbom; and John Kenny.

This book is on my web site for several reasons. It allows me to use my recent experiences to periodically revise the book without the rigors of publishing a revised edition. Because the book is searchable using standard "Find" functions, I did not have to spend time building an index, and this searchabilty seems to make it more useful than a printed version. However, for several reasons the book is in Microsoft Word, rather than Acrobat Reader (PDF) or HTML, so it may download rather slowly. It was also "auto" formatted from WordPerfect to Word, and so, in addition to my own errors of omission and commission, there are probably reformatting errors.

If you see anything you like, or dislike, errors, etc., please e-mail me with the details.

Robert A. Ping, Jr.

Department of Marketing

Wright State University

Dayton, Ohio 45435-0001

 2002 Robert A. Ping, Jr. 9/20/02 i

TESTING LATENT VARIABLE MODELS WITH SURVEY DATA

INTRODUCTION

This monograph selectively suggests improvements in the process of testing theoretical models involving unobserved variables[3] and survey data. Because my social science research involves theoretical research in Marketing, the book's examples and much of its discussion of theoretical model testing is focused there. Nevertheless, because the conventions and processes of theoretical model testing using survey data are generally the same across the social sciences, the book's suggestions have application throughout the social sciences. Thus, the book is intended for researchers in the social sciences who test theoretical models involving latent variables and survey data.

The suggestions in the monograph are prompted by a review of a sample of substantive articles in the social sciences that qualitatively judged compliance with generally accepted procedures for testing latent variables using survey data, and a review of the recent social science methods literature.[4] The book also selectively extends recent results in the methods literature, and proposes novel applications of several others. It provides numerous explanations and examples, and overall is intended as a contribution to continuous improvement in the use of generally accepted procedures for theoretical model tests involving unobserved variables and survey data in the social sciences.

 2002 Robert A. Ping, Jr. 9/20/02 i

THEORETICAL MODEL TESTING STUDIES

Perhaps Bollen (1989:268) best states the objective of theoretical model testing studies that involve unobserved variables and survey data: "In virtually all cases we do not expect to have a completely accurate description of reality. The goal is more modest. If the model... helps us to understand the relations between variables and does a reasonable job of matching (fitting) the data, we may judge it (the model) as partially validated. The assumption that we have identified the exact process generating the data would not be accepted."

Reasonableness or adequacy in model testing studies involving unobserved variables and survey data is addressed by first determining measure adequacy, then determining model adequacy. Measure adequacy is typically determined using conceptual definitions of the unobserved concepts, observed items that "tap into" or measure the unobserved concepts, and, increasingly, model-to-data fit and parameter estimates from measurement models that utilize structural equation analysis. Model adequacy is determined using hypotheses, and model-to-data fit and parameter estimates from structural models that utilize structural equation analysis.

Specifically, social science researchers appear to agree that specifying and testing models using unobserved variables with multiple item measures of these unobserved variables and survey data (UV-SD models) involve i) defining model constructs, ii) stating relationships among these constructs, iii) developing appropriate measures of these constructs, iv) gathering data using these measures, v) validating these measures, and vi) validating the model (i.e., testing the stated relationships among the constructs). However based on articles I reviewed, there also appears to be considerable latitude in some cases, and confusion in others, regarding how these steps are carried out in UV-SD model tests.

 2002 Robert A. Ping, Jr. 9/20/02 i

For example, in response to calls for increased psychometric attention to measures in theoretical model tests in Marketing (e.g., Churchill, 1979; Churchill and Peter, 1984; Cote and Buckley, 1987, 1988; Heeler and Ray, 1972; Peter, 1979, 1981; Peter and Churchill, 1986; among others), reliability and validity now receive more attention in these tests, when compared to the study results. However, the articles reviewed exhibited significant variation in what constitutes an adequate demonstration of valid and reliable measures when unobserved variables and survey data were involved. For example in some articles, steps v) (measure validation) and vi) (model validation) involved separate data sets. In other articles a single data set was used to validate both the measures and the model. Further, in some articles the reliabilities of measures used in previous studies were reassessed. However in other articles, reliabilities were assumed to be constants that, once assessed, should be invariant in subsequent studies. Similarly, in some articles many facets of validity for each measure were examined, even for previously used measures. In other articles few facets of measure validity were examined, and validities for existing measures were also assumed to be constants (i.e., once judged acceptably valid a measure should be acceptably valid in subsequent studies).

Further, methodologists in the Social Sciences have long warned about regressions potential for coefficient bias and sample-to-sample coefficient variation (inefficiency) because of measurement error (Bohrnstedt and Carter, 1971; see Rock, Werts, Linn and Jöreskog, 1977; Warren, White and Fuller, 1974; and demonstrations in Cohen and Cohen, 1983). Nevertheless based on the articles I reviewed, regression still appears to be generally acceptable in some areas as an estimation technique for survey data with variables that contain measurement error. In addition, although many of the studies I reviewed acknowledged the risk of generalizing from a single study,[5] in general there was little subsequent concern about the appropriateness or amount of generalizing from a single study. Because there are other examples, major and minor, such as little apparent concern about violations of the assumptions underlying the estimation techniques used in UV-SD model tests (e.g., the use or ordinal data with covariant structure analysis, which assumes continuous data), it seems fair to say there appear to be fewer generally accepted principles of model validation using unobserved variables and survey data than there could be.[6]

Fortunately there have been important advances in validating UV-SD models. These include new results in developing, testing and evaluating multiple item measures, and estimating models employing these measures. However, some of these developments have appeared in literatures not widely read or easily understood by all substantive researchers in the social sciences.

 2002 Robert A. Ping, Jr. 9/20/02 i

Thus, this monograph is intended for these researchers, and one of its objectives is to selectively identify areas for continuous improvement in the process of testing UVSD models. The book provides a qualitative review of UV-SD model testing practices.[7] It provides selective discussions of the errors of omission and commission in the process of testing theoretical models involving latent variables and survey data, and it suggests, and selectively extends, remedies from recent applicable methods research. For example, it suggests an additional procedure for achieving model-to-data fit, or consistency in a measure, using covariant structure analysis (I will use the more popular term, structural equation analysis, e.g., analysis using LISREL, EQS, AMOS, etc.), and it includes a suggestion for easily executed pretests using scenario analyses. The monograph provides accessible discussions of several overlooked but valuable statistics such as Average Variance Extracted (AVE) and Root Mean Squared Error of Approximation (RMSEA), and it suggests an estimator of AVE that does not rely on structural equation analysis. It discusses matters that may be well-known to methodologists but may not be as well known to substantive researchers, such as a discussion of error-adjusted regression, the use of single summed indicators in structural equation analysis, and the use of a nonrecursive model to investigate directionality or causality. This research selectively discusses recent advances in the detection of interactions and quadratics, and provides a rationale for the more frequent inclusion of interactions and quadratics in UV-SD model tests. It calls for additional attention to measure consistency, and thus model-to-data fit, in structural equation analysis, and argues for a higher thresholds for acceptable reliability based on average extracted variance. This research also renews calls for caution in generalizing from a single study because of the unavoidable risks from violations of methodological assumptions and the use of inter-subject research designs to test intra-subject hypotheses. It also discusses the implications of reliability and facets of validity as sampling statistics with unknown sampling distributions. It suggests techniques such as easily executed experiments that could be used to pretest measures, and bootstrapping for reliabilities and facets of validity. In addition, it suggests an alternative to omitting items in structural equation analysis to improve model-to-data fit, that should be especially useful for older measures established before structural equation analysis became popular.

The first step in UV-SD model testing is discussed next.

STEP I IN UNOBSERVED VARIABLE-SURVEY DATA MODEL TESTING--

DEFINING MODEL CONCEPTS

 2002 Robert A. Ping, Jr. 9/20/02 i

Models with unobserved variables with multiple item measures of unobserved variables and survey data (UV-SD models) involve so called latent or unobserved variables because we observe indirect evidence or indications of these model variables. For example, we can directly observe or measure a concept such as household income, but we can measure only indirect evidence or indications of the concept or construct overall satisfaction. Thus, it is the practice in social science to measure several indications, or what is termed indicators, of each latent variable using multiple-item measures.

 2002 Robert A. Ping, Jr. 9/20/02 i

The construction of these indicators or multiple items is guided by the definition of the construct or concept. These definitions were as a rule clearly stated in the articles I reviewed. Because these matters have received attention previously (e.g., Bollen , 1989:180 and Churchill, 1979), later in this section I will simply summarize the two definitional requirements for the unobserved variables typically involved in theoretical model tests: conceptual definition and operational definition.