How to Use Survey Data From the Community Tracking Study
December 11, 2008
By Paul Leigh and Dan Tancredi
Center for Healthcare Policy and Research, University of California, Davis
Outline
1. Description of CTS Data
2. How to obtain data from University of Michigan
3. How to up-load CTS data onto personal computer
4. How to program
5. References
6. Appendix # 1. Letter to Peter Granda at ICPSR to request data
7. Appendix # 2. Data Protection Plan
1. Description of CTS Data(partially drawn from the website, see references)
The Community Tracking Survey (CTS) is the core research effort of the Center for Studying Health System Change (HSC), which aims to describe how the accessibility, cost and quality of locally delivered health care is determined by the interactions of providers, insurers and policy makers. In addition to in-depth site visits, the CTS is comprised of nationally representative surveys, including ongoingbiennial national surveys of households and physicians, as well as employer surveys and a single health insurer follow-back survey conducted in the first two rounds.The first four rounds were conducted beginning in 1996 and were concentrated in a nationally representative probability sample of 60 communities. The first three of these rounds also included “national supplement” probability sample, which surveyed respondents from throughout the nation and which can be used by itself or in combination with the site sample.
Data from the first four rounds are semi-longitudinal i.e. roughly 50% of households/physicians in a later round having been a respondent in the previous round. The physician survey data in these rounds can be used in panel data analysis of individual respondents. Because the sampling unit for the household survey is telephone number, not household members, survey designers decided that panel data analysis was not feasible with that survey. Household and physician survey data collected over the first four rounds can be used to draw conclusions for the nation as well as for individual sites, either for individual rounds or by pooling multiple rounds. Having data from multiple surveys for a common set of sites permits analysts to relate the individual-level measures obtained from one survey to market-level health system characteristics obtained from the other surveys. Because of changes to the survey design, the 2007-08 Physician and Household Surveys will support analyses on the national-level only. Note that there are no links between any of the survey respondents in the household, physician and employer surveys (e.g., respondents to the Household Survey are not patients of physicians in the Physician Survey).
Household Survey
17,800 individuals in 9,400 families comprise the sample for the most recent Household Survey in 2007, which focuses on tracking changes in health care access, utilization, insurance, perceptions of care quality and problems paying medical bills. The response rate was 43% in 2007. Particular areas of inquiry include access, satisfaction, use of services and insurance coverage. Information about health status, sociodemographic characteristics and employment is also collected. Mathematica Policy Research conducts the Household Survey for HSC. The first four household surveys were conducted in 1996-97, 1998-99, 2000-01, and 2003. The fifth survey was conducted primarily in calendar year 2007.
Physician Survey
Physicians respond to a series of questions about source of practice revenue, problems they face in practicing medicine, quality of care, access to services, information technology, sources of practice revenue and compensation, as well as questions about their practice arrangements and care practices. Over twelve thousand (>12,000) practicing physicians across the country provided perspectives on how health care delivery is changing in the first three rounds of the survey (1996-97, 1998-99, 2000-01), while more than 6,600 physicians were interviewed in round four (2004-05). The 2008 Physician Survey is currently underway and over 4,500 physicians are expected to respond. Unlike previous rounds of the survey that were administered over the telephone, this round uses a mail questionnaire.
2. How to obtain data from ICPSR, including precautions
CTS survey data are available through the Health and Medical Care Archive (HMCA) at the University of Michigan’s Interuniversity Consortium for Political and Social Research (ICPSR). UC Davis is a member of ICPSR and public-use versions of the survey data are freely available to UC Davis researchers and can be downloaded from the web. Web downloading requires that the user complete a simple registration form and agree to reasonable terms regarding data use. The public-use data are very helpful for preliminary analysis. However, these files do not include certain data fields whose inclusion could jeopardize the confidentiality of survey participants. In particular, the public-use data do not include site identifiers and other survey design variables used by survey data analysis procedures to account for the complex survey design, nor do they include physician identifier codes that can be used to link data from adjacent survey rounds in order to conduct panel data analysis.
The so-called restricted-use data contain more complete information, including the data fields needed for design-adjusted and panel data analysis. Access to these data requires that the user apply for permission and agree to strict terms contained in a data use agreement. Practically speaking, the user needs to write a letter, fill-out a Data Protection Plan, and then snail-mail these to the person in charge of the CTS at the ICPSR. See the appendix, below, for the letter and Plan we sent in 2006. In response, the requested data arrived on a single CD about two weeks after we sent our letter.
The ICPSR people are quite serious about guarding the data while using it as well as destroying it once you have finished. They contacted Paul in July, 2008, and asked whether we had destroyed the data. We said “no” and asked for an extension, which they granted, but only until January, 31, 2009.
3. How to up-load CTS data onto personal computer
The data products in the CTS series are similar to other titles in the ICPSR highly regarded archives in that they are well documented and include data definition statements that allow the data to be easily used by a person with basic experience using any of the three major statistical analysis applications SAS, SPSS or Stata. Data products are identified by the combination of version (public-use vs. restricted-used), survey type (household vs. physician vs. employer vs. health plan) and survey round. Each product includes a rectangular ASCII data file, data definition statements specific to SAS, SPSS or Stata, a very useful and complete user’s guide and separate codebook (in PDF format) and a small collection of small text files with miscellaneous notes of potential interest to user.
For example, to read the data into SAS, one would simply need to load the accompanying SAS data definition file into the SAS Program Editor and update the INFILE statement so that it refers to the location of the ASCII data file. It’s also a good idea to modify the DATA statement by supplying a dataset name (otherwise SAS will give the dataset a default name, such as DATA1). Once the data definition statements (i.e. SAS program) have been updated, they can be submitted in SAS for execution.Execution of the data definition statements will create user-defined formats and a SAS dataset with variables properly labeled and formatted, ready for additional analyses. It’s a good idea to save the statements as a program file, which can then be reused and expanded to include additional data and proc steps needed for particular analyses.
4. Data analysis
It is highly recommended that the user browse the user guides that accompany CTS survey data, in order to become acquainted with important features of the CTS data. Most notably, the CTS surveys follow complex probability sampling survey plans that involve such features as unequal response probabilities, stratification and clustering. To account for these features when making point and variance estimates for statistical parameters of interest, the use of SUDAAN software is the preferred and sometimes only option. The survey design adjustments available in SAS and Stata can provide reasonable adjustments for many analyses. Guidance on these matters is available in the user guides and in documentation referenced therein to technical publications available from the CTS website.
In addition, the user should be aware that missing values for many key variables are replaced by imputed values. Another notable feature about the data is that respondents in multiple rounds of a survey are not identifiable by a single ID field. Instead, the CTS supplies adjacent-round panel ID fields. These allow, say, the linking of round 4 data to round 3 data. In order to link round 4 data to round 2 data, though, one has to make use of this ID as well as the separate panel ID field on the round 3 survey that allows linking to the round 2 survey.
These and similar features can present serious pitfalls to a user who ignores the information contained in the user guides! Additional details and guidance are available in the extensive (but somewhat formidable) collection of publications available at the HSC/CTS website (
5. References
6. Appendix # 1. Letter to ICPSR to request data
DEPARTMENT OF PUBLIC HEALTH SCIENCES
UNIVERSITY OF CALIFORNIA
ONE SHIELDS AVENUE
DAVIS, CALIFORNIA95616-8638
(530) 752-2793
FAX: (530) 752-3239
August 3, 2006
Peter Granda
Health and Medical Care Archive
ICPSR
330 Packard, Room 2132
Ann Arbor, MI 48104
Dear Mr. Granda,
I am writing to ask for some Restricted Data from the Community Tracking Study Physician Survey. I have enclosed three copies of my application. If I have forgotten anything from the application would you please let me know.
Thank you for considering this request.
Sincerely,
Paul Leigh
Professor of Health Economics
Department of Public Health, TB168
UC DavisMedicalSchool
One Shields Avenue
DavisCA 95616-8638
530-754-8605
7.. Appendix # 2. Data Protection Plan
Date: August 3, 2006
Application for Community Tracking Study Physican Survey, 2000-2001 Restricted Data File
Contents:
- Application
- Vitas
- Detailed Data Protection Plan
1. Application
For Project Entitled: “New Estimates of Career Satisfaction Across Specialties”
Name of Principal Investigator: J. Paul Leigh
Title: Professor
Department (if applicable): Public Health Sciences
Organization:University of California Davis Medical School
Street Address:One Shields Ave, TB-168
City, State, ZIP: Davis, California, 95616-8638
Phone: 530-754-8605
Fax: 530-752-3239
Email:
Name of Co-Principal Investigator (if applicable): None
2. Title of research project for which the CTS Physician Survey, 2000-2001 restricted data file is requested.
New Estimates of Career Satisfaction Across Physician Specialties
3. Short description of research project including research questions, primary methodology, categories of variables to be used (attach additional sheets if required).
The proposed project will rely on a prior study entitled:
Physician Career Satisfaction Across Specialties
And written by J. Paul Leigh, PhD; Richard L. Kravitz, MD, MSPH; Mike Schembri, MS; Steven J. Samuels, PhD; Shanaz Mobley, BS
Arch Intern Med.2002;162:1577-1584.
ABSTRACT
Background The career satisfaction and dissatisfactionphysicians experience likely influence the quality of medicalcare.
Objective To compare career satisfaction across specialtiesamong US physicians.
Methods We analyzed data from the Community Tracking Studyof 12474 physicians (response rate, 65%) for the late1990s. Data are cross-sectional. Two satisfaction variableswere created: very satisfied and dissatisfied. Thirty-threespecialty categories were analyzed.
Results After adjusting for control variables, the followingspecialties are significantly more likely than family medicineto be very satisfying: geriatric internal medicine (odds ratio[OR], 2.04); neonatal-perinatal medicine (OR, 1.89); dermatology(OR, 1.48); and pediatrics (OR, 1.36). The following are significantlymore likely than family medicine to be dissatisfying: otolaryngology (OR,1.78); obstetrics-gynecology (OR, 1.61); ophthalmology (OR,1.51); orthopedics (OR, 1.36); and internal medicine (OR, 1.22).Among the control variables, we also found nonlinear relationsbetween age and satisfaction; high satisfaction among physiciansin the west north Central and New England states and high dissatisfactionin the south Atlantic, west south Central, Mountain, and Pacific states;positive associations between income and satisfaction; and nodifferences between women and men.
Conclusions Career satisfaction and dissatisfaction varyacross specialty as well as age, income, and region. These variationsare likely to be of interest to residency directors, managedcare administrators, students selecting a specialty, and physiciansin the groups with high satisfaction and dissatisfaction.
This prior study used the first wave of the CTS data from the 1990s. We would simply like to update the original study by using the 2000-2001 data .
4. What types of data from other sources will be merged with the CTS Physician Survey, 2000-2001 restricted data file?
We do not plan to merge any data with the CTS restricted file.
5. State reasons why the CTS Physician Survey, 2000-2001 public use data file is not adequate for conduct of the research project.
The public use file contains only 6-7 very broad specialties. Since the focus of our study will be on differences across specialties, we need more than 7 specialties. Our prior study used the restricted file for 1996-97.
6. Describe all the ways that you intend to use the results of the research, including plans for public dissemination.
We plan to publish the paper in a good medical journal.
7. Provide names, titles, and affiliations of other members of the research team who will have access to the restricted data or to output derived from these data. If not all members have been selected, please list as "unassigned" and indicate the job titles. Include individuals who are employed by different organizations.
Richard Kravitz MD, Director, Center for Health Services in Primary Care, and Professor of Medicine, University of California Davis Medical Center; Dan Tancredi PhD, Senior Research Statistician , the Center for Health Services Research in Primary Care, University of California Davis Medical Center.
8. If employed at an organization that has a current NIH Multiple Project Assurances (MPA) Certification Number or Federal Wide Assurances (FWA) Certification Number, please provide the number and expiration date.
FWA 00004557
9. If a member of the proposed research team, including subcontractors, is employed at an organization that does not have an NIH Multiple Project Assurances (MPA) Certification Number or Federal Wide Assurances (FWA) Certification Number, please respond to the following questions:
Not applicable. Both Tancredi and Kravitz are employed by the University of California, Davis.