MAINTAINING DATA QUALITY ~ GUIDELINES FOR NCI STATES
Overall Considerations
- Cleaning data is an expensive process
- Large degree of heterogeneity among states requires generating a lot of data transformations
- The more the data deviates from NCI protocol, the longer it takes to complete the analysis
- It’s better to give us more information about your data so that we don’t have to guess
- Complying with NCI protocol will increase data reliability
CONSUMER SURVEY
When Collecting Data:
Rule #1: Collect all information.
- All information is critical
- Do not dismiss anything (e.g., interviewer feedback)
- Missing data is a problem!
- Reduces N = less power
- Findings are non-representative…even of the sample
- Inferences are invalid
- Estimates are unstable, biased
- Errors are large – Type II errors
Rule #2: Followskip patterns in the survey.
Rule #3: Provide ongoing training and monitoring of interviewers.
- Refresher trainings each year to review any changes
- Conduct “shadow” interviews to ensure reliability
- Conduct follow-up validation calls on a sample of cases to ensure that the interview took place and that proper protocols were followed
- SCREEN COMPLETED SURVEYS TO MAKE SURE THERE IS NO MISSING INFORMATION (see Rule #1 above)
- Create an email distribution list for Q&A
- Contact HSRI staff if you have any questions
When Entering Data:
Rule #4: Maintain compliance with NCI formats provided in codebooks, which are located on the “tools” section of the website.
- Use NCI variable names and codes
- If you must change them, please send HSRI staff a list of changes
- If you need help recoding or constructing a crosswalk, please let us know
Example: One state pulled diagnostic information from a state database. Instead of having separate variables for level of MR and for each type of other disability, the file contained a field that looked like this.
When Cleaning Data:
Rule #5: Check your N.
- Make sure file is complete and contains the correct number of cases (e.g., if you have ten regions that collected surveys, make sure all ten are included in the file)
- Background SectionN should match survey N
- Verify you don’t have duplicate cases
- Check for cases under age 18
When Submitting Data:
Rule #6: Provide a description of how the sample was drawn, especially if sampling design deviated from NCI specifications, for example:
- You used a sampling design other than a simple random sample
- Your sample excludes certain portions of your adult service population (e.g., only Waiver recipients were included, people who live in institutional settings were not included)
Rule #7: Name your file appropriately
- Don’t use a generic name like “HSRI consumer survey” – put year and state somewhere in the name, e.g., “Consumer survey 05-06 DE”
Rule #8: Tell us if you made any modifications to the survey (even minor changes!).
- Send along a copy of your version
- Tell us if any coding differs from the NCI codebooks
- If necessary provide a crosswalk
- Remove extra questions, unless you have arranged with us to analyze them
Rule #9: Tell us if you are planning to request additional analyses from HSRI.
- Extra questions
- Sub-state analyses
When You Receive Your IndividualState Report:
Rule #10: Review the results to see if anything stands out as unusual or inaccurate.
FAMILY SURVEY
When Entering Data:
- Please make sure your data entry folks follow the coding provided in the codebooks. Our biggest problem, people who enter “0”s instead of “2”s for “No” responses. (See Rule #4 above)
- For questions that allow the respondent to “Mark all that apply”, data entry people should be entering “1” for checked (yes) responses, and “2” for unchecked (no) responses. Our biggest problem, people who enter “9” (missing data) for unchecked responses. “9” should only be used when ALL responses (every single one, even the one that says “none of the above”) are left blank.
Instructions for entering qualitative comments:
- It is very important that we receive all qualitative comments. Please do not summarize the comments. We need the comments verbatim.
- It is best if you enter the comments in a Word document instead of the Access, Excel or SPSS file used for quantitative data.This is particularly important if you use SPSS, which may cut off some of the longer comments. If you are using the most current versionsof the Access or Excel files, this should not be a problem.
- Do NOT type text comments in all capital letters.
- Translate all non-English comments. Indicate that this text has been translated. Specify what language was translated.
- Remove identifying information. For example, enter “[name]” instead of someone’s name or “[provider]” instead of the name of the provider.
When Cleaning Data
- Many states add their own state-specific questions to the NCI surveys. Please take out the responses to your state-specific questions (i.e., delete these columns) before sending us your data. The only exception to this would be if you’d like to request that we do some additional analysis for your state using the added questions. If this is the case, please let us know this when you send the data in.
When Sending in Data
- Let us know this information so we can accurately calculate response rates:
- How many surveys were originally sent out (e.g., 1,000 Child Surveys, 600 Adult Surveys)
- How many surveys were returned due to bad addresses, people had moved, people had died
- Let us know if surveys were sent only to a specific sub-population (e.g., only to adults receiving vocational/employment services, or only to children receiving waiver-funded services)***By the way, we STRONGLY recommend that your sample is representative of ALL the folks receiving services, and NOT solely those in a sub-population. BUT, if this is how you’re sampling, please let us know as we need to make note of it in our reports.(See Rule #6 above)
- Let us know if you’d like us to do any analysis of “extra” state-specific questions. This way, we can clean your data before removing any additional survey questions. If we clean your data (i.e., remove your extra questions, remove non-qualifying responses) and THEN you ask for additional analysis ~ it can be very time-consuming to go back and re-clean the raw data. So, let us know ahead of time if you even THINK you might want some additional analysis completed by us.(See Rule #9 above)
1
March 1, 2007