1st QC
1)Copy all files for quality check from the processor’s /qc directory to a temporary directory on your work space. These files can be removed after the study has been QC’ed and turned over.
2)Review the processing history file(s) (e.g., ph01234-0001.txt) AND the processing checklist to check what the processor did to the data and documentation.
a)Watch for any recodes and/or reformatting.
3)Ensure that pre- and post-Hermes files match, without any loss of data in the conversion (use Robert’s script).
4)Open each setup file and check the following:
a)Check that there are no extra blank lines or analysis commands (such as frequencies or display dictionary) at the end of the file.
b)Check the LRECL from the data list/input statement against the metadata record.
c)Run setup_tst script to test and examine the resulting log files.
5)Test each setup, supplementary syntax, and ready-to-go file.
a)Check for errors and warnings.
b)Check frequencies and data dictionary for each variable.
i)Are there any confidential variables?
ii)Does the case count match the metadata record?
iii)Are there variable labels for all variables?
iv)Are there any unstandardized missing values or wild codes?
v)Do variable and value labels make sense?
vi)Are there any typos?
vii)Incorrect capitalization/punctuation?
viii)Do the frequencies make sense? (e.g., all variables are not all missing, code categories are mutually exclusive).
ix)Is there a unique record identifier?
x)Are variable labels being truncated?
xi)Are variable labels unique?
c)Test weight variables, if present.
d)Check SPSS dictionary for insufficient widths (*) and value labels that don’t appear in frequencies.
e)Check SAS log to make sure all formats have been outputted.
f)Check the variable count, case count, and LRECL in the SAS log against the metadata record.
6)Check the Codebook.
a)Check metadata record LRECL against the codebook column locations.
b)Check bookmarks to insure that they work.
c)Check that the title and study # is correct.
d)Check overall layout and display. For example, if you are expecting numbers in the frequency table but letters or symbols are displaying then this is a potential problem.
e)Check for any unusual truncated variable or value labels.
f)Are all of the variables present in the codebook?
g)Are the variables listed in the same order as they appear in the data file?
h)Is the display showing extremely long lists of frequencies for continuous variables?
i)Do the frequencies and descriptive statistics look reasonable based on what you know about the data?
j)Do the processing notes make sense? Is anything missing?
k)What other documentation will be included with the study (e.g., questionnaire, appendices, other original documentation)?
7)Check the Metadata Record.
a)Do all of the fields make sense?
b)Do all of the fields follow the conventions of the metadata record manual?
c)Are the fields consistent (e.g., Does the data source match the design? Do date fields match the title?)
d)Check for spelling errors.
e)Make sure file specifications are accurate.
Create a QC processing history file (e.g., ph01234-0001_qc1.log) listing QC results and submit it to the processor. (The processor will add this 1st QC processing history file to her/his /turn directory.) Meet with the processor to explain the corrections that need to be made to the study.
2nd QC
1)Copy all files for quality check from the processor’s /turn directory to a temporary directory on your work space. These files can be removed after the study has been QC’ed and turned over.
2)Review the notes from the 1st QC log file (which should be included by the processor in the /turn directory) and verify that corrections have been made.
3)Check setup, supplemental syntax, and ready-to-go files.
4)Review documentation (metadata record, codebook(s), and other files).
5)Run turnqa, the automated QC tool.
Create a new QC processing history file (or add to the existing QC log), listing QC results and submit it to the processor. (The processor will add this QC processing history file to her/his /turn directory and turnover the QC file(s) with all other final data and documentation.) Meet with the processor to explain the corrections that need to be made to the study.
SDA 1st and 2nd QC
The SDA QC takes place after the files have been turned over and approved for Web release by the Release Management department.
Once the data and HTML codebook are ready, run submitsdafrom the directory just above VARS. This script will allow you to copy your SDA files to n internal testing area and will e-mail you an automatically generated test URL. Be prepared to specify the names (if any) of the weight, stratum, and/or cluster variables.
Processors (using the following checklist) will perform the 1st QC. If there are any errors or files that need to be replaced, make the necessary changes and re-run submitsda. Once the files are ready, e-mail the test URL to the processor performing the 2nd QC, who will also perform the same checks below.
In order to QC the SDA files, you will need to review the HTML codebook. In general, look for any typos, missing variables, missing variable names, and missing value labels. Variables and labels should match the Hermes-generated PDF codebook.
1) Open the HTML codebook via an Internet browser such as Internet Explorer,
Netscape, or Mozilla Firefox. This can be done in either of two ways:
a.In the browser: Choose FileOpen File and browse to locate correct file directory Double click on #####.htm.
b.Via Windows Explorer: Find directory containing the SDA codebook Double click on #####.htm. This will automatically open the file in the default browser.
2) Click on each of the following links (located in the far left frame) and check that the correct information is being displayed. Specifically check:
GROUP HEADINGS
- Do the variable group headings make sense? Are the headings to broad or general?
- Click on each heading. Do the variables listed fit their group heading?
- For each variable listed, is the question text complete or is it truncated? Are there any typos in the question text?
- Is every variable in the data file listed? Each variable should be listed only once in a particular group.
STANDARD VARIABLE LIST
- Is every variable in the data file listed? Each variable should be listed only once in a particular group.
- Click on each variable and check the frequency tables listed (NOTE: If the data file contains weights, these frequencies will be unweighted). Are there any errors in the frequencies, such as blank lines/undocumented codes, missing percentages, missing labels? Are variables that should be blanked for confidentiality correctly blanked? Is the data type listed correctly? Are the missing data codes designated and listed correctly?
ALPHABETICAL ORDER
- Is every variable in the data file listed?
3) Continue checking the metadata links (Title Page, Data Collection Description, etc.). Is the correct information being displayed? Make sure the file specifications are accurate.
4) Check the processing notes. Do they make sense? Is anything missing?
5) Should there be any other documentation included that currently is not, such as detailed weighting information or appendices?
6) In the frame where users can select an action. Test some of the analysis options, like running a frequency or crosstabulation (NOTE: If a weight variable is present, it will automatically be applied for all analyses, unless you turn the weight off via the drop down menu). Run some test analyses with and without the weight variable applied. Are accurate results being generated?
Approved by the General Archive on10/6/2006