Questionnaires

Introduction

Questionnaires

Introduction

Robin Beaumont

25/02/2009 17:03

Contents

1.Learning outcomes check list for the session......

2.Introduction......

3.Overview of the process......

4.Literature Review......

5.Decide Aims......

6.Operationalise Concepts......

7.Formulate Hypotheses......

8.Choose Sample......

9.Design Instrument / Coding frame......

9.1Web based and email questionnaires......

10.Pilot......

11.Review......

12.Administer instrument......

13.Follow-up to attain adequate response......

14.Process data......

14.1Preparation......

14.2Data cleaning......

14.3Analyse......

15.Produce report......

16.Checking what you have learnt and finding out more......

17.References......

1.Learning outcomes check list for the session

Each of the sessions aims to provide you with both skills (the 'be able to's' below) and useful information (the 'know what's' below). These are listed below. After you have completed this session you should come back to these points ticking off those you feel happy with.

Learning outcome / Tick box
List the thirteen main stages in questionnaire development / 
Be aware of the two main aims of a literature review when planning a questionnaire / 
Understand the difference between the aims of a one off survey and a longitudinal or cohort study / 
Understand the process of Operationalisation / 
Be able to formulate appropriate hypotheses that could be tested by a questionnaire / 
Be aware of the different types of sampling / 
Know how to create a systematic random sample / 
Understand the stages that you must go through before actually developing a questionnaire / 
Know what a coding frame is / 
Be able to produce a simple coding frame from a questionnaire / 
Be aware of the coding frame production facility in SPSS / 
Understand the purpose of the pilot stage / 
Be aware of the possibility of converting open questions into closed questions during the pilot stage / 
Be aware of the issues to consider when administering a questionnaire / 
List the 6 important details to provide in a cover letter or introduction / 
Be aware of the importance of follow-up / 
Be aware of the three stages of data processing / 
Be able to name the three stages of data cleaning / 
Be aware of some of the issues to consider regarding report production / 
Be aware of the stages when a statistician must be involved / 
Understand the importance of statistical involvement at the initial stages of the research design / 

2.Introduction

Questionnaires have gained a universal popularity as a method of correcting information in many areas of research and this chapter presents an overview of the process. This introduction does not attempt to be at all rigorous in terms of activities suggested or discussion of the various problems associated with questionnaire development. Rather it is intended to be a practical guide for those researchers who wish to use a questionnaire as part of their research. Specifically it is not a guide for those research units that specialise in questionnaire development.

An excellent online resource is the Online Evaluation Resource Library at: (active on 25/02/2009).

3.Overview of the process

The diagram below gives an overview of the thirteen stages in questionnaire design for someone who may be planning to develop a questionnaire as part of a research project. If you were developing a national survey there would be additional stages to those given below.


4.Literature Review

Doing a thorough literature review can mean you have almost finished before you began. Numerous, well designed, validated questionnaires are available for most areas you would be interested in researching. One major advantage of using a well known questionnaire is the fact that your results can be compared with others. It also ensures that you measure concepts in a acceptable manner. For example it would be unwise to attempt to access the incidence of nausea and vomiting in a group of patients unless you knew your measures for the concept used the same criteria as previous/concurrent nausea / vomiting research. This issue is discussed further below in the operationalisation section.

Therefore the two main aims of the literature review are:

To identify appropriate questionnaires + published results

To identify measures which have already been developed

5.Decide Aims

This may be the result of the literature review or imposed upon you by some higher authority. Two common uses of questionnaires are that of investigative surveys (i.e. the one off cross sectional study) and longitudinal studies, that is following a group of people that have a common shared experience over a time period ('Cohort Study'). If you are re-administrating a survey to a group over time you will need to consider two conflicting aspects:

Learning/practice effects If you re-administer a survey to the same sample there is a possibility that responses will change due to learning (i.e. I Q tests). Often equivalent tests are developed in these circumstances. Somewhat paradoxically where this problem is not considered to exist questionnaires are often subjected to test - re-test reliability testing. However there are alternative (?better) methods of testing reliability (see Oppenheim p160).

Time effects If you are planning to use a questionnaire repeatedly over time there are numerous problems. For example while it is often recommended to keep questions identical over time to ensure that responses can be compared this is not always possible. Both specialist terms, such as diagnostic terms can change meaning as well as words in the general vocabulary (i.e. the word gay, tea, coffee, bedroom - see Oppenheim 1992 p125).

Questionnaires are also often used in less academic settings such as in Evaluation,QA (Quality Assurance) and Audit. Often in these situations part of deciding the aims involves standard setting and standard settingitself often involves deciding acceptable levels of response to specific questions, or example you may decide thatyou will want more than 80% of respondents to indicate they are happy or very happy with the service for your service to be classed as acceptable. Another approach istolerance bands (see: ) for example you may say that when measuring waiting time for patients in a GP practice thatacceptable waiting times are between 5 and 20 minutes.

6.Operationalise Concepts

Operationalisation means turning a concept into one or more measures, Intelligence and health are well known examples there are also more exotic ones such as exhaustion = concept; measures = heart rate and Borg scale rating.

This is often the most difficult part of the process. Poorly operationalised concepts mean that you fail to measure what you think you are measuring (poor construct validity).

Often complex concepts such as health, personality and aspects such as motivation have been analysed statistically to define particular sets of items that measure them most accurately, rather than just a list of 'I thinks'. This process is not easy, for example the attempt to measure intelligence has been going on for over a hundred years (see: ) now and there is still no real agreement between the different approaches to its measurement. The measurement of Health has a similar if not longer history! For examples and discussion of over one hundred health questionnaires seeMeasuring Health: A Guide to Rating Scales and Questionnaires by Ian McDowell 2006 OUP. You can see a preview of the book on Google books at, A good description of a the process of developing a one page 17 item health questionnaire (the Duke health profile) from a 63 item version is given in Parkerson, Broadhead and Tse 1991 at (need login password).

7.Formulate Hypotheses

This is similar to defining the aims. One off, investigatory studies often do not have explicit hypotheses other than finding out more about certain characteristics of the sample. The hypotheses should be testable i.e. able to be analysed from the dataset. Hypotheses within questionnaires often concern:

Differences across groups Was there a statistically significant lower incidence of X in group Y. To achieve this type of analysis you must make sure you collect the relevant data to allow you to divide the data up appropriately (i.e. if you wish to divide the data up across specialty you must collect the specialty value as a numerical code for each case).

Relationships between variables Incidence of gastric ulceration symptoms against age. Again therefore make sure you collect the relevant data (possibly actual age rather than age bands?) which will allow you to obtain the appropriate information.

8.Choose Sample

This is often overlooked with the effect that the results can not be generalised to a group that the researcher originally hoped they could be. For example collecting information about fails at a A&E department in a relatively wealthy area may mean that the results do not generalise to a poorer neighbouring district.

If you were undertaking a methodologically rigorous trial (i.e. part of a RCT trial) this process would be greatly expanded with statistical procedures used to develop a sample frame (something from which you could draw an appropriate sample) and sampling procedures to ensure you had the correct sample and it possessed the appropriate size.

There are two broad types of sample, Random (probability sample) or non-random. Simple Random Samples (SRS) should theoretically be typical of the population from which they have been drawn, whereas with non-random samples you do not have this assurance.

The process of selecting an simple random sample (SRS) is frequently very time consuming. For this reason a Systematic random sample is often used instead. In this process you select every nth element of a sampling frame. Rodeghier M 1996 (p29) describes the process thus:

Decide how many elements should be in the sample.

Calculate the sampling fraction as 1/n. For example, the hospital that plans to survey its patients has a list of 10,000 but needs to contact only 1000. They must sample 1/10 of the population, or one in every 10 patients. 1/n is the sampling frame, where n=10.

Use a random number table (just once) to pick a number within the sampling interval from 1 to n (for the hospital, the interval must be 1 to 10). Let's call this number Sn.

Pick the Snth element in the sampling frame as the first person in the sample. If the researcher for the hospital had chosen 5 as the random number starting point, he or she would pick the fifth patient on the list.

Now add n to Sn (5+10=15 for the hospital) and choose that element next. And so one until you reach the end of the list.

Cluster sampling involves selecting a random sample from a geographically defined group. The sampling process may involve Multiple sub-samples to choose from resulting in Multi-stage sampling (note this term is used sometimes in place of cluster sampling, whereas at other times it is considered to be distinct from it - read on). An example of multistage sampling would be the situation where we wanted to sample therapists in rehabilitation departments across the country, it is unreasonable to expect we could easly compile such a list. We therefore need some strategy that will allow us to link members of the population to some already established grouping that can be sampled. Suppose we want to generate a random sample of 500 therapists. In stage 1, we choose a random sample, or cluster of 20 states. In stage 2, we select a random sample of 100 hospitals from the 20 states. In stage 3, we randomly select 5 therapists from each hospital. (taken from Portney & Watkins 1993 p119). Note that this is also called a three stage cluster sample. Multistage sampling can involve any type of sampling techniques at each stage.

A common type of non-random sampling is that of 'convenience sampling' e.g. volunteers. An extension of this is snowball sampling where respondents provide further contacts. (i.e. members of weight-watchers provide other respondents). The problems with these types of non-random sample can be seen immediately.

One very important factor to consider is the possible prevalence of a particular factor you may be interested in investigating. Suppose you were interested in studying albinos, or male anorexics, a random sample of the entire population would have to include thousands before you even found one. A more logical approach would be to use the non-random sampling technique of 'rarity' or 'expert' sampling whereby you obtain a list of all the subjects with a particular characteristic (i.e. a sample frame). You may then decide to take a random sample from the sample frame.

Another type of sample is a stratified sample this can be either random or non random. The population is divided into strata (i.e. different specialties in a hospital). Stratified random sampling can be proportionate, so that the size of the strata correspond to the size of the groups in the population or it can be disproportionate. DisproportionateStratified random sampling is where the researcher deliberately produces random samples of the required size (e.g. possibly equal) to facilitate comparison across strata (e.g. specialty). A non-random sampling equivalent to this is Quota sampling where each strata is sampling by convenience sampling or some other non -random method.

A particular type of time sampling called historical control is sometimes used in medical research. A historical control group is a sample formed from subjects which will have presented in the past and are considered to be 'similar' to some other group currently under investigation. The dangers of such an approach are readily apparent.

The aim of most sampling techniques is to allow the researcher to reduce the sample size as much as possible without compromising the validity of the research significantly. For example a disproportionate stratified random sample can reduce the sample size required from 963 to 175 (Wilson 1975 p.123).

Whichever sampling method you choose you should consider how you wish to generalise your results and how you intend to analyse the data. Statistical advice should always be sought AT THIS STAGE.

H W Smith p105 - 30 provides an excellent chapter on sampling methods.

I have deliberately avoided mentioning sample size calculations. Rodeghier provides practical advise on sample size with further details in Czaja & Blair 1996. They also provide details of developing a sampling frame from a telephone directory. Rodehier provides the following advice:

Base the sample size on a minimally adequate sample size for the important subgroups you may wish to analyse. You should strive to include at least 50 respondents, preferably 100 in each important subgroup. A hospital surveying patients interested in looking at differences across disease categories could plan to include at least 50 patients who had cancer, 50 with heart disease, 50 women who gave birth, and so on.

The actual sample size to be drawn should be:
sample size = number of respondents/ planned response rate

One last type of sample which is used in clinical trials but rarely in questionnaire design is a sequential sample. A sequential sample is one where observations are made (i.e. subject recruited) until enough data have been collected to make a decision. In this situation the data set is constantly being re-analysed until the statistician says no more.

Concerning questions of sample size always consult a statistician at the design stage NOT after you have collected the data.

9.Design Instrument / Coding frame

The design of the actual questionnaire should be relatively painless if all the above stages have been carried out. All questions should be clear and unbiased. You should have also decided the method of delivery (see section 'administer instrument' below) Actual questionnaire design will be covered in greater depth in another handout.

A coding frame should also be developed. This is essential no matter how small the project. Each question will be either pre or post coded, that is coded before administration or coded after a sub-sample of the completed questionnaires have been analysed. Post-coding is often used for open ended responses, such as 'any comments' , a researcher will go through a sub-sample of the completed questionnaires noting responses and producing codes for them which will be used for the entire sample. Below is given an example of part of a questionnaire and the relevant part of the coding frame:



SPSS has a facility to produce a coding frame of sorts which you can find from the main menu: Utilities -> File Info. Obviously you need to sent the datafile up in the appropriate way for this facility to be of any use (see Rodeghier 1996 for details).


Often people collect mountains of data much of which is unnecessary and impossible to analyse. A good idea is to plan what you intend to put into the report at the end of the research, and check that you can get it out of the questionnaire. What you do not intend reporting on in the report should not be in the questionnaire!

If you are measuring complex concepts such as motivation, fear etc. you will need to use either complex statistical procedures (factor analysis, Reliability analysis) or a well tried questionnaire. As a last resort you can use an 'expert in the area' to help define the questions for a pilot study which would then be analysed using the above techniques to provide the structure to the final version.

9.1Web based and email questionnaires

The new millennium has seen the rise of the ubiquitous internet. Now most people have internet access and use it as a part of life. Web based or emailed questionnaires are now extremely common. Most web based questionnaires make use of a database back end and you can use free web sites to help you set up such questionnaires, however it does require some IT skills.

Adobe Acrobat provides a method (via a program called Livecycle) whereby you can create PDF forms which can be collated on the machine that sent them, which also offers rudimentary form tracking facilities. Microsoft provides a similar facility. Both these approaches cost money, around £9000 for the full version of Adobe Acrobat, but for this amount they do also provide a web server facility where the completed forms can be sent to instead of back to you, where you can download the collated datafile.