Sampling

Studies are rarely carried out on everyone in a population. In most field data collection activities, irrespective of the methodology, a sample is drawn from a specific group of people with the objective of obtaining a sub-group that is representative of the larger population. In the end, decisions about samples will be a compromise between cost, accuracy (including how accurate the information has to be to satisfy the M/E stakeholders), the nature of the research question(s), and the art of the possible.

Basic elements of designing sampling procedures

Determine whether it is appropriate/feasible to collect data from all the sources or if a sample needs to be studied.

Define the population to be studied.

Decide whether a control group is needed.

The choice of sampling techniques and sample size depends on the:

§  use/users of the M/E activity, and especially the users demands in terms of type and level of precision and when results are needed;

§  data collection techniques to be used – quantitative or qualitative;

§  personnel available, e.g. number and skills;

§  resources and time constraints – logistics, access (seasonal, insecurity).

Decide whether to use a probabilistic or non-probabilistic sample .

Identify the sampling frame.

Decide the sample plan and select elements. This depends on

§  The degree of accuracy required

§  The degree of variability in the total population with respect to the issues under study

Categories of sampling:

Probabilistic and non-probabilistic

There are two categories of sampling: probabilistic and non-probabilistic.

Characteristics / Examples
Probability samples / In all of these, everyone in the studied population has an equal and known chance of being selected in the sample. / ·  Simple random
·  Systematic
·  Stratified
·  Cluster
Non-probability sampling methods / In these sampling methods, population units have unequal and unknown chances of being selected. / ·  Purposive
·  Quota sampling/maximum variation
·  Convenience
PROBABILITY SAMPLES

There are four types of probability samples:

Simple Random Sampling (SRS)
1  2 3 4 5 6
7 8 9 10 11 12
13 14 15 16 17 18
19 20 21 22 23 24
25 26 27 28 29 30 / Everyone in the population is assigned a number.
Random numbers are chosen using a calculator, computer or random number tables.
All individuals’ names are written on pieces of paper and the lottery method is used to pick names.
The selection of one individual is always independent of the selection of another individual. / ·  Many statistical tests are designed for this type of sample.
·  Evaluators need a complete list of everyone in studied population.
Systematic sampling
1 2 3 1 2 3 1 2 3
4 5 6 4 5 6 4 5 6
7 8 9 7 8 9 7 8 9
1 2 3 1 2 3 1 2 3
4 5 6 4 5 6 4 5 6
7 8 9 7 8 9 7 8 9 / A list of individuals/households is numbered.
The list is divided into ‘n’ equal sections (‘n’ being sample size).
One unit is selected using SRS.
Corresponding numbers (one from each) are taken from each section (e.g. if the sixth unit is chosen from the first section, then it should be taken from all the other sections). / ·  It is possible to list the population in a meaningful manner (such as putting each county together) and thus ensure samples are taken from each county.
Stratified sampling

/ Divide the population into homogeneous sub-groups.
Select sub-samples from each sub-group.
(If one selects them in proportion to the size of the sub-group, the sample does need to be weighted later.) / ·  Maintains representation with small samples being even when the population is quite large.
·  Ensures representation if the population is highly heterogeneous and some groups might be left out of the sample.
·  Is cost-effective and time-saving.
Cluster sampling / The population is divided into different groups (such as counties, provinces, and communities).
Sometimes, a second layer of selection is done, some of which is chosen at random.
A few clusters are selected from the chosen groups.
All households or people of selected clusters are chosen to form the sample. / ·  Used where there are budget constraints, less time, the population is large and spread over a large area, the survey group is small, and it is easier to divide the population into heterogeneous groups. In sum, virtually all surveys UNICEF becomes involved with employ some form of cluster sampling.
·  Used in large surveys to measure goal indicators (MICS survey) where complete lists of households or individuals are not available and are too expensive to construct.
·  Used in cases where a random sample of households or individuals would be logistically problematic for a survey because of the distances one would need to travel.

The four types can be combined in multistage sampling:

·  A larger population is first divided into different groups/sectors using pre-determined criteria and a sample is selected.

·  In subsequent stages, these samples are treated as the population and divided further using a set of criteria to select the desired sample.

·  One or more types of sampling methods could be used at different stages.

NON-PROBABILITY SAMPLES

There are three types of non-probability samples:

Purposive / In this sampling method, a place is chosen for the sample “on purpose,” hence “purposive”. Purposive criteria can be context-related (e.g. gender and age, cultural ethnic composition, wealth, location factors) as well as project-related (e.g. project/non-project sites, good and poor performers, stage of implementation). For example, sentinel communities might be chosen because it is known that they are where the programme has been implemented very well, while others are chosen because they are similar to the programme communities in many aspects except that no programme implementation has taken place there).
Quota sampling/ maximum variation / This method includes as much expected variation as possible, thus seeking to provide representative samples without random selection of cases. For example, an evaluation of programme implementation might include interviews of representatives of all programme stakeholders, including staff, beneficiaries, donors, government officials, and others affected by the programme.
Convenience sampling / This approach is designed to answer evaluation questions that may not be best served by small random clusters. It addresses a different set of issues than random sampling. For pilot testing or exploratory research, selecting a sample of willing respondents is acceptable even though the sample is not representative. Although this method of sampling is an expedient way of doing some kinds of evaluation research, it is the least rigorous.
Convenience sampling techniques include:
·  Semi-random fashion: Spin a pen or a bottle to choose the direction to follow from the centre, choose a number to predetermine which house or person will be asked for an interview (be careful about centre or “tarmac” bias).
·  Ad hoc manner: Based on spur-of-the-moment, arbitrary selection (consider unconscious biases).
·  Location: For example those attending a facility such as a clinic or a market (consider gender or age biases).
·  Snowball sampling: Early interviewees are asked to name acquaintances who will then be interviewed next, and so forth (consider bias of the sub-group, ethnic or by location, which may be naturally over-represented).


Guarding against sampling bias

RISKS IN NON-RANDOM DATA GATHERING

It is important to be aware that an arbitrarily chosen sample can be biased. The following table presents examples of location bias of non-randomly-gathered data in assessing children’s health in a crisis situation.

If you sample people… /
Data can be biased by…
/ Because…
On the streets or in the market / Under-reporting / Ill children are less likely to be outside
Without household mortality data, you will only see survivors
At feeding centres / Under-reporting & Over-reporting / They are getting food, maybe others who need food are not getting it. In some situations, only worst cases are allowed in feeding centres
In hospitals/health centre / Over-reporting / Sicker people are in health facilities
Near centre of camp/village / Under-reporting / “Wealthier” or more powerful people may live there
In any single area of the camp / Under-reporting & Over-reporting / People of similar status (or physical condition) tend to live together
Along roads / Under-reporting / “Wealthier” or more powerful people may live there

(Source: MSF.)

CHECKING YOUR SAMPLE
Sampling representative of a larger population / Sampling particular groups or people, linked to qualitative aspects
·  What might have occurred to make the sample atypical of the wider group?
·  Could certain types of participants be less likely to be selected than others?
·  Could pragmatic criteria such as cost or time constraints introduce bias into the sample selection? / ·  Does the sample cover those whose views and opinions are particularly important or normally overlooked, in particular women and the poorest groups?
·  Whose views and opinions will not be covered by a given sample, and does exclusion matter?
·  Does the sample cover all the groups likely to have differing opinions or views? Does the sample help us to understand the linkages between different units of analysis (such as individuals and organisations)?

Source: Roche (1999).


Issues in sample size calculations

SAMPLE SIZE FOR MEASURING TRENDS

Sample size calculations for estimating a proportion with a given precision are adequate for many indicators of typical interest to UNICEF-supported programmes. Quite often, programme stakeholders have a strong interest in estimating trends over time.

However, sample size calculations for measuring trends are somewhat more complex. Though these are not dealt with in-depth in this session, here are three useful rules of thumb:

  1. If the two samples are completely unrelated (e.g. different clusters are used in the two surveys) and if the estimate has an error of plus or minus five percentage points, then the smallest trend difference one should be able to detect would be seven percentage points. That would be 1.4 times greater than the original margin of error.
  1. If one uses the same clusters in both surveys, precision improves, making it possible to measure a change of the same size (for example, five percentage points) as the margin of error. This assumes a correlation of 0.5 between the cluster-specific estimates in the two surveys.
  1. If the households are exactly the same as in sentinel community surveillance, a precise estimate of trends should be possible. However, precision will be less if there is a lot of migration in and/or out of the clusters, or if an accurate record is not kept of which households are part of the cluster.

Using the same clusters in repeated surveys has advantages and disadvantages, which each country should weigh when considering repeated surveys. In addition to reducing sample size, it simplifies fieldwork because the areas will already have been mapped in the first survey. On the negative side, surveys may include educational messages or raise the community’s awareness of health problems and therefore might lead to changes in health behaviour (known as the “Hawthorne effect”). Subsequent surveys in the same areas may, therefore, be misleading because these communities may no longer be representative of the country. Usually, the dangers of the Hawthorne effect are very much overstated.

Sample sizes for sub-group analysis

Sub-group analyses may include breakdowns of the indicators by gender, socio-economic group and so forth. Indicators based on sub-groups will be less precise than those calculated for the whole sample.

The smaller the sub-group, the less precise the estimate.

The examples below show how the margins of error increase for smaller sub-groups. Based on an overall sample with a margin of error plus or minus five percentage points for a given indicator, one would have:

§  A margin of error of approximately ± 6.3 percentage points for gender-specific indicators (given 50 percent boys and 50 percent girls in the sample).

§  A margin of error of approximately ± 8.6 percentage points for a sub-group making up 20 percent of the sample (for example, a given socio-economic category).

These results demonstrate that if the overall margin of error is about five percentage points, reasonably precise results will also be obtained for gender-specific indicators, as well as for other sub-groups making up 20 percent or more of the whole sample.

Sampling - Page 4/5