Statistics for Marketing & Consumer Research (M. Mazzocchi)

SAGE Publications

Multiple choice questions

Chapter 1

  1. The difference between random and systematic errors is:

A)Systematic errors are always larger than random errors

B)Random errors are subject to probability laws, while systematic errors are not

C)Systematic errors can be eliminated by taking repeated measurements

  1. If measurement error follows a Normal curve centred at zero, then:

A)Errors in excess compensate error in defect

B)Errors in excess are more likely than error in defect

C)The average error is negative

  1. Which of the following is a qualitative ordinal variable?

A)Monthly income

B)Quantity of coffee drank in a week

C)Customer satisfaction measured on a Likert scale from one to seven

  1. High reliability for a set of questionnaire items measuring a latent construct means that:

A)The items are bipolar

B)The items are not internally consistent

C)The items are internally consistent

  1. What is the difference between a Likert and a Semantic Differential (SD) scale?

A)The Likert scale requires two bipolar attributes, the SD scale one

B)The SD scale requires two bipolar attributes, the Likert scale one

C)There is no difference

Chapter 2

  1. Primary data are:

A)Data already collected by others for different purposes

B)Data collected before secondary data

C)Data explicitly collected for the specific purpose of the research

  1. Secondary data are always preferable to primary data when:

A)Secondary data are more expensive than primary data

B)The target population and sampling fit with the research objective and quality is acceptable

C)The sample size of secondary data is larger than the one of primary data

  1. What is the difference between household budget surveys and panel household surveys?

A)Household budget surveys include the same households over time

B)Panel household surveys include the same households over time

C)Panel household surveys do not record budgets

  1. What is the COICOP classification?

A)A classification of households according to the economic status of the reference person

B)A classification of surveys according to the type of sampling procedure

C)A classification of expenditure items according to their purpose

  1. What is scan data?

A)Purchase data collected through a bar-code scanner

B)Old data transcribed into electronic files through scanning facilities

C)Machine-readable questionnaires

Chapter 3

  1. Which of the following is not a non-response error?

A)Sampling frame error

B)Not-at-home error

C)Refusal

  1. What is the difference between sampling and non-sampling error?

A)Sampling error can be quantified with probabilistic sampling

B)Non-sampling error can be quantified with large sample sizes

C)Sampling error is always larger than non-sampling error

  1. What is the reference population?

A)A population used for comparison with the selected sample

B)The list of subjects included in the sample

C)The complete set of subjects relevant to the research

  1. Which survey method has the highest non-response rate?

A)Telephone

B)Mall intercept

C)Mail surveys

  1. What is the duration of a telephone interview to ensure adequate quality of responses?

A)15-20 minutes

B)30-40 minutes

C)45-60 minutes

  1. What is the duration of a telephone interview to ensure adequate quality of responses?

A)15-20 minutes

B)30-40 minutes

C)45-60 minutes

  1. Sensitive questions

A)Should never be at the beginning of a questionnaire

B)Should never be asked through multiple choice questions

C)Are better recorded through face-to-face interview

Chapter 4

  1. What is a dummy variable?

A)A metric variable with no decimals

B)A binary variable

C)A useless variable which should be discarded

  1. Which of the following is not the definition of an outlier?

A)A value with a distance from the mean higher than 2.5 times the standard deviation

B)A value with a distance from the mean higher than 1.5 times the interquartile range

C)A value with distance from the mean higher than the median

  1. Listwise or casewise deletion of missing data implies:

A)That observations (cases) with missing data are omitted from the analysis

B)That observations (cases) enter the analysis only with their non-missing values

C)That variables with missing data are omitted from the analysis

  1. In statistical analysis with pairwise deletion of missing data:

A)Observations (cases) with one or more missing data are omitted from the analysis

B)In each estimation step all valid cases are exploited

C)If a variable is omitted in an estimation step because of missing data, then it does not enter estimation any more

  1. Missing response should be treated as non-random when:

A)They are more than 5% of total responses

B)The mean of relevant variables is very different between respondents and non-respondents

C)The mean of relevant variables for non-respondents is equal to the one for respondents

  1. Which of the following is not a good strategy for imputing missing data?

A)Substituting missing values with the sample mean for that variable

B)Substituting missing values with the standard deviation for that variable

C)Imputing missing values with several approaches, then taking the average

  1. Which of the following is a measure of central tendency?

A)Standard error

B)Coefficient of variation

C)Mode

  1. What is the median?

A)The value which splits the observations in a data-set in two halves, those below and those above the median value

B)The most likely value

C)The average value

Chapter 5

  1. What is a statistic sample?

A)A measure of variability

B)A group of variables measured on all the subjects object of interest

C)A subset of the target population expected to represent the whole population

  1. What is a sampling frame?

A)A list of all subjects included in the sample

B)A list of statistics to be computed in the sample

C)A list of all subjects included in the reference population

  1. Which of the following is a form of probability sampling?

A)Convenience sampling

B)Quota sampling

C)Stratified sampling

  1. Which of the following statistics is a measure of sample precision?

A)Standard error of the mean

B)Mode

C)Mean

  1. Which of the following parameters influence the choice of the sample size?

A)Variability of the target variable in the population

B)Size of the population

C)Both

  1. What is the advantage of the sampling error over the non-sampling error?

A)Sampling error can be quantified according to probability laws

B)Sampling error is much smaller than non-sampling error

C)Sampling error can be eliminated by training interviewers

  1. The rationale behind stratified sampling is:

A)to maximise heterogeneity within each stratum

B)to minimise heterogeneity between different strata

C)to maximise homogeneity within each stratum

  1. A population parameter is estimated in a sample through:.

A)A sample statistic

B)The precision level

C)The sampling frame

  1. Which of the following statistics is an estimate of population variability

A)Sample average

B)Population average

C)Sample standard deviation

  1. Which of the following situations does not favor the use of a census?

A)There is high variance in the characteristic to be measured.

B)The cost of nonsampling errors is low.

C)The population is large.

  1. All of the following statements are limitations of simple random sampling except:

A)It is often difficult to construct a sampling frame that will permit a simple random sample to be drawn.

B)Simple random sampling often results in lower precision with larger standard errors than other probability sampling techniques.

C)The sample results may be projected to the target population.

  1. Which probability sampling technique selects a random starting point and picks up every ith element in succession from the sampling frame.

A)Simple random sampling

B)Systematic sampling

C)Cluster sampling

  1. In which one of the following ways do cluster sampling and stratified sampling differ:

A)the former is probabilistic and the latter is non probabilistic

B)there is no difference

C)with respect to homogeneity and heterogeneity within/across subgroups

  1. All of the factors listed below favor the use of probability sampling except:

A)Nonsampling errors are likely to be an important factor

B)The nature of the research is conclusive

C)The population is heterogeneous with respect to variables of interest

  1. The difference between the mean value for the sample and the true mean value of the population is a measure of :

A)Precision

B)Accuracy

C)Randomness

  1. The probability sampling technique where each element in the population has exactly the same probability of extraction is:

A)Stratified sampling

B)Simple random sampling

C)Quota sampling

  1. The sampling technique which divides the population into sub-populations which are expected to be similar among them is:

A)Stratified sampling

B)Simple random sampling

C)Cluster sampling

  1. Which of the following is an estimate of the variability of estimates of the mean in different samples?

A)Standard error of the mean

B)Variance

C)Standard deviation

Chapter 6

  1. What is the probability distribution for sample means extracted through simple random sampling?

A)The uniform distribution

B)The Normal distribution

C)The F distribution

  1. The 95% confidence interval for a mean from a large sample has

A)A width of about twice the standard error of the mean

B)A width of about four times the standard error of the mean

C)A width of about ten times the standard error of the mean

  1. The critical values for hypothesis testing are:

A)A measure of variability of the target variable

B)The values which separate the acceptance region from the rejection region

C)The area under the probability distribution

  1. What is the level of significance  of a test?

A)The probability of rejecting the null hypothesis when it is actually true

B)The probability of non-rejecting the null hypothesis when it is actually true

C)The probability of non-rejecting the null hypothesis when it is actually false

  1. What is the relation between level of confidence and the level of significance ()?

The level of confidence is 1+

B)The level of confidence is 1– 

C)There is no difference

  1. What is the power of a test?

A)The probability of rejecting the null hypothesis when it is actually true

B)The probability of non-rejecting the null hypothesis when it is actually true

C)The probability of rejecting the null hypothesis when it is actually false

  1. What is the difference between parametric and non-parametric tests?

A)Parametric tests do not require assumption on the probability distribution of the variables being tested

B)Non-parametric tests do not require assumption on the probability distribution of the variables being tested

C)Parametric tests are more powerful

  1. If the same sample of individuals is interviewed before and after an advertising campaign, a mean comparison test about the purchasing habits before and after the campaign is a test for:

A)Independent samples

B)Related samples

C)Paired samples

  1. Which test is more appropriate for mean comparison with related samples?

A)The t-test

B)The F-test

C)The Wilcoxon test

  1. What is the distribution of the ratio of two variances under the null hypothesis of variance equality?

A)The t-distribution

B)The Normal distribution

C)The F-distribution

  1. A t-test in two means returns a t-statistic with a p-value of 0.03. Which of the following is correct?

A)The means of the two populations are equal at 97% confidence level

B)The means of the two populations are different between each other at 95% confidence level but not at a 99% confidence level

C)The mean of the two populations show a 3% difference

Chapter 7

  1. Which of the following is a consequence of running multiple mean comparison tests?

A)It is more likely to reject the null hypothesis when it is true

B)It is impossible to compute the F-statistic

C)It is less likely to reject the null hypothesis when it is true

  1. When does one-way ANOVA rejects the hypothesis of mean equality?

A)When all means are different

B)When at least two means are different

C)When all means are equal

  1. What is the main difference between planned comparisons and post-hoc tests?

A)Planned comparison take into account the familywise error, post-hoc tests don’t

B)Post-hoc tests take into account the familywise error, planned comparison don’t

C)Planned comparison are decided prior to the analysis, post-hoc test afterwards

  1. What is three-way ANOVA?

A)An ANOVA with three variables and one factor

B)An ANOVA with one variable and three factors

C)An ANOVA with both categorical and scale variables

  1. What is the effect size in ANOVA?

A)It is the number of factors being considered

B)It corresponds to the F-value

C)Is a measure of the relative weight of the variability imputable to the factor

  1. What is the difference between random and fixed effects?

A)Fixed effects are measured with no error, random effects are the outcome of a random variable

B)Fixed effects are larger than random effects

C)Random effects are under the control of researchers, fixed effects are not

  1. Which of the following is not an assumptions of one-way ANOVA?

A)Independence between the units sampled in different treatments

B)Absence of large discrepancy in variance across different treatments

C)Different treatments are measured on the same units

  1. What is the difference between multi-way (factorial) ANOVA and multivariate ANOVA?

A)Multivariate ANOVA has more than one target variable, factorial ANOVA only one

B)Factorial ANOVA has more than one target variable, multivariate ANOVA only one

C)There is no difference

  1. What is the General Linear Model (GLM)?

A)A regression model with the target variable(s) on the left-hand side and the factors on the right-hand side

B)A general model which includes ANOVA, factorial ANOVA, MANOVA and ANCOVA as special cases

C)Both of the above

  1. Which of the following statement on ANCOVA is true?

A)ANCOVA is a GLM where all the right-hand side variables are categorical

B)ANCOVA is a GLM where the dependent variable is categorical

C)ANCOVA is a model which can contain both metric and non-metric variables on the right-hand side

Chapter 8

  1. If the F-test in a multiple regression equations returns a probability of 0.22, it means that:

A)None of the coefficient is significantly different from 0

B)All of the coefficients are significantly different from 0

C)At least one coefficient is significantly different from 0

  1. The t-test on regression coefficients tests the null hypothesis that:

A)There is no collinearity

B)The residuals are normally distributed

C)The coefficient is significantly different from 0

  1. Which of the following statements on the relationship between the bivariate correlation coefficient r and bivariate regression is true?

A)The regression coefficient is equal to the bivariate correlation coefficient

B)The regression R2 is the squared correlation coefficient

C)The F-test on the regression is equal to the bivariate correlation coefficient

  1. What is the difference between covariance and correlation?

A)Covariance depends on the measurement unit, correlation is standardized to be between zero and one

B)Correlation depends on the measurement unit, covariance is standardized to be between zero and one

C)Covariance depends on the measurement unit, correlation is standardized to be between minus one and one

  1. A bivariate correlation coefficient equal to minus one means that:

A)There is no correlation between the two variables

B)The two variables are perfectly correlated, as one increases by n%, the other also increases by n%

C)The two variables are perfectly correlated, as one increases by n%, the other decreases by n%

  1. Which of the following is an appropriate equation for multiple regression?

A)

B)

C)

  1. The R-square indicator measures the goodness of fit of a regression model. What does an R-square value of zero mean?

A)There is no relationship between the dependent and explanatory variables

B)The model performance is very poor

C)The model fits perfectly the data

  1. What does the partial correlation coefficient measure?

A)The correlation between two variables after controlling for the effects of one or more additional variables

B)The correlation among three variables

C)The percentage of correlation between two variables which depends upon a third variable

  1. What is the difference between stepwise and forward model selection methods?

A)The stepwise method allows the removal of variables after they have been included, the forward method does not

B)The stepwise method goes backward (removing variables), the forward method goes forward (adding variables)

C)There is no difference

Chapter 9

  1. When are two categorical variables said to be associated?

A)When frequencies in the contingency table are equally distributed across cells

B)When a relationship exists between frequencies in each category of the first variable and frequencies in categories of the second variable

C)When they have the same number of categories

  1. What is the null hypothesis of the Pearson chi-square test?

A)That two categorical variables are independent

B)That two categorical variables are associated

C)That two categorical variables have the same distribution

  1. Which of the following test works with strictly nominal variable?

A)Goodman and Kruskal's Lambda

B)Somer’s d statistic

C)Tau statistic

  1. What is the dependent variable in a log-linear model?

A)The exponent of the cell frequencies in a contingency table

B)The logarithm of the cell frequencies in a contingency table

C)The ratio of the cell frequencies in a contingency table

  1. What is the relation between the Pearson chi-square test and log-linear analysis in a 2×2 table?

A)The Pearson chi-square test is more powerful

B)Log-linear analysis does not work with 2×2 tables

C)The Pearson chi-square test corresponds to the test on 2nd order interaction in log-linear anaysis

  1. What is the model selection process in hierarchical log-linear analysis?

A)Higher order interaction are tested first, the process stops when deletion of a term makes the frequencies predicted by the model significantly different from frequencies of the saturated model

B)Lower order interaction are tested first, the process stops when deletion of a term makes the frequencies predicted by the model significantly different from frequencies of the saturated model