Information on the QR Midterm

11.220: Quantitative Reasoning and Statistics for Planning I

Information about the test-out exam

The exam will be held on Friday, February 3rd, time and room can be found at Please bring with you to the exam pens/pencils and a calculator. You may also bring up to 2 textbooks and 1 binder/set of *your* notes with you. We will supply copies of any statistical tables needed to complete the exam (although you’re welcome to use tables in your textbooks if you prefer).

The QR staff will grade the exams over the weekend, and we hope to place the results in your mail files on Monday, February 6th. Please note that we will give partial credit for problems that you attempt but do not complete or make an error in completing. Thus, as you prepare for the exam, get in the habit of showing your work.

The possible outcomes of the test-out exam are that (1) you pass out of 11.220 entirely; (2) you partially pass out and must complete one or more assignments and/or attend one or more specific lectures to address a particular deficiency; or (3) you must take QR.

Information about the brush-up sessions

The brush-up sessions will take place on January 30, 31, and February 1st, time and room can be found at The brush-up sessions are meant to refresh your memory and give you an opportunity to ask questions that arise while you are conducting your preparation for the exam over IAP. Please note that it is not designed to teach you QR/stats that you haven’t had before! The instructor will go through the topics covered in the QR syllabus (and the review below), and will have a few examples prepared for you. However, they will respond to student demand for assistance on particular topics—if you come prepared with questions the sessions will be much more valuable to you. By questions, we mean questions about QR/stats principles and tools, not questions about the test-out exam. You should review the sample test-out exam, as well as past midterm and final exams, on the class website for that type of information.

The material covered on the 30th will not be the same as that covered on the 31st; that is, to take full advantage of the brush-up session you will need to come on all days. You are welcome, however, to come for as little or as much of the sessions as you like. We know that some of you are taking other IAP classes during this week. Feel free to leave the brush-up and come back when your other class finishes.

Brush-up review

The following list of questions is meant to guide your preparation for the test-out exam. The answers to all these questions can be found by reviewing the class textbooks, and solution sets to past problem sets and exams. Note that a big part of 11.220 is developing the ability to explain a QR / stats concept in a way that someone with no training in statistics could understand. This is important for planners, and at least one third of the points on the test-out exam will be related to your ability to express concepts in clear, simple language. We know that this is not easy to do well, and it is a skill that many stats classes overlook. Make sure you understand the ideas behind the formulas and symbols, and that you can express them clearly for the exam.

Argumentation:

What is a premise? What is a conclusion?

What is the difference between an inductive and a deductive argument? Which is more relevant to statistics and why?

Types of data:

Know the difference between ordinal, nominal, and interval/ratio data.

Know which quantitative and statistical tools can be used with which kind of data.

Measurement:

What is a construct? What is an indicator?

What is a valid indicator? What about a reliable indicator? Can you have one without the other?

What is a biased indicator?

Univariate data summary and analysis:

Be able to look at summary statistics and/or a graph and say something about the argument summarized therein.

Understand what a mean, median, quartile, and mode are and how to calculate them for grouped and ungrouped data.

Understand what an outlier is and how to detect it.

What kind of analysis might be affected by outliers? How might you deal with outliers in such cases?

Understand what variance & standard deviation are and how to calculate them for grouped and ungrouped data.

Know how to compute and interpret a Z-score.

Bivariate data summary & analysis:

Understand the conceptual distinctions between dependent and independent variables.

Know how to interpret scatterplots, graphs, and contingency tables and say something about the argument summarized therein.

Understand (Pearson’s) correlation: what it means, how to ‘eyeball’ values for a set of data, and the principle behind how it’s computed (note that you will not have to compute it yourself).

Understand the conceptual distinctions between causation and association or correlation. How is statistics related to “proving causality”?

What is confounding and why is it a problem for quantitative analysis?

Regression analysis:

What is the purpose of OLS regression analysis? For what kind of data can this tool be used?

What is the principle behind choosing where to put the OLS regression line through a set of data?

Know how to interpret the values derived from a regression equation (slope, intercept, y-hat, error or residual). (You will not have to compute the slope or intercept values for a regression line.)

Know how to compute and interpret predicted values of the dependent variable, as well as impacts of changes in the independent variable value, using a regression equation.

Know how to compute and interpret errors (residuals) using a regression equation.

Know what r2 is, the principle behind its computation (you will not have to compute it yourself), and how to interpret its value.

Indices:

Know what indices are and how to interpret index values relative to the base and to one another.

Sampling:

Know the terminology of sampling: target population, parent population, sample, census.

What are the most frequent sources of bias in sampling? Know how to evaluate a sampling strategy described to you in terms of these types of bias.

Be able to recognize the elements that comprise the most common sampling strategies: random sampling, systematic sampling, the use of stratification, etc., and their objectives.

Why is the simple random sample often considered the “gold standard” in research that uses statistical analysis?

Probability:

Understand the concept of probability (i.e., be able to express probability figures in simple words).

Be able to compute simple, joint, and conditional probabilities and interpret your findings. (Note that you will be able to use either the empirically based probability approach or the probability formulas, whichever is easier for you.)

The binomial, normal, and student’s t probability distributions:

What is a probability distribution generally?

Know how to compute (or find in a table) and interpret binomial probabilities in plain language.

Know how to use the normal approximation of the binomial distribution correctly.

What are the features of the normal probability distribution? Know how to compute and interpret Z-scores to answer questions about a normally-distributed variable. Given the description of a variable (e.g., rental prices for 2-bedroom apartments in Cambridge), be able to reason about its distribution.

What is the central limit theorem? Why is it useful in statistical analysis?

What is the student’s t distribution? When and why do we use it?

Confidence intervals:

What information does a confidence interval express?

Given a confidence interval, be able to explain in plain language what it means.

Given a sample statistic (mean or proportion), know how to construct and interpret a confidence interval for the relevant population parameter at any confidence level.

Know how to use 2 confidence intervals to discuss the difference in means and proportions in two populations.

Understand the relationship between sample size and margin of error in a confidence interval. Given the necessary information, know how to find the required minimum sample size needed to obtain a particular margin of error value at a particular level of confidence.

What are the two most feasible strategies for increasing the precision (i.e., ‘shrinking’) your confidence interval?

Research design:

What do the concepts of internal validity and external validity mean in research design?

Know the major types of study design features (e.g., randomization) and why they might be used.

Know the strengths and weaknesses of different design features. For example, what purpose does a control group serve? Why might a before-and-after design be preferred to a "one time" design?

Hypothesis testing:

In general, what does hypothesis testing mean in statistics (what are we trying to achieve)?

What is a null hypothesis? An alternative hypothesis? How are they used? Know how to construct these in words and with formulas given a description of a particular research effort.

What is a one-tailed and a two-tailed test and when would you apply each?

What is a significance level? How does it relate to a confidence level in constructing confidence intervals (and what is the relationship between hypothesis testing and confidence intervals more generally)?

What does it mean, in plain English, to reject your null hypothesis? What about failing to reject your null hypothesis?

What does the phrase “statistical significance” mean? How does it differ in meaning from the term “important” or the phrase “substantive significance”?

Know how to set up, compute, and interpret a test statistic associated with a hypothesis test of a mean value (one- or two-tailed). (Know when you should employ a z or a t table for such a test.)

Know how to find the p-value associated with that test statistic and how to interpret it in plain English. (Remember, don’t get so focused on the formulas that you forget what the ideas behind the tools mean.)

Know how to set up and compute a test statistic associated with a hypothesis test of a proportion (one- or two-tailed). Know how to find the p-value associated with that test statistic and interpret it in plain English.

Know how to set up and compute a test statistic associated with a hypothesis test of two means (one- or two-tailed).

Know how to set up and compute a test statistic associated with a hypothesis test of two proportions (one- or two-tailed). Be able to express the results of your test in clear language.

In each case, you should be able to set up the problem correctly; compute the test statistic; find the associated p-value; and interpret your findings in plain English.

Understand what a Type I and a Type II error is, both in general and in the context of a particular example supplied to you.

Analyzing categorical data:

Know what types of data are appropriate for what types of analysis generally. (What kind of data can be used with the X2 test?)

Know what the X2 test is used for and how to implement it.

What is the null hypothesis in a X2 test?

Given a set of information, know how to set up a contingency table of observed values, compute expected values, set up and compute a X2 statistic, find the associated p-value (with the appropriate degrees of freedom) and interpret your results. (Be sure you know what information a X2 test does not give you.)

Regression analysis:

Understand how regression analysis can be used in inferential statistics. How is hypothesis testing used in regression models? What is being tested?

Know how to set up and compute a test statistic for a hypothesis about , and to find and interpret the p-value associated with your test statistic value. You should also be able to construct a confidence interval for . (Note that the techniques for hypothesis testing/confidence intervals with  are analogous to those used for population means--just understand how the concepts transfer to regression coefficients and how the formulas change.)

Know how to interpret results from a simple or multiple linear regression model. How are the intercept value (a) and the coefficients (the bs) explained in simple language?

How does the interpretation of the coefficients and of the r2 value change when more than one explanatory (independent) variable is in the model?

Know how to calculate predicted values (of the dependent variable) for regression models and to explain what they mean. Know how to compare the predicted impact of a particular change in one independent variable to the predicted impact of a particular change in a second independent variable.

Know how to interpret coefficient values associated with dummy variables in regression analysis. Know how to create a set of dummy variables from a single ordinal or nominal variable correctly.

Know the assumptions that we’re making whenever we use linear regression techniques.