Excel Exercises for
Marketing Research Essentials, Canadian Edition
PREFACE
This Excel User Guide is provided for students who will be working with the Excel version of the SPSS Exercises to accompany McDaniel, Gates, and Sivaramakrishnan, Marketing Research Essentials, Canadian Edition.It consists of detailed step-by-step instructions for each exercise, accompanied by relative visual aids, such as “screen shots,” to further serve as “road map” indicators ensuring that you are on the right track.The guide also includes periodic “troubleshooting” tips, such as, “If you see “#####” in cell A3, then you need to…”
NOTE: All Exercises must be completed in chronological order, omitting none, for successful completion of this data analysis project.
NOTE: You must have the "Data Analysis Toolpack" installed for your Microsoft Excel program. To check to see if you have it, open Excel and click "Tools" on the toolbar at the top of the screen.If you see "Data Analysis" in the tools menu, then you are ready to go!If you do not see "Data Analysis" you will have to select "Add-Ins..." from the menu.Click the "Data Analysis ToolPack" box, and hit "OK."
If something happens where the computer can't find the file needed to install the pack, then ask your professor or lab assistant for help!
TABLE OF CONTENTS
Chapter 11:Sample Size Determination 4
Exercise #1: Sample Size Determination Using the Sample Means Method 4
Exercise #2: Determining the Reliability/Confidence of Sample Results 8
Chapter 12:Data Processing, Fundamental Data Analysis, and the Statistical Testing of Differences 10
Exercise #1: Machine Cleaning Data10
Exercise #2: Analysis of Data with Frequency Distributions14
Exercise #3: Analysis of Data with Descriptive Statistics17
Exercise #4: Analysis of Demographic Characteristics Using Charts19
Exercise #5: Analyzing Data using Cross Tabulation Analysis21
Exercise #6: T/Z Test for Independent Samples31
Exercise #7: ANOVA Test for Independent Samples34
Chapter 13:Bivariate Correlation and Regression38
Exercise #1: Pearson’s Product-Moment Correlation38
Exercise #2: Bivariate Regression41
Chapter 11– Sample Size Determination
Exercise #1: Sample Size Determination Using the Sample Means Method
1
(1)Go to the Wiley website at and download the “Segmenting the College Student Market for Movie Attendance” database for Excel.Also download a copy of the “Segmenting the College Student Market for Movie Attendance” questionnaire so that you can understand the database contents.The most important items in the survey are in question #5.It contains 9 movie items in which respondents rate their relative importance.When you open the database, take note of the variable descriptions and the computer coding for each of the variables, which are contained in the “Variable Labels” and the “Value Labels” worksheets of the Excel Workbook.
1
(2)As you’ve learned in class and from reading the course text, the Sample Means method of sample size determination consists of:
i.required confidence level (Z)
ii.level of tolerable error (e)
- estimated population variance (o)
- estimated sample size (n)
- Formula: n = (z2 * o2)/e2
(3)Of the various methods of deriving sample size, estimated population standard deviation can be estimated based on prior studies, expert judgement, or by conducting a pilot sample.For this problem, we are going to estimate population standard deviation using a pilot sample.To do this you will use only the first 200 cases in the Segmenting the College Student Market for Movie Attendance database.We are assuming that these are responses to a pilot sample, and we will use them to estimate the needed sample size.
1
1
Do This: In the database, you’ll notice that questionnaire number 200 falls into line 201 because the field titles are in row number 1.So, highlight row “202” in the database, click “insert” on the toolbar at the top of your screen, and select “row.”You do this so that you can have a blank working space in which to calculate the standard deviation of the pilot sample.(If you mess up and insert a row in the wrong place, just select the row, click “edit,” and select “delete”; then insert the row in the right place.)
1
(4) Now, Do This: select cell F202 and enter this formula: =STDEV(F2:F201)
This formula computes the standard deviation for all 200 responses for variable Q5a.
If you have #NAME instead of some decimal number in cell F202 of your database worksheet, then you forgot to put the colon (:) between the F2 and F201 in your formula.When you have something to the nature of 0.5 in cell F202, then you’re good to go.Now select that cell and position the cursor until you have the fill tool indicator that looks like a cross.Now drag the tool cursor all the way to variable Q5i and drop it.
Congratulations!You just calculated the standard deviation for all of the components of question five of the survey (the most important questions that we are interested in, as stated above).We are assuming that each of the nine variables is equally important with respect to the research objective.
(5)As you have learned from class lectures and from reading your course text, sample size determination can be reached by selecting the variable with the largest computed standard deviation.
Answer the following questions:
1.Which of the nine movie theatre items had the largest standard deviation?______
2.Now, using the formula for the sample means method of sample size determination that you learned in class, make the necessary computations for each of the following:
a.Compute sample size given the following:
i.required confidence level (Z) is 95.44%.
ii.tolerable error (e) is .1 or 1/10 of a response point.
iii.standard deviation (o) = ______
iii.sample size (n) = ______
- Compute sample size given the following:
- required confidence level (Z) is 99.72%.
- tolerable error (e) is .1 or 1/10 of a response point.
- standard deviation (o) = ______
- sample size (n) = ______
3.How do your computed sample sizes in the problems above compare to the total number of cases in the Segmenting the College Student Market for Movie Attendance database?
______
4.We are going to assume that the objective of our research concerning students attendance at movies can be expressed as a dichotomy (greater or lesser, etc.). For example, it doesn’t matter how much one group attends movies over another group, but just who attends the most.To accomplish this we can use the much less complicated sample proportions formula.We are going to assume that we have no prior studies, hence, in the sample proportions formula P = .5 and (1 – P) = .5.You will not need Excelto assist you with this computation.
a.Compute sample size given the following:
i.required confidence level (Z) is 95.44%.
ii.tolerable error (e) is .05 or accuracy within 5% of the true
population mean.
iii.standard deviationP = .5 and (1 – P)= .5
iv.sample size (n) = ______
b.Compute sample size given the following:
- required confidence level (Z) is 99.72%.
- tolerable error (e) is .03 or accuracy within 3% of the true
population mean.
- standard deviation P = .5 and (1 – P)= .5
- sample size (n) = ______
Exercise #2: Determining the Reliability/Confidence of Sample Results
(1)Instead of determining the needed sample size, we will now evaluate the confidence level of results derived from the entire Segmenting the College Student Market for Movie Attendance database.To evaluate this type of confidence, using the sample means formula, solve for Z instead of n.Hence, use the formula Z2 = n * e2/o2.Then take the square root of Z2.You can go to the normal distribution table in the appendix of your text to determine the confidence level associated with the database.For the sample proportions formula, solve for Z using the formula Z2 = (n * e2)/[P(1 – P)], then take the square root of Z2.
You can also use Excel to calculate the confidence level for the entire database.As you know from class, a confidence level for a normal distribution requires two tails.So you multiply 1-α by two.But don’t worry about that by itself… just enter the following formula into a blank cell, such as 503A: =2*NORMSDIST(your Z value)-1.
A good Z-value is 1.75, since it corresponds to a confidence level of a little over 90% (the general cut-off for significance).
1
(2)Now, remember that we assume that question #5 has the most important questions in the questionnaire, with respect to the research objectives.Let’s calculate the standard deviation for these questions, but this time for the entire database instead of just the first 200 responses.
Do This: select cell F504 and enter this formula: =STDEV(F2:F501) in order to have Excel compute the standard deviation for all responses for question 5a.It should be something close to 0.59.
1
Now select that cell and position the cursor until you have the fill tool indicator that looks like a cross.Now drag the tool cursor all the way to variable Q5i and drop it.You have now computed the standard deviation for all responses to Q5a through Q5i.Choose the question with the largest standard deviation and use it in the formula for computing the confidence level in the following problem.
(3)Compute the confidence level associated with the Segmenting the College Student Market for Movie Attendance database, given the answer to the preceding question and the following:
a.tolerable error is .1 or 1/10 of a response point.
b.sample size = 500.
c.standard deviation______
- Confidence Level = ______%
(4)How do the results in question #3 above compare to the results in #2 in Exercise 1?
______
(5)Sample Proportions Formula: Given the information below, compute the confidence level associated with the Segmenting the College Student Marketing for Movie Attendance database.You will not need Excel to make this computation.
a.tolerable error is .05 or 5%
b.sample size = 500
c.standard deviationP = .5 and (1 – P) = .5
d.Confidence Level = ______%
(6)How do the results in this problem compare the confidence level in question #3above?
______
Chapter 12 – Data Processing, Fundamental Data Analysis, and the Statistical Testing of Differences
Exercise #1: Machine Cleaning Data
(1)Go to the Wiley website at and download the “Segmenting the College Student Market for Movie Attendance” Excel database.This database will have several errors for you to correct.Click on the “Value Labels” tab at the bottom of the Excel worksheet and notice the computer coding for each variable.
(2)Also from the Wiley website, download a copy of the “Segmenting the College Student Market for Movie Attendance” questionnaire.Notice the computer coding for each of the variables; it is the same as that in the “Value Labels” in the Excel worksheet.This information will be important in finding errors in the database.
(3)Now it’s time to check the database for errors.As you noticed in the “Value Labels”
worksheet, there are a limited number of possible answers to each question.For example, for Q1, “Did you attend at least one movie at a movie theatre in the past year?” a respondent is only able to answer “Yes” or “No.”Therefore, the minimum value possible for that question is “0,” the numerical value label assigned to that response for the “No” response for that question. The maximum value possible is “1,” the numerical label assigned to the “Yes” response for that question.In Excel, we can use formulas to find the minimum and maximum values among all 500 responses to any given question.So, if we find that the maximum value among all of the responses for Q1 is “2,” then we know that there was an error in the data entry process, because the only possible maximum value is “1,” meaning that the respondent answered “Yes” to the question.
Do This:In a blank cell, such as B503, enter this formula: =MIN(B2:B501)
You should get a calculated value of zero, which is what we expected.
Now, Do This:In the blank cell directly beneath the one you put the MIN formula in, say B504, enter this formula: =MAX(B2:B501)
You should get a calculated value of 1, theoretically, but you’ll see that you actually get a calculated value of 2, which means that someone messed up in entering the data.So, we need to find the error in the Q1 column.
1
Do This: Select the Q1 column (column B) to highlight it.Then click “Edit” on the toolbar at the top of the screen and select “find” (for a shortcut, hold the “ctrl” button and hit “f,” which is the “find” command).The “find” window will pop up.Enter “2” in the box and hit enter.Excel will go to the location of the first 2 that it finds in the column for Q1.
1
Follow this procedure for finding errors in the columns for the rest of the variables.For example, for Q2, “Indicate how important you consider going to the movies at a movie theatre, relative to other leisure activities,” the minimum possible value is “1” for “Very Unimportant,” and the maximum possible value is “4” for “Very Important.”Since you have already entered the MIN and MAX formulas for Q1, all you really have to do is highlight cells B503 and B504, position the cursor so that you have the fill tool that looks like a cross, and drag the cursor all the way to the cells AE503 and AE504.
When you analyze what you have just done, you see that:
- we have an error in the column for Q1 (as we already discussed)
- there are no errors in the column for Q2
- the MIN and MAX for Q3 don’t matter because it’s an open-ended question
- we have an error in the column for Q5f (the MAX value is 7, which is not a possible response)
- we have an error in the column for Q6
- the MIN and MAX for Q8a–d don’t matter because they are open-ended questions (however, we can check these for errors using another strategy, which we will do in a minute)
- we have an error in the column for Q9
- we have an error in the column for Q10
- there are no errors in the column for Q11
- we have an error in the column for Q12
- there are no errors in the column for Q13
- there are no errors in the column for Q14
(4)When you have determined which variables have input errors, summarize the errors using the template below as a guide:
Questionnaire Number / Variable Containing Error / Incorrect Value / Correct Value(5)Now, as we stated above, another possible source of errors is in question 8.Notice that in this question, the sum of the answers should be 100%.To check that these add up to 100% for each respondent, just enter the following formula into a blank cell for Questionnaire Number 1 (such as cell AF2): =SUM(V2:Y2) (there are other ways to calculate the sum of certain cells using Excel; play around with the tools or see if you can figure out varying but equivalent formulas that will do the trick!)
1
1
Now all that you have to do to calculate the sum of Q8a–d for the rest of the database is to use the “fill” tool that you have most likely mastered by now! (Highlight cell AF2, position the cursor to get the “fill” tool, then “drag” the cursor all the way to the last Questionnaire entry.)
When you look over the calculated sums, you will notice that there is an error in the row for Questionnaire number 238, because the sum of Q8a–d is 110, when it should only be 100.
1
(6)Once you have completed summarizing the variables containing errors, position the cursor on each of the variable columns containing errors (one at a time) to highlight the column.Use the ctrl-f function to find the questionnaire numbers where the errors occurred.At this point, you will need the corrected database, or database with no errors.Your professor has access to this database.After getting the corrected database, finish filling in the Table in part (4) above with the correct values.Then make the changes in your database, so that you have a database with no errors.Be sure to re-save your database after correcting it for errors.
(7)After machine cleaning your data, you will notice that the values for your MIN and MAX formulas are now in accordance with the correct range for the value labels.
Exercise #2: Analysis of Data with Frequency Distributions
1
Now that you have a clean database, let’s obtain frequencies for all of the variables!First, let’s create multiple workspaces: Position your cursor on the topmost left corner of the worksheet and click it to highlight the entire worksheet.
1
Now hold “Ctrl” and hit “c” for the “copy” command.Click Sheet 2 on the bottom of the screen to open a blank worksheet, position the cursor in cell A1, and hold “Ctrl” and hit “v” for the “paste” command (you can also use the copy and paste icons on the toolbar at the top of the screen for these commands).You’ll see that you just created an exact replica of your original clean database.Do this again in about 5 or 6 other worksheets (you may not use them all), and name them each something unique whenever you start to use them by double-clicking on the word “Sheet 1” (etc.) to highlight it and replace the text.
To make it easier to work with your worksheets, you can “freeze” the panes that indicate the Questionnaire number and the Variables:Highlight cell B2, click “Window” on the toolbar at the top of the screen, and select “Freeze Panes.”From now on, no matter where you are in the database, you will still be able to see the variable labels and the questionnaire numbers.