TWELVEDATABASES

Shown here are abbreviated versions oftwelvelarge, real data sets. Each of these isincluded on the student CDin a separate folder called Big Data Setsso they are easy to find. Your instructor may assign these for individual or team projects that are not tied to any specific textbook exercises. They are grouped according to the most likely statistical applications or analysis that might be performed.

Group 1:Appropriate choices for sampling, descriptive statistics, or one-sample or two-sample tests of means and proportions

Database A: ATM Transactions (14, 913 transactions, 8 variables)

Transaction / Type / Time / Date / DayCode / Weekday / Location / Hour / Amount
1 / Deposit / 19 / 1 / 3 / Tue / DriveUp / 0 / 359
2 / Deposit / 20 / 1 / 3 / Tue / DriveUp / 0 / 357
3 / Deposit / 126 / 1 / 3 / Tue / DriveUp / 1 / 141
… / … / … / … / … / … / … / … / …
14,911 / Withdrawal / 2325 / 30 / 4 / Wed / DriveUp / 23 / 30
14,912 / Withdrawal / 2326 / 30 / 4 / Wed / DriveUp / 23 / 80
14,913 / Transfer / 2336 / 30 / 4 / Wed / DriveUp / 23 / 300

Source: Credit union transactions for 30 days at a drive-up ATM and two walk-up ATMs on-campus in student union buildings at two different university campuses.

Database B: Crime Rates in U.S. Metropolitan Areas (350 cities, 9 variables)

City / Metropolitan Area / Murder / Rape / Robbery / … / Burglary / Larceny / Car Theft
1 / Abilene, TX / 4.3 / 50.9 / 96.8 / … / 1097.9 / 2529.7 / 217.8
2 / Akron, OH / 5.4 / 47.3 / 116.6 / … / 854.5 / 2504.2 / 303.4
3 / Albany, GA / 4.8 / 30.5 / 139.3 / … / 1306.8 / 2502.3 / 248.1
… / … / … / … / … / … / … / … / …
348 / York-Hanover, PA / 4.5 / 36.0 / 98.9 / … / 394.7 / 1935.8 / 141.2
349 / Yuba City, CA / 7.2 / 34.8 / 70.2 / … / 1017.7 / 2265.8 / 572.2
350 / Yuma, AZ / 6.6 / 23.1 / 64.8 / … / 847.5 / 2261.7 / 561.9

Sour e: (crime rates are per 100,000 population).

Database C: Post-Anesthesia Recovery Time (3,511 patients, 12 variables)

Patient / Month / Time In / Time Out / … / RecovMins / Hour of Day / Shift / Weekday
1 / 1 / 1:51:00 / 2:20:00 / … / 29 / 1 / 1 / Wed
2 / 1 / 10:55:00 / 12:20:00 / … / 85 / 10 / 2 / Wed
3 / 1 / 13:35:00 / 14:15:00 / … / 40 / 13 / 2 / Wed
… / … / … / … / … / … / … / … / …
3509 / 2 / 19:26:00 / 20:20:00 / … / 54 / 19 / 3 / Fri
3510 / 2 / 20:31:00 / 22:30:00 / … / 119 / 20 / 3 / Fri
3511 / 2 / 22:51:00 / 23:40:00 / … / 49 / 22 / 3 / Fri

Source: Random sample of all hospital patients during 59 consecutive days. Each observation is one patient.

Database D: Annual Wages (11 occupations, 8 variables, up to 407 cities)

Area Name / Occupation / 25th Pct / 50th Pct / 75th Pct / … / Employed
Anniston-Oxford, AL / Accountant / 34,350 / 43,230 / 56,460 / … / 180
Auburn-Opelika, AL / Accountant / 31,140 / 35,830 / 45,220 / … / 200
… / … / … / … / … / … / …
Wausau, WI / Nurse / 47,710 / 55,130 / 65,010 / … / 1110
Casper, WY / Nurse / 42,040 / 48,670 / 54,740 / … / 780
Cheyenne, WY / Nurse / 44,090 / 52,540 / 61,260 / … / 770

Source: Number of cities varies: accountants/auditors (407), actuaries (81), avionics technicians (80), firefighters (203), electricians (396), mechanical engineers (337), nurses (341), physical therapists (355), middle school teachers (302).

Database E: CEO Compensation in 2005 (362 CEOs, 4 variables)

Name / Company / Comp / 5-Yr Comp / Age / Log(Comp)
W James McNerney Jr / 3M / 10,312 / 28,573 / 55 / 4.0133
Miles D White / Abbott Laboratories / 4,417 / 25,860 / 50 / 3.6451
Bruce Chizen / Adobe Systems / 18,005 / 49,990 / 49 / 4.2554
… / … / … / … / … / …
David C Novak / Yum Brands / 25,970 / 46,654 / 52 / 4.4145
J Raymond Elliott / Zimmer Holdings / 16,523 / 24,365 / 55 / 4.2181
Harris H Simmons / Zions Bancorp / 1,837 / 7,724 / 50 / 3.2641

Source: Compensation (Comp) is in thousands of dollars.

Group 2:Appropriate choices for correlation, chi-square tests, orone-sample or two-sample tests of means and proportions

Database F: Statistics Student Survey (193 students, 20 variables)

Student / Gender / GPA / Work Hrs / … / Relig Serv / News Read? / Foreign Lang? / Exercise?
1 / 1 / 3.0 / 20 / … / 3 / 1 / 2 / 1
2 / 1 / 3.6 / 12 / … / 2 / 1 / 0 / 2
3 / 0 / 3.0 / 0 / … / 0 / 1 / 0 / 1
… / … / … / … / … / … / … / … / …
191 / 0 / 2.5 / 25 / … / 0 / 0 / 1 / 1
192 / 0 / 3.0 / 4 / … / 10 / 1 / 1 / 1
193 / 0 / 3.3 / 20 / … / 0 / 2 / 1 / 1

Source: In-class survey of introductory statistics students.

Database G: Web Survey Responses(158 respondents, 31 variables)

Respondent / Living / CellPhone / CellMinutes / CreditCard / Balance / … / PS3? / Xbox?
1 / Dorm / Verizon / 179 / Visa / 194.85 / … / 0 / 0
2 / Dorm / Verizon / 400 / Visa / 700.00 / … / 0 / 0
3 / Dorm / Cingular / 1,000 / Discover / 70.00 / … / 0 / 0
… / … / … / … / … / … / … / … / …
156 / Solo Apt / Cingular / 500 / Visa / 3000.00 / … / 0 / 0
157 / Dorm / T-Mobile / 700 / Visa / 135.66 / … / 0 / 1
158 / Dorm / Verizon / 200 / Visa / 200.00 / … / 0 / 0

Source: Web survey. Includes only complete respnses. Detailed variable definitions and question wording are included in the file.

Group 3: Appropriate choices for correlation,regression, or descriptive statistics and histograms

Database H: Noodles & Company Database (74 restaurants, 45 variables)

Rice Krispie Sales
Obs / SqFt / Sales/Person / PunchCard% / Sales/SqFt / … / Oct / Nov / Dec
1 / 2,354 / 6.81 / 2.07 / 701.97 / … / 29.1 / 28.5 / 28.9
2 / 2,604 / 7.57 / 2.54 / 209.93 / … / 10.0 / 10.4 / 10.0
3 / 2,453 / 6.89 / 1.66 / 364.92 / … / 12.3 / 13.8 / 13.1
… / … / … / … / … / … / … / … / …
72 / 2,450 / 7.37 / 1.09 / 339.94 / … / 17.8 / 16.1 / 15.4
73 / 2,575 / 6.76 / 0.64 / 400.82 / … / 14.3 / 14.2 / 14.5
74 / 2,400 / 7.97 / 1.77 / 326.54 / … / 8.7 / 10.1 / 10.4

Source: Noodles & Company. management.

Database I: U.S. States (50 observations, 170 variables)

State / Age 5% / Age 65% / Median Age / … / Divorce / Income / Union% / Metro%
Alabama / 6.5 / 13.3 / 37.4 / … / 5.0 / 29,136 / 10.2 / 89.2
Alaska / 7.7 / 6.6 / 33.9 / … / 4.3 / 35,612 / 22.8 / 74.7
Arizona / 7.7 / 12.8 / 34.5 / … / 4.3 / 30,267 / 6.1 / 96.7
… / … / … / … / … / … / … / … / …
W. Virginia / 5.6 / 15.3 / 40.7 / … / 5.0 / 27,215 / 14.4 / 75.0
Wisconsin / 6.1 / 13.0 / 37.9 / … / 3.1 / 33,565 / 16.1 / 85.9
Wyoming / 6.1 / 12.2 / 39.1 / … / 5.3 / 36,778 / 7.9 / 71.5

Source:Statistical Abstract of the U.S., 2007

Database J: County Data (3,140 counties, 29 variables,)

Obs / County / State / Popul / PopDen / PopChg% / … / Unempt / Services% / WaterUse
1 / Autauga / AL / 44,876 / 73.3 / 2.8 / … / 3.7 / 17.3 / 911
2 / Baldwin / AL / 145,799 / 88.0 / 3.8 / … / 3.1 / 17.8 / 3,101
3 / Barbour / AL / 28,947 / 32.8 / -0.3 / … / 5.1 / 17.7 / 1,066
… / … / … / … / … / … / … / … / … / …
3138 / Uinta / WY / 19,572 / 9.5 / -0.9 / … / 5.5 / 21.5 / 10,742
3139 / Washakie / WY / 8,102 / 3.7 / -2.3 / … / 5.3 / 21.4 / 7,307
3140 / Weston / WY / 6,533 / 2.8 / -1.7 / … / 4.5 / 21.1 / 405

Sources: City-County Data Book and U.S. Census

Database K: MySpace Friends (48 friends, 11 variables)

Friend / Name* / Friends / Age / Blogs / … / Comments / Marital / College / Gender
1 / Michele / 129 / 28 / 10 / … / 294 / 0 / 1 / 0
2 / Nancy / 349 / 21 / 7 / … / 540 / 0 / 0 / 0
3 / Theresa / 51 / 27 / 0 / … / 317 / 0 / 1 / 0
… / … / … / … / … / … / … / … / … / …
46 / Rachel / 62 / 24 / 0 / … / 131 / 0 / 1 / 0
47 / Joe / 11 / 28 / 0 / … / 20 / 0 / 1 / 1
48 / James / 5 / 28 / 3 / … / 4 / 0 / 0 / 1

Source: MySpace files of a student, used in an independent statistics project (*friend names are changed).

Database L: Fortune 1000 Largest Companies in 2006 (1000 firms, 4 variables)

Company / Revenue (millions) / Net Income (millions) / Profit (1) or Loss (0) / % Profit
3M (MMM) / 21167.0 / 3199.0 / 1 / 15.11
A.G. Edwards (AGE) / 2611.8 / 186.5 / 1 / 7.14
A.O. Smith (AOS) / 1689.2 / 46.6 / 1 / 2.76
… / … / … / … / …
Zale (ZLC) / 2383.1 / 106.8 / 1 / 4.48
Zimmer Holdings (ZMH) / 3286.1 / 732.5 / 1 / 22.29
Zions Bancorp (ZION) / 2349.1 / 480.1 / 1 / 20.44

Source: Data are from the April 17th, 2006 issue. Revenue and income are in millions of dollars.

Page 1 of 4Printed10/18/2018