Displaying and Describing Categorical Data

Important Characteristics of Data:

Center: A representative or average value that indicates where the middle of the data is located.

Variation: A measure of the amount that the data values change (vary) among themselves.

Distribution: The nature or shape of the spread of the data (such as bell shaped, uniform, or skewed)

Outliers: Sample values that are very far away from the vast majority of the other sample values.

Time: Changing characteristics of a period.

A good way to help you organize and make sense of your data is to make a picture. Choosing the correct picture greatly assist you in your analysis.

One way to get started is to make a ______.

A ______lists data values (either individually or by groups of intervals), along with their corresponding counts (frequencies).

Are nonsmokers really affected by others who are smoking cigarettes, or is the effect of second hand smokereally a myth?

The following data is from a survey conducted by The National Institutes of Health. The data values consist of the measured levels of Cotinine (ng/ml) in people. Cotinine is a metabolite of nicotine, meaning when nicotine is absorbed by the body, the cotinine is produced. Because it is known that nicotine is absorbed through cigarette smoking, we have a way of measuring the effective presence of cigarette smoke indirectly through cotinine. (All data were rounded to the nearest whole number, so a value of zero does not necessarily indicate the total absence of cotinine.)

Smoker: Subjects report tobacco use

ETS: Environmental Tobacco Smoke—Subjects are non-smokers who are exposed to environmental tobacco smoke “second hand smoke” at home or work.

NOETS: No Environmental Tobacco Smoke—Subjects are non-smokers who are not exposed to environmental tobacco at home or work. Not exposed to “second hand smoke”

A visual comparison of the data arranged into three groups offers some insight, but we can display it in other ways to provide greater understanding.

Was there a relationship between cotinine levels and whether the person was a smoker, exposed to second hand smoke, or a non-smoker not exposed?

We can look at the data distributed along each variable. This is called a ______.

Frequency Table of Cotinine Levels of the Three Groups
Cotinine Level / Smokers / ETS / NOETS
0-99 / 11 / 34 / 38
100-199 / 12 / 2 / 0
200-299 / 14 / 1 / 1
300-399 / 1 / 1 / 1
400-499 / 2 / 0 / 0
500-599 / 0 / 2 / 0

We can organize the ______into Relative Frequencies within each category.

Relative Frequency Table of Cotinine Levels of the Three Groups
Cotinine Level / Smokers / ETS / NOETS
0-99 / 28% / 85% / 95%
100-199 / 30% / 5% / 0%
200-299 / 35% / 3% / 3%
300-399 / 3% / 3% / 3%
400-499 / 5% / 0% / 0%
500-599 / 0% / 5% / 0%

Conditional Distribution:

Conditional Distribution of Cotinine Levels of the Three Groups
Cotinine Level / Smokers / ETS / NOETS / TOTAL
0-99 / 11 / 34 / 38 / 83
13% / 41% / 46% / 100%
100-199 / 12 / 2 / 0 / 14
86% / 14% / 0% / 100%
200-299 / 14 / 1 / 1 / 16
88% / 6% / 6% / 100%
300-399 / 1 / 1 / 1 / 3
33% / 33% / 33% / 100%
400-499 / 2 / 0 / 0 / 2
100% / 0% / 0% / 100%
500-599 / 0 / 2 / 0 / 2
0% / 100% / 0% / 100%

Looking only at the 40 smokers surveyed:

Frequency Table of Cotinine Levels of Smokers
Cotinine Level / Frequency
0-99 / 11
100-199 / 12
200-299 / 14
300-399 / 1
400-499 / 2
Relative Frequency Table of Cotinine Levels of Smokers
Cotinine Level / Frequency / Relative Frequency
0-99 / 11 / 28%
100-199 / 12 / 30%
200-299 / 14 / 35%
300-399 / 1 / 3%
400-499 / 2 / 5%

Pie Charts are another way to visually represent data. It is a graph depicting quantitative data as slices of a pie.

Cellular Phone Customer Complaints
Dropped Calls / 181
Internet Connection / 596
Texting Issues / 45
Cost/Fees / 344
Device Issues / 121
Miscellaneous / 90
Total / 1377

Bar Chart

M&M Color Distribution
Blue / 13
Red / 7
Orange / 11
Green / 9
Yellow / 8
Brown / 7
TOTAL / 55

We can display it as a Relative Frequency

Side-by-Side Bar Chart

Drugs Used By College Students
Male / Female
Alcohol / 84% / 78%
Marijuana / 31% / 25%
Stimulants / 22% / 36%
Sedatives / 17% / 26%
Street Drugs / 9% / 4%
None / 12% / 18%

Segmented

Games Won
Home / Visitor
Baseball / 58% / 42%
Football / 64% / 36%
Hockey / 52% / 48%
Basketball / 74% / 26%

Segmented Bar Chart