Business Statistics 240

This course is an introduction to "quantitative reasoning" (thinking with numbers). The main objective of the course is to present foundations of problem solving using quantitative approaches. These approaches are referred by the generic term "statistical methods" and their purpose is to enable reaching conclusions about “problems” from observable data sets called "samples".

I. Steps in problem solving using statistical methods:

1)Identification of a problem:

a)Recognition.

b)Definition.

2)Taking inventory of informational requirements obtained from (1)(b).

3)Gathering data. Complication: not all data is observable. Solution: take a representative subset of all data, a "sample".

4)Organizing available data. COVERED IN COURSE

5)Analyzing available data. COVERED IN COURSE

6)Reaching conclusions from data. COVERED IN COURSE

7)Prescribing “solution” to “problem”.

  1. Most important checks and balances (debugging):
  1. Is sample large enough (to be representative of any/all data that could have been used in the above steps)?
  2. Do the conclusions from the used sample of data extend/generalize to “problem” over any/all data?

III.Taxonomy of Data:

Statistical methods are applicable to many kinds of data[1]. These varieties of data are classified into two categories (Qualitative, or Quantitative) and into four different levels of measurement (Nominal, Ordinal, Interval and Ratio).

Categories of Data

Qualitative data answers the question “What kind?” while Quantitative answers the question “How much?”. There is a greater variety of quantitative data than qualitative data because:

(1)the required detail level of measurements for our data determines the numeric scale/system employed --continuous or discrete-- and

(2)typically one can not account for all the possible values that data may take when it is of the Quantitative variety --while it is typically easy to count the different observed values of qualitative data.

By Category / Discrete, Continuous, Both? / Countable, Not Countable, Both?
Qualitative / Discrete / Countable
Quantitative / Both / Both

Levels of Measurement

By level of measurement, data of interest may take on any of four forms. Depending on what we want the data for: sorting, measuring intensity or to provide us with perspective or a sense of relation. Data can be Nominal, Ordinal, Interval or Ratio. Nominal data enables us to label or name observations, Ordinal data can be used to sort or rank or order elements of a sample or population. Interval data captures the intensity of some phenomenon by giving us an idea of the lowest and highest value an item of interest takes for a given observation. And when expressed in Ratio form, data can also give us a clue as to how much variability there is in data relative to the value in the denominator of the ratio.

By Level of Measurement / Organization of Data
Nominal / None
Ordinal / Sorting Possible
Interval / Sorting and Intensity Possible
Ratio / Sorting, Intensity and Reference Point Possible

The following table presents a full organization of all descriptors of data, by both methods of data classification:


By category
By level of measurement / Qualitative
"discrete"
and "countable" / Quantitative
May be "continuous" or not; may be "countable" or not
Nominal
(no order necessary) / Names/Labels
Attributes
Ordinal
"a", "b", "c", "d", "e", …
(sorting) / Rankings
Lexicographic (culture) / All "number" systems used for counting
Interval
From "a" to "b"
(sorting; intensity) / Temperature
Stock Prices
Statistical Classes
Ratio
"a":"b" or "a"/"b"
(sorting; intensity; reference point) / All existing divisions of all "number" systems used for counting

Treatment of Data: Organizing Samples in Chapters4 and 6.

There are two questions we will always seek to answer when analyzing data using statistical methods:

  1. What is happening? What values are observed in the gathered samples of data? (observed data values, ODV)
  2. How often do we observe the values that are occurring in the data? Do these values repeat themselves in identifiable or predictable patterns? (FREQUENCY)

Two of the first substantive chapters in your textbook for the course, Chapters 4 and 6, elaborate on the treatment of data by offering you information on how we use statistics to organize sample data in ways that permit us to come up with a meaningful arrangement of data values by category as well as by level of measurement.

Chapters4 and 6 describe visual tools that help you to organize data via charts graphs or plots that provide insights into data of interest to you. Chapter 6 enters into more detail and offers you definitions of what we formally call "statistics": numbers that describe features of interest about the observed data values in samples.

However, both chapters offer us complementary views of observed data values that permit us to meaningfully arrange samples of data into organized formats that allow us to tell what values occur and how often they happen.

Visual Organization of Data

In the case of chapter four, we are ultimately interested in charting (picturing) different varieties of data to obtain a clearer understanding or perspective about the content of data. As the saying goes: "a picture is worth a thousand words".

Effective charting tools for chapter 2


By category
By level of measurement / Qualitative
"discrete"
and "countable" / Quantitative
May be "continuous" or not; may be "countable" or not
Nominal
(no order necessary) / Pareto Chart
Pie Chart
Ordinal
"a", "b", "c", "d", "e", …
(sorting) / Bar Graph / Histogram / Polygon
Ogive
Stem-Leaf Plot
Run Chart/ Time Series
Interval
From "a" to "b"
(sorting; intensity) / Histogram / Polygon
Ogive
Stem-Leaf Plot
Run Chart/ Time Series
Ratio
"a":"b" or "a"/"b"
(sorting; intensity; reference point) / Histogram / Polygon
Ogive
Stem-Leaf Plot
Run Chart/ Time Series

All of the charting tools below use one fundamental vehicle of data organization: the frequency table. A frequency table is an array of rows and columns that organizes data into "classes" (rows) that share some common element(s).

The columns of a frequency table then contain counts (absolute and relative) or how often different values arise in each class of data --hence the name frequency table.

Depending on the type of data --by category or by level of measurement-- we are using for purposes of our analysis, the process of constructing frequency tables will vary, as will the kind of charting tool that most effectively organizes the data visually --see the last table above, and the slides below.

Building Frequency Tables
For Qualitative data
•classes (rows) are by types of attributes
•first column describes attributes
•all other columns count frequencies in different ways / For Quantitative data
•classes (rows) are ranges of values of equal magnitude (class width)
•first column details ranges for each class
•all other columns count frequencies...
Classes? Class Width? Frequency-table Construction Steps for Quantitative Data
•Sturge’s law: #classes = 1 + 3.3*log(n)
•class width = (max - min) / #classes
•start first class close to min
•start second class close to min+(class width), and so on…
•all other columns count frequencies

1

[1] A more formal definition of what we mean by data will be given in class.