2 - Data/Variable Types
DATA TYPE DETERMINES WHAT STATISTICAL METHODS WE USE TO SUMMARIZE AND DRAW CONCLUSIONS FROM OUT DATA!
There are three main variable types:
Numerical/Quantitative
Ordinal
Categorical/Qualitative
Example: Assessing Mercury Levels Found in Fish in Maine Lakes
Mercury is a toxic metal sometimes found in fish consumed by humans. The state of Maine conducted a field study of 115 lakes to characterize mercury levels in fish, measuring mercury and 10 variables on lake characteristics. From these data, we can investigate the following four questions of potential interest:
- Are mercury levels high enough to be of concern in Maine lakes?
- Do dams and other man-made flowage controls increase/decrease mercury levels?
- Do different types of lakes have different mercury levels?
- Which lake characteristics best predict mercury levels?
Data File in JMP: Maine Mercury Study
The variables measured by the researchers as part of this field study are listed below. Classify each variable according type.
- Merc (ppm): Mercury level found in fish fillets in parts per million
- N: number of fish in the composite sample
- Elevation: elevation of the lake (feet)
- Surf Area: surface area of the lake (acres)
- Z: maximum depth (feet)
- Lake type: 1 = oligotrophic, 2 = eutrophic, 3 = mesotrophic
- ST: lake stratification indicator. (1 = yes, 0 = no)
This refers to whether or not there is temperature stratification within a lake. In summer, the lake surface warms up and a decreasing temperature gradient may exist with the bottom remaining cold. A lake is considered stratified if a temperature decrease of 1 degree per meter or greater exists with depth. - DA: drainage area (square miles).
Area of land which collects and drains the rainwater which falls on it, such as the area around the lake. - RF: runoff factor = (total runoff during year)/(total precipitation during year)
Runoff is the amount of rainwater or melted snow which flows in rivers and streams. In general, higher runoff factors may lead to more surface waters from lake watershed reaching lakes. If contaminants are from local source, this may influence concentration found in fish. - FR: flushing rate = (total inflow volume during year)/(total volume of lake).
This gives the number of times all water is theoretically exchanged during a year. - DAM: Dept. of Inland Fisheries and Wildlife impoundment class
0 = no functional dam present; all natural flowage
1 = some man-made flowage in the drainage area
What other information (variables) would be potentially useful in a study like this?
1