Chapter 3: Descriptive Statistics: Bivariate Correlation

Bivariate Correlation: Relationship between two variables

Concerned with two things:

  1. Is there a relationship?
  2. How strong/weak is the relationship? (if there is)

Can be one of three types:

  1. High-high, low-low
  2. High-low, low-high
  3. Little systematic tendency

Correlation Coefficient:

-A measure of the correlation between two variables

-Ranges from -1 to 1

-Toward -1 means high-low (inverse relationship; negative correlation)

-Toward +1 means high-high (direct; positive correlation)

-Depending on value can use words like weak, moderate, strong to describe

Correlation Matrix

-Table of correlation coefficients between different pairs of variables

Different kinds of correlation coefficients used (depending on type of variables)

Type of correlation coefficient used / Var 1 / Var 2 / Remarks
Pearson’s Product-Moment (r) / Quantitative (interval or ratio); represents raw score / Quantitative (interval or ratio);
represents raw score
Spearman’s Rho / Quantitative but represents rank / Quantitative but represents rank
Kendall’s Tau / Quantitative but represents rank / Quantitative but represents rank / Similar to Rho but differs in treatment of ties (Tau does a better job)
Point Biserial / Raw Scores / Dichotomous (1 or 0)
Biserial Correlation / Raw Scores / Artificial Dichotomy
Phi / Dichotomous (1 or 0) / Dichotomous (1 or 0)
Tetrachoric / Artificial Dichotomy / Artificial Dichotomy
Cramer’s V / Nominal / Nominal / If only two categories, Cramers’ V identical to Phi

Warnings about correlation:

  1. Correlation is not equal to Cause
  2. Coefficient of determination (square of correlation coefficient)
  3. Suggests how much variability in either variable is explained by the other variable
  4. Outliers
  5. Linearity
  6. Needs to check whether the two variables have a linear relationship to begin with
  7. If relationship is curvilinear, Pearson’s correlation will underestimate the strength of the relationship
  8. Correlation and Independence
  9. The two instruments used to measure the variables should (ideally) be independent
  10. Hence, the correlation between them should be low
  11. Relationship Strength
  12. Choice of words still subjective; check raw data if necessary and possible