Correlation

You use a Pearson Correlation to examine a linear relationship between 2 interval or ratio variables.

Before calculating the pearson correlation coefficient, it is a good idea to create a scatterplot.

  • The scatterplot can demonstrate if you indeed have a linear relationship (instead of a curvilinear one).
  • The scatterplot can help identify outliers that don’t fit with the pattern of your data. If you have an extreme outlier, you may have entered data incorrectly and you will need to correct it. Alternatively, one person may have misinterpreted the question and you might decide to omit their data from further analyses.

Example: Let’s say we want to examine the relationship between years of education and salary. We might expect a linear relationship between these variables, and hypothesize that salary is positively correlated with education. In other words, we would expect that people with more education tend to have higher salaries.

To create a scatterplot:

Graphs  Legacy Dialogs  Scatter/Dot

Select “Simple Scatter” and then “Define”

Send your Y variable to the “Y axis” (the vertical axis) box and

Your X variable to the “X axis” (the horizontal axis) box.

(Use your best judgment to determine which is your x and y variable,

but it really doesn’t matter)

Click OK

Correlation Coefficient = effect size (or strength of relationship)

The effect size tells you the magnitude of the relationship between two variables. A correlation coefficient is an effect size that not only tells you the strength of the relationship, but also tells you the direction of the relationship. There are many different types of correlations, the pearson r is just one of them.

A Pearson correlation coefficient (r) describes the strength and direction between 2 interval and/or ratio variables (x and y). Because there are 2 variables, this relationship is assumed to be linear.

A correlation coefficient ranges from –1 to 1. The number indicates the strength of the relationship and the sign (+ or -) indicates the direction.

For example: r = 0 indicates no relationship.

r = -.48 is a negative relationship.

r = .09 is a positive relationship.

A correlation of -.48 is much stronger than a correlation of .09.

If you square the correlation coefficient (r2) you get the coefficient of determination, which tells you the proportion of variance in y accounted for by x. The coefficient of determination tells you the strength of the relationship, but doesn’t tell you the direction of the effect.

If r = -.48, r2 = .23 meaning that 23% of the variance in y is accounted for by x.

If r = .09, r2 = .008 meaning that less than 1% of the variance in y is accounted for by x.

Describe a correlation as weak, moderate, or strong:

Correlation Coefficient / Correlation of Determination and
Proportion of Variance accounted For
Weak (or small) / r = ±.10 / r2 = .01 or 1% of the variance accouted for
Moderate (or medium) / r = ±.30 / r2 = .09 or 9% of the variance accounted for
Strong (or large) / r = ±.50 / r2 = .25 or 25% of the variance accounted for

Keep in mind that the guidelines are approximations, they are NOT intended to be clear cut-off scores.

Examples:

r = -.60 is a strong negative relationship.

r = -.03 is a weak negative relationship.

r = .12 is a small positive correlation.

r = .32 is a moderate positive relationship.

r = -.40 is a moderate to strong negative relationship.

r = .17 is a small to medium positive correlation.

Statistical Significance Testing

In addition to describing the correlation coefficient, you should also determine if the correlation meets the criteria for statistical significance. Statistical significance tests whether your results are due to chance alone. Researchers usually choose p<.05 as a reasonable amount of error (meaning that there is less than a 5% chance that results are due to error alone).

Statistical significance testing is based on probability theory in which you test your results against a population in which the null hypothesis is true. If results are at the extreme – or rejection range – of that population than you may conclude that it is likely that results belong to a different population.

You do NOT have to calculate critical values. SPSS will provide the EXACT significance value as a p value. All you have to do is understand what the p value means and report it in your results.

Interpreting significant p values:

You can report p < .05, p < .01, or p < .001 as it fits your results. You can also simply report the exact p value.

For example:

If your results are p = .034, this means that there is a 3.4% chance that the results are due to error alone.

 You can report p = .034 OR p < .05

If your results are p = .008, there less than a 1% chance that the results are due to error alone.

 You can report p = .008 OR p < .01

If your results are p = .000, it does NOT mean that there is no chance of error. The error rate is simply beyond three decimal places – or there is less than a .1% chance that the results are due to error.

 Report p < .001

Interpreting non significant p values:

You can report p > .05, p = ns, or the exact p value.

If your results are p = .21. This means that there is a 21% chance that your results are due to error. This error rate is too high, and does not meet the typical criteria of a less than 5% error rate. Thus, the results are considered non-significant.

 Report p = .21, p > .05, or p = ns

What if p = .05 or is really close to being significant (e.g., p = .06)? In such cases, you might say that the results “approach significance” and report the exact p value.

To calculate r: Analyze  Correlate  Bivariate

SPSS OUTPUT

Correlations

Correlations
yrseducation / Salary
rseducation / Pearson Correlation / 1.000 / .650*
Sig. (2-tailed) / .042
N / 10.000 / 10
salary / Pearson Correlation / .650* / 1.000
Sig. (2-tailed) / .042
N / 10 / 10.000
*. Correlation is significant at the 0.05 level (2-tailed).

Writing up Results

When you write up results from a pearson correlation, you will include the following information:

 The statistical analysis used (pearson correlation)

 The variables examined (never use SPSS codes such as Q1, etc.)

 The r, r2, or the proportion of variance accounted for (Choose one)

 The direction of the relationship (positive or negative)

 The strength of the relationship (weak, moderate, or strong)

 The results of statistical significance testing (e.g., the p value)