Journal Club Stats Moment
Correlation Coefficients
Scatter plots are similar to line graphs in that they use horizontal and vertical axes to plot data points. However, they have a very specific purpose. Scatter plots show how much one variable is affected by another. The relationship between the two variables is called their correlation.
A correlation coefficient is a quick way to express how and how well two variables are related or go together. There are several formulas that can be used to calculate the correlation between two variables.
Some facts about correlation coefficients:
- The methods of correlation describe the relationship between two variables. Most methods of correlation were designed to measure the degree to which two variables are linearly related, that is, how closely the relationship resembles a straight line.
- The correlation coefficient (symbolized rxy) indicates the degree of association between two variables.
- The values of rxy can range from -1.00 to +1.00.
- The sign of the correlation indicates how two variables are related. Positive values of rxyindicate that low values on one variable are related to low values on the other, and vice versa. Look at the first figure for an example of positive correlation. Negative values on rxy indicate an inverse relationship between two variables. Look at the second figure for an example of negative correlation.
- When rxy = 0, no relationship between the two variables exist. They are uncorrelated.
- The absolute value of rxy (when you ignore the plus or minus sign) indicates how closely two variables are related. Values close to -1.00 or +1.00 reflect a strong relationship between the two variables; values near zero indicate a weak relationship.
The peds appy score article used two different correlation tests- The Pearson test and the Spearman test.
The Pearson correlation (also referred to as the Pearson r) is the most commonly used measure of correlation. It is used to describe the linear relationship between two quantitative variables. This is a parametric measure of correlation.
The Spearman rank correlation (also referred to as the Spearman’s rho) is a non-parametric measure of correlation. That is, it can be used when two variables are monotonically related. When two variables are monotonically related, the values of one variable tend to increase when the values of the other variable increase, but not necessarily linearly. In addition, it can be used with ranked data.
“Correlation does not imply causation”. This is an important dictum that means that correlation cannot be validly used to infer a causal relationship between the variables. This does not mean that correlations cannot indicate causal relationships but just that being correlated is not enough to establish a causal relationship (in either direction). For example, if X and Y are positively correlated, it does not prove that X causes Y any more than it proves Y causes X. X and Y might be correlated with each other due to the common influence of a third variable.
Q: Describe the main difference between parametric and non-parametric statistics.
Q: Name three common parametric and three common non-parametric tests.