Testing the Difference Between Two Independent R's

Testing the Difference Between Two Independent r's:

Fisher's r to z transformation:

When , the distribution becomes more radically skewed the closer the correlation is to , and the standard error becomes difficult to estimate. Using Fisher's r to z transformation, both correlation coefficients are converted to r', where is approximately normally distributed. (is often used instead of z in this formula to avoid confusion with z, the standard normal deviate.)

Standard Error for r':

Compute the standard error for each of the correlation coefficients using:

Test of the that ( or ):

To test the difference between the two independent r' values, use the following formula, where critical values can be found in a z-table (e.g., for α = .05, :

Example:

Assume that for a sample of 63 males, the correlation between level of education and income is .75. Further, for a sample of 63 females, the correlation between level of education and income is .60. A researcher is interested in determining if these two correlation coefficients are significantly different from one another.

Males (m) / Females (f)
r / .75 / .60
n / 63 / 63

First, the correlation coefficients are transformed using Fisher’s r to z transformation. This can be done using the formula above or using an r to z table:

For males:

For females:

Although it is not needed to calculate the significance of the difference between the two transformed correlation coefficients, the standard error for each transformed correlation coefficient can be computed as:

For males:

For females:

Test if the difference between the two transformed correlation coefficients is statistically significant:

Let’s say the researcher predicts that there is no difference in the true population correlations for males and females.

*predicted

Using α = .05,

Find zobs:

Since 1.53 is greater than -1.96 and less than 1.96, it does not fall in the rejection region. Therefore, the researcher fails to reject the null and concludes that . It appears that, on average, the correlation between level of education and income is equal for males and for females. From here, the researcher can go on to determine the effect size and 95% confidence interval.

Testing the Difference Between Two Dependent r's:

One example of two dependent r's would be looking at the correlation between two variables in a sample at Time 1 and then again in the same sample at Time 2. Another example would be two correlations that both share a common variable.

Determinant of the 3 x 3 matrix of intercorrelations:

Calculated as:

Test of the that ( or ):

To test the difference between the two r values, use the following formula (Williams' test), where critical values can be found in a t-table. The resulting t-value has df. The two dependent correlation coefficients of interest are r12 and r13, and r23 is the correlation between the two predictors.

Example:

Assume that a researcher administers a measure of relationship satisfaction, a measure of honesty and a measure of kindness to a sample of 58 adults who are currently in a committed relationship. The following correlations are obtained:

Relationship Satisfaction (r) / Honesty (h) / Kindness (k)
r / 1.00 / .78 / .56
h / 1.00 / .49
k / 1.00

The researcher is interested in determining if the correlation between relationship satisfaction and honesty (.78) is significantly larger than the correlation between relationship satisfaction and kindness (.56). Since both correlations share relationship satisfaction, they are dependent.

First, compute the determinant of the 3 x 3 matrix of intercorrelations:

Test if the two correlation coefficients of interest (rrh = .78 and rrk = .56) are significantly different from one another:

Let’s say the researcher predicts that the true population correlation between relationship satisfaction and honesty is significantly greater than the true population correlation between relationship satisfaction and kindness.

*predicted

df =

Using α = .05,

Find tobs:

Since 2.59 is greater than +2.00, it falls in the rejection region. Therefore, the researcher rejects the null and concludes . It appears that, on average, the correlation between relationship satisfaction and honesty is significantly greater than the correlation between relationship satisfaction and kindness. From here, the researcher can go on to calculate the effect size and 95% confidence interval.

Testing the Difference Between Two Independent b's

The t test for the differences between two independent regression coefficients and two independent means is equivalent. Use these equations when you have two separate sets of data with the same variables rather than one large set.

Standard error:

Compute the standard error if Ho is true, the sampling distribution of b1 – b2 is normal and has a mean of 0.

The ratio is then calculated using:

t is distributed on n1 – n2 – 4 df.

Standard error of b:

Compute the standard error for each of the regression coefficients using:

Once both standard errors have been found (b1 and b2), the equation can be written as follows, where and are the error variances for the two samples

If we assume homogeneity of error variances, the two estimates can then be pooled, weighting each by its degrees of freedom.

Example:

Assume we have two sets of data examining the association between the amount someone smokes and their life expectancy. One data set is all females, while the other data set is all males. We are using two different data sets rather than one large data set because we do not want the results to be affected by the normal differences in life expectancy between males and females.

Males / Females
b / -0.40 / -0.20
sy*x / 2.10 / 2.30
s2x / 2.50 / 2.80
n / 101 / 101

It can be seen from the data that the regression line for males is steeper than the regression line for females. If this difference were significant, it would mean that males decrease their life expectancy more than females do for any amount that they smoke. If this was true, we would be interested in testing the differences between b1 and b2.

Substituting this pooled estimate into the equation we get:

= 0.192

Now that we have , we can solve for t:

with 198 df. Because t.025 (198) = 1.97, we fail to reject Ho and would conclude that we have no reason to doubt that life expectancy decreases as a function as function of smoking at the same rate for both males and females.

Testing the Difference between Dependent βs

For casual analysis with standardization variable (path analysis) βi is Xi ’s direct effect on the independent variable. The comparison of effects of the causal effects in the same model is accomplished via their βs.

Dividing the observed difference by the previous equation then gives:

With df = n – k – 1

Example:

Assume the following correlations looking at the associations between time since PhD and number of publications.

Time since Ph.D. X1 / Publications X2 / Sex X3 / Citations X4
X1 / 1.8481 / -1.0731 / -0.1714 / -0.3058
X2 / 1.7638 / -0.0277 / -0.1837
X3 / 1.0528 / -0.0878
X4 / 1.1878
β1 = (.37771) = 1.12344 - .54315 - .03444 - .16813 = .37772
β2 = (.13382) = -.65234 + .89273 - .00556 - .10101 = .13382
β3 = (.04727) = -.10418 - .01401 + .21158 - .04612 = .04727
β4 = (.35725) = -.18591 - .09300 - .01764 + .65302 = .35726

Do the data suggest a population differences in sizes of the direct effects for time since Ph.D. (β1), and the number of publications (β2)? The standard error of their differences is found by substituting n = 62, k = 4, = .5032.

= .113029

Then find:

= 2.158

For df = 57, p < .05