This data set “pingo.txt” concerns explanations of species richness. The data are from pingos, ice-cored mounds found in certain regions of continuous permafrost. Pingos range in height from about 3 or 4 up to as high as 50m, and their diameters may be over half a kilometer.

1.  Type: One of two morphological types of pingos - a steep-sided classic form and a giant, broad-based form (an ordinal variable)

2.  Height: Height in m

3.  Diameter: Diameter in m

4.  Mean pH: Mean pH of a series of soil samples

5.  SD of pH: Standard deviation of pH of a series of soil samples

6.  Distance to coast: Distance to the coast, in km, measured along the primary wind vector; this is very highly correlated with thawing-degree days and is therefore an indirect estimate of summer temperature

7.  Distance to nearest neighbor: Distance to the nearest pingo in km

8.  Distance to nearest river: Distance to the nearest river, in km

Three variables for specie diversity

1.  Number of vascular species: Total number of vascular plant species (= a measure of diversity)

2.  Number of lichen species: Total number of lichen species

3.  Number of bryophyte species: Total number of bryophyte species

The data set we will use will have the following variables log-transformed: vascular species, lichen species, height, river, neighbor, and diameter, and a square-root transform on mean pH. Please do transformations before running statistical analysis.

Question 1: Is there a significant relationship between the sets of variables? Would you say that the answer is ambiguous or unambiguous?

Question 2: What is the shared variance between the two variates, as expressed by the first canonical function?

Question 3: What is the order of the independent variables in terms of their influence on the canonical variate? Please provide documentation of your answer (i.e., the numbers on which you based it and a brief explanation).

Question 4: For this particular study, do you think that the canonical correlation was more or less helpful than the regression analysis? What reasons might someone have for preferring one over the other? Explain the potential advantages and disadvantages of each technique.