Water Chemistry of
LakeCarcoar

Introduction

Data on chemical composition of the water of LakeCarcoar, near Cowra, were entered as part of a project on water quality management conducted at the University of Canberra.

LakeCarcoar is a relatively small storage in an agricultural district, so its water quality is of particular concern to the New South Wales Sydney Catchment Authority.

The data are held in disk file CARCOAR.DAT and the measurements have been selected because of their known relationship to algal production, particularly production by diatoms. These algae are single-celled and secrete elaborate silica skeletons. Blooms of these microscopic organisms can cause severe deterioration of water quality.

The Data
Variable / Columns / Units
STATION NUMBER / 1- 7
DATE / 8-13 / ddmmyy
NITRATE / 16-22 / mg/l
SILICA / 25-30 / mg/l
SOLUBLE PHOS / 33-37 / mg/l
TOTAL PHOSPHORUS / 40-44 / mg/l
AMMONIA / 47-51 / mg/l
CHLOROPHYLL-A / 54-56 / UNESCO units
CONDUCTIVITY / 59-61 / microsiemens/cm
TURBIDITY / 64-69
/ NTU
The Problem

The Sydney Catchment Authority would like to design a monitoring programme for this lake, based on knowledge of the typical concentrations of each of these key measurements. They would also like forewarning of algal blooms, and information that can be used to define upper acceptable limits for each of these variables would be most welcome. Silica and Total Phosphorus are of greatest interest.

Perform the appropriate analyses for SILICA, and provide a brief report for the Catchment Authority, using the proforma supplied.

Analysis

Undertake the appropriate analyses to determine whether silica concentration is normally distributed. Present the outcomes of the analysis below. Be sure to include a histogram.

Normal Probability Plot

41+ *

|

37+

| *

33+ ***

| ****

29+ *

| *

25+ * +

| * +++

21+ * +++

| +++

17+ +**

| +***

13+ +++**

| +++ *

9+ +++ ***

| +++ *******

5+ ********

| **********

1+** ********** ++

+----+----+----+----+----+----+----+----+----+----+

-2 -1 0 +1 +2

Figure 1. Normal probability plot of 324 measurements of Silica concentrations (in mg/l) collected from 7 different sites in LakeCarcoar during the period between 14 July 1981 and 8 January 1985.

Table 1. Test for normality for Silica concentrations (mg/l) for LakeCarcoar.

Tests for Normality

Test --Statistic------p Value------

Shapiro-Wilk W 0.666368 Pr < W <0.0001

Kolmogorov-Smirnov D 0.277731 Pr > D <0.0100

Cramer-von Mises W-Sq 6.588178 Pr > W-Sq <0.0050

Anderson-Darling A-Sq 36.07851 Pr > A-Sq <0.0050

Figure 2. Histogram for 324 measurements of Silica concentrations (in mg/l) collected from 7 different sites in LakeCarcoar during the period between 14 July 1981 and 8 January 1985.

Compute a comprehensive set of summary statistics for the variable SILICA. Present the full set of statistics below in tabular form.

Table 2. Descriptive statistics for Silica concentrations (mg/l) for LakeCarcoar.

N 324 Sum Weights 324

Mean 6.98352778 Sum Observations 2262.663

Std Deviation 7.01269648 Variance 49.177912

Skewness 2.56413518 Kurtosis 6.48595258

Uncorrected SS 31685.8355 Corrected SS 15884.4656

Coeff Variation 100.417679 Std Error Mean 0.38959425

Table 3. Basic summary statistics for Silica concentrations (mg/l) for LakeCarcoar.

Location Variability

Mean 6.983528 Std Deviation 7.01270

Median 5.000000 Variance 49.17791

Mode 2.400000 Range 40.50000

Interquartile Range 4.16550

Table 4. Hypothesis test for location for Silica concentrations (mg/l) for LakeCarcoar.

Tests for Location: Mu0=0

Test -Statistic------p Value------

Student's t t 17.92513 Pr > |t| <.0001

Sign M 161.5 Pr >= |M| <.0001

Signed Rank S 26163 Pr >= |S| <.0001

Table 5. Quantiles for Silica concentrations (mg/l) for LakeCarcoar.

Quantile Estimate

100% Max 40.5000

99% 33.6000

95% 25.2000

90% 14.3000

75% Q3 7.1655

50% Median 5.0000

25% Q1 3.0000

10% 2.0000

5% 1.6000

1% 0.4000

0% Min 0.0000

Table 6. Extreme values and missing values for Silica concentrations (mg/l) for LakeCarcoar.

----Lowest------Highest---

Value Obs Value Obs

0.0 188 33.4 200

0.3 102 33.6 199

0.4 103 33.7 284

0.4 101 34.3 286

0.5 107 40.5 282

Missing Values

-----Percent Of-----

Missing Missing

Value Count All Obs Obs

. 18 5.26 100.00

Results

What do you conclude regarding the normality of the variable SILICA? Be sure to include supporting statistics or cross-references to diagrams and tables produced during the analysis.

Silica concentration (mg/l) from water samples collected from seven different sites in Lake Carcoar during the period between 14 July 1981 and 8 January 1985 were not normally distributed (Shapiro-Wilks Test, W=0.666, p<0.0001) (Figures 1 & 2). Indeed, their distribution is strongly skewed to the right. This is also confirmed in that the mean of 7.0 mg/l is larger than the median concentration of 5.0 mg/l which in turn is larger than the mode of 2.4 mg/l (Table 3). This indicates that the distribution is skewed and the likely presence of some extreme values.

The key issue here is that you recognise that there are multiple indications of non-normality when it exists – from a test statistic (Shapiro-Wilks W), from the graphical representation of the data as a probability plot and histogram, and by the non-coincidence of the mean, median and mode.

Provide a concise summary of the results, such as might appear in the results section of a manuscript or report. Include in your summary, a description of the distribution of SILICA values, only those descriptive statistics appropriate to the data, and a working definition of an extreme SILICA value.

Silica cocentrations for LakeCarcoar ranged from 0.0 to 40.5 mg/L with a mean 6.98  0.39, (n= 324) during the period of the study. This variable was not normally distributed, but rather had a unimodal distribution with a pronounced skew to the right (W= 0.67 p <0.0001; Figures 2 &3). The median and mode were 5.0, and 2.4 mg/L respectively, and the interquartile range was 4.17. An extreme event was defined by the 99th percentile as ay value greater than 33.6 mg/L.

The key issue here is that you recogised the need to adjust your description by virtue of the non-normality of the data, to include the greater detail necessary in your description than would be the case if the data had been Normal. Need for example to include the mean, median and mode, as the three will not be coincident, to describe the distribution in more detail (unimodal, bimodal? Skewed to right or left? Strongly leptokurtic or platykurtic, etc etc), and to define extreme events in terms of percentiles not the mean  3 standard deviations.

Discussion

With regard to normality, are your results consistent with expectation for a variable such as SILICA? Why?

One might usually expect concentrations of chemicals, like many other variables, to be normally distributed. The strongly non-normal behaviour exhibited by the concentrations of silica in the lake must result from the periodic episodes of algal blooms or episodic influx of silica leachedfrom the catchment during storm events.Diatoms secrete silica skeletons. Therefore high concentrations of silica in the water indicate high levels of algal production and hence possible deterioration of water quality. Since these events of high algal production are episodic and since they result in high concentrations of silica in the water body, we might not be surprised to find that the distribution of silica in the lake to be highly right skewed.

Any plausible explanation of your expectation, whether you expected normality or not, will do.

What advice would you give to anyone planning further statistical analyses on SILICA?

As the data is clearly not normal, the usual parameteric analyses such as construction of a 95% confidence interval cannot be directly applied. One approach to suggest is to try a normalising transformation such as a log(x) or log(x+1) transformation (in the later case since there are concentrations of zero in the data).

Below is a histogram and normal probability plot of the log transformed silica concentrations. Clearly in this case the distribution of log silica is nearly normal.

1

(c) Arthur Georges, 2002

Normal Probability Plot

1.65+ *

| ****

| ******++

| ** +++

| **++

| ****

| ++*

| +++***

| +******

| *****

| *****

0.55+ ***+

| ****

| *****

| *****+

| **+++

| +*+

| +++*

|++ **

|

|***

|

-0.55+*

+----+----+----+----+----+----+----+----+----+----+

-2 -1 0 +1 +2

What recommendations would you like to make to Sydney Catchment Authority?

Since the presence of high concentrations of silica in the water body maybe an indicator of high algal production, continued measurement of this parameter may provide an index for monitoring the progress of algal blooms. As only 1% of the observed concentrations of silica in the water body exceeded 36 mg/l during the study period, one might use this value as a trigger for management intervention.

Any reasonable recommendation will suffice for the purposes of this exercise, not necessarily the one above.

Program Listing

Append a full SAS program listing, cleaned up and free from error or redundant code.

DATA CHEM;

INFILE"K:\1\CARCOAR.DAT";

INPUT STATID DATE $ NITRATE SILICA SOLPO4 PO4 NH3

CHLOROA CONDUCT TURBID;

RUN;

PROCUNIVARIATEDATA=CHEM PLOTNORMAL;

VAR SILICA;

RUN;

GOPTIONSRESET=ALL;

PROCGCHARTDATA=CHEM;

VBAR SILICA /SPACE=0MIDPOINTS=0.0 TO 40 BY 5;

RUN;

DATA CHEM;

SET CHEM;

SILICA=LOG10(SILICA+1);

RUN;

PROCUNIVARIATEDATA=CHEM PLOTNORMAL;

VAR SILICA;

RUN;

GOPTIONSRESET=ALL;

PROCGCHARTDATA=CHEM;

VBAR SILICA /SPACE=0;

RUN;

1

(c) Arthur Georges, 2002