GENSTAT-Descriptive Statistics: Unclean data

Basic data manipulation- Genstat

Summary statistics

Þ Restart the session and reopen the file sheepdata1.gsh. The data in the spreadsheet are passed into the GenStat server as you click anywhere outside the spreadsheet or the spread menu.

Þ Summary information about the columns will appear in the output window showing minimum, mean and maximum values, number of values and number of those that are missing.

Þ For further statistical summaries use the Stats menu, as shown below. Choose Stats Þ Summary Statistics Þ Summarize Contents of Variates. Select the variates requires in the resulting dialogue shown Fig. 1b, then click [OK]

Fig.1a Choosing the dialogue Fig.1b Selecting the column

Fig.1b Results

Alternatively, one can write the following input for Genstat:

INPUT FILE

TALLY [PRINT=All] BREED; FREPRESENTATION=Labels

TALLY [PRINT=All] SEX; FREPRESENTATION=Labels

TALLY [PRINT=All] PARITY; FREPRESENTATION=Labels

TALLY [PRINT=All] YEAR; FREPRESENTATION=Labels

TALLY [PRINT=All] BIRTH_TY; FREPRESENTATION=Labels

DESCRIBE [SELECTION=nobs,mean,median,min,max,range,var,sd,sem,%cv,skew,kurtosis] BIRTHWT,\

WEANWT,YRLNGWT

OUTPUT

Tally of BREED

======

Value Frequency Percentage Cumulative Cumulative %

H 1990 45.3 1990 45.3

M 2402 54.7 4392 100.0

Tally of SEX

======

Value Frequency Percentage Cumulative Cumulative %

F 2145 48.8 2145 48.8

M 2247 51.2 4392 100.0

Tally of PARITY

======

Value Frequency Percentage Cumulative Cumulative %

-9 16 0.4 16 0.4

0 1 0.0 17 0.4

1 1456 33.2 1473 33.5

2 1222 27.8 2695 61.4

3 844 19.2 3539 80.6

4 535 12.2 4074 92.8

5 239 5.4 4313 98.2

6 71 1.6 4384 99.8

7 6 0.1 4390 100.0

8 2 0.0 4392 100.0

Tally of YEAR

======

Value Frequency Percentage Cumulative Cumulative %

1992 288 6.6 288 6.6

1993 692 15.8 980 22.3

1994 843 19.2 1823 41.5

1995 957 21.8 2780 63.3

1996 1123 25.6 3903 88.9

1997 489 11.1 4392 100.0

Tally of BIRTH_TY

======

Value Frequency Percentage Cumulative Cumulative %

-9 11 0.3 11 0.3

1 3059 69.6 3070 69.9

2 1267 28.8 4337 98.7

3 55 1.3 4392 100.0

Summary statistics for BIRTHWT

======

Number of observations = 4392

Mean = 2.448

Median = 2.500

Minimum = 0.300

Maximum = 4.800

Range = 4.500

Standard deviation = 0.567

Standard error of mean = 0.009

Variance = 0.321

Coefficient of variation = 23.150

Skewness = -0.055

Kurtosis = 0.037

Summary statistics for WEANWT

======

Number of observations = 4392

Mean = 7.52

Median = 8.90

Minimum = 0.00

Maximum = 14.90

Range = 14.90

Standard deviation = 5.23

Standard error of mean = 0.08

Variance = 27.39

Coefficient of variation = 69.60

Skewness = -0.43

Kurtosis = -1.29

Summary statistics for YRLNGWT

======

Number of observations = 4392

Mean = 7.19

Median = 0.00

Minimum = 0.00

Maximum = 35.00

Range = 35.00

Standard deviation = 9.16

Standard error of mean = 0.14

Variance = 83.86

Coefficient of variation = 127.32

Skewness = 0.66

Kurtosis = -1.20