GENSTAT-Descriptive Statistics: Unclean data
Basic data manipulation- Genstat
Summary statistics
Þ Restart the session and reopen the file sheepdata1.gsh. The data in the spreadsheet are passed into the GenStat server as you click anywhere outside the spreadsheet or the spread menu.
Þ Summary information about the columns will appear in the output window showing minimum, mean and maximum values, number of values and number of those that are missing.
Þ For further statistical summaries use the Stats menu, as shown below. Choose Stats Þ Summary Statistics Þ Summarize Contents of Variates. Select the variates requires in the resulting dialogue shown Fig. 1b, then click [OK]
Fig.1a Choosing the dialogue Fig.1b Selecting the column
Fig.1b Results
Alternatively, one can write the following input for Genstat:
INPUT FILE
TALLY [PRINT=All] BREED; FREPRESENTATION=Labels
TALLY [PRINT=All] SEX; FREPRESENTATION=Labels
TALLY [PRINT=All] PARITY; FREPRESENTATION=Labels
TALLY [PRINT=All] YEAR; FREPRESENTATION=Labels
TALLY [PRINT=All] BIRTH_TY; FREPRESENTATION=Labels
DESCRIBE [SELECTION=nobs,mean,median,min,max,range,var,sd,sem,%cv,skew,kurtosis] BIRTHWT,\
WEANWT,YRLNGWT
OUTPUT
Tally of BREED
======
Value Frequency Percentage Cumulative Cumulative %
H 1990 45.3 1990 45.3
M 2402 54.7 4392 100.0
Tally of SEX
======
Value Frequency Percentage Cumulative Cumulative %
F 2145 48.8 2145 48.8
M 2247 51.2 4392 100.0
Tally of PARITY
======
Value Frequency Percentage Cumulative Cumulative %
-9 16 0.4 16 0.4
0 1 0.0 17 0.4
1 1456 33.2 1473 33.5
2 1222 27.8 2695 61.4
3 844 19.2 3539 80.6
4 535 12.2 4074 92.8
5 239 5.4 4313 98.2
6 71 1.6 4384 99.8
7 6 0.1 4390 100.0
8 2 0.0 4392 100.0
Tally of YEAR
======
Value Frequency Percentage Cumulative Cumulative %
1992 288 6.6 288 6.6
1993 692 15.8 980 22.3
1994 843 19.2 1823 41.5
1995 957 21.8 2780 63.3
1996 1123 25.6 3903 88.9
1997 489 11.1 4392 100.0
Tally of BIRTH_TY
======
Value Frequency Percentage Cumulative Cumulative %
-9 11 0.3 11 0.3
1 3059 69.6 3070 69.9
2 1267 28.8 4337 98.7
3 55 1.3 4392 100.0
Summary statistics for BIRTHWT
======
Number of observations = 4392
Mean = 2.448
Median = 2.500
Minimum = 0.300
Maximum = 4.800
Range = 4.500
Standard deviation = 0.567
Standard error of mean = 0.009
Variance = 0.321
Coefficient of variation = 23.150
Skewness = -0.055
Kurtosis = 0.037
Summary statistics for WEANWT
======
Number of observations = 4392
Mean = 7.52
Median = 8.90
Minimum = 0.00
Maximum = 14.90
Range = 14.90
Standard deviation = 5.23
Standard error of mean = 0.08
Variance = 27.39
Coefficient of variation = 69.60
Skewness = -0.43
Kurtosis = -1.29
Summary statistics for YRLNGWT
======
Number of observations = 4392
Mean = 7.19
Median = 0.00
Minimum = 0.00
Maximum = 35.00
Range = 35.00
Standard deviation = 9.16
Standard error of mean = 0.14
Variance = 83.86
Coefficient of variation = 127.32
Skewness = 0.66
Kurtosis = -1.20