Amanda Johnson
22S:030
May 3, 2004
Proportion Testing of USDF Award Winners
INTRODUCTION
Dressage is an equestrian sport that had its start 2000 years ago with classical Greek horsemanship, and later developed in the Renaissance. Today it is an Olympic sport where horse and rider perform a test of a prescribed series of movements of very high difficulty in an arena. The test performed at the Olympic Games is the Grand Prix, the test with the highest degree of difficulty. Dressage is much like ballet on horseback, horse and rider appear to be moving as one and dance around the arena. The Grand Prix level is so difficult it takes a minimum of five years to train a horse in the Grand Prix movements.
The United States Dressage Federation (USDF) was founded in 1973 to promote dressage in the United States. Today it has over 45,000 members, and provides educational opportunities, competitions, rider and year-end awards for its members. The year-end and rider awards are based on test percentages earned at a USDF competition. This project focuses on the three major rider awards that can be earned through USDF. The three major rider awards are bronze medal, silver medal, and gold medal. USDF has various levels of dressage beginning with the basic training level, levels 1 through 4, then the FEI (Federation Equestrian International) levels which include Prix St. George, Intermediare 1 and 2, and finally the highest level Grand Prix. The USDF levels training through fourth are levels only shown in America and the FEI levels are those that are shown in international competitions and they are the same in every country. The bronze medal is won through achievements in first level through fourth level, the silver medal is won through achievements at fourth level and Prix St. George, and the gold medal is won through achievements at Intermediare and Grand Prix. A list of all award winners can be viewed at http://www.usdf.org/Programs/awards/RiderAwards.asp.
DATA COLLECTION
Lists of bronze, silver and gold medal award winners were retrieved from www.usdf.org and compiled together into an excel file Table 1. The first column lists the riders name, the second column lists the bronze, third the silver and fourth the gold medal award. Under each column a Y or N was placed depending on if that rider won that award. There are people that have only won one award and some that have won all three awards. Normally people will earn their bronze medal first and then their silver and gold medals as they progress through the levels. It would be unusual for a rider to have solely there gold medal as it takes a lot of skill to ride at that level and they would have to be competent at the lower levels in order to be successful enough to earn the gold medal. One reason a rider may only have earned their gold medal is they have not sent in their paper work to earn the silver or bronze medal.
Table 1. Sample of data set
Name (membership number) / Bronze / Silver / GoldJames, Erika T (46703) / Y / Y / N
Jansen, Ann (42497) / Y / N / N
Jaraczewski, Kathleen A (50540) / Y / Y / N
Jaskiel Brown, Linda D (43) / Y / Y / Y
Jeffers, Deri (640) / N / Y / N
Jendrowski, Lynn M (14776) / Y / Y / N
Jenkins, Breen (50818) / Y / N / N
Jenkins, Valerie L (18655) / Y / Y / N
Jensen, Bent (8796) / N / N / Y
Jensen-Bennett, Cindy (52562) / Y / Y / N
Jerdeman, Sharon (22179) / Y / Y / N
Jerman, Carol L (26425) / Y / N / N
Jewell, Roxanne E (17998) / Y / Y / N
Jo, Cassandra (40640) / Y / N / N
Joaquin, Tricia Ziebell (30026) / Y / Y / N
Johansen, Kay (22004) / Y / N / N
Johns, Kathleen (30104) / Y / Y / N
Johnson, Alice A (7287) / N / Y / N
Johnson, Amanda L (42779) / Y / Y / Y
METHODOLOGY
1. Find the proportion of bronze, silver, gold, of bronze and silver, of bronze and gold and silver and gold using the proc freq feature in SAS from the data set. The input from SAS is:
proc freq data = usdfYN;
tables bronze * silver * gold / chisq;
run;
2. Create a pie chart of bronze, silver, and gold medal winners.
The SAS input will be:
data usdf;
input type $ count;
datalines;
G 416
S 1667
B 2858
;
run;
3. Carry out a significance test for a proportion. Find if more than 80% of the population is a not gold medal winner. Ho: 80% of the population are not gold medal winners. Ha: More than 80% of the population are not gold medal winners.
Ho: p = 0.8
Ha: p 0.8
This is tested by the Z statistic:
z = (p-po)/√(po(1-po)/n)
The SAS input will be:
proc freq data = usdfYN;
tables gold / binomial (p = 0.8);
run;
4. Find a 99% confidence interval for the proportion of Gold medal winners. Find the 99% confidence interval for silver and bronze medal winners for comparison.
The approximate level C confidence interval for p is
p±z*√ (p(1-p)/n)
The SAS input (for the variable gold) is:
proc freq data = usdfYN;
tables gold / binomial alpha = .01;
run;
RESULTS
From the data below there are 2858 people who won the bronze medal, 1667 people who won the silver medal and 416 people won the gold with a total of 3364 people receiving awards. There are 51 people who only received their gold medal, 393 people who only received their silver and 1635 people who only received their bronze medal. These numbers show that it is more difficult to receive the gold medal than the bronze. This is very plausible considering the amount of time it takes to get the horse trained to the Grand Prix level. I have earned my bronze, silver and gold medal as shown in Table 1, bottom of the list. The proportion of people that have won their bronze medal would be 8.68%. Table 2 shows the numbers in a more readable format.
Table 2. Awards
Awards Won / Number of Peoplewho won award / Proportion
Bronze, Silver & Gold / 292 / 8.68%
Bronze & Silver / 920 / 27.35%
Bronze & Gold / 11 / 0.327%
Silver & Gold / 62 / 1.84%
Bronze / 1635 / 48.6%
Silver / 393 / 11.68%
Gold / 51 / 1.52%
The FREQ Procedure
Table 1 of Silver by Gold
Controlling for Bronze=N
Silver(Silver) Gold(Gold)
Frequency‚
Percent ‚
Row Pct ‚
Col Pct ‚N ‚Y ‚ Total
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
N ‚ 0 ‚ 51 ‚ 51
‚ 0.00 ‚ 10.08 ‚ 10.08
‚ 0.00 ‚ 100.00 ‚
‚ 0.00 ‚ 45.13 ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Y ‚ 393 ‚ 62 ‚ 455
‚ 77.67 ‚ 12.25 ‚ 89.92
‚ 86.37 ‚ 13.63 ‚
‚ 100.00 ‚ 54.87 ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Total 393 113 506
77.67 22.33 100.00
Statistics for Table 1 of Silver by Gold
Controlling for Bronze=N
Statistic DF Value Prob
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Chi-Square 1 197.2529 <.0001
Likelihood Ratio Chi-Square 1 175.1595 <.0001
Continuity Adj. Chi-Square 1 192.3045 <.0001
Mantel-Haenszel Chi-Square 1 196.8631 <.0001
Phi Coefficient -0.6244
Contingency Coefficient 0.5296
Cramer's V -0.6244
Fisher's Exact Test
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Cell (1,1) Frequency (F) 0
Left-sided Pr <= F 1.179E-38
Right-sided Pr >= F 1.0000
Table Probability (P) 1.179E-38
Two-sided Pr <= P 1.179E-38
Sample Size = 506
The FREQ Procedure
Table 2 of Silver by Gold
Controlling for Bronze=Y
Silver(Silver) Gold(Gold)
Frequency‚
Percent ‚
Row Pct ‚
Col Pct ‚N ‚Y ‚ Total
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
N ‚ 1635 ‚ 11 ‚ 1646
‚ 57.21 ‚ 0.38 ‚ 57.59
‚ 99.33 ‚ 0.67 ‚
‚ 63.99 ‚ 3.63 ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Y ‚ 920 ‚ 292 ‚ 1212
‚ 32.19 ‚ 10.22 ‚ 42.41
‚ 75.91 ‚ 24.09 ‚
‚ 36.01 ‚ 96.37 ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Total 2555 303 2858
89.40 10.60 100.00
Statistics for Table 2 of Silver by Gold
Controlling for Bronze=Y
Statistic DF Value Prob
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Chi-Square 1 404.0990 <.0001
Likelihood Ratio Chi-Square 1 462.1276 <.0001
Continuity Adj. Chi-Square 1 401.6313 <.0001
Mantel-Haenszel Chi-Square 1 403.9576 <.0001
Phi Coefficient 0.3760
Contingency Coefficient 0.3520
Cramer's V 0.3760
Fisher's Exact Test
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Cell (1,1) Frequency (F) 1635
Left-sided Pr <= F 1.0000
Right-sided Pr >= F 6.046E-102
Table Probability (P) 2.557E-100
Two-sided Pr <= P 6.046E-102
Sample Size = 2858
As shown in this pie chart, there are more bronze medal winners than gold medal winners. B is equal to bronze medal award, S is for silver medal award and G stands for gold medal award.
Table 3. Pie chart of bronze silver and gold medal awards
In the significance test for a proportion H0 would be rejected if the z-value is greater than z*. At a 95% confidence interval z* is equal to 1.96. From the results below the z-value found is 11.069, therefore Ho is rejected. More than 80% of award winners are not gold medal winners, or less than 20% of award winners’ received gold medals. This SAS result also gives us a 95% confidence interval. This tells us that we are 95% confident that the proportion of non-gold medal winners is between 86.5% and 88.75%.
Gold
Cumulative Cumulative
Gold Frequency Percent Frequency Percent
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
N 2948 87.63 2948 87.63
Y 416 12.37 3364 100.00
Binomial Proportion for Gold = N
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Proportion 0.8763
ASE 0.0057
95% Lower Conf Limit 0.8652
95% Upper Conf Limit 0.8875
Exact Conf Limits
95% Lower Conf Limit 0.8647
95% Upper Conf Limit 0.8873
Test of H0: Proportion = 0.8
ASE under H0 0.0069
Z 11.0690
One-sided Pr > Z <.0001
Two-sided Pr > |Z| <.0001
Sample Size = 3364
The following SAS result shows that we are 99% confident that the proportion of gold medal winners is between 13.83% and 10.9%. These numbers are found by taking (1-99% conf limit). The 99% confidence interval for the proportion of silver is between 51.79% and 47.32% and the confidence interval for the proportion of bronze is between 86.55% and 83.37%.
Gold
Cumulative Cumulative
Gold Frequency Percent Frequency Percent
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
N 2948 87.63 2948 87.63
Y 416 12.37 3364 100.00
Binomial Proportion for Gold = N
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Proportion 0.8763
ASE 0.0057
99% Lower Conf Limit 0.8617
99% Upper Conf Limit 0.8910
Exact Conf Limits
99% Lower Conf Limit 0.8610
99% Upper Conf Limit 0.8906
Test of H0: Proportion = 0.5
ASE under H0 0.0086
Z 43.6552
One-sided Pr > Z <.0001
Two-sided Pr > |Z| <.0001
Sample Size = 3364
Silver
Cumulative Cumulative
Silver Frequency Percent Frequency Percent
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
N 1697 50.45 1697 50.45
Y 1667 49.55 3364 100.00
Binomial Proportion
for Silver = N
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Proportion 0.5045
ASE 0.0086
99% Lower Conf Limit 0.4823
99% Upper Conf Limit 0.5267
Exact Conf Limits
99% Lower Conf Limit 0.4821
99% Upper Conf Limit 0.5268
Test of H0: Proportion = 0.5
ASE under H0 0.0086
Z 0.5172
One-sided Pr > Z 0.3025
Two-sided Pr > |Z| 0.6050
Bronze
Cumulative Cumulative
Bronze Frequency Percent Frequency Percent
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
N 506 15.04 506 15.04
Y 2858 84.96 3364 100.00
Binomial Proportion
for Bronze = N
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Proportion 0.1504
ASE 0.0062
99% Lower Conf Limit 0.1345
99% Upper Conf Limit 0.1663
Exact Conf Limits
99% Lower Conf Limit 0.1349
99% Upper Conf Limit 0.1669
Test of H0: Proportion = 0.5
ASE under H0 0.0086
Z -40.5517
One-sided Pr < Z <.0001
Two-sided Pr > |Z| <.0001
Sample Size = 3364
CONCLUSION
For the USDF rider awards there proportionality more people who won the bronze medal compared to the silver or gold medal as was shown in Table 3. This is because it takes a relatively long time to train a horse to Grand Prix and an extraordinary amount of skill on the rider’s part in order to achieve the gold medal. There are usually fewer people at the very high skill level in a sport than at the basic levels.
The analysis also showed that fewer than 20% of award winners are gold medal award winners. We are 99% confident that there are between 13.83% and 10.9% of gold medal winners, between 51.79% and 47.32% silver medal winners, and 86.55% and 83.37% bronze medal winners.
Figure 4. Amanda Johnson and her horse Glissade