Amanda Johnson

22S:030

May 3, 2004

Proportion Testing of USDF Award Winners

INTRODUCTION

Dressage is an equestrian sport that had its start 2000 years ago with classical Greek horsemanship, and later developed in the Renaissance. Today it is an Olympic sport where horse and rider perform a test of a prescribed series of movements of very high difficulty in an arena. The test performed at the Olympic Games is the Grand Prix, the test with the highest degree of difficulty. Dressage is much like ballet on horseback, horse and rider appear to be moving as one and dance around the arena. The Grand Prix level is so difficult it takes a minimum of five years to train a horse in the Grand Prix movements.

The United States Dressage Federation (USDF) was founded in 1973 to promote dressage in the United States. Today it has over 45,000 members, and provides educational opportunities, competitions, rider and year-end awards for its members. The year-end and rider awards are based on test percentages earned at a USDF competition. This project focuses on the three major rider awards that can be earned through USDF. The three major rider awards are bronze medal, silver medal, and gold medal. USDF has various levels of dressage beginning with the basic training level, levels 1 through 4, then the FEI (Federation Equestrian International) levels which include Prix St. George, Intermediare 1 and 2, and finally the highest level Grand Prix. The USDF levels training through fourth are levels only shown in America and the FEI levels are those that are shown in international competitions and they are the same in every country. The bronze medal is won through achievements in first level through fourth level, the silver medal is won through achievements at fourth level and Prix St. George, and the gold medal is won through achievements at Intermediare and Grand Prix. A list of all award winners can be viewed at http://www.usdf.org/Programs/awards/RiderAwards.asp.

DATA COLLECTION

Lists of bronze, silver and gold medal award winners were retrieved from www.usdf.org and compiled together into an excel file Table 1. The first column lists the riders name, the second column lists the bronze, third the silver and fourth the gold medal award. Under each column a Y or N was placed depending on if that rider won that award. There are people that have only won one award and some that have won all three awards. Normally people will earn their bronze medal first and then their silver and gold medals as they progress through the levels. It would be unusual for a rider to have solely there gold medal as it takes a lot of skill to ride at that level and they would have to be competent at the lower levels in order to be successful enough to earn the gold medal. One reason a rider may only have earned their gold medal is they have not sent in their paper work to earn the silver or bronze medal.

Table 1. Sample of data set

Name (membership number) / Bronze / Silver / Gold
James, Erika T (46703) / Y / Y / N
Jansen, Ann (42497) / Y / N / N
Jaraczewski, Kathleen A (50540) / Y / Y / N
Jaskiel Brown, Linda D (43) / Y / Y / Y
Jeffers, Deri (640) / N / Y / N
Jendrowski, Lynn M (14776) / Y / Y / N
Jenkins, Breen (50818) / Y / N / N
Jenkins, Valerie L (18655) / Y / Y / N
Jensen, Bent (8796) / N / N / Y
Jensen-Bennett, Cindy (52562) / Y / Y / N
Jerdeman, Sharon (22179) / Y / Y / N
Jerman, Carol L (26425) / Y / N / N
Jewell, Roxanne E (17998) / Y / Y / N
Jo, Cassandra (40640) / Y / N / N
Joaquin, Tricia Ziebell (30026) / Y / Y / N
Johansen, Kay (22004) / Y / N / N
Johns, Kathleen (30104) / Y / Y / N
Johnson, Alice A (7287) / N / Y / N
Johnson, Amanda L (42779) / Y / Y / Y

METHODOLOGY

1.  Find the proportion of bronze, silver, gold, of bronze and silver, of bronze and gold and silver and gold using the proc freq feature in SAS from the data set. The input from SAS is:

proc freq data = usdfYN;

tables bronze * silver * gold / chisq;

run;

2.  Create a pie chart of bronze, silver, and gold medal winners.

The SAS input will be:

data usdf;

input type $ count;

datalines;

G 416

S 1667

B 2858

;

run;

3.  Carry out a significance test for a proportion. Find if more than 80% of the population is a not gold medal winner. Ho: 80% of the population are not gold medal winners. Ha: More than 80% of the population are not gold medal winners.

Ho: p = 0.8

Ha: p 0.8

This is tested by the Z statistic:

z = (p-po)/√(po(1-po)/n)

The SAS input will be:

proc freq data = usdfYN;

tables gold / binomial (p = 0.8);

run;

4.  Find a 99% confidence interval for the proportion of Gold medal winners. Find the 99% confidence interval for silver and bronze medal winners for comparison.

The approximate level C confidence interval for p is

p±z*√ (p(1-p)/n)

The SAS input (for the variable gold) is:

proc freq data = usdfYN;

tables gold / binomial alpha = .01;

run;

RESULTS

From the data below there are 2858 people who won the bronze medal, 1667 people who won the silver medal and 416 people won the gold with a total of 3364 people receiving awards. There are 51 people who only received their gold medal, 393 people who only received their silver and 1635 people who only received their bronze medal. These numbers show that it is more difficult to receive the gold medal than the bronze. This is very plausible considering the amount of time it takes to get the horse trained to the Grand Prix level. I have earned my bronze, silver and gold medal as shown in Table 1, bottom of the list. The proportion of people that have won their bronze medal would be 8.68%. Table 2 shows the numbers in a more readable format.

Table 2. Awards

Awards Won / Number of People
who won award / Proportion
Bronze, Silver & Gold / 292 / 8.68%
Bronze & Silver / 920 / 27.35%
Bronze & Gold / 11 / 0.327%
Silver & Gold / 62 / 1.84%
Bronze / 1635 / 48.6%
Silver / 393 / 11.68%
Gold / 51 / 1.52%

The FREQ Procedure

Table 1 of Silver by Gold

Controlling for Bronze=N

Silver(Silver) Gold(Gold)

Frequency‚

Percent ‚

Row Pct ‚

Col Pct ‚N ‚Y ‚ Total

ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

N ‚ 0 ‚ 51 ‚ 51

‚ 0.00 ‚ 10.08 ‚ 10.08

‚ 0.00 ‚ 100.00 ‚

‚ 0.00 ‚ 45.13 ‚

ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

Y ‚ 393 ‚ 62 ‚ 455

‚ 77.67 ‚ 12.25 ‚ 89.92

‚ 86.37 ‚ 13.63 ‚

‚ 100.00 ‚ 54.87 ‚

ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

Total 393 113 506

77.67 22.33 100.00

Statistics for Table 1 of Silver by Gold

Controlling for Bronze=N

Statistic DF Value Prob

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Chi-Square 1 197.2529 <.0001

Likelihood Ratio Chi-Square 1 175.1595 <.0001

Continuity Adj. Chi-Square 1 192.3045 <.0001

Mantel-Haenszel Chi-Square 1 196.8631 <.0001

Phi Coefficient -0.6244

Contingency Coefficient 0.5296

Cramer's V -0.6244

Fisher's Exact Test

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Cell (1,1) Frequency (F) 0

Left-sided Pr <= F 1.179E-38

Right-sided Pr >= F 1.0000

Table Probability (P) 1.179E-38

Two-sided Pr <= P 1.179E-38

Sample Size = 506

The FREQ Procedure

Table 2 of Silver by Gold

Controlling for Bronze=Y

Silver(Silver) Gold(Gold)

Frequency‚

Percent ‚

Row Pct ‚

Col Pct ‚N ‚Y ‚ Total

ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

N ‚ 1635 ‚ 11 ‚ 1646

‚ 57.21 ‚ 0.38 ‚ 57.59

‚ 99.33 ‚ 0.67 ‚

‚ 63.99 ‚ 3.63 ‚

ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

Y ‚ 920 ‚ 292 ‚ 1212

‚ 32.19 ‚ 10.22 ‚ 42.41

‚ 75.91 ‚ 24.09 ‚

‚ 36.01 ‚ 96.37 ‚

ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ

Total 2555 303 2858

89.40 10.60 100.00

Statistics for Table 2 of Silver by Gold

Controlling for Bronze=Y

Statistic DF Value Prob

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Chi-Square 1 404.0990 <.0001

Likelihood Ratio Chi-Square 1 462.1276 <.0001

Continuity Adj. Chi-Square 1 401.6313 <.0001

Mantel-Haenszel Chi-Square 1 403.9576 <.0001

Phi Coefficient 0.3760

Contingency Coefficient 0.3520

Cramer's V 0.3760

Fisher's Exact Test

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Cell (1,1) Frequency (F) 1635

Left-sided Pr <= F 1.0000

Right-sided Pr >= F 6.046E-102

Table Probability (P) 2.557E-100

Two-sided Pr <= P 6.046E-102

Sample Size = 2858

As shown in this pie chart, there are more bronze medal winners than gold medal winners. B is equal to bronze medal award, S is for silver medal award and G stands for gold medal award.

Table 3. Pie chart of bronze silver and gold medal awards

In the significance test for a proportion H0 would be rejected if the z-value is greater than z*. At a 95% confidence interval z* is equal to 1.96. From the results below the z-value found is 11.069, therefore Ho is rejected. More than 80% of award winners are not gold medal winners, or less than 20% of award winners’ received gold medals. This SAS result also gives us a 95% confidence interval. This tells us that we are 95% confident that the proportion of non-gold medal winners is between 86.5% and 88.75%.

Gold

Cumulative Cumulative

Gold Frequency Percent Frequency Percent

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

N 2948 87.63 2948 87.63

Y 416 12.37 3364 100.00

Binomial Proportion for Gold = N

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Proportion 0.8763

ASE 0.0057

95% Lower Conf Limit 0.8652

95% Upper Conf Limit 0.8875

Exact Conf Limits

95% Lower Conf Limit 0.8647

95% Upper Conf Limit 0.8873

Test of H0: Proportion = 0.8

ASE under H0 0.0069

Z 11.0690

One-sided Pr > Z <.0001

Two-sided Pr > |Z| <.0001

Sample Size = 3364

The following SAS result shows that we are 99% confident that the proportion of gold medal winners is between 13.83% and 10.9%. These numbers are found by taking (1-99% conf limit). The 99% confidence interval for the proportion of silver is between 51.79% and 47.32% and the confidence interval for the proportion of bronze is between 86.55% and 83.37%.

Gold

Cumulative Cumulative

Gold Frequency Percent Frequency Percent

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

N 2948 87.63 2948 87.63

Y 416 12.37 3364 100.00

Binomial Proportion for Gold = N

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Proportion 0.8763

ASE 0.0057

99% Lower Conf Limit 0.8617

99% Upper Conf Limit 0.8910

Exact Conf Limits

99% Lower Conf Limit 0.8610

99% Upper Conf Limit 0.8906

Test of H0: Proportion = 0.5

ASE under H0 0.0086

Z 43.6552

One-sided Pr > Z <.0001

Two-sided Pr > |Z| <.0001

Sample Size = 3364

Silver

Cumulative Cumulative

Silver Frequency Percent Frequency Percent

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

N 1697 50.45 1697 50.45

Y 1667 49.55 3364 100.00

Binomial Proportion

for Silver = N

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Proportion 0.5045

ASE 0.0086

99% Lower Conf Limit 0.4823

99% Upper Conf Limit 0.5267

Exact Conf Limits

99% Lower Conf Limit 0.4821

99% Upper Conf Limit 0.5268

Test of H0: Proportion = 0.5

ASE under H0 0.0086

Z 0.5172

One-sided Pr > Z 0.3025

Two-sided Pr > |Z| 0.6050

Bronze

Cumulative Cumulative

Bronze Frequency Percent Frequency Percent

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

N 506 15.04 506 15.04

Y 2858 84.96 3364 100.00

Binomial Proportion

for Bronze = N

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Proportion 0.1504

ASE 0.0062

99% Lower Conf Limit 0.1345

99% Upper Conf Limit 0.1663

Exact Conf Limits

99% Lower Conf Limit 0.1349

99% Upper Conf Limit 0.1669

Test of H0: Proportion = 0.5

ASE under H0 0.0086

Z -40.5517

One-sided Pr < Z <.0001

Two-sided Pr > |Z| <.0001

Sample Size = 3364

CONCLUSION

For the USDF rider awards there proportionality more people who won the bronze medal compared to the silver or gold medal as was shown in Table 3. This is because it takes a relatively long time to train a horse to Grand Prix and an extraordinary amount of skill on the rider’s part in order to achieve the gold medal. There are usually fewer people at the very high skill level in a sport than at the basic levels.

The analysis also showed that fewer than 20% of award winners are gold medal award winners. We are 99% confident that there are between 13.83% and 10.9% of gold medal winners, between 51.79% and 47.32% silver medal winners, and 86.55% and 83.37% bronze medal winners.

Figure 4. Amanda Johnson and her horse Glissade