1 | Statistical and Physiological Modelling of Athletic Race times 491947
Abstract
During the recent Olympic Games held in London the world record for the men’s 800m was broken by David Rudisha with a time of 1:40:91. It seems for now that the world record for this event can still be beaten but the question is at which time, if any, the record for this event will be ‘unbeatable’. This project builds upon earlier work by Shaw (2012) by focusing on the men’s and women’s 800m. Purely statistical analysis of results from 1890 up to the present day indicates that the men’s times are best fitted by an exponential distribution using the world record times. The women’s Olympic times are best fitted by a logistic distribution. Forecasts using the best fits suggest the 100 second barrier for men will be broken in the year 2018 ± 1. The fits forecast limiting times of 94.5 ± 0.5seconds and 112.25 ± 0.25 seconds for the men and women respectively. An analysis of world age-group records using a physiological model by Péronnet and Thibault (1989) have been updated on the previous results derived by Shaw (2012). The model suggests a linear decline in VO2 max and anaerobic power between the ages of 35 and 85 for both men and women. The endurance parameters for men appear to increase until 40 years and then decline linearly to 80 years of age. The endurance for women is more irregular showing fewer trends.
Contents
Abstract
Chapter 1 – Introduction to Modelling and forecasting of Running Records
Chapter 2 – History and Progression of the 800m
Chapter 3 – Statistical Models and Excel Macro Implementation
Chapter 4 – Data Fitting and Forecasting for 800m: Linear
Chapter 5 – Data Fitting and Forecasting for 800m: Exponential
Chapter 6 – Data Fitting and Forecasting for 800m: Log Quotient
Chapter 7 – Data Fitting and Forecasting for 800m: Logistic
Chapter 8 – Data Fitting and Forecasting for 800m: Gompertz
Chapter 9 – Data Fitting and Forecasting for 800m: Chapman Richards
Chapter 10 – Conclusion for Data Fitting of the 800m
Chapter 11 – Physiology models on Running Performance
Chapter 12- Conclusion for Physiological Models on Running Performance
Appendix
Bibliography
Chapter 1 – Introductionto Modelling and forecasting of Running Records
At the recent London 2012 Olympics there were a total of 26 world records broken throughout the competition. A world record for most sportsmen and women is regarded as one of the highest accolades they could achieve within their sport. There is a ‘big’ question whether there is a limit on human performances within sport. Does this mean there will be a time where new world records will be a thing of the past? In events such as Archery there is a limit to the score a competitor can achieve for example in the 72 arrow ranking round the perfect score would be 720. In events such as Athletics it’s unclear about whether there are limiting times, and if so what are these times. As humans we are always trying to push ourselves to our maximum capabilities this is evident with top athletes. Athletes are always striving to be the best they can be so the idea of limits will not enter an athlete’s mind. Olympic Gold medallist and former 800m world record holder Sir Sebastian Coe once said, ’World records are only borrowed.’
The theory that there are limits to human performance within sport is of high interest to mathematicians especially the idea of limits regarding running times.Many mathematicians have written studies attempting to model the developments of running times using historical data for both genders. Notable studies include Kuper and Sterken (2007) who focused their research on the Gompertz curve and applied this model to 16 events. Nevill and Whyte (2005) tried applying flattened ‘’S-shaped’ logistic curves to the data as they believed records would reach some limit time in the future. Whipp and Ward (1992) used linear approximations and came to the conclusion that women will outrun men. Wainer, Njue and Palmer (2000) found that both men’s and women’s development in athletics showed a similar pattern however women lagged the male equivalents. This project will involveattempts at fitting existing statistical models against 800m times, for both genders. There will be two different sets of data’s to analyse for each gender. The first set of data analysed will be the world record times from records began and the second set of data analysed will be the gold winning times from each Olympics.
The final part of the project is to analyse running performances based on physiological factors such as VO2 max, endurance capabilities etc. There have been simple models created in previous work by mathematicians such asJoyner (1992) and Morton (2012) that showed how the effects of physiological factors affect running performance. A more complex model created by Péronnet and Thibault (1989) showed the effects of physiological factors on running performance for every distance ranging from 100m to the marathon. This Péronnet and Thibault model will then be used to establish whether these factors are affected by age.
Chapter 2 – History and Progression of the 800m
800m
The 800m is a common track running event. The 800m is regarded as the shortest middle distance track event. It is run over two laps of the standard outdoor 400m running track and during the indoor season it is run over 4 laps on a 200m running track. This event has been part of the summer Olympics since the first games in 1896. This event combines two energy systems, aerobic endurance and anaerobic conditioning, unlike events such as 100m which is solely an anaerobic event. The 800m is known to be the most tactical racing event within athletics as it’s the fastest event where all runners converge into the same lane therefore gaining an advantageous position early in the race will usually put the athlete in good stead for the rest of the race. This event has created many legends within athletics such as Sir Sebastian Coe, David Rudisha, Kelly Holmes and Wilson Kipketer.
Men’s 800m World Records
Fig 2.1 shows the progression of the men’s 800m world record since 1912, the year the International Association of Athletics Federation was founded and world records were recorded officially. Observing the graph, without using any mathematics, it appears the charts shows two different trends. From 1912 up to1939 a new world record would be set more often and by a larger time margin compared to 1955 up to the present day.From the year 1939 up to 1955 was a period of time when athletics may have taken a back seat role in the lives of those who competed at the highest levels. During this period World War 2 took place therefore athletes may have been away fighting for their countries meaning they were unable to train and compete, during this time 2 Olympic Games were cancelled. The world record was broken on average every 3.375 years between 1912 and 1939 with an average time reduction per year of 0.196 seconds. The world record was broken every 4.07 ears from 1955 onwards with the average time reduction per year of 0.08 seconds.
Men’s 800m Olympic Times
Fig 2.2 shows the Gold medal winning times from each Olympic Games since 1896.The Gold medal winning times for the first 5 Olympics were improving at a rapid rate, during these 16 years the time had improved by 19.1 seconds. The Olympic times from then onwards were still improving however at a much smaller margin in comparison.This progression can be explained well by Wainer, Njue and Palmer (2000, p.12) who found that ‘The trends over times of athletic record setting are nonlinear. When a sport (or event) is new, records improve quickly, but as the sport matures the records change much more slowly. There are many causes for this, but the most obvious one is participation rate. When a sport is new the record holder might be the best among hundreds, whereas for a mature sport the world record holder might be the best among millions; as participation increases the record improves apace’.
Women’s 800m World Records
Fig 2.3 shows the Women’s 800m World records since 1922. The women’s world records also seem to follow two linear trends the first between 1922 up to1928 and the second from 1944 up to 1983. The world records in the first time period were broken more often and by a greater margin than the other time period. The current World record, 1:53:28, was set byJarmila Kratochvílová in 1983 and is currently the longest standing individual world record in Track and Field.
Women’s 800m Olympic Times
Fig 2.4 shows the women’s 800m gold medal winning times since the first Olympic Games women could compete in. The first appearance for the women’s 800m came in the Summer Olympics in 1928 which was 32 years after the very first Games in Athens. In this momentous race the winner Lina Radke won the race with the time 2:16:8. However after the race rumours spread that may competitors failed to complete the race and many collapsed after the finishing line, a reporter from the New York Evening Post wrote after the race "Below us on the cinder path were 11 wretched women, 5 of whom dropped out before the finish, while 5 collapsed after reaching the tape."As a result of this race the event was taken off the women’s programme by the IOC for 32 years as they felt women were too frail to compete for the 800m. After be reinstated in 1962 this event has been a key event in the Olympics within Track and Field.
Chapter 3 – Statistical Models and Excel Macro Implementation
Model Functions
The table below consists of functions which will be used by the macro in excel to fit against the data. This table was provided by Dr Michael McCabe.
Model Function / r(t) / r(0) / Limit t / CommentsLinear / / / - / Equal annual progression; 0 after t = a/b
Whipp and Ward 1992 Tatem et al. 2004
Exponential / / /
Log Quotient / / / / Could use log10
Logistic / / /
Gompertz / / / a / Kuper and Sterken 2006
Chapman Richards / / a / a - b / Generalises logistic
Macro in Excel
A macro programme built in excel can be used to fit any function against a set of data points, the better the fit the smaller the sum of squares figure will be. Fig 3.1 below is the programme that has been built in excel.
(Fig 3.1)Sub test()
Range(“X51”).Value = 100000
For a = Range(“L4”).Value To Range(“M4”).Value Step Range(“N4”).Value
For b = Range(“L5”).Value To Range(“M5”).Value Step Range(“N5”).Value
For c = Range(“L6”).Value To Range(“M6”).Value Step Range(“N6”).Value
For d = Range(“L7”).Value To Range(“M7”).Value Step Range(“N7”).Value
Range(“Q4”).Value = a
Range(“Q5”).Value = b
Range(“Q6”).Value = c
Range(“Q7”).Value = d
If Range(“X50”).Value < Range(“X51”).Value Then
Range(“X51”).Value = Range(“X50”).Value
Range(“O4”).Value = a
Range(“O5”).Value = b
Range(“O6”).Value = c
Range(“O7”).Value = d
End If
Next d
Next c
Next b
Next a
Range(“Q4”).Value = Range(“O4”).Value
Range(“Q5”).Value = Range(“O5”).Value
Range(“Q6”).Value = Range(“O6”).Value
Range(“Q7”).Value = Range(“O7”).Value
End Sub
At the start of the programme the macro sets the total sum of squares at a high positive number. A range of parameters is then selected for each variable from a to d, with a chosen step size. Depending on what function is used, the amount of variables will change.
Then the sums of squares from the world records times against the selected variables are totalled .If the total is less than the previous total sums of squares figure these parameter values for each variable will be selected.
In column O values for each of the different variables will be given. These values will give the lowest sum of squares when they are used in the model and fit against the data.The least squares value is given in cell X51.
Chapter 4 – Data Fitting and Forecasting for 800m: Linear
Linear Function
Model Function / r(t) / r(0) / Limit t / CommentsLinear / / / - / Equal annual progression; 0 after t = a/b
Whipp and Ward 1992 Tatem et al. 2004
Men’s World Records
Model Function / Year of 100 seconds / 2050 / 2150 / 2250 / Limit t / Least SquaresLinear / 2009 / 94.88 / 82.68 / 70.48 / - / 21.88
Fig 4.1shows the linear fit through the men’s world records. The men’s world records appear linear over this period time. To confirm thiscoefficient of determination (R2) could be calculated as this is a measurement of linearity. In this case R2 = 0.9352 which confirms the data is fairly linear. This model forecasted the 100 second barrier would be broken in 2009 and as time tends to infinity the model forecasts a negative time.
Fig 4.1 also includes linear fit against the data points using the linear feature in excel. This was performed to show the macro was working accurately and sufficiently.
Men’s Olympic Times
Model Function / Year of 100 seconds / 2050 / 2150 / 2250 / Limit t / Least SquaresLinear / 2013 / 94.06 / 78.41 / 62.76 / - / 287.58
Fig 4.2 shows the linear trend through the men’s Olympic times. Looking at the graph the Olympic times do not seem to follow a linear trend. R2= 0.7415, this concludes the data is not very linear.This trend forecasts the 100 second barrier will be broken this year. As the time goes to infinity the trend forecasts a negative time.
Women’s World Records
Model Function / Year of 110 seconds / 2050 / 2150 / 2250 / Limit t / Least SquaresLinear / 1986 / 74.86 / 19.46 / -35.94 / - / 117.88
Fig 4.3 shows the linear fit through the women’s world records. In this set of data R2 = 0.9609 which indicates linearity. Although the data is linear when using this trend to forecast future times it becomes questionable. The trend forecasted the 110 second barrier should have been beaten in 1986. In the current year (2013) the world record is still over three seconds slower. Also by the year 2250 the linear fit suggests the distance will be completed in a negative time, this is impossible.
Women’s Olympic Times
Model Function / Year of 110 seconds / 2050 / 2150 / 2250 / Limit t / Least SquaresLinear / 2021 / 103.8 / 81.8 / 59.8 / - / 142.24
Fig 4.4 shows the linear fit through the women Olympic times. It appears the linear fit is not very accurate for this set of data. In this data the R2= 0.693 this value confirms the women’s Olympic times do not follow a linear trend. Although the Olympic times do not follow a linear trend, the trend forecasts the 110 second barrier will be broke in 2021. This is an optimistic prediction however it cannot be ruled out at this stage.
Linear Conclusions
The results show that a linear trend is a suitable model to use when showing the development of the men’s world records. The linear trend forecasted the 100 second barrier should have been broken in 2009. The current world record in 2013 is 100.91 seconds therefore this forecasted year was too early.As the linear trend has no asymptotes it will carry on through 0 therefore forecasting distant future times using this model would be inappropriate.
The linear trend is not a suitable model when showing developments of the men’s Olympic times. The linear model forecasts the 100 second barrier will be broken this year. This prediction seem optimistic as it’s a 0.91 second improvement on the current world record this is a large margin in 800m.The model predicts within the next 30 years more than 6 seconds will be knocked off the current time. Therefore forecasting men’s futuretimes using this model would give optimistic results but not realistic.
The linear trend fits well for the women’s world records, especially for the more recent times. However using this model to forecast future times would not be appropriate. The linear trend forecasted the 110 second barrier should have been broken in 1986; this estimate was wild in comparison to the current world record. The linear trend decays at a fixed rate so due to the world record existing for over 30 years this may have caused the wild estimation.
The linear fit is not suitable for the women’s Olympic times however this may be due to the times being slightly clustered. Although the model is not suitable for showing the developments of the Olympic times, close future predictions using this model may be worth taking into account.
The linear trend may be useful when showing the progression in running times up to the current date however when it comes to forecasting times for the distant future it would be inadequate as it is a straight line that will carry on through 0 indicating a negative time. ‘A surprising amount of time has been devoted to fitting linear models of the form y = α + βx to this kind of data...’ Baxter (2005, p.31) believed that as the results lead to unbounded estimates of performances, linear models are inappropriate when forecasting running times.
Chapter 5 – Data Fitting and Forecasting for 800m: Exponential
Exponential Function
Model Function / r(t) / r(0) / Limit t / CommentsExponential / / /
Men’s World Records