Car Accessibility in Inner Cities and Residential Areas - Discrete Choice Analysis of Stated Preference Data
Mattias Haraldsson and Tomas Svensson
Swedish National Road and Transport Research Institute
SE-581 95 Linköping
Introduction and purpose
An attractive and efficient urban design requires balance between the individual benefit of car accessibility and public benefits from a good urban environment. The Swedish National Road and Transport Research Institute has carried out a study in this field of inquiry, which has been reported earlier. (Grudemo & Svensson, 2000, Gustavsson, 2000, Svensson, 2000) The research question was answered by letting individuals choose among different scenarios concerning the design of inner cities and residential areas in suburbs, by means of a questionnaire. The purpose of this paper is to use logit models for further analysis. This approach makes it possible to extract more details and get a deeper knowledge of the preferences. One important step in this process is also to find a method to handle missing values; a problem present in many survey-based analyses. Another step that is of great importance for the ease of communicating the results is to present the model parameters in a proper way. The analysis has generated a large number of numerical measures, such as model parameters etc. The main tendencies are summarised in the paper and simulated choices are visualised in graphics, which will hopefully give the reader a satisfactory picture of the results. Readers who are interested in a more detailed description and further discussions are referred to Haraldsson (2000).
Data acquisition
Since there is no functioning market for urban design, stated preference methods are the only available option if the purpose is to investigate the relevant individual preferences. (Stated preference methods are presented in Ivehammar, 1996) The data analysed in this study was collected by means of postal questionnaires in two different surveys. The first survey contains questions about inner city design and the second deals with suburban residential areas. The questionnaires contain three and four scenarios respectively, which are illustrated by pictures and explained briefly in words. Additionally, the questionnaires contain questions about socio-economic characteristics such as income, type of housing, car usage and usage of other transport modes and travel behaviour in general.
The questionnaire concerned with inner cities describes three different scenarios, or alternatives, with the following headings:
- Increase in motor traffic and street space for cars
- Less space for cars and road pricing
- Lower speed limits on smaller streets
The first alternative entails an accommodation of the design of inner cities to a greater use of cars. Street capacity and parking facilities are expanded to cater for more cars. The number of pedestrian streets remains at the present level but public transport experiences a falling number of passengers, which will decrease the quality of the supplied services. Cyclists will find it more difficult to travel safely in increasingly car-adapted surroundings. The second alternative’s major ingredient is a pricing system that imposes a charge on car usage in the inner city. The revenues are used to subsidise bus transport with lower fares as a result. Travelling by car decreases and travelling by public transport increases as a consequence of the system. Some streets and roads are converted to pedestrian streets and bicycle lanes. The third alternative is a so-called “traffic calming” scenario. Car traffic is limited by means of lower speed limits, more pavements and bicycle lanes, and special lanes for public transport only. By calming motor traffic, the competitiveness of other means of transportation is increased and they gain a larger share of the travellers, at the expense of the car.
The questionnaire used in the other part of the research project describes four different designs, or alternatives, of a suburban residential area. It is important to note that in such residential areas there are generally no serious problems due to through traffic. The traffic resulting from car usage consists almost entirely of travel in the particular area by residents and visitors. The four alternatives are identical with respect to number of inhabitants, size, composition of different types of housing, rents, house prices etc. The only characteristic that varies between the four alternatives is the conditions for car usage and parking, which have consequences for public transport, pedestrians, cyclists and children’s play on streets and other common areas. The four alternatives are assigned the following headings:
- More street space for cars
- Lower speed limits
- All parking on the outskirts
- Free of cars
Scenarios two, three and four have reduced accessibility for cars. The second alternative is the least car restrictive and the fourth alternative has the most severe car restrictions. In the first alternative car access is unlimited and the area is designed to improve accessibility for cars. The fourth alternative is primarily designed for individuals and households without cars, or households that use cars very seldom in their everyday life. Ambulances, removal vans and other heavy vehicles can, however, use the pedestrian and cycle lanes in the area. In the second alternative unlimited car access is retained, but with the help of speed limits, street design and other measures, the street space intended for relatively safe use by pedestrians, cyclists, children at play etc., is increased at the expense of car traffic, i.e. traffic calming. In the third alternative, car traffic is limited further. All car parking is located on the outskirts of the area and the pedestrian and cycle lanes can only be used by cars under certain circumstances. The services supplied by public transport are varied with respect to frequencies of bus departures during rush hours: departures at 20, 15, 10, and 5 minute intervals for the alternatives one to four.
Econometric method
In this paper the multinomial logit in (1) is used to model the individuals choice of the scenarios presented above. (See for example Ben-Akiva et al, 1987)
(1)
Missing values
The questionnaire used in the survey is comprehensive and contains a large number of questions. In some cases the respondent has answered only a subset of the full set of questions, which results in missing values. A common way to handle this problem (usually the default method in statistical packages) is to delete observations that are non-complete; a method called listwise data deletion. The consequence of this method, however, is that a lot of information is totally wasted. A further problem is that the estimated parameters are biased if the pattern of the missing values is not missing completely at random, MCAR, which is a very strong assumption. MCAR means that the pattern is not a function of other information in the data and is thus impossible to predict. (King et al, 2000a) These weaknesses imply that other more sophisticated methods should be substituted for listwise data deletion when this is possible. There are several methods described in the literature to substitute an estimated value for the missing value. One state-of-the-art approach, which is also coherent with statistical theories, is multiple imputation, MI, which is used in the research discussed in this paper. With MI a number of data sets with estimations substituted for the missing values are generated, with an iterative method based on Bayesian principles. The estimations vary somewhat between the sets, to represent the uncertainty in estimations. After the imputation of these data sets, each of them is used for estimation of models which, in a second step, are synthesised into a final model. With MI a number of data sets are obtained and, consequently, all of them are used for the model estimation. We can therefore generate as many model estimations as we have data sets and the final model is derived through computation of an average of the single model estimations. (King et al, 2000a) The parameters and their associated variances are computed according to the formulas below. (Montalto et al, 1996, Rubin, 1997, Schafer et al, 1998) The parameter vector is computed according to (2), and is simply an average of the original parameter vectors.
(2)
Computation of the covariance matrix is somewhat more complicated and includes the computation of two components that are subsequently used to derive the final covariance matrix. The covariance in the parameter estimates is compounded of a simple average of the m model covariances and the covariances of the average parameter value. The first component in the covariance, the within imputation covariance, is shown in (3). The formula for the second part, the between imputation covariance, is presented in (4). The two components are used to generate the multiple imputation covariance matrix, which is represented by (5).
(3)
(4)
(5)
The number of degrees of freedom depends on the number of imputations and the variance components, which is apparent from (6). If within imputation variance is the major part, the number of degrees of freedom reaches high levels, and can in fact approach infinity. If this is the case one can conclude that the number of imputations is sufficiently large. When the opposite is true, i.e. when between imputation variance is the largest part, the number of degrees of freedom m-1, will indicate that more imputations are needed. The number of imputations is considered sufficient if the number of degrees of freedom is larger than 10. (Schafer et al, 1998)
(6)
Simulations
In this section a graphical presentation technique will be used to illustrate the model results through simulated choices of two type respondents. Type respondents are analytical constructions, constituted by a set of variable values that can be used to show how different characteristics influences individual choices. The two type respondents used in this analysis are different with respect to sex, family, type of housing and car usage. The purpose has been to represent two rather different individuals, one that is a frequent car user and another that uses public transport or walks more frequently. To make the differences in characteristics more apparent, the two respondents differ in type of housing and the frequency of travelling to the inner city as well. The number of explanatory variables differs somewhat between the inner city model and the model describing suburban residential areas, which means that the respondents are not identical in the two approaches, though the underlying principle is the same. The first respondent is a 40-year-old man who lives in a detached house with his wife and two children. His work is located in the inner city and he commutes by car. He drives about 20 000 kilometres annually. He is satisfied living in his house i.e. his preference for other type of housing is weak. He seldom uses public transport and walks one or two times a week for recreational purposes. The second respondent is a 35-year-old woman. She is living in an apartment with her two children. Her choice of housing is in part explained by low earnings and she would rather live in a detached house if she had the financial means. She owns a car, which is not used very often. She drives only about 5 000 kilometres each year. Her main transport mode is bus, which she uses when commuting to her work outside the city. Both respondents visit the inner city about once a week for shopping purposes.
When the features of the type respondents have been established, these are used to simulate a large number of choices, i.e. a set of probabilities, for the two respondents. The probabilities for the scenarios are visualised in plots where the location indicates the preference structure. (This simulation based presentation technique is elaborated in King, 2000b) For each type respondent 1 000 simulations were made and the dispersion between the simulated probabilities reveals the precision of the estimated parameters. In each simulation one parameter vector is drawn from a multivariate normal distribution with expected value and covariance matrix T. Multiplication of this parameter vector by the set of variable values that represents the type respondents, generates the probabilities for choosing different scenarios. The number of potential type respondents is of course practically unlimited. The two respondents outlined above are just examples with the purpose of presenting some of the main characteristics in the models. Besides this they serve as a means to illustrate how the graphical presentation technique works. It is of course possible to add a broader spectrum of type respondents, where the differences in characteristics are smaller, which makes ceteris paribus analysis possible. The figure below illustrates the probabilities of the different scenarios in the inner city model. The sum of the probabilities is always unity which implies that the points are located on the plane x+y+z=1. Centrally located points represent a situation where every alternative has the same probability.
Figure 1. Simulated choice probability, inner city
The blue points represent the male respondent, and the red points the female respondent. It is obvious that the male respondent tends to choose increase in motor traffic and street space for cars or lower speed limits on smaller streets, whilethe female respondent has a more unambiguous preference for the alternative lower speed limits on the smaller streets. Besides the location of the point clusters per se, the respondents also differ in terms of dispersion. The blue points are more dispersed than the red ones, which implies that the predictions for the male respondent are more uncertain than those for the female.
Figure 2. Simulated choice probability (i), suburban residential areas
The probabilities for the choices of the four alternatives in the suburban residential area model are shown in figure 2 and figure 3. Every corner in the volume 1x+y+x represents one alternative with unity probability. In the figure above only one of the sides in the tetrahedron is shown, and therefore only three alternatives can be explicitly visualised. The fourth alternative, which is marked by the intersection of the dotted lines, is shown in the next figure.
The type respondents are similar in that they prefer the alternative lower speed limits although their second best choices differ. The female respondent has stronger preference for the alternative all parking on the outskirts, whilethe male respondent has a stronger preference for the car friendly alternative more street space for cars. The dispersion inside the clusters is similar and the certainty of the predictions can thus be considered similar.
Figure 3. Simulated choice probability (ii), suburban residential areas
In figure 3 we have rotated the figure above to visualise the fourth scenario of the model and the inner corner represents a choice of the alternative free of cars with a 100 percent probability. As the two clusters are located at approximately the same distance from the inner corner, we can conclude that both respondents approach similar attitudes towards that alternative.
Elasticities
To increase the understanding of marginal effects it is interesting to interpret econometric models in terms of elasticities. The logit model, contrary to ordinary regression approaches, produces elasticity measures that vary with the explanatory variable values. Elasticities may well change over the entire explanatory variable range, which makes the analysis quite complicated. (Cramer, 1991) In this analysis we have calculated elasticities for continuous explanatory variables. The elasticity, i.e. percentage change of the probability for alternative i, P(i), is computed according to the formula below. (Greene, 1997)
(7)
It is evident that the elasticity depends not only on the actual parameter value but also on all other parameter values and each probability. The probability in turn depends on the variable values. The elasticities must therefore be evaluated for each and every relevant explanatory variable vector. In this analysis the variable vector constituting the two type respondents will be employed. The sum of the probabilities is always unity and, consequently, the elasticities add up to zero. The reason is that a probability increase in one alternative must be counterbalanced by an equal decrease in the probabilities for other alternatives.
The ability to draw general conclusions about the way that the explanatory variables influence the probabilities for different scenarios depends on the stability of the elasticities. We are interested in results that can be generalised and, therefore, primarily focus on elasticities that are similar or at least have the same sign at both variable values. In this paper the main tendencies which can be identified are discussed briefly. Readers who are interested in the numerical values are referred to Haraldsson (2000). It seems that the elasticity measures in the inner city model are stable because the variation between the two type respondents is fairly limited. Independently of the chosen values of the explanatory variables, the change of probability points in the same direction. We can observe, for example, that an increase in the number of school or work related errands in the city means that the probability for less space for cars and road pricing decreases. The increase in errand frequency increases the probability for the less car restrictive alternatives. We can also observe that frequent visits to the city for education or work purposes influence the probability for increase in motor traffic and street space for cars in a positive direction. People who visit the city for shopping purposes only, however, have stronger preference for lower speed limits on smaller streets.
Compared with the inner city model, we can observe much more variation in the elasticities in the suburban residential area model, which implies that conclusions about probability changes must be drawn with greater care. A couple of elasticities are different between the respondents in magnitude but also in sign. For example; according to the results the male respondent should get a stronger preference for free of cars if the frequency of bus travelling is increased. The female respondent, though, should get a weaker preference for the same alternative if the frequency of bus travelling increases. The same model parameter thus generates two completely different elasticities depending on the values of the explanatory variables. To mediate a deeper understanding of these effects is a difficult task, but the example reveals the complexity of the problem and that conclusions drawn from the logit parameters can be false. Although this kind of problem is evident for a few variables, most of the variables have stable elasticities, which is a prerequisite for unambiguous interpretations. For example, we can see that an increase in car usage frequency, and/or annual driven distances, implies a stronger preference for more street space for cars. All “car user variables” give the same picture; when individuals become more frequent car users, their preference for car accessibility in suburbs strengthens. Following this tendency we can observe that people who often travel by bus or walk, have preferences against the scenario more street space for cars.