Sensor Locations for Reliable Travel Time Prediction and Dynamic Management of Traffic Networks

How important is capturing congestion dynamics in dynamic OD estimation?

Rodric Frederix

Department of Mechanical Engineering

Center for Industrial Management - Traffic and Infrastructure

Katholieke Universiteit Leuven

Celestijnenlaan 300A - PO Box 2422,

3001 Heverlee, Belgium

Francesco Viti[*]

Department of Mechanical Engineering

Center for Industrial Management - Traffic and Infrastructure

Katholieke Universiteit Leuven

Celestijnenlaan 300A - PO Box 2422,

3001 Heverlee, Belgium

Chris M.J. Tampère

Department of Mechanical Engineering

Center for Industrial Management - Traffic and Infrastructure

Katholieke Universiteit Leuven

Celestijnenlaan 300A - PO Box 2422,

3001 Heverlee, Belgium

Keywords: Dynamic OD estimation, congestion patterns, Dynamic Network Loading

Abstract

In this paper we highlight some important conditions for a proper handlingof congestion effects in dynamic origin-destination (OD) estimation. A first condition is related with the choice of the network loading model and in particular to the modelling of queues. It is shown that an incorrect representation of queuing leads to an incorrect interpretation of information from the detectors, and thus leads to biased results. Another source of error is caused by the presence of multiple local minima in the objective function. These are partly due to an incorrect interpretation of the information from the detectors. Since flow measurements can correspond to two distinguished traffic regimes, free flow and congestion, when the assigned initial OD matrix reproduces a different traffic regime at the detectors compared to reality, the measurements can be interpreted incorrectly. We show these findings through theoretical considerations and using synthetic examples. Finally a case study is presented that illustrates how OD estimation results can be faulty if these shortcomings are not considered.

Introduction

The dynamic OD estimation problem represents a fundamental research subject in transportation theory and practice. The large majority of transportation problems require preliminarily the estimation of OD flows, and errors in this procedure are often carried over in the following steps, yielding to incorrect findings and conclusions. Being a widely acknowledged fundamental problem, the dynamic OD estimation procedure has been studied extensively under various application domains and focusing on different aspects of the problem. Past studies focus on a large variety of topics, dealing with the randomness and underdeterminedness of the problem, solution algorithms, the use of different data sources, and so forth.

Traditional OD estimation methods use link traffic flows, or counts, as input, since they are the most common data measures available in practice. However, traffic counts contain aggregate information about multi-commodity flows, i.e. they need to be decomposed into a number of flow portions depending on the number of routes connecting any OD pair. As one can easily understand there can be many combinations of these flow fractions that result in the same link flow values and the set of possible solutions usually grows with the size of the network in consideration, and the travel alternatives available for each OD pair, while it usually reduces by increasing the number of detectors. Two clarifying examples of this issue can be found in Wu et al. [1]and in Yang and Zhou [2]. Generally speaking, the under-determinedness is expected in all cases where the information collected to estimate the OD flows is insufficient to determine them unambiguously, which is the most likely scenario in practice ([3]-[4]).

A general way to obviate this under-determinedness is to add extra information to the counts, traditionally in the form of a prior estimate of the OD table (e.g., an outdated OD matrix), or by including route information, for instance through route choice models or using route information sources (e.g., floating car data, automatic vehicle identification systems). An overview of the most relevant approaches to solve this problem is given in e.g. [5]. However, it is quite well known that, by imposing these conditions, the solution will become tied to these input variables, whose reliability is often hard to be checked. For instance, imposing an initial matrix can be a valid solution for a specific time of the day, while completely different traffic patterns may be observed at other times or even in other days of the week. Also route information can be affected by the same issue, and this information type is in general very expensive to be collected.

The underdeterminedness issue affects indiscriminately static and dynamic OD estimation problems. However, in congested networks and for a within-day dynamic context, this is not the only factor causingbiases between the estimated and real OD matrix. Congestion causes the relationship between the link flows and the OD flows to become highly non-linear in time, because of spillback and rerouting effects. This non-linearity makes the problem non-convex, and it is possible that the estimated OD matrix converges to a local minimum and not to a global minimum of the goal function. In many cases these local optima result in a congestion pattern that differs strongly from reality. An example of this misinterpretation of the dataisvisualized in Figure 1. This figure was obtained by performing the OD estimation on the ring-way around Antwerp using only flow measurements. The left picture in Figure 1 shows the real speed contour plot, while the right picture displays the speed contour plot that is obtained by assigning the estimated OD matrix. As one can see the congestion pattern of the estimated OD matrix deviates strongly from the actual congestion pattern. Apparently the OD estimation process converged to a local minimum. This can be easily explained observing that the same number of vehicles can be observed under light conditions of traffic (low density and high speeds), and during heavy traffic (high density and low speeds).

Figure 1: Real traffic pattern (left) in terms of spatio-temporal diagram of speeds and traffic pattern calculated with the estimated OD matrix (right)

To reduce the risk of local minima many studies stress, again, the importance of having a good initial matrix that should not deviate too much from the actual OD matrix. Very often this is referred to as the OD adjustment process. However, as said, only in rare cases the available initial OD matrix yields to a reliable adjusted matrix. Moreover, even having a good initial matrix might not be a sufficient condition for obtaining a good adjusted matrix, as we will show later in this paper.

We therefore argue in this paper that congestion is a determinant factor leading to poor-quality estimates of the dynamic OD matrix. These errors are transferred onto the network flows, leading to errors in network performance measures. Since practitioners are often confronted with highly congested networks and adopt traffic models that need to be calibrated in order to be used in traffic management systems or for planning purposes, there is an urgent need for a practical OD estimation methodology that is applicable to heavily congested networks and that does not rely on the quality of the initial guess.To obtain this, we argue that extra information is necessary, for instance one can use information on speeds or link densities to identify unambiguously the state of the network, and choose a proper dynamic traffic model that models congestion accurately. The scope of this study is to show, through some clarifying example, how identifying and modelling congestion dynamics using one or both conditions can improve reliability of the OD estimation.

This paper is structured as follows. The next section provides an overview of the within-day assignment-based dynamic OD estimation problem and the relevant literature on this topic. Later we identify some fundamental requirements to obtain a reliable OD estimate that reproduces the observed congestion dynamics using synthetic examples. Then we apply our findings to a real dataset. Finally we conclude the paper sketching the next steps of our research.

The within-day dynamic OD estimation problem

Key issue in the estimation of an OD matrix is the identification of the origin-destination pairs whose flow portions use a particular link in which traffic is monitored. The common ground is the relationship between any origin-destination flow distributed on each (used) route alternative and each link flow in the network. Let

Mathematically speaking, OD estimators can be formulated in a general way as follows:

(1)

where f1 and f2 are performance functions controlling the performance of the OD estimator, T is the estimated OD matrix of elements for each origin-destination pair (r,s), is the corresponding element in the a priori matrix, and respectively and are the estimated and the measured link counts.

By definition, the vector of link flows ya satisfies the generic relationship[†]:

(2)

where is the assignment matrix, which controls the fraction of flows from any OD pair which uses link . This matrix can be further subdivided into a crossing fraction matrix and a route fraction matrix (see also [6]). The elements of this crossing fraction matrix express the proportion of a route flow that passes a link in time, thus describing the spatio-temporal propagation of the route flows throughout the network. The elements of route fraction matrix express the proportion an OD flow choosing a certain route in time. Both are used in the lower level, in which a dynamic OD matrix estimated in the upper level is assigned using a traffic loading modeltaking the route proportions from the output of an assignment model. The output of the traffic model, in terms of link flows, is then used to derive a relationship between the measurements and the OD flows. In general in this procedure a linear relationship is assumed. This relationship is then used again in the minimization problem of the upper level to obtain a new estimate of the OD matrix. This process repeats until convergence is reached and the two levels are mutually consistent.

Different forms of the optimization problem (1) can be found whereas the assignment relationship is used as constraint to the problem. Popular functions used at the upper level are the Maximum Likelihood estimator [7]and the Generalized Least Squares method (e.g., [8]-[9]). Alternative formulations combine trip-distribution with assignment and require the computation of the marginal functions between these two estimation parameters. For instance, in gradient-based estimation approaches the target OD matrix is “adjusted” to reproduce the traffic counts by iteratively calculating directions based on the gradient of the objective function (see e.g., [10]-[11]). The difficulty to derive the gradient in an exact way has suggested recently the use of gradient approximation methods, among which the Simultaneous Perturbation Stochastic Approximation (SPSA, [12]) has been used extensively in dynamic OD estimation problems (e.g. [13]-[14]). In contrast with more traditional gradient approximation methods like e.g. the Finite Difference method, which requires 2p measurements (p being the number of variables) of the objective function to obtain one gradient approximation, SPSA only takes 2 measurements per iteration, regardless of the number of variables. For the OD estimation problem, where the number of variables rapidly increases with the network size and the considered time period, this is a very attractive feature. In analogy with the other approximation methods another important feature of this approach is that eq. (1) can be formulated with other variables for which normally one cannot calculate an exact gradient (e.g. speed measurements).

Focusing on the lower level, originally two types of (static) assignment approaches have been proposed: proportional assignment (e.g., [15]-[18]) and equilibrium assignment (e.g., [19]-[21]). In proportional assignment approaches, route choices are assumed independent from the traffic volumes, which can be acceptable in uncongested networks. In case of congested networks, drivers choose which routes to take in proportion to the current (and/or past) information of route travel times. We will not go further into this literature search direction since this paper does not focus on the impact of route choice.

All the above methods have been proposed for solving static OD estimation problems, and adapted in some way to the dynamic context (e.g.,[22]). The result of this adaptation is however often unsatisfactory in practice. In fact, traditional bilevel formulations assume the assignment matrix in (2) as fixed with respect to the OD flows. As a consequence the effects of congestion may not be modelled correctly in the upper level.Therefore relationship (2) should be reformulated specifying that is both time-dependent and OD-flow dependent. We have shown from the second expression in eq. (2) that the assignment matrix can be decomposed into two elements, i.e. one controlling the effect of a change in OD flows onto the spatio-temporal propagation pattern due to the time-varying travel times and the other on the route choice matrix, which controls the temporal changes in route preferences of the users [23].

If we then focus on calculating the derivative of the distance between measured and estimated link flows, and specifying now the time dimension explicitly, we would obtain:

(3)

In expression (3), the first term consists of the assignment fraction, which represents the fraction of OD flow that passes link i during interval m. The second term consists of the sensitivity of each element of the crossing fraction matrix (that was non-zero for link i and interval m) to OD flow , multiplied with the associated route flow. The third term consists of the sensitivity of all route flows that pass link i during interval m to OD flow , multiplied with the associated crossing fraction.

It is normal procedure in practice to consider only the first term on the right-hand side of eq. (3) when solving the upper level (1). However, to incorporate in this level the dynamic effects of congestion, we cannot neglect these two terms so easily. By neglecting them, the link flows are assumed separable in space and time with respect to the OD flows. In other words, it assumes that changing an OD flow of a certain time period only influences the flow on the space-time interval at which it originally passes. This assumption is incompatible with some typical phenomena in congested networks, such as spillback of congestion, time lags due to congestion effects, and interdependencies between crossing or opposing flows through intersections.

This error has already been addressed in past studies. In Yang [24]explicit derivatives of the link flows to the OD flows are used in the OD estimation. Since the paper deals with static OD estimation, the crossing fraction matrix simplifies to the link-incidence matrix, which is not sensitive to the OD flows. Lindveld [23]investigates the derivative of the link flows to the OD flows, but decides not to use the last two terms because of computational complexity. Tavana [25]acknowledges the theoretical importance of including both derivatives. Although he does not find a substantial improvement compared to the standard linear relationship, the case study considers a non-congested network, and is thus not well chosen to draw any conclusions on its performance in congested networks. He argues that one of the reasons for the small amount of improvement is that the initial guess of OD flows is very close to the desired solution, because else the process may converge to another solution. We show in the next section, through some synthetic example, that this is not the only reason for a bad performance of the OD estimation procedure if a congested case is considered.

Modelling congestion dynamics in the estimation procedure

In this section we aim at acquiring insight into the possible factors affecting the reliability of the OD estimation procedure, and that are related to the calculation of the second term of equation (3). Note again that the current analysis does not consider the effects on route choice, and therefore on the above third term. These conclusions are thus directly applicable to OD estimation problems where route choice dynamics are not significant, for instance in case of motorway corridors.

Therefore, in this section we identify and discuss some conditions that should be met for obtaining an unbiased OD estimation,

The traffic model (or Dynamic Network Loading model) that is used in the estimation process should be able to represent queuing behavior in a proper way, both in terms of queue size and in terms of spatio-temporal propagation;
During the OD estimation process the correct bottlenecks should be activated and the simulated traffic state of the network (free flow or congestion) should match with the actual traffic state of the network.

These two elements are not sufficient conditions for guaranteeing a correct dynamic OD estimation; yet represent important requirements for avoiding certain biases, as it is shown with the following synthetic examples.

The effect of Dynamic Network Loading

The traffic model adopted in the OD estimation procedure is partly responsible of the calculation of the second term in equation (3). A traffic model that assumes infinite buffer, and therefore no spillback effects, implies link costs separability, thus all elements in the second term become zero. Errors in the calculation can be also caused by an incorrect propagation of queues in time and space, even when link travel times are not separable. We show this with the help of synthetic examples.

Dynamic Network Loading models used in dynamic OD estimation can be classified into two main categories. The first group consists of analytical models, which describe the behaviour of traffic with macroscopic traffic flow variables, such as inflow rates and travel times. Very popular analytical models adopt simple queuing mechanisms, such as vertical or point-queue models, or horizontal, spatial-queue models. However, traditional analytical approaches fail in capturing the spatio-temporal effects of congestion in a network. Frederix et al. [26] show the significant errors caused by vertical and horizontal queues in assignment-based OD estimation problems. This has motivated the development of more advanced traffic flow propagation models based on kinematic wave theory, which have the advantage of capturing the effects of congestion more realistically, but at the expense of a certainly higher mathematical and algorithmic complexity. These models describe the physical propagation of flows on the links and across the nodes as a compressible fluid control the variation of flow, density and speed using the conservation law of fluid-dynamics. The main elements in these models are the forward and backward shockwaves emerging when a change in one of the three fundamental parameters is observed. Widely adopted are the Cell Transmission Model (CTM, [27]), and the Link Transmission Model (LTM, [28]-[29]). LTM models outperform CTM models in their computational efficiency, since links are not subdivided into discrete cells, while keeping consistent track of kinematic waves.