Proceedings Template - WORD s17

An Efficient Method for Indexing Future Positions of Continuously Moving Objects

- 2 -

ABSTRACT

Over the last decade, we have witnessed an increasing use of location-aware devices in a number of different applications. A critical and common requirement in many of these applications is to efficiently query on the locations of moving objects. Indexing methods are a natural solution to support this key operation. Since these applications are characterized by a large number of continuously moving objects, a key requirement for indexing methods in this domain is to efficiently support both updates and queries. Previous work on indexing such database can be broadly divided into two categories: indexing the past positions and indexing the future predicted positions. In this paper we focus on an efficient indexing method for indexing the future positions of moving objects.

The indexing method proposed in this paper indexes the predicted trajectories in a dual-transformed space. Trajectories for objects in d-dimensional space become points in a higher-dimensional 2d-space. In this dual transformed space, a regular hierarchical grid decomposition indexing structure is used, which results in leaf nodes that are the size of disk block and non-leaf nodes that are much smaller. An efficient technique is used for clustering the non-leaf nodes on disk pages. This new indexing method can evaluate a range of queries including timestamp, window, and moving queries. We have compared the performance of the proposed index with a recently proposed indexing method (TPR*-tree), and show that our approach has significant performance advantages for both updates and search queries.

1. INTRODUCTION

Past work on indexing predicted trajectories have produced a range of indexing structures which either index the trajectories in native space, or index the trajectories in dual-transformed space. Most of the recent work

Importance of moving object database. Two classes of indexing issues, past and predicted trajectories. Need to be efficient w.r.t. updates and queries. A number of techniques. Our proposal based on dual transformation to a higher dimensional space. Trajectories in two-dimension become points in a four-dimensional space. The higher dimensional space is indexed using a regular hierarchical decomposition with leaf pages being the size of a disk block, and the internal nodes being small records that are clustered by the children

STRIPES = A Scalable Trajectory Index for Predicted Positions in Moving Object Databases. The STRIPES index maps queries which results in producing a pattern that looks like stripes when projected on a two-dimensional space.

2. BACKGROUND AND MODEL

<Prasad: can you take a crack at this section too, modeling it after the way the SETI paper introduces the background and model. The only caveat is we need to use the same terminology as is used in Section 4, see the notations table. Explain the background of trajectories – represented as a sequence of moving points. Historical trajectories a time series of points in a the physical space.

Future trajectories – different models for the predicted trajectories. The Sistla model, which is widely used. We use that here too.

Explain the notion of horizon.

2.1 Query Types

Explain the query types

Figure 1: Query Examples for Objects Moving in a One-dimensional Space

3. TPR AND TPR*-TREE INDICES

In this section we review the two popular predicted indices, TPR-tree index [20] and the TPR*-tree index [26].

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

Conference’04, Month 1–2, 2004, City, State, Country.

3.1 TPR-Tree

The TPR-tree [20] is essentially a time parameterized R*-Tree.
The index stores velocities of the elements along with their positions in nodes. Since the elements are not static, the corresponding MBRs are dynamic (see Figure 2).

The index structure and as well as the algorithms for search, insert and delete used are very similar to that of R*-tree[2]. The R*-tree uses a number of static parameters such as the area, perimeter, distance from the centroid, and the intersection between the two MBRs. The TPR-tree uses time parameterized metrics for these parameters. The time parameterized metric is computed using the formulae , where M(t) is some metric that is used in the original R*-tree (for example the area), and H is the life time of the index. The life time of an index is the time for which the index is used and queried.

Figure 2 shows an example of the time-parameterized area metric for four object a, b, c, and d, moving in a two-dimensional space.. The MBR of the index node at time T0 is labeled as A, and the MBR of the node at time T0+H is labeled as B. The size and the position of B are calculated by extrapolating (position, velocity) of the entries with in the node. Then the area metric used is the volume of the trapezoid that is formed by moving MBR of the node from time T0 to T0+H. All the other metrics are similarly computed.

Figure 2: Area Computation in the TPR-Tree: The actual data objects are shown as shaded boxes. The “area” computation is the volume of the three-dimensional shape shown in the figure.

The insert algorithm chooses a node such that the expansion in volume is the smallest at non-leaf nodes and the expansion in integrated perimeter is the smallest at the leaf node level. When such a node is full, it is split similar to R*-Tree. Now instead of just sorting boundaries of elements, the velocity vectors are also sorted to choose the best distribution of the elements.

The TPR-Tree inherits all the issues related to the R*-Tree such as overlap and dead space. Since the positions and the velocities are estimated and can change, the optimal combination of elements can not be maintained at all times in the future.

3.2 TPR*-Tree

The recently proposed TPR*-tree [26] provides a number of optimization to the basic TPR-tree algorithms.

Prasad: This part still needs some work, as it needs to clear convey the key insight of the TPR*-tree algorithm regarding using the PQ to find the optimal leaf. Feel free to use a figure to explain this clearly. The insert algorithm of TPR*-tree recognizes that a local optimal solution at a level can be from a broken-tie (two elements have same deterioration) and that the sub-elements of that element may not be optimal. It proposes to traverse down the tree to leaf level by maintaining a priority queue of deteriorations, an optimal node can be determined. The authors argue that the extra cost incurred in traversing can be offset by the benefits of finding an optimal node for insertion. This algorithm leads to a tighter packing of elements in nodes and thus better query performance.

The algorithm to deal with overflow nodes in TPR*-Tree is to first force reinsert and then split the node. The nodes are first sorted along all the 8(4*d) possible dimensions and first l (=30%) entries from the best possible sort are chosen for reinsert. If during the reinsert, if a node overflows then the node is split. The authors propose a heuristic to reduce the number of sorts to just one, by recognizing that the elements at leaf nodes can be assumed to be uniformly distributed, and the largest extent of all the dimension (positions and velocities) would give the best benefit.

The authors propose two complex algorithms to optimize TPR-tree algorithms at the cost of extra I/O and CPU. The authors of TPR*-tree also propose a cost model as well as hypothetical optimal tree for predictive indices using the TPR-tree style of indexing. They show that their proposed algorithmic improvement produces an index structure that is close to the optimal index. Consequently, one can conclude that the TPR*-tree is currently the best know practical indexing technique for predicted trajectories.

Figure 3: TPR*-tree figure goes here

4. STRIPES

This section presents the techniques to represent moving objects, the index structure and algorithms used to query, insert and delete data points in STRIPES. We start with one-dimensional moving objects, and then extend to two dimensions and further to d-dimensions. In this section, we will use the notations described in Table 1.

Table 1. Notations

Notation / Description
/ number of dimensions in real space
D / Number of dimensions in dual (transformed) space (=2d)
L / Index lifetime
/ Reference time, a.k.a. initialization time of an index
/ Vector of maximum velocities
/ Maximum absolute velocity value in dimension
/ Maximum position value of a moving object in the dimension
/ Position vector of an object in original space
/ Velocity vector of an object in original space
/ Reference position of an object in dimension of original space
/ Position of an object in dimension of original space
/ Velocity of an object in dimension of original space
/ Position vector of an object in transformed dual space
/ Velocity vector of an object in transformed dual space
/ Reference position of an object in plane of transformed dual space
/ Total number of mobile objects
/ Page capacity (number of objects per page)
= / Minimum number of pages to store the mobile objects
K / Number of objects reported by a query
/ Minimum number of I/O’s to report the query answer
/ Non-leaf node fanout

4.1 Representing Moving Objects using Dual Transform

The STRIPES index represents the moving object in a dual transformed space.

The basic idea of a dual transform in this case is to transform a linear trajectory defined by equation in d+1-dimensional space ( being the additional dimension) into a point () in 2d-dimensional dual space, where and are the transformed velocity and reference position vectors. We incorporate both negative and positive values for velocity by applying the following transform: given , the velocity and position vectors of an object, the corresponding transformed velocity and reference position vectors are calculated as follows:

Thus in the range for is and the range for is .

Since time is monotonically increasing, the value of is not bounded, and also in the discussion of queries in the next section, the slope of the lines bounding query regions can be potentially infinite as time advances far into the future relative to . This presents a potential problem with the accuracy and functionality of the index structure.

To solve this problem, we apply the assumptions that objects must (i) issue an update when they cross over the boundaries (ii) issue an update periodically to keep alive with the system, and employ a dual-index system where we keep two distinct index structures in the system, starting from time . We empirically assign the same lifetime value to each of the index structures, which also becomes the enforced update period. We then enforce the following rules: starting from , the reference time of the first index is , and it indexes objects that issue updates within the period , and the second index has a reference time that contains the objects that issue updates within the period . Since an update consists of the deletion of the old entry and the insertion of the new entry, when an update with timestamp > comes in, we are sure that the first index is either empty, given all objects in it have at least issued an update with timestamp within , which means they are removed from the first index and inserted into the second one, given the periodic update rules, or that it is not empty because it contains objects that failed to issue an update within the required time frame, which we treat as expired. At this time, we clear the first index structure and update its to be and insert the new updates with timestamps within , so on and so forth. This way each time we create a new index structure we remove an empty or expired one. It is easy to see that now the range for in each of the indexes is . To simplify the computation of index entry coordinates, we add to at transform time and convert the range to , thus the transform equation becomes:

And the linear motion equation becomes:

4.2 Index Structure

The STRIPES index is essentially a disk-based multi-dimensional bucket quadtree. Each of the dual planes is equally partitioned into 4 quads, resulting in altogether partitions that we call grids, the fanout of non-leaf nodes is thus . Each leaf node consists of an integer level indicator (root node has level 0), a pointer to its parent non-leaf node, the grid it is associated with, and an array of objects that each consists of a unique object ID and the transformed tuple. Each non-leaf node consists of an integer level indicator, a pointer to its parent non-leaf node (that of the root node is null), and an array of length of pointers to its children nodes, be them non-leaf or leaf nodes.