The DyGeo Model – Dynamic Geospatial Database Model: Advancing Spatio-Temporal Database Model

Karine Reis Ferreira1, Gilberto Camara1,Antonio Miguel Vieira Monteiro1

1DPI – Image Processing Division, INPE – National Institute for Space Research,

São José dos Campos, Brazil

{karine, gilberto, miguel}@dpi.inpe.br

Abstract.Since most existing spatio-temporal database models are specific to meet a particular set of applications, there is a needfor a more general one which is not application-oriented and is able to represent and handle different kinds of dynamic geospatial data, including those generated by geosensors, mobile devices and remote sensing images plataforms.This work proposes a more general and not application-oriented model, called Dynamic Geospatial Database Model- the DyGeo Model, to represent and handle spatio-temporal data. In this paper, a DyGeo algebra is defined based on set of six data types and operators. The DyGeo data types, observations, interpolator, time series, trajectory,fieldand dynamic field, are able to cope with different dynamic geospatial datasources, and the DyGeoset of operatorsare able to expresscomplex spatial-temporal queries on these data

Keywords: spatio-temporal database model; algebra forspatio-temporal data; spatio-temporal queries; dynamic geospatial data.

1 Introduction

The recent technological advances in geospatial data collection, such as Earth observation and GPS satellites, wireless and mobile computing, radio-frequency identification (RFIDs) and sensor networks, have motivated new types of applications which handle spatial information. Examples include recording of animal movements, transportation systems, monitoring oil slicks on the ocean, and tracking changes in the landscape. To meet this demand, it is necessary to represent spatio-temporal information in spatial databases and geographical information systems (GIS). According to Worboys[1], there are four stages in introducing temporal capacity into GIS: (0) static GIS, (1) temporal snapshots, (2) object change, and (3) events, actions and processes. He argues that most current proprietary technologies are in stage zero, that is, they do not deal with spatio-temporal data. We consider that this situation is partly due to the lack of consensus on how to represent spatio-temporal data in computational systems.

Static 2D geospatial information is represented in GIS following well-established ideas, following the centuries-old tradition of cartography. We have grown familiar with the abstractions involved in map-making which include a projecting the Earth´s surface into two dimensions and assigning fixed boundaries to geospatial objects. Additionally, important research results provided the principles for dealing with 2D static data. The scientific basis for 2D static GIS includes object-based and field-based models[2] [3], topological operators[4], as well as spatial indexing and spatial joins[5]. In recent years, database management systems (DBMS) have been extended to handle 2D static geospatial information and there has been a major effort for standardization of the basic components for such data[6].

However, when we deal with spatio-temporal data, there is no consensus on how to model and handle it. There are many proposals of spatio-temporal database models, including STOM - Spatio-Temporal Object Model[7], ESTDM - Event Oriented ST Data Model[8], Three-Domain Model [9], Moving Object Model[10], GEM - Geospatial Events Model [11] and the recent standard for moving features, called Moving Feature Model[12]. Nevertheless, Pelekis et al.[13] consider that existing spatio-temporal models are mostly application-specific, each of them being focused on one aspect of spatio-temporal data. For example, these models are specific to represent either fields, such as ESTDM and Hierarchal, or objects which vary over time. Besides that, there is a subset of models specialized in representing objects whose geometries change continuously over time, that is, objects in movement, such as Moving Object and Moving Feature.

One of the challenges of handling spatio-temporal data is the diversity of applications. An important subset is dealing with moving objects, for uses such as transportation, location-based services, and animal-tracking systems. Given the commercial importance of location-based services, there has been much research on this area. One of the most promising approaches is the formal algebra for moving objects proposed by Erwig et al.[14], which uses the core idea of trajectory and associated operations. However, according to Paton et al. [15]spatio-temporal databases are not just about moving objects. There are a lot of requirements of applications modelling changes that are not directly associated with movement, such as cadastral systems, geosensors as well as environmental change monitoring and modelling.

Considering the current research status, this work proposes a more general and not application-oriented model, called Dynamic Geospatial Database Model (DyGeo Model), to represent and handle spatio-temporal data. In this paper, the DyGeo data types and operators are defined by using an algebraic formalism. The DyGeo algebra is based on five data types, observations, interpolator, time series, trajectory and dynamic field, which are able to represent different kinds of dynamic geospatial data, including those generated by geosensors, mobile devices, remote sensing images as well as environmental change monitoring. Besides the data types, the new model proposes a set of operators which are able to express many kinds of important queries on these dynamic geospatial data.[TALVEZ AQUI FOSSE MAIS INTERESSANTE MAIOR PRECISAO, QUE QUERIES?? OU QUE OPERADORES]

In what follows, we review related work on spatio-temporal database model in Section 2. Section 3 defines the DyGeo data types, by using an algebraic formalism. Section 4 presents the DyGeo operators, their syntactic specifications, axioms as well as examples of use.Finally, the last section concludes this work.

2 Related Work

There are many proposals of spatio-temporal database models in GIS literature, including STOM model[7], ESTDM model[8], Three-Domain model [9], Moving Object model[10], GEM model [11] and the recent standard for moving features, called Moving Feature model[12].

The STOM model defines basically two spatio-temporal data types, ST-simplexes and ST-complexes, and a set of operations over them, such as ST-Union, ST-Intersection and ST-Difference. Itfocuses on representing objects whose spatial attribute change over time in a discrete way, without considering changes in non-spatial attributes.The ESTDM model is specific to represent changes in raster data. Its main ideais to group changes by time of occurrence, ordering changes in locations within a predetermined geographical area. The time associated with each change, called event, is stored in increasing order from initial time t0 to the latest time tn. The set of changes Ci recorded for any time ti consists of the set of each location (x, y) which changed since ti-1, and its new value v.

The Three-domain model mainly focuses on how to represent objects which vary over time in a relational database system by using normalized tables and a spatial graph as well as on how to query them by using SQL language. The proposed database schema consists in four tables, one for each domain (semantic, temporal and spatial) and another for the domain link. It is a simple model, which does not define spatio-temporal data types and operations. It only uses the data types and query language provided by database management systems.

Moving Object defines a robust algebra, data types and operations, to deal with moving objects, such as, cars, aircraft, ships, mobile phone users, polar bears, and oil spills in the sea. The authors propose an algebra with two main data types, moving points and moving regions, and a set of auxiliary types, such as moving real and moving int. Besides that, this algebra defines a set of operators over these data types, such as trajectory, distance, direction, and velocity. The GEM model introduces an event concept and relationships between events and objects in a model based on spatial objects. It defines two kinds of relationships, object-event and event-event, following the idea that an event can affect or be associated to one or more objects or events of different types. Some examples of object-event relationships are splitting and mergerand of event-event relationship are initiation and termination. However, it is a model which defines only data types but not operations over them.

The Moving Feature Model, proposed by the International Organization for Standardization (ISO), defines a conceptual schema for moving feature. The term feature refers to an abstraction of real world phenomena and moving feature refers to features whose geometries move over time. This schema includes a set of classes, attributes, associations, and operations which provides a common conceptual framework to deal with feature geometry which moves as a rigid body. Therefore, it supports changes of location, translation and rotation of a feature, but not other change types, such as, the feature deformation and changes in non-spatial attributes of a feature.

We can notice that each model focuses on particular aspects of spatio-temporal data. For example, theyare specific to represent either fields, such as ESTDM, or objects which vary over time. Besides that, there is a subset of models specialized in representing objects whose geometries change continuously over time, that is, objects in movement, such as Moving Object and Moving Feature.

3 The DyGeo- Dynamic Geospatial Database Model- Data Types

This work presents an algebra to formally define the data types and operators proposed by the Dynamic Geospatial Database Model. In this section, each DyGeo data type is defined and illustrated with simple examples.

3.1 Primitive Data Types

In DyGeo model, there are four low-level groups of data types: basic, temporal, spatial and null. Basic data types, called A, are: Integer, Real, Boolean, and String. Temporal data types, called T, are Instant which represents a single time instant and Period which represents a time interval composed of two time instants.Spatial data types, called S, are: Point, Line, Polygon, Cell, Multipoint, Multiline, Multipolygon, andMulticell. Finally, null data type, called N, is Null. They are:

A = {Integer, Real, String, Boolean}

S = {Point, Line, Polygon, Cell, Multipoint,Multiline,

Multipolygon, Multicell}

Instant = Integer

Period = (Instant, Instant)

T = {Instant, Period}

N = {Null}

Based on these low-level data types, the DyGeo model proposes six data types which are defined in the following sections:Observations, Interpolator, TimeSeries, Trajectory, Fieldand DynamicField.

3.2 Observations

In the actual world, continuous phenomena are often measured through a set of discrete observations. For example, in order to study the temperature variation in a specific city, we can observe the temperature at different time instants and in different locations within the city limits.Other example is an animal tracking that is often represented by a set of observations where each onetakes the animal location at a specific time instant. Likewise, the Amazon deforestation process is measured by observing deforested regions at distinct times.

According to Fowler[16], “an observation is an act associated with a discrete time instant or period through which a number, term or other symbol is assigned to a phenomenon.”Based on this definition, this work defines an observation as “an actassociated with a discrete position in time or space through which a number, term or other value is assigned to a phenomenon”. Therefore, the DyGeo model proposeanObservation data type which is a tuplecomposed of a position (Position) and a value (Value), where position can be a spatial (S) or temporal (T) data type and value any data type.Besides that, Observations data type is defined as a set of Observation:

Observation(Position, Value)=

(Position, Value)  Position  {S, T}

Observations(Position, Value)=

{Observation(Position, Value)}

A constructor operationis responsible for buildingany instance of the type being defined and a type can have more than one constructor[17]. So, the DyGeo model proposes the following constructors for Observation and Observations types:

observation: (Position, Value) → Observation(Position, Value)

observations: {(Position,Value)} →

Observations(Position, Value)

Regarding the DyGeo algebra notation, all data type names begin with an uppercase letter, whereas all operator names begin with a lowercase letter.

3.3 Interpolator

Since continuous processes are represented by a set of discrete observations, interpolation functions are widely used in order to estimate values associated to non-observed positions in time or space. Thus, the DyGeo model proposes a data type called Interpolator which is an interpolation function able to estimate a value in any non-observed position.

Given a set of observations (Observations) and a position (Position), the interpolator returns a value (Value)associated to this given position. If there is an observation that associates a value to this given position, it returns this observed value. Otherwise, it computes an estimated value associated to thisgiven and non-observed position:

Interpolator(Position, Value):

Observations(Position, Value)X Position → Value

The central idea in this work is to separate interpolatorsfrom observations. Therefore, different kinds of interpolators can be definedand usedover a same set of observations.For example, given two car locations, one observed at time instant 4 and another at instant 8, shown in Fig. 1(a), we can use different kinds of interpolators in order to estimate the car location at non-observed time 6. We can use a linear interpolator which considers the car moving through a straight line between the two observed locations and in a constant velocity, as shown in Fig. 1(b). On the other hand, we can use another interpolator which regards a street map in its estimation, as illustrated in Fig. 1(c).

Fig. 1.Interpolators.

3.4 Time Series

In this work, we define a time series as a function from time instant (Instant) to a value of a basic type (A), such as integer, real and string (Instant→A). For example, the temperature variation over time can be considered a time series. In this case, this variation can be represented by a time series from time instant (Instant) to values of real type (Real).

So, the DyGeo model proposes a type called TimeSeries that is composed of an observationset, where position is a time instant (Instant) and value is a basic data type (A), and of an interpolator (Interpolator). This interpolator is able to estimate a value at any non-observed time instants of the time series:

TimeSeries()= (Observations(Instant, ),

Interpolator(Instant, ) )  A

Besides that, a TimeSeries constructor operation is defined:

timeSeries:Observations(Instant, )X

Interpolator(Instant, ) → TimeSeries()

Fig. 2 illustrates an example of a time series generated by an egg traprelated to theDengue Fever Monitoring and Control projectSAUDAVEL.This project joinsuniversities and research institutes in Brazil in order to build a surveillance system for monitoring, control, warn and interventionfor Dengue Fever[18]. The central experiment of this project was carried out in Recife, Brazil. Mainly, it consists in giving out egg traps for Aedes aegypti and Aedes albopictus mosquitoes in different locations around the city and in counting the number of mosquito eggs found in each trap weekly. Then, this data is processed together with environmental information, resulting in risk maps for public health interventions. In this example, each egg trap has a fixed location and a time series mapping each week to the number of collected eggs at this week.

Fig. 2 shows a set of egg traps (represented by black points) in a Recife’s district called “Engenho do Meio” and a time series generated by an egg trap. This time series represents the number of collected eggs (axis y) by time (axis x).In DyGeo model, this time series can be represented by a TimeSeries(Integer)and an Interpolator able to estimate a number of mosquito eggs at any non-observed times.

Fig. 2.A time series associated to an egg trap.

3.5 Trajectory

In this work, a trajectory represents a spatial variation over time, that is, it is a function from time instant (Instant) to space (S) data type (Instant→S). A trajectory example is an animal tracking where the spatial location of an animalchanges over time. Another example of trajectory is the evolution over time of a deforested area in Amazonia. In the first example, each animal location is represented by a point data type(Point), whereas, in the second one, each deforested region is represented by a polygon (Polygon). Fig. 3 illustrates an animal trajectory and a deforested area trajectory.

In order to represent trajectories, the DyGeomodel proposes a type called Trajectorywhich is composed of anobservation set, where position is a time instant (Instant) and value is a spatialdata type (A), and of an interpolator (Interpolator). An interpolator associated to a trajectory must be able to estimate a space at any non-observed time instant.

Trajectory()= (Observations(Instant, ),

Interpolator(Instant, ) )  S

A Trajectory constructor operation is also defined:

trajectory: Observations(Instant, ) X

Interpolator(Instant, ) → Trajectory()

In Fig. 3, the animal trajectory can be represented by Trajectoty(Point)and the deforested area trajectory by Trajectory(Polygon). In these examples, the animal trajectory is composed of observed locations at times 10, 20, 30, 40, 50, 60, 70, 80 and 90, whilethe deforested area trajectory is composed of three observations, at times 2, 4, and 6. Besides that, both ones have an Interpolator able to estimate spaces at any non-observed times.Fig. 1 presents two examples of interpolators for trajectories of points.

In both examples of Fig. 3, besides the trajectories, we have includedinterest areas which are used to illustrate some operations.

Fig. 3. Examples of trajectories: (a) an animal trajectory and (b) a deforested area trajectory.

3.6 Dynamic Field

Before defining dynamic field, this section defines field. According to Worboys and Duckham[19], a field is a mapping between a location in a spatial framework and an attribute domain, that is, Field:S→A. In thisdefinition, fields are spatially continuous where every location in a spatial framework is associated to a set of attributes.

In order to represent fields, the DyGeo model proposes a type called Field that is composed of three parts: (1) a set of observations, where position is a space (S) data type and value is a basic (A) data type;(2) a boundary (Bound) which limits the spatial extent of the field; and (3) an interpolator (Interpolator) whichis able to estimate a value associated to any non-observed space: