OSM Geoprocessing Tools and Editor Extension for ArcGIS

Purpose:

ESRI intends to develop geoprocessing tools and extensions for the ArcGIS platform to allow users to participate in the OpenStreetMap community. The primary purpose of the tools is the data preparation for the desktop editor environment. The users will be able to download data for user specified areas of interest to the desktop and to use the core ArcGIS desktop tools to manipulate the data. The created and/or modified data can then be uploaded back to the OSM server.

This is a draft document and does not guarantee a specific type of implementation. It is merely a suggestion reflecting the current point in the development cycle.

Outline of tools and extensions:

There are a number of geoprocessing tools to facilitate data download and upload operations and data preparation for the editing environment as well as for cartographic display. The goal for the tools is to streamline the data delivery and to make the editing experience of the OSM data as ‘native’ as possible to the core ArcGIS environment.

Currently there are 6 tools planned as outlined in the screen capture below.

  1. Download OSM Data
    The user specifies the download location (URL) and the area of interest. For more information about the concept of extents within the geoprocessing framework please refer to http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/Output_Extent/001w00000009000000/

Further a target workspace needs to be specified as to where the downloaded data will be stored. Currently we plan to support file based and enterprise geodatabases. The resulting data is sorted into 3 feature classes – points, lines, and polygons. (There is a fourth entity as well for relations but for more information on relations please take a look at discussions section).

I am contemplating to allow additional download of all references in a relation that are not part of the initial download request - but I am not sure if the referenced data of a relation is really needed as it is outside of the area of interest. I would like to get some feedback on that.

Good idea to at least offer this as an option. Also relevant when downloading nodes (ie. get all ways containing this node) - Mikel

  1. Upload OSM Data
    This tool essentially the reverse of the download tool. (Since this tool is not written yet there are still a number of open questions concerning the implementation.)
    Once the user stops an edit sessions and saves the changes, the custom editor extension will determine the differences in data (add/updates/deletes) and store the findings in a revision table.

Rows in this table indicate that there outstanding edits that have not been reconciled with the server. The user account will be authenticated through OAuth.

My current question with respect to the upload functionality is if the edit changes do “expire” at some point in time? Imagine that you download a chunk of data and take it offline with you. After 2 weeks of working on a specific area you decided that it is time to upload and to synchronize the data back to the server. What if the data content has changed dramatically within these two weeks. Is there a concept of a data lease time with respect to edits? Should updates back to server be committed with a certain timeframe after the download operation?

There is no enforced timeframe. But when you upload, the API will compare the revision of each object, and if that has changed since download, it will report a conflict. This could happen in just under a minute, or not happen at all over the course of weeks … so conflict resolution will need to be a part of the workflow. -Mikel


What happens if an update post is not successful? What if there is a posting conflict? Do I ignore this single upload attempt or do I abort the whole changeset operation?

You can check in JOSM for how conflicts are handled … perhaps purposely create one against a test OSM api. It's not the best interface, so thoughts there welcome generally. -Mikel

Currently there are no plans to reconcile the data received from the server after the download back to into the geodatabase. Let’s assume that a point feature is created and assigned an initial ID of -10. After the upload the server returns the new (and final) feature ID of 324998. There are no plans to enter this information back into the database. After one upload iteration I would rather like the user to do a complete data refresh and get the latest and greatest data from the server. This means that I would delete all the features as they currently exist on the desktop client and I would like the user to re-download the area of interest for a new edit session.

  1. OSM Feature Symbolizer

This tool is designed to prepare the data for the editing experience. One of the enhancements for ArcGIS 10 is the introduction of feature templates http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//00qp00000006000000.htm

The tool sets up a default symbology that drives the feature templates and as such the editing experience. Inputs are the OSM feature dataset holding the downloaded data and three pre-defined template layers containing the starting symbology. The users can modify this symbology to their own needs and requirements. The results are layers that can be feed directly into the editor.

Ideally all tools that are symbolizing OSM data could start to share those in a common, open format. Even within OSM, the tools like JOSM, Mapnik, and the wiki manually replicate these settings. I don't expect a solution now, but just something to keep in mind. -Mikel

  1. OSM Attribute Selector
    One of the strong features of OSM is its flexible data structure. This is a little bit of an issue as most GIS databases are optimized for normalized data structures and in general follow a common data model. Approaches that I have seen in importing OSM data into a classic GIS structure is to flatten all key/value pair and treat them as unique attributes throughout a common theme. If the theme is sorted by geometry it is not uncommon to have more than 100 attributes a single feature class. In my opinion (a developer point of view and simple minded user) this is not acceptable.

The OSM download tool will auto-generated certain attributes with the editor in mind. The proposed approach for auto-generated fields is derived from user defined common OSM themes (http://wiki.openstreetmap.org/wiki/Map_Features) like:

-  Highway

-  Amenities

-  Railway, etc.

On top of that the tools store common information like version, user, uid, changeset, etc. on each feature. This information is considered metadata and is of little use to most users.

All of the OSM key/value pairs are stored in a single field, either a BLOB field for local geodatabases or an XML field for enterprise geodatabases.

That's fine, but you will need to send back all key/value pairs when uploading, else they will be lost. -Mikel

This table shows the currently planned auto-generated fields:

Attribute Name / Attribute Type /
OBJECTID / ESRI unique ID field
SHAPE / ESRI Geometry field
highway / OSM Theme type (user defined)
barrier / OSM Theme type (user defined)
waterway / OSM Theme type (user defined)
railway / OSM Theme type (user defined)
aeroway / OSM Theme type (user defined)
aerialway / OSM Theme type (user defined)
power / OSM Theme type (user defined)
man_made / OSM Theme type (user defined)
building / OSM Theme type (user defined)
leisure / OSM Theme type (user defined)
amenity / OSM Theme type (user defined)
shop / OSM Theme type (user defined)
tourism / OSM Theme type (user defined)
historic / OSM Theme type (user defined)
landuse / OSM Theme type (user defined)
military / OSM Theme type (user defined)
natural / OSM Theme type (user defined)
geological / OSM Theme type (user defined)
route / OSM Theme type (user defined)
boundary / OSM Theme type (user defined)
place / OSM Theme type (user defined)
OSMID / Unique OSM ID
TagContainer / Debug field (will not be part of release)
osmTags / OSM key/value pair storage container
osmuser / OSM user (metadata)
osmuid / OSM user (metadata)
osmvisible / OSM user (metadata)
osmversion / OSM user (metadata)
osmchangeset / OSM user (metadata)
osmtimestamp / OSM user (metadata)
osmMemberOf / OSM is a member of what relations
osmSupportingNode / Has attributes (key/value pairs) yes or no

Have you looked at the Humanitarian Data Model? I wonder if some of the mapping concepts there could be applied here? -Mikel

Will this list of mappings be expandable? Shareable? In different contexts, different feature types will be important. -Mikel

This storage arrangement keeps the number of fields manageable. However some of the core GIS functionality is rendered inaccessible. For example it is not immediately possible to label features as the labeling information (such as name) is ‘hidden’ in a storage container. For the editing experience I imagine that fact to have little impact but I would imagine that once users have the OSM data within the ArcGIS environment they would like to use it on more occasions than just editing.

Not clear to me why name must be hidden … I could imagine editors wanting to edit name … it's definitely one of the core features. -Mikel

This geoprocessing tool takes a look at key/value pair storage field and it creates a unique list of OSM keys. The user can then select the keys of interest and they will be added as additional attributes of the feature class and populated with the values if such a key exists.

  1. Add/Remove OSM Editor Extension

In the way the ArcGIS desktop software is architected there are a number of ways to extend the overall framework of ArcObjects as well as the applications (like ArcMap). The goal for the editor extension is to help to streamline the user experience in capturing OSM data. Due to flexible nature of the OSM data I decided on attaching a “behavior “ to the downloaded OSM feature classes. “Behavior” means in this case that I would like to execute my editor (business) logic at certain moments and replace some of the core UI with user interface components which I think might be more suitable in traversing the rich OSM data model.

The down side is that now all clients (ArcGIS Desktops) interacting with the data are expected to know about this new behavior for OSM data and as such need to have my extension installed. In order to get around cryptic error messages (…Unable to create object class extension COM component….) the Add and Remove tools handle the extension of the created feature classes respectively.

Once you remove the editor extension (behavior) from the feature class all ArcGIS clients can handle the data but loose the convenient editing mechanism. In order to gain the custom experience again you add the extension and everything is back to normal.

  1. I am contemplating to also have a tool to read standalone OSM files but I am not sure if that is really necessary – may not for the first release

Yes, that could also be useful, but secondary. -Mikel

Workflows:

In the first step the user will download OSM data from a given server and load it into an ArcMap session with the proper feature templates attached. This step can be automated by combining the download and the symbolizer tool inside a model. A model is the way in the ArcGIS framework of capturing and automating workflows in a visual modeling environment. The the screen capture below the blue ovals are inputs, the yellow entities are functions (as the for the custom OSM data download function) and the green ovals are the results of processing step. Functions can be chained together such that the output of the first function is input to the second function. A simple model of combining download and symbolization is outlined in the screen capture below.

The results of a download (an area north east of Munich, Germany) are shown below.

Let’s take a brief look at the generation of templates and symbology. The current software reads the map feature templates from a XML file. The file itself is structured like

domain name="highway"

domainvalue value="motorway"

descriptionA restricted access major divided highway, normally with 2 or more running lanes plus emergency hard shoulder. Equivalent to the Freeway, Autobahn, etc..</description

geometrytypeline</geometrytype

</domainvalue

domainvalue value="motorway_link"

geometrytypeline</geometrytype

descriptionThe link roads (sliproads/ramps) leading to/from a motorway from/to a motorway or lower class highway. Normally with the same motorway restrictions.</description</domainvalue

domainvalue value="trunk"

geometrytypeline</geometrytype

descriptionImportant roads that aren't motorways. Typically maintained by central, not local government. Need not necessarily be a divided highway. In the UK, all green signed A roads are, in OSM, classed as 'trunk'.</description</domainvalue

domainvalue value="trunk_link"> …..

and the XML structure represents the core OSM features and their geometric representation, meaning that these are the entries which can be captured and therefore will be a feature template generated for them.

Is this XML format new? -Mikel

The symbology itself is currently stored in a lyr file and it can be used as an input for the symbolizer tool. The users can modify this symbology if they want to.

The next step in the workflow is to start an edit session. Once the session is started the user is presented with the Create Feature window. The window lists all the features that have a feature template and it suggests a digitizing tool to capture the feature type. The default behavior of showing all 400 OSM features can be a bit overwhelming at that point but the user can filter by layer type (i.e. map themes) or by filter types.