ExtractAirData Script update

Author: Koldo Goñi

Updated: 10/10/2013

Task description

Build Python script for data download. The download service shall work in the same way as the current data download service for airquality data:

discomap.eea.europa.eu/ArcGIS/rest/services/ExtractData/Air/GPServer/NRTAirData

We should consider if we should add additional parameters. It will also be beneficial if we can ensure that the output file (csv-file) is posted back. It’s a problem for current users that the filename changes.

Testing environment

Software requirements

The script requires next software being installed on the computer:

·  ArcGis 10.1 (python 2.7)

·  Pyodbc library

Data source

For testing purposes the connection string is pointing to a local copy of AirQuality_E2a DDBB.

In the current setup an intermediary DDBB called “AirQualityExport” is used. On this DDBB there are the tables, the view and the stored procedure used on the downloading script.

In the new setup, downloaded data is extracted from a view located on the E2a database directly.

The “tokens” and “tokenlog” tables must be created on the DDBB because the script provides a token management protocol that works with these two tables (appendix 1).

Store procedure “TokenLogInsert” is also created. This SP is called on the script to create a registry of each requested data (appendix 2).

As it is said before, export table is pointing to a new view with the same name than the original, “stationMeasurementsExport” (appendix 3).

The information provided on the current setup doesn’t match the new DDBB fields. For this reason the data provided by this query is different from the current setup.

The information is going to be defined in the future and, in consequence, this view is going to change to accomplish final requirements.

Tokens

The E2a database lack the “tokens” and “tokenLog” tables, and “tokenLogInsert” Stored procedure. Those tables and stored procedure must be created on the E2a DDBB for this script to work.

Zip and Output file name

The part of the script that generates the zip file containing csv file with data is changed. The “make_archive” function of shutil module is used. This function is implemented from python 2.7 version and successive.

The use of this function allows simplifying the zipping process. On this zip file it’s included just the content of the scratch folder and not the folder itself. We find this more convenient because the scratch folder name returned by “createFolderInScratch” function returns a variable folder name.

The output zip file name is an input parameter of the tool. In the case that this parameter is not introduced, the default value is the type of requested pollutant.

Other parameters

In the current script, file format parameter by default returns a csv tuple. On the future XML format is going to be implemented.

Another question that rose up with the new database is that the variables on parameter “pollutant” are more than one hundred. Must they be included on the combo for this parameter on the tool?

Pollutants in Observation-Property tables:

3

No. / Pollutant /
1 / 1,2,3-Trimethylbenzene (air)
2 / 1,2,4-Trimethylbenzene (air)
3 / 1,3,5-Trimethylbenzene (air)
4 / 1.3 Butadiene (air)
5 / 1-Butene (air)
6 / 1-Pentene (air)
7 / Acenaphthene (air+aerosol)
8 / Acenaphtylene (air+aerosol)
9 / Ammonium in PM2.5 (aerosol)
10 / Anthanthrene (air+aerosol)
11 / Anthracene (air+aerosol)
12 / Anthracene (precip+dry_dep)
13 / Arsenic (aerosol)
14 / Arsenic (precip+dry_dep)
15 / Benzene (air)
16 / Benzo(a)anthracene (precip+dry_dep)
17 / Benzo(a)anthracene in PM10 (aerosol)
18 / Benzo(a)anthracene in PM10 (air+aerosol)
19 / Benzo(a)pyrene (precip+dry_dep)
20 / Benzo(a)pyrene in PM10 (aerosol)
21 / Benzo(a)pyrene in PM10 (air+aerosol)
22 / Benzo(e)pyrene (air+aerosol)
23 / Benzo(ghi)perylene (air+aerosol)
24 / Benzo(ghi)perylene (precip+dry_dep)
25 / Benzo(k)fluoranthene (precip+dry_dep)
26 / Benzo(k)fluoranthene in PM10 (aerosol)
27 / Benzo(k)fluoranthene in PM10 (air+aerosol)
28 / Biphenyl (air+aerosol)
29 / Black smoke (air)
30 / Cadmium (aerosol)
31 / Cadmium (precip+dry_dep)
32 / Calcium in PM2.5 (aerosol)
33 / Carbon monoxide (air)
34 / Chloride in PM2.5 (aerosol)
35 / Chrysene (air+aerosol)
36 / cis-2-Butene (air)
37 / Cobalt (precip)
38 / Coronene (air+aerosol)
39 / Cyklopenta(cd)pyrene (air+aerosol)
40 / Dibenz(ac+ah)anthracenes (air+aerosol)
41 / Dibenzo(ae)pyrene (air+aerosol)
42 / Dibenzo(ah)pyrene (air+aerosol)
43 / Dibenzo(ai)pyrene (air+aerosol)
44 / Elemental Carbon in PM2.5 (aerosol)
45 / Ethane (air)
46 / Ethene (Ethylene) (air)
47 / Ethyl benzene (air)
48 / Ethyne (Acetylene) (air)
49 / Fluoranthene (air+aerosol)
50 / Fluoranthene (precip+dry_dep)
51 / Fluorene (air+aerosol)
52 / i-Butane (2-methylpropane) (air)
53 / i-Hexane (2-methylpentane) (air)
54 / Indeno_123cd_pyrene (precip+dry_dep)
55 / Indeno_123cd_pyrene in PM10 (aerosol)
56 / Indeno_123cd_pyrene in PM10 (air+aerosol)
57 / i-Octane (2,2,4-trimethylpentane) (air)
58 / i-Pentane (2-methylbutane) (air)
59 / Isoprene (2-methyl-1,3-butadiene) (air)
60 / Lead (aerosol)
61 / Lead (precip+dry_dep)
62 / m,p-Xylene (air)
63 / Magnesium in PM2.5 (aerosol)
64 / Mercury (aerosol)
65 / Mercury (air+aerosol)
66 / Mercury (precip+dry_dep)
67 / N1methylnaphtalene (air+aerosol)
68 / N1methylphenanthrene (air+aerosol)
69 / N2methylanthracene (air+aerosol)
70 / N2methylnaphtalene (air+aerosol)
71 / N2methylphenanthrene (air+aerosol)
72 / n-Butane (air)
73 / n-Heptane (air)
74 / n-Hexane (air)
75 / Nickel (aerosol)
76 / Nickel (precip+dry_dep)
77 / Nitrate in PM2.5 (aerosol)
78 / Nitrogen dioxide (air)
79 / Nitrogen monoxide (air)
80 / Nitrogen oxides (air)
81 / n-Octane (air)
82 / n-Pentane (air)
83 / o-Xylene (air)
84 / Ozone (air)
85 / Particulate matter < 10 µm (aerosol)
86 / Particulate matter < 2.5 µm (aerosol)
87 / Perylene (air+aerosol)
88 / Phenanthrene (air+aerosol)
89 / Phenanthrene (precip+dry_dep)
90 / Potassium in PM2.5 (aerosol)
91 / Propane (air)
92 / Propene (air)
93 / Pyrene (air+aerosol)
94 / Pyrene (precip+dry_dep)
95 / Reactive_mercury (air+aerosol)
96 / rRetene (air+aerosol)
97 / Sodium in PM2.5 (aerosol)
98 / Sulphate in PM2.5 (aerosol)
99 / Sulphur dioxide (air)
100 / Toluene (air)
101 / Total gaseous mercury (air+aerosol)
102 / Trans-2-Butene (air)

3

“Coordinate System” parameter is useless at this moment. Do we have to provide support for coordinates transformation over the latitude and longitude fields?

Tool

A tool is created on Arctoolbox with 7 parameters, most of them optional at the moment to make test work easier.

Parameter / Format / Default value / Observations
Pollutant / String / “Ozone (air)” / They are 102 different pollutants on the DDBB
Feature format / String / "CommaSeparated CSV .csv" / Invariable at the moment is going to be implemented to accept XML format
Start date / String / datetime.utcnow() / Datetime module function for current date
Total Hours / String / “-24”
Coordinate System / String / Useless at this moment
Token / String / "0D3137D5-1DA5-4271-B2CF-6943426A4A6C"
Output zip file name / String / requested Pollutant name

Bugs & errors

The script works perfect outside ArcGis.

Under an OS W7 and ArcMap 10.1 the second times we launch the tool from ArcToolBox an error rise up. A next launching of the tool produces ArcMap crash.

Picture 1 Error after second launch of the tool

Appendix

Appendix 1. Creating the 2 tokens tables

Tokens

CREATE TABLE [dbo].[Tokens](
[token] [uniqueidentifier] NOT NULL,
[name] [nvarchar](50) NULL,
[contact] [nvarchar](100) NULL,
[description] [nvarchar](100) NULL,
CONSTRAINT [PK_Tokens] PRIMARY KEY CLUSTERED
(
[token] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

ALTER TABLE [dbo].[Tokens] ADD CONSTRAINT [DF_Tokens_token] DEFAULT (newid()) FOR [token]
GO

TokenLog

USE [Airqualityexport]
GO

/****** Object: Table [dbo].[TokenLog] Script Date: 10/07/2013 13:51:42 **/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[TokenLog](
[id] [bigint] IDENTITY NOT NULL,
[token] [uniqueidentifier] NULL,
[timestamp] [datetime2](7) NULL,
[comment] [nvarchar](100) NULL
) ON [PRIMARY]

GO

ALTER TABLE [dbo].[TokenLog] WITH CHECK ADD CONSTRAINT [FK_TokenLog_Tokens] FOREIGN KEY
REFERENCES [dbo].[Tokens] ([token])
GO

ALTER TABLE [dbo].[TokenLog] CHECK CONSTRAINT [FK_TokenLog_Tokens]
GO

ALTER TABLE [dbo].[TokenLog] ADD CONSTRAINT [DF_TokenLog_timestamp] DEFAULT (getdate()) FOR [timestamp]
GO

Appendix 2. Creating “tokenLogInsert“ stored procedure

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

-- ======

-- Author:Koldo Goñi

-- Create date: 10/10/2013

-- Description: Procedure to insert record on Tokens Log used on AirExtractData downloading

-- ======

CREATE PROCEDURE tokenLogInsert

@token VARCHAR(36) ,

@status varchar(10)

AS

BEGIN

SET NOCOUNT ON;

INSERT INTO TokenLog (token,timestamp, comment)

VALUES (@token, getdate(), @status);

END

Appendix 3. “stationMeasurementsExport” view SQL

SELECT dbo.STATION.NAME AS StationName, dbo.STATION.ID AS StationID, dbo.STATION.RESOURCE AS ProviderURL, dbo.Property.label AS Component,

dbo.Value.datetime_begin, dbo.Value.datetime_end, dbo.Value.value_numeric, dbo.Unit.label AS MeasurementUnit

FROM dbo.STATION INNER JOIN

dbo.SAMPLINGPOINT ON dbo.STATION.PK_STATION = dbo.SAMPLINGPOINT.FK_STATION INNER JOIN

dbo.Observation ON dbo.SAMPLINGPOINT.PK_SAMPLINGPOINT = dbo.Observation.fk_samplingpoint INNER JOIN

dbo.Property ON dbo.Observation.fk_property = dbo.Property.pk_Property INNER JOIN

dbo.Value ON dbo.Observation.pk_observation = dbo.Value.fk_observation INNER JOIN

dbo.Unit ON dbo.Value.fk_unit = dbo.Unit.pk_Unit

3