Meadowlands Environmental Monitoring Network Project

A Report

Vasundhara Chaudhuri Chakraborty

BACKUP_CLEAN_DATA

Name Null? Type

------

STATIONID NOT NULL NUMBER(4)

SAMPLE_DATE_TIME NOT NULL DATE

BATTERY_VOLTAGE NUMBER(6,2)

PROGRAM_SIGNATURE NUMBER(10)

AIR_TEMPERATURE NUMBER(6,2)

RELATIVE_HUMIDITY NUMBER(6,2)

RAINFALL_AMOUNT NUMBER(6,2)

SOLAR_RADIATION NUMBER(7,3)

SOLAR_RADIATION_TOTAL NUMBER(10,6)

WIND_SPEED NUMBER(7,3)

WIND_DIRECTION_DEGREES NUMBER(7,3)

WIND_SPEED2 NUMBER(7,3)

WIND_DIRECTION_RADIANS NUMBER(7,3)

WIND_DIRECTION_STDV NUMBER(7,3)

WATER_TEMPERATURE NUMBER(7,3)

CONDUCTIVITY NUMBER(7,3)

PH NUMBER(7,2)

OXIDATION_REDUCTION_POTENTIAL NUMBER(7,2)

WATER_DEPTH NUMBER(7,3)

TURBIDITY NUMBER(6,2)

DISSOLVED_OXYGEN NUMBER(6,2)

SALINITY NUMBER(6,2)

DISSOLVED_OXYGEN_PCT NUMBER(7,2)

BAROMETRIC_PRESSURE NUMBER(10,3)

WIND_SPEED_SAMP NUMBER(10,3)

WIND_DIRECTION_SAMP NUMBER(7,3)

CLEAN_DATA:

Name Null? Type

------

STATIONID NOT NULL NUMBER(4)

SAMPLE_DATE_TIME NOT NULL DATE

BATTERY_VOLTAGE NUMBER(6,2)

PROGRAM_SIGNATURE NUMBER(10)

AIR_TEMPERATURE NUMBER(6,2)

RELATIVE_HUMIDITY NUMBER(6,2)

RAINFALL_AMOUNT NUMBER(6,2)

SOLAR_RADIATION NUMBER(7,3)

SOLAR_RADIATION_TOTAL cvcNUMBER(10,6)

WIND_SPEED NUMBER(7,3)

WIND_DIRECTION_DEGREES NUMBER(7,3)

WIND_SPEED2 NUMBER(7,3)

WIND_DIRECTION_RADIANS NUMBER(7,3)

WIND_DIRECTION_STDV NUMBER(7,3)

WATER_TEMPERATURE NUMBER(7,3)

CONDUCTIVITY NUMBER(7,3)

PH NUMBER(7,2)

OXIDATION_REDUCTION_POTENTIAL NUMBER(7,2)

WATER_DEPTH NUMBER(7,3)

TURBIDITY NUMBER(6,2)

DISSOLVED_OXYGEN NUMBER(6,2)

SALINITY NUMBER(6,2)

DISSOLVED_OXYGEN_PCT NUMBER(7,2)

BAROMETRIC_PRESSURE NUMBER(10,3)

WIND_SPEED_SAMP NUMBER(10,3)

WIND_DIRECTION_SAMP NUMBER(7,3)

CONTROL_EMS1

Name Null? Type

------

V_TYPE NOT NULL VARCHAR2(8)

AIR_TEMPERATURE NUMBER(7,3)

RELATIVE_HUMIDITY NUMBER(7,3)

SOLAR_RADIATION NUMBER(7,3)

RAINFALL_AMOUNT NUMBER(7,3)

WIND_DIRECTION_SAMP NUMBER(7,3)

WIND_SPEED_SAMP NUMBER(7,3)

BAROMETRIC_PRESSURE NUMBER(7,3)

WATER_TEMPERATURE NUMBER(7,3)

CONDUCTIVITY NUMBER(7,3)

PH NUMBER(7,3)

WATER_DEPTH NUMBER(7,3)

TURBIDITY NUMBER(7,3)

DISSOLVED_OXYGEN NUMBER(7,3)

SALINITY NUMBER(7,3)

DISSOLVED_OXYGEN_PER NUMBER(7,3)

DISSOLVED_OXYGEN_P NUMBER(7,3)

CONTROL_EMS2

Name Null? Type

------

V_TYPE NOT NULL VARCHAR2(8)

SAMPLE_DATE_TIME NUMBER(7,6)

BATTERY NUMBER(7,3)

WATER_TEMPERATURE NUMBER(7,3)

CONDUCTIVITY NUMBER(7,3)

PH NUMBER(7,3)

WATER_DEPTH NUMBER(7,3)

TURBIDITY NUMBER(7,3)

DISSOLVED_OXYGEN NUMBER(7,3)

SALINITY NUMBER(7,3)

DISSOLVED_OXYGEN_PCT NUMBER(7,3)

EMS1_DATA

Name Null? Type

------

STATIONID NOT NULL NUMBER(4)

SAMPLE_DATE_TIME NOT NULL DATE

BATTERY_VOLTAGE NUMBER(6,2)

PROGRAM_SIGNATURE NUMBER(10)

AIR_TEMPERATURE NUMBER(6,2)

RELATIVE_HUMIDITY NUMBER(6,2)

SOLAR_RADIATION NUMBER(7,3)

RAINFALL_AMOUNT NUMBER(6,2)

WIND_DIRECTION_SAMP NUMBER(7,3)

WIND_SPEED_SAMP NUMBER(10,3)

BAROMETRIC_PRESSURE NUMBER(10,3)

WATER_TEMPERATURE NUMBER(7,3)

CONDUCTIVITY NUMBER(7,3)

PH NUMBER(7,2)

WATER_DEPTH NUMBER(7,3)

TURBIDITY NUMBER(6,2)

DISSOLVED_OXYGEN NUMBER(6,2)

SALINITY NUMBER(6,2)

OXIDATION_REDUCTION_POTENTIAL NUMBER(7,2)

V_PH NUMBER(3)

V_SALINITY NUMBER(3)

V_AIR_TEMPERATURE NUMBER(3)

V_RELATIVE_HUMIDITY NUMBER(3)

V_SOLAR_RADIATION NUMBER(3)

V_RAINFALL_AMOUNT NUMBER(3)

V_WIND_DIRECTION_SAMP NUMBER(3)

V_WIND_SPEED_SAMP NUMBER(3)

V_BAROMETRIC_PRESSURE NUMBER(3)

V_WATER_TEMPERATURE NUMBER(3)

V_CONDUCTIVITY NUMBER(3)

V_WATER_DEPTH NUMBER(3)

V_TURBIDITY NUMBER(3)

V_DISSOLVED_OXYGEN NUMBER(3)

V_OXYGEN_REDUCTION_POTENTIAL NUMBER(3)

DISSOLVED_OXYGEN_P NUMBER(7,2)

V_DISSOLVED_OXYGEN_P NUMBER(3)

EMS2_DATA

Name Null? Type

------

STATIONID NOT NULL NUMBER(4)

SAMPLE_DATE_TIME NOT NULL DATE

BATTERY_VOLTAGE NUMBER(6,2)

PROGRAM_SIGNATURE NUMBER(10)

WATER_TEMPERATURE NUMBER(7,3)

CONDUCTIVITY NUMBER(7,3)

PH NUMBER(7,2)

WATER_DEPTH NUMBER(7,3)

TURBIDITY NUMBER(6,2)

DISSOLVED_OXYGEN NUMBER(6,2)

SALINITY NUMBER(6,2)

OXIDATION_REDUCTION_POTENTIAL NUMBER(7,2)

V_SAMPLE_DATE_TIME NUMBER(3)

V_WATER_TEMPERATURE NUMBER(3)

V_CONDUCTIVITY NUMBER(3)

V_PH NUMBER(3)

V_WATER_DEPTH NUMBER(3)

V_TURBIDITY NUMBER(3)

V_DISSOLVED_OXYGEN NUMBER(3)

V_SALINITY NUMBER(3)

V_OXIDATION_REDUCTION_PO NUMBER(3)

DISSOLVED_OXYGEN_PCT NUMBER(7,2)

V_DISSOLVED_OXYGEN_PCT NUMBER(3)

EMS3_DATA

Name Null? Type

------

SAMPLE_DATE_TIME NOT NULL DATE

BARO NUMBER(10,3)

CO2 NUMBER(6,2)

CO NUMBER(6,2)

NO2 NUMBER(6,2)

O3 NUMBER(6,2)

PART NUMBER(6,2)

RH NUMBER(6,2)

SND NUMBER(6,2)

SO2 NUMBER(6,2)

TEMP NUMBER(6,2)

WDIR NUMBER(6,2)

WSPD NUMBER(6,2)

PWR NUMBER(6,2)

SERVICE_RECORD

Name Null? Type

------

SERVICERECORDID NOT NULL NUMBER(38)

STATIONID NUMBER(38)

SERVICEDATE DATE

SERVICEDBY VARCHAR2(255)

SERVICENOTES VARCHAR2(255)

STATION

Name Null? Type

------

STATIONID NOT NULL NUMBER(38)

CANONICALNAME NOT NULL VARCHAR2(255)

DATEINSERVICE NOT NULL DATE

LOCATION VARCHAR2(255)

AREA VARCHAR2(255)

STREET VARCHAR2(255)

CITY VARCHAR2(255)

STATE VARCHAR2(2)

ZIP VARCHAR2(255)

FLOOR VARCHAR2(255)

PHONENUMBER VARCHAR2(15)

CONSTRAINTS

EMS1_DATA

CONSTRAINT_NAME C SEARCH_CONDITION

------

SYS_C004926 C "STATIONID" IS NOT NULL

SYS_C004927 C "SAMPLE_DATE_TIME" IS NOT NULL

PK_EMS1 P

CONSTRAINT_NAME COLUMN_NAME

------

PK_EMS1 SAMPLE_DATE_TIME

SYS_C004926 STATIONID

SYS_C004927 SAMPLE_DATE_TIME

EMS2_DATA

CONSTRAINT_NAME C SEARCH_CONDITION

------

SYS_C001658 C "STATIONID" IS NOT NULL

SYS_C001659 C "SAMPLE_DATE_TIME" IS NOT NULL

PK_EMS2 P

CONSTRAINT_NAME COLUMN_NAME

------

PK_EMS2 STATIONID

PK_EMS2 SAMPLE_DATE_TIME

SYS_C001658 STATIONID

SYS_C001659 SAMPLE_DATE_TIME

EMS3_DATA

CONSTRAINT_NAME C SEARCH_CONDITION

------

SYS_C001661 C "SAMPLE_DATE_TIME" IS NOT NULL

PK_EMS3 P

CONSTRAINT_NAME COLUMN_NAME

------

PK_EMS3 SAMPLE_DATE_TIME

SYS_C001661 SAMPLE_DATE_TIME

CONTROL_EMS1

CONSTRAINT_NAME C SEARCH_CONDITION

------

SYS_C004931 C "V_TYPE" IS NOT NULL

CONTROL_EMS2

CONSTRAINT_NAME C SEARCH_CONDITION

------

SYS_C004882 C "V_TYPE" IS NOT NULL

PK_V_TYPE P

CONSTRAINT_NAME COLUMN_NAME

------

PK_V_TYPE V_TYPE

SYS_C004882 V_TYPE

CLEAN_DATA

CONSTRAINT_NAME C SEARCH_CONDITION

------

SYS_C001652 C "STATIONID" IS NOT NULL

SYS_C001653 C "SAMPLE_DATE_TIME" IS NOT NULL

PK_CLEAN_DATA P

CONSTRAINT_NAME COLUMN_NAME

------

PK_CLEAN_DATA STATIONID

PK_CLEAN_DATA SAMPLE_DATE_TIME

SYS_C001652 STATIONID

SYS_C001653 SAMPLE_DATE_TIME

BACKUP_CLEAN_DATA

CONSTRAINT_NAME C SEARCH_CONDITION

------

SYS_C001650 C "STATIONID" IS NOT NULL

SYS_C001651 C "SAMPLE_DATE_TIME" IS NOT NULL

CONSTRAINT_NAME COLUMN_NAME

------

SYS_C001650 STATIONID

SYS_C001651 SAMPLE_DATE_TIME

STATION

CONSTRAINT_NAME C SEARCH_CONDITION

------

SYS_C001665 C "STATIONID" IS NOT NULL

SYS_C001666 C "CANONICALNAME" IS NOT NULL

SYS_C001667 C "DATEINSERVICE" IS NOT NULL

PK_STATION P

CONSTRAINT_NAME COLUMN_NAME

------

PK_STATION STATIONID

SYS_C001665 STATIONID

SYS_C001666 CANONICALNAME

SYS_C001667 DATEINSERVICE

SERVICE_RECORD

CONSTRAINT_NAME C SEARCH_CONDITION

------

SYS_C001663 C "SERVICERECORDID" IS NOT NULL

PK_SERVICE_RECORD P

FK_SERVICE__SERVICED_STATION R

CONSTRAINT_NAME COLUMN_NAME

------

FK_SERVICE__SERVICED_STATION STATIONID

PK_SERVICE_RECORD SERVICERECORDID

SYS_C001663 SERVICERECORDID

Programs and their locations

Under user hmdc

hmdc/public_html/cgi-bin/ems/query_ems01.pm

hmdc /public_html/cgi-bin/ems/query_ems02.pm

hmdc /public_html/cgi-bin/ems/query_ems03.pm

hmdc /public_html/cgi-bin/ems/main_ems.pl

hmdc /public_html/cgi-bin/ems/csv_output.pl

hmdc /public_html/cgi-bin/ems/html_output.pl

hmdc /public_html/cgi-bin/ems/create_chart.pl

hmdc /public_html/cgi-bin/ems/chart_variable_values.pl

Under user Weather:

Pub/ems_data/Perl_insert_ems01.pl

Pub/ems_data/Perl_insert_ems02.pl

Pub/ems_data/Perl_insert_ems03.pl

Pub/ems_data/validation_ems1.pl

Pub/ems_data/getfiles

Pub/ems_data/download_and_loadfiles

WWW/hmdc_public/stations/search.html

WWW/hmdc_public/stations/ems_search.html

What the Program does:

Pub/ems_data/getfiles: This program is run using a cron job every 1 hour.
This sets the path for downloading data and then calls the download_and_loadfiles.pl

Pub/ems_data/download_and_loadfiles:

This program is run using a cron job every 1 hour.This exports the data from cimicgis in .dat format and calls the insert program.

Pub/ems_data/Perl_insert_ems01.pl

Pub/ems_data/Perl_insert_ems02.pl

These programs are called from the download_and_loadfiles.pl program every 1 hr using a cron job in order to ingest data to the Oracle database.

Pub/ems_data/validation_ems1.pl: This program is run using a cron job every 1 hour when validating data automatically.

This validates the data in the EMS tables based on max/min allowable values specified in the Control_ems tables. In case the data is found to be invalid then it is marked with a specific code.
When validating data manually this program allows the user to validate the data for a specific time period by choosing the starting date and ending data from the command line manually .The program has to be run from /home/pub/ems_data and the starting & ending dates specified as per the required format.

The codes are as follows:
Greater than MAX: 200

Lesser than MIN: 300

Null values: 700

Step value greater than max: 400

Higher than Std Dev higher limit: 500

Lower than Std Dev lower limit: 600

Pub/ems_data/manual_validation_ems1.pl: This program allows the user to validate the data for a specific time period by choosing the date manually .This program can be run by the user as and when wished and the period for which the data is to be validated can be specified.
The other functionalities are same as the program for automatic validation.

WWW/hmdc_public/stations/search.html: This is the opening page for queries. This calls the main_ems.pl

hmdc /public_html/cgi-bin/ems/main_ems.pl

This checks for the various features like whether to show all data or only validated data, summary level etc and then based on the station for which the data needs to be shown,the Query programs are called.

hmdc/public_html/cgi-bin/ems/query_ems01.pm

hmdc /public_html/cgi-bin/ems/query_ems02.pm

hmdc /public_html/cgi-bin/ems/query_ems03.pm

These programs are called from main_ems.pl to retrieve data based on particular requirements and the Station as specified by the user.

hmdc /public_html/cgi-bin/ems/csv_output.pl: This program is called from the Query program in order to display data or allow data to be downloaded as csv files.

hmdc /public_html/cgi-bin/ems/html_output.pl: This program is called from the Query program in order to display data as a html table.

Note:

The proposed system shown in the diagram has been shown after an attempt at normalizing all the tables.
However I am not sure whether having one table for data from the different EMS stations will be efficient or not.

The new diagram includes only one table with sample_date_time+EMS Stn no as the primary key and the table holds all the fields present in the different EMS stations.
But there is a chance that later more stations might be added and the number of fields might increase.In such cases we would need to modify the table each time.

Also, there could be a possibility that while retrieving records for one particular station the program could be slow, unless maybe indexing could help.

To do list:

1)  Change the graph function so as not to plot invalid data.

2)  Prepare a update program or change the ingest program so that updated data can be reingested.

3)  Validation program for EMS2