Reviewed - Error Handling in ETL -Informatica.docError Handling in ETL -Informatica Ver1.0

Error Handling in ETL - Informatica

Ver1.0

(Oct 2006)


Document Revision List

Project: EDW & Reporting

Client: Destiny (Cartus)

Document Version No.: 1.0

Name / Version No. / Description / Revision Date
Error Handling in ETL-Informatica / 1.0 / Initial Version


ERROR HANDLING

This document contains Error Handling and processing logic. It is intended to survive the project and should be revised as enhancements are made to the data warehouse.

Following is the data flow of ETL information through to the Data Warehouse.

This process will be accessed when any ETL check fails for a record. The failed record is written to the error table and processing resumes with the next input record. Any record written to error tables is corrected and fed back into the process at a later time.

Error Table Requirements

Data warehouse will maintain an Error table in addition to all the Staging and EDW tables required for processing the business logic.

This error table will hold all those records which fail a lookup condition.

Ø  To get the corresponding Description and Abbreviation in Staging.

Ø  To get the corresponding Dimensional keys in EDW.

The error table will consist of the following fields:

ERROR_SEQ_NUM NUMBER(10)

ERROR_OCCUR_DATE DATE

ERROR_TGT_TABLE_NAME VARCHAR2(100)

ERROR_TGT_MAPPING_NAME VARCHAR2(500)

ERROR_SOURCE_DATA LONG

ERROR_MSG_TXT VARCHAR2(1000)

ERROR_SEQ_NUM

Every record passed to the error table will have a unique ERROR_SEQ_NUM Once a value of ERROR_SEQ_NUM is assigned to a row in an error table, the same value must never be assigned to another row in the same error table even after the original row is deleted from the table.

ERROR_OCCUR_DATE

ERROR_OCCUR_DATE indicates the date on which the record was written to the error table. The ERROR_OCCUR_DATE should be populated as SYSDATE.

ERROR_TGT_TABLE_NAME

The name of the Target table (Staging or EDW), which is being populated by the ETL.

ERROR_TGT_MAPPING_NAME

The name of the Mapping (Staging or EDW), which is being populated by the ETL.

ERROR_SOURCE_DATA

The source record (data in all source fields) which was written to the error table will be concatenated and separated by a delimiter ||.

ERROR_MSG_TXT

The error message concatenated and separated by a delimiter ||

Primary Key Constraint

The error table must have a primary key constraint consisting of a single column. The primary key ensures the uniqueness of every row in the table and prevents anomalies that occur when rows are not unique. It also serves to document the structure of the table. In the error table ERROR_SEQ_NUM will be the primary key.

Not Null Constraints

In addition to the primary key, the following columns must have a not null constraint:

ERROR_SEQ_NUM

ERROR_OCCUR_DATE

These constraints are required to ensure the Oracle optimizer uses indices on these columns during sorting operations.

Indexes

The following indexes are required:

create index index_name on (ERROR_OCCUR_DATE)

This single Error error table should be created either in the Staging or EDW schema and a synonym to this error table should be created which should be accessible from both the Staging and EDW schemas.

TCS Internal Page 5 of 5