19th International Roundtable on Business Survey Frames
Cardiff, United Kingdom 17 – 21 October 2005
Session 5: Developing new register systems and tools
Gaétan St-Louis, Statistics Canada
Business Register Redesign Project - Technology Strategy Plan

Business Register Redesign Project

Technology Strategy Plan

October 2005

Gaétan St-Louis

Statistics Canada

Table of Contents

1Introduction......

2BR Redesign Project Goals and Strategies......

2.1Simplification

2.2Accessibility

2.3Integration

2.4Timeliness

2.5Improved Management and Follow-up Concerning the Respondent Burden

2.6Tax data usage......

2.7Major efficiencies......

2.8Data collection areas......

3Strategic Technology Issues and Decisions......

3.1Database Management Technology Products......

3.2Database Management Data Models......

3.3IT Resource Management......

3.4Business Process and Workflow Management......

3.5Data Warehousing Products......

3.6Application Development Platforms......

4New BR MODEL......

4.1User Interface

4.2Browser Module

4.3Update Module

4.4Structure Manager Module

4.5Collection Entity Module

4.6Workload Module

4.7Survey Interface Module

4.8Response Burden Module

4.9Reports & Analysis Tools

4.10Data base content

5Conclusion......

BR Redesign Project – Technology Strategy Plan

Statistics Canada, October, 2005Page 1

1Introduction

The current Canadian Business Register (BR) computer system is based on a complex architecture residing on the mainframe and on old software technology that is no longer used by other divisions of Statistics Canada. This situation has several negative effects such as:

  • It is laborious and costly to make changes to improve the efficiency of the Register. Integration of new modules is difficult given the complexity of the current system.
  • It is laborious and expensive to maintain the content of the Register. Automated or manual updates are costly to perform on the current system mainly due to mainframe usage.
  • Retrieval of information from the current system is extremely difficult and costly due to the old technology, the current architecture and the diverse non-integrated modules of the Register.

The redesign of the BR provides the opportunities to build a complete new system and tools to ensure that the BR is able to continue to meet the needs of the Survey Programs and the System of National Accounts (SNA).

This document presents the Business Register Redesign Project technology strategy plan and provides an overview of the different components that are being built to achieve the main objectives of the redesign.

With respect to technology,we have adopted several strategies andthey are identified under the following headings:

  • Database Management Technologyproducts
  • Database Management Data Models
  • IT Resource Management
  • Business Process and Workflow Management
  • Data Warehousing Products
  • Application Development Platforms

The last portion of the document describes the new model of the Business Register with each of its main components. This model is based on the various functions that are necessary to the operations of a central business register in the Canadian statistical system.

2BRRedesignProject Goals and Strategies

The general structure of the Business Register was developed and put in place in the mid-1980’s. Both the outside world and the internal environment have changed enormously in the last two decades. Despite the changes made over the years to keep the BR roadworthy, these are not sufficient to enable the Register to respond adequately to emerging pressures in the medium and long term.

To properly define the objectives of the redesign project, a series of consultations were conducted during 2004. Working groups that were multidisciplinary and multi-divisional or with cross agency membership were created to assess the overall quality of the existing BR and to develop a strategic plan for setting priorities with respect to potential improvements. One thing to emerge was a desire to simplify the BR as a whole and to reduce its operating costs, while continuing to meet the needs of the different survey programs and the SNA. The project seeks to design a new Register that will meet the following objectives:

2.1Simplification

The concepts, procedures and systems of a new Register must be more comprehensible. Currently, the BR is so complex (with respect to both the concepts and their operationalization) that it imposes a very steep learning curve on users and BRD staff alike. Also, communications between the Agency and respondents need to be improved, using language and concepts that are familiar to respondents rather than being based on the perspective of statisticians.

2.2Accessibility

The BR should be made more accessible and user-friendly. A mass of information and data is stored in the BR, but it is hard to access because of a lack of efficient tools for querying the database and because of the system’s complex architecture. In other words, all information stored in a new BR must be easy to retrieve, both for purposes of consulting micro records and for conducting macro analyses.

2.3Integration

It will be possible to link the systems of the new BR with other major systems relating to economic surveys (we are thinking primarily of tax databases, collection tools and the data warehouse of the Enterprise Statistics Division (ESD)).

2.4Timeliness

The enterprises in the Register will be updated more frequently. First it is necessary to optimize automated updates (such as by making greater use of tax data) and to concentrate BRD staff on manual updates concerning enterprises that have more economic impact. Second, there can be no doubt that with today’s technologies, it will be possible to provide our employees with interfaces enabling them to do their work much more quickly and efficiently.

2.5Improved Management and Follow-up Concerning the Respondent Burden

It has been stressed that Statistics Canada can improve this aspect of the economic surveys. For this to happen, the BR must serve all business surveys. This will result in greater consistency in our economic statistics and better management of the burden on our respondents.

2.6Tax data usage

The Register has made extensive use of taxation data over the last few years. However, there are additional edited sources of tax data which are pertinent for frame maintenance.Recent research has demonstrated that tax data could be used to validate/correct industry classification, to detect complex enterprises, to identify new and inactive units in a more timely fashion, and to update size variables. These are opportunities to maximize the use of tax data to improve the timeliness and accuracy of the BR.

2.7Major efficiencies

Although re-engineering the BR is a very ambitious project, it is a unique opportunity to review the conceptual framework and the current operating procedures and to simplify many aspects of the Register. When completed, the new environment will lead to major efficiencies within the Register. It should also lead to efficiencies for users of the Register since the redesign will facilitate the use of the BR.

2.8Data collection areas

The BR redesign presents a great opportunity to increase the effectiveness of collection activities as they pertain to the overall quality of the Register and the timeliness of updates received from survey areas. Currently, collection staff deals with only one particular production unit or a group of production units for a given survey. The new environment should provide them with a complete picture of the operating structure of any given business, along with its entire associated survey collection requirements.

3Strategic Technology Issues and Decisions

In this section we consider a number of issues related to our use of technology in the up coming BR Redesign Project. Under each of the following sub-headings we introduce a topic of concern, state the issue in the form of a question, identify threats, opportunities, strengths and weaknesses of alternative strategies for addressing the issue then finally state the decision.

3.1Database Management Technology Products

For the procurement of database management technology products we were presented with two viable options; mainstream vendor products (Oracle, SQL Server, and Sybase) and open source (MySQL, PostgreSQL). The primary distinction is access to the source code and cost.

The Government of Canada has opened the door for procurement of free and open source software thereby setting an even playing field for procurement. Open source software is used in some government and public sector organizations. The open source software community of programmers is becoming more organized and effective in delivering products and services. The subject has recently received some attention at STC as well.

Issue: Do we now introduce Open Source DBMS products into the mix of STC products?

The risks related to the viability of open source DBMS products and the uncertainty of our capability to deploy and manage open source products is higher than we are willing to tolerate on this project. The same applies to open source products in general, including operating systems.

Decision: Employ mainstream database management technology (SQL Server).

The project team has selected Microsoft SQL Server 2005 as their1st choice of DBMS product subject to successfully meeting or exceeding software specific performance requirements. Every effort, within reason, will be made to facilitate success including upgrades to the product version, operating system version and processor capacity. If the trial completes successfully then Microsoft SQL Server will be deployed as the platform for transactional databases.

If SQL Server 2005 fails the performance requirement then Oracle 9i will be the 2nd choice of DBMS products subject to successfully meeting or exceeding software specific performance requirements. The first trial will be in a Windows environment. If that fails a second trial will be conducted in a UNIX environment. If the trial completes successfully then Oracle 9i will be deployed as the platform for operational and transactional databases.

3.2Database Management Data Models

Database management technology allows for the implementation of many data models including the hierarchical data model, independent data file model (ADABAS) , relational data model (Oracle, Sybase, SQL Server) and the object-oriented data model (OBDII by Fujitsu). In addition to these, the ANSI SQL-99 standard introduces the object-relational model which essentially provides for user defined data types and functions (together viewed as an object). The user- defined data types and functions become part of the SQL language and consequently become pervasive throughout the system code and beyond.

Another data model consideration is XML. The mainstream RDBMS vendors are quickly incorporating XML features and will likely over take smaller XML DBMS vendors (Gartner).

Issue: Do we now begin to exploit the object-relational and XML features of proprietary and open source DBMS products?

The risks related to openly endorsing the use of the SQL99 O-R features without a well defined and implemented control are significant.

Decision: Use standard relational database model features with the provision that the use of O-R extensions may be formerly approved where deemed beneficial and manageable.

The project team will use standard relational database model and the conceptual data models will be developed using OO modeling facilities (e.g. UML). Candidate objects for O-R implementation will be identified and proposed for consideration. The logical data models will be developed using single inheritance entity-relationship diagrams. Proposed candidate objects will be approved (or not) in the context of the logical models. Physical model specifications will include O-R extensions for approved objects only.

3.3IT Resource Management

Withinorganizations today there is a movement towards corporate management of the computing environment. STC has recognized this as evident in the Strategic Streamlining of IT Resources Initiative. One objective of this initiative is to position our organization for utility computing, also known as “on demand computing”. External influences such as budget cuts and the innovations in scalable redundant storage and processing technology are moving STC more in the direction of on demand computing. However the services and utilities are not fully available today. Early adopters will need to manage the associated risks.

Utilities are data storage, data backup, database hosting and application hosting. Services are server administration, database administration, hardware/software procurement and contract management, and LAN support services.

Issue: Do we now target corporate on demand computing utilities and services for divisional systems?

The risks of continuing to develop and maintain local IT utilities and services have become significant. All existing corporate IT utilities and services deemed to be mature, reliable, and scaled to adequate capacity will be purchased under the terms of negotiated service level agreements (SLA).

Decision: Purchase existing corporate on-demand IT utilities and services, and petition the corporation to expand on the available utilities and services.

The project team will pursue the contracting of corporate IT utilities and services by negotiating a SLA for server housing, full system administration, Enterprise Storage (ESS) and Enterprise Backup (EBS). The team will also negotiate a SLA for SQL hosting and database administration for development, testing and production. Finally, it will petition for an expansion of IT utilities to include CPU on-demand.

3.4Business Process and Workflow Management

A workflow management system supports the execution of business processes and provides business managers with process metadata. There are a number of standards groups involved in the workflow management business.

STC has some experience with workflow management productsand has recently procured a corporate license that includes workflow management facilities.

Issue: Do we now purchase and employ workflow management technology?

The risks related to the use of new vendor products, unproven at Statistics Canada (STC), are significant. The workflow management requirements of the BR are not complex.

Decision: Custom builds a workflow management system.

The project will build a workflow management system thatspecifically addresses BRD needs, implements the “portfolio” caseload concept, andincorporates flexibility through condition/action formatted business rules.

3.5Data Warehousing Products

STC has a number of successful data warehouse applications (Input-Output, Unified Enterprise Survey (UES)) as well as a number of projects underway (Green House Gases, Census MIS). The expertise in this technology is concentrated in the Data Warehouse Technology Centre in System Development Division (SDD).

BRD will implement a Data Warehousing technology only where deemed appropriate by the design team. SQL Server will be the principal Data Warehouse technology.

3.6Application Development Platforms

Our three major software vendors (SAS, Microsoft and Oracle) each supply a development platform for their products. Strategically, STC has tried to manage the diversity of these platforms. Even within the constrained list of vendor suites there are options presented to the developer, e.g. Visual Studio .Net provides the programmer the option to program in VB.Net, C#, VC++, J# and others.

Choosing a software development platform product does not necessarily imply a total solution. The data management layer of the application does not have to be from the same vendor, e.g. SAS applications can access either an Oracle database or a Microsoft database, a VB .Net application can access a Sybase database. Therefore there may be strategic advantages to separating our preference of development platform from database platform. Typically we choose the database vendor’s development platform for at least a major part of the application.

Another obvious consideration is the fact that by choosing a vendor we are essentially choosing an application development technology and application architecture; Oracle is Java, Microsoft is .Net, SAS is SAS (and java).

Issue: Do we now select application development platforms on their merit, independent of database products?

Decision: Select the most appropriate development platforms(s).

DBMS vendor development platforms do not address requirements of all layers of the application architecture equally well. The project team has establishedthat Microsoft Visual Basic .Net 2005 will be the development platform for interface layer development. SAS9 will be the development platform for bulk data transformations and batched transactions.

4New BR MODEL

In this section, we will describe the different components that will make up Statistics Canada’s new Business Register. Each component represents functions that are essential to the administration and maintenance of a central business register.

The new business register (BR) system will be developed modularly. Each module will be developed by prototyping, using an iterative approach. The BR will be made up of a relational database (SQL Server), and interaction with this database will take place using modules that will be developed in VB.NET. The SAS software will be used for developing components that require aggregation of data, such as some reports or the grouping of flat files.

The diagram below shows the new BR environment with its main modules.

4.1User Interface

All BR users will have the same user interface. Regardless of whether the user is from the BRD or from data collection operations or is a subject matter expert, he or she will enter the BR and view its data in the same way. On the other hand, even though the interface is common to all users, privileges will be controlled by means of a privilege administration tool.

It is important to provide a common interface, but it is even more important to control access and levels of access to the system. For example, if a Statistics Canada subject matter person wants to access the BR to check the information on businesses participating in his/her survey, that person will not necessarily have the privilege of updating the information stored in the BR. On the other hand, that same person may be given the privilege of updating the information contained in the collection entity (CE) for the enterprises participating in the survey. It should be noted that only employees sworn under the Statistics Act can request access to the BR.

The levels of access that will be permitted to the BR are as follows:

  • Browsing only
  • Updating of collection entities
  • Updating of information relating to the enterprise
  • Updating of legal and operational structures
  • Portfolio management

4.2Browser Module